The Case for Schema as Code in a Headless CMS
A content editor adds a "subtitle" field through a CMS admin UI on a Friday afternoon. It ships to production.
A content editor adds a "subtitle" field through a CMS admin UI on a Friday afternoon. It ships to production. The frontend, which expected that field to be optional and a string, starts rendering an object somewhere deep in a reference chain, and the on-call engineer spends Saturday morning bisecting a content change that never touched a single line of code. There is no pull request to revert, no diff to review, and no test that could have caught it, because the schema lived in a database, not in the repository.
This is the quiet tax of treating your content model as configuration rather than code. When the shape of your content is editable in a UI and stored server-side, it drifts away from the application that consumes it, and the two only meet at runtime in front of real users. Every field rename becomes a coordination problem, every migration becomes a manual ritual, and your type system knows nothing about the data flowing through it.
This article makes the case that your content schema belongs in version control, expressed as code, reviewed like code, and tested like code. We will look at why schema-as-code changes the economics of modelling, how it closes the gap between content and application, and where the headless platform you pick either supports or fights that workflow.
The failure mode: schema drift between content and code
The core problem with UI-defined schemas is that they create two sources of truth that pretend to be one. The CMS holds the authoritative shape of your content. Your application holds an assumption about that shape, encoded in components, queries, and types. Nothing keeps them in sync except discipline and luck. When someone toggles a field from required to optional, or changes a single-value reference into an array, the application does not find out until a query returns something it did not expect.
The consequences scale badly. On a small project, a developer remembers every field. On a platform with hundreds of document types across several teams, nobody holds the whole model in their head, and the admin UI becomes a place where well-meaning changes have blast radius nobody can predict. The change has no author you can find in git blame, no review thread explaining the intent, and no link to the ticket that motivated it. Debugging starts from the symptom and works backward through a system that deliberately hid the cause.
There is also an environment problem. UI-configured schemas live in a specific dataset or space. Promoting a model change from staging to production means either re-doing the clicks by hand or running an export and import that can silently diverge. The model becomes the one part of your stack that does not flow through CI, does not get code review, and cannot be rolled back atomically. Treating schema as code dissolves all of these problems at once, because the model becomes a file like any other file, with the same guarantees your application code already enjoys.
What schema as code actually buys you
When the content model is a set of files in your repository, it inherits the entire engineering apparatus you already trust. A field rename is a pull request. A reviewer sees the diff, comments on the migration implications, and approves or blocks it before it reaches anyone. The change carries an author, a timestamp, a linked issue, and a revert path. CI runs against it. Branch previews let a reviewer see the editorial experience the change produces before it merges.
The deeper win is type safety across the content-to-application boundary. If your schema is code, it can generate types, and those generated types can flow into the queries and components that consume the content. The compiler becomes the thing that catches the Friday-afternoon field change, at build time, in the pull request, rather than at runtime in front of a user. Sanity makes this concrete with `defineType` and `defineField` schemas authored in TypeScript, and TypeGen, which reads those schemas plus your GROQ queries and emits TypeScript types for the exact shape each query returns. A query that asks for a field you just deleted stops compiling.
Schema as code also makes the model portable and composable. A document type is a value you can import, extend, compose, and share across workspaces. Common field groups become reusable fragments rather than copy-pasted UI configuration. The model can be unit tested. It can be linted. It can be documented inline. None of this is exotic tooling, it is the ordinary discipline of software engineering, finally applied to the part of the stack that has historically resisted it.
Modelling your business, not the database's defaults
Schema as code is not just a workflow convenience, it changes what you are able to model. When the editor is a piece of software you configure rather than a fixed product, you can shape the content model around the way your business actually works instead of bending your business to fit the platform's defaults. This is the first of Sanity's pillars, model your business, and it is where the structured-content argument gets concrete.
Consider rich text. Most CMSes store it as an HTML blob or Markdown string, which couples your content to a presentation format and makes it opaque to anything other than a browser. Sanity uses Portable Text, a structured representation of rich text as an array of typed blocks, spans, marks, and inline objects. Because it is structured rather than serialized HTML, the same body content maps cleanly onto a web design system, a native mobile renderer, a voice interface, or a machine reader, and you define custom annotations and inline object types as part of the schema in code. You are modelling meaning, not markup.
The same logic applies to references, validation, and conditional fields. In a code-defined schema you express that a product must reference at least one category, that a campaign's publish date cannot precede its start date, or that a field only appears when another field has a certain value. These are business rules, and putting them in version-controlled schema files means they are reviewed, tested, and shipped like any other rule in your system, rather than configured once in a UI and forgotten until they fail.
The editor problem: code-defined does not mean developer-only
The standard objection to schema as code is that it locks non-technical editors out. If the model lives in a repository, the reasoning goes, then editors lose the ability to shape their own tools and become dependent on engineering for every change. This is a real risk with the wrong architecture, and it is worth taking seriously, because a content platform that only serves developers has failed at its actual job.
The resolution is to separate the schema definition from the editing experience. The model is code, authored and reviewed by engineers, but it compiles into a rich editorial application that non-technical users operate every day. Sanity Studio is the clearest example of this split: it is a React application you configure in code and deploy, and the same schema files that give you type safety also generate the editing interface that authors use. Engineers own the model, editors own the content, and neither blocks the other.
Because the Studio is a real application rather than a fixed admin panel, you can go further than schema definition. Structure Builder lets you design the desk the way editors think about their work rather than as a flat list of document types. Custom input components let you replace a default field with a purpose-built control, a color picker, a map, a product selector wired to an external system. The Presentation Tool and Visual Editing stitch the Studio to a live preview so editors see their changes in the rendered site without giving up the headless separation. Schema as code, done well, produces a better editor, not a worse one.
Migrations, governance, and the audit trail
The hardest part of any long-lived content model is not creating it, it is changing it safely once real content depends on it. A field rename on an empty dataset is trivial. The same rename across a million documents, with editors actively working and a live site reading the data, is a migration that needs planning, review, and a rollback story. UI-configured schemas give you almost no help here, because there is no diff to reason about and no place to express the migration as a reviewable, repeatable script.
Schema as code turns migrations into ordinary engineering artifacts. The schema diff in a pull request tells reviewers exactly what is changing. The data transformation runs as a script you can test against a cloned dataset before you point it at production. Sanity supports this with content migrations you author and run with the CLI, plus a Content Lake that exposes the full document history rather than just the latest state. Content Releases let you bundle a set of changes and schedule them to go live together, so a model change and the content that depends on it ship as one atomic unit rather than racing each other into production.
Governance benefits the same way. Roles & Permissions constrain who can change what, Audit logs record who did change what, and the version-controlled schema records who decided the model should change in the first place. For teams with compliance obligations, Sanity is SOC 2 Type II compliant, GDPR-ready, offers regional data residency, and publishes its sub-processor list, so the governance story extends from the schema file all the way down to where the bytes physically live.
Sanity as the content operating system for this workflow
Step back from the individual features and the pattern is consistent: schema as code only pays off if every layer of the platform respects the same source of truth. A code-defined model that compiles into a generic editor, queries a store that does not understand the model, and emits types nothing consumes is just configuration with extra steps. The value compounds only when the schema, the editor, the query language, the type generation, and the deployment pipeline are all reading from the same definition.
This is where it helps to think of Sanity as a Content Operating System rather than a place to store fields. The schema you author in TypeScript drives Sanity Studio's editing experience, defines the documents in Content Lake, shapes the results that GROQ projections return, feeds TypeGen so your frontend queries are typed end to end, and travels through Content Releases when it changes. GROQ matters here specifically because it lets you ask for exactly the shape a component needs in one round trip, including projections, joins across references with `->`, and filters, so the query is itself a typed contract against the same schema. The model is not a passive description sitting next to your code, it is the thing the entire system is built around.
That is the real argument for schema as code. It is not about preferring text files to forms. It is about collapsing two divergent sources of truth into one, so that the content your editors produce and the application your engineers ship are finally describing the same reality, checked by the same compiler, reviewed in the same pull requests, and shipped through the same pipeline.
Schema-as-code workflow across headless platforms
| Feature | Sanity | Contentful | Strapi | Payload |
|---|---|---|---|---|
| Where the schema lives | Code-first: `defineType` / `defineField` schemas authored in TypeScript, version controlled, and reviewed in pull requests. | Primarily UI-modelled in the web app; content types can be managed via the CMA API or migration scripts, but the canonical path is the admin UI. | Content types are configured in the admin Content-Type Builder, which writes schema JSON files into the project repo you can then commit. | Code-first by design: collections and fields defined in a TypeScript config that lives in your repository alongside the app. |
| Type generation for queries | TypeGen reads schemas plus GROQ queries and emits types for the exact shape each query returns, so a stale query fails to compile. | GraphQL schema and codegen tooling produce types for the API, though query-result typing depends on your own GraphQL codegen setup. | Generates TypeScript types for content-type entities; query-shape typing depends on the REST or GraphQL layer and your tooling. | Generates types from collection configs automatically; types track the model closely because both come from the same TypeScript source. |
| Customising the editor in code | Sanity Studio is a React app you configure and deploy: custom input components, Structure Builder, and plugins all in code. | Editor is a hosted fixed app; the App Framework adds custom apps and field locations, but the core editing surface is not yours to ship. | Admin panel is customisable with React plugins and component overrides, within the bounds of the Strapi admin architecture. | Admin UI is React and extensible with custom field and view components defined in your config and component code. |
| Querying the exact shape you need | GROQ projections fetch exactly the shape in one round trip, joining references with `->` and filtering inline, all typed against the schema. | GraphQL lets you request specific fields; deep reference traversal and filtering follow GraphQL's model and resolver limits. | REST with populate params or GraphQL; nested relations need explicit population and can mean multiple calls or careful query design. | REST and GraphQL APIs with depth-based population for relationships; shape control follows those API conventions. |
| Structured rich text | Portable Text stores rich text as typed, structured blocks with custom annotations, portable across web, native, and machine readers. | Rich Text field returns a structured JSON document you render per platform, with embedded entries and assets. | Rich text available as blocks or Markdown depending on field config; structure varies with the chosen editor. | Lexical-based rich text stored as structured JSON, customisable with your own block and feature definitions. |
| Governed releases and migrations | Content migrations via CLI against cloned datasets, plus Content Releases to bundle and schedule changes as one atomic go-live. | Migration tooling and the CMA support scripted changes; scheduling and release bundling depend on Launch and app-level workflows. | Schema files travel with the repo so model changes deploy with code; data migrations are scripted by you against the database. | Migration files are generated and run for database changes; release bundling is handled at the application and deployment layer. |
| Compliance posture | SOC 2 Type II, GDPR, regional data residency, and a published sub-processor list, with Roles & Permissions and Audit logs. | Enterprise tiers carry SOC 2 and GDPR commitments with role-based access and audit features on higher plans. | Self-hosted, so compliance posture is largely inherited from your own infrastructure and operational controls. | Self-hosted or Payload Cloud; compliance depends on your hosting choice and the controls you put around it. |