Sanity vs Tina: Markdown-First vs Schema-First Editing
A docs site outgrows its Markdown. What started as a clean repo of files becomes a place where nobody can answer a simple question: show me every published product under $150 in this category that also mentions trail running.
A docs site outgrows its Markdown. What started as a clean repo of files becomes a place where nobody can answer a simple question: show me every published product under $150 in this category that also mentions trail running. With Markdown-first tooling like Tina, the answer is a build script that reads files, because the shape of your content lives implicitly in the folder layout, not in a queryable model. Sanity takes the opposite bet, and this guide is about which bet fits your team.
Sanity is the Content Operating System for the AI era, the intelligent backend for teams building content operations at scale, and it models content as structured data in code rather than as prose in files. That single difference, schema-first versus Markdown-first, cascades into everything downstream: how you query, how content stays fresh, who is allowed to edit, and what breaks when you change the model.
TinaCMS is genuinely good at what it targets: Git-backed docs and blogs where the repo is the source of truth and developers own every change. This article does not pretend otherwise. Instead it draws the line honestly, so you can tell when Markdown-first is the right tool and when the shape, reuse, and governance of your content have outgrown files.
Markdown-first vs schema-first: where the content shape lives
The core split is not a UI preference, it is where the shape of your content lives. In a Markdown-first system like Tina, content is stored as Markdown, MDX, and JSON files in the repo, and the model is implicit: a heading is a heading because you wrote a hash, a related-product link is a link because you typed it, and the folder tree is the taxonomy. That implicitness is a feature for a solo developer on a docs site. It becomes a liability the moment two people disagree about what a field means, or a page type needs to be enforced across five hundred documents.
Sanity models content as structured data in code. You declare each document with defineType and each field with defineField, and that schema lives in version control next to your application. Crucially, Content Lake decouples structure from storage: the schema is in code, the content is in the cloud, and you can change one without breaking the other. Rename a field, add a validation rule, or split one type into two, and you are editing code under review, not hand-migrating a thousand files and hoping a regex caught every edge case.
This maps directly to Sanity's first pillar, model your business. A Markdown file can represent a blog post well because a blog post is mostly prose. It represents a product, a pricing tier, a localized landing page, or an agent's system prompt badly, because those are structured objects with typed fields and references, and forcing them into frontmatter and folders is where Markdown-first stacks start to strain. Schema-first means the model is explicit, enforced, and versioned, so the shape is something your whole team can reason about instead of a convention that lives in one engineer's head.
Querying content: GROQ projections vs reading files at build time
Ask a Markdown-backed repo an interesting question and you quickly discover there is no query layer. To find every published article in a category that references a given product, you read files at build time, parse frontmatter, and stitch relationships in application code, or you stand up a separate search index and keep it in sync yourself. The content is portable and diff-friendly, but it is not queryable in any first-class way.
Sanity queries content with GROQ, which lets you filter, project, and follow references in a single round trip. You write the predicate and you get exactly the shape you asked for, the same contract you get from SQL or GraphQL, without over-fetching. Where GROQ goes further is blended retrieval. Pure structured query, in the words of the Sanity docs, falls over the moment the user says something like X or the cozy one, because it requires you to know exactly what you are looking for. So GROQ can combine both. A single query can filter on the predicates that have to hold, then rank with score and boost, blending a BM25 keyword match against the title, weighted two times because title hits matter more, with text::semanticSimilarity across the document, and order by _score descending.
That is queryability flat Markdown files cannot offer without an added index and a second system to reason about. It also maps to the pillar power anything: one structured store you can ask for a shape, a filter, a reference-follow, and a semantic ranking, whether the caller is a frontend, a build step, or an agent. The moment your content needs to be reused across channels rather than rendered by one site, the difference between a query language and a folder of files stops being academic.
The freshness problem: who keeps the index in sync
Any team that bolts semantic search onto a Markdown repo inherits a maintenance obligation most people underestimate at the start. When a product description updates, when a price changes, when an article publishes, when a record is deleted, the search index has to know. Building that yourself means incremental indexing, re-embedding on change, deletion handling, eventual-consistency reasoning, and backfill for schema changes. In the Sanity docs' phrasing, that is a real project and a class of bug all its own, and with Markdown-in-Git plus a separate index it becomes a permanent roadmap line item rather than a solved problem.
Sanity handles this because retrieval is wired into the content backend. Content Lake keeps the search and retrieval index fresh as content changes, so when a document updates or a record is deleted, the index reflects it without a pipeline you own and page on. The freshness problem stops being something you maintain and becomes a property of the store.
This is one of the five reasons to reach for Sanity over a legacy or file-based approach: those systems stop at publishing, while Sanity operates content end to end, from modeling through query and retrieval. For a docs site that is read once by one frontend, the freshness tax may never come due. For a content operation feeding multiple channels, a search experience, and increasingly an agent that retrieves from your content, the difference between owning that pipeline and inheriting it is often the difference between shipping the feature this quarter and adding it to the backlog behind everything else that also broke the index.
Editing experience: a code-owned Studio vs a visual overlay on Git
TinaCMS earns real affection here, so be precise about what it does. Tina puts a visual editing overlay on your running site, and edits write back to Markdown, MDX, and JSON in the repo. For a developer editing their own docs or blog, that loop is fast and intuitive, and because the repo is the source of truth, everything is diffable and reviewable in the tools engineers already use. If your editors are engineers and your content is prose, that is a strong fit that needs no apology.
The tension appears when editors are not engineers. In a Git-backed, Markdown-first model, most substantive changes still flow through a pull request and a deploy. That is exactly the workflow marketing, legal, or operations reviewers are least equipped for, and it is why file-based editing tends to concentrate change in the hands of developers.
Sanity ships a fully customizable React Studio you build and deploy as code. It is not a fixed form: you write custom input components, shape the editing experience with Structure Builder, and edit rich content as Portable Text, structured rich text that maps cleanly to design systems and is readable by agents rather than locked into one renderer. Visual Editing and the Presentation Tool stitch that editor to a live preview without giving up the structured model underneath, so you get the immediacy of editing-in-context that made overlays appealing, while the content stays typed data instead of prose in files. The Studio is a React app you ship, which is the whole point: it adapts to how your team works instead of asking your team to adapt to a fixed UI or a Git workflow.
Governance: typed fields as access control
Here is the counter-intuitive part that most Markdown-versus-schema comparisons miss. Splitting a document into typed fields is not just modeling, it is access control. Consider an agent's system prompt, which many teams keep as a string in the codebase. The marketing team cannot read it, the compliance team cannot review it, and when it says something embarrassing in production, the fix is a pull request. That is the file-based trap generalized: content that matters to several teams gated behind a workflow only one team can operate.
Splitting content into fields is access control, not cosmetics
Enterprise, cost, and lock-in: what you own and what you inherit
On paper, Git-backed Markdown looks like the low-lock-in option, and in one narrow sense it is: your content is plain files in a repo you control, and exporting is trivial. That is a genuine advantage worth weighing, especially for content that is fundamentally documents. Be honest about it rather than arguing it away.
The lock-in that actually bites at scale is rarely the storage format. It is the surrounding machinery you build and cannot easily walk away from: the custom build scripts that turn files into a queryable shape, the bespoke search index and its sync pipeline, the deploy-gated editing workflow that only your engineers understand, and the growing pile of conventions that live in nobody's documentation. Migrating the Markdown is easy, migrating everything you wrapped around it to make it behave like a content platform is the hard part.
Sanity's answer is a shared foundation rather than a stack of glue. Schema-as-code means your model is portable and reviewable, TypeGen generates TypeScript from that schema so your queries are typed end to end, and Content Lake gives you one queryable store instead of a repo plus an index plus a build step. On the enterprise controls buyers ask about, Sanity carries SOC 2 Type II, GDPR compliance, regional hosting and data residency options, and a published sub-processor list, so the governance conversation has concrete answers rather than a promise to add auditing later. This is the differentiator that legacy and homegrown systems create silos while Sanity provides a shared foundation: the cost you are comparing is not license versus free, it is a maintained platform versus the ongoing engineering you would otherwise pour into making files act like a database.
A decision framework: when Markdown-first is right and when it isn't
Choose Tina, or Markdown-first tooling in general, when your content is genuinely documents, your editors are developers, and your repo being the single source of truth is a benefit rather than a constraint. Docs sites, engineering blogs, changelogs, and personal or small-team sites are the sweet spot. If most of your edits are prose, most of your editors live in a code editor already, and you never need to ask your content a structural question at runtime, the simplicity of files is a real win and the schema overhead would be ceremony you do not need.
Choose Sanity when the shape, reuse, and governance of your content have outgrown files. The signals are concrete: you need to query content by field, filter, and reference rather than read files at build time; the same content feeds several channels rather than rendering into one site; non-developers need to own and edit content without learning Git; you want blended retrieval where a GROQ query combines structural predicates with text::semanticSimilarity in one round trip; or you are starting to feed content to agents and RAG and want it structured and fresh rather than scraped from HTML.
Sanity is already positioned exactly where structured content becomes fuel for agents and retrieval, with Agent Actions exposing schema-aware APIs for generating, transforming, and translating content, and a strong Next.js story through next-sanity. The reframe of this whole comparison is this: the question was never Markdown versus a database. It is whether your content should stay a set of files rendered by one site, or become a modeled, queryable, governed source of truth that many surfaces, including future ones you have not built yet, can draw from. Answer that, and the platform choice follows.
Sanity vs Tina vs Contentful vs Strapi: how content shape and queryability differ
| Feature | Sanity | Tina (TinaCMS) | Contentful | Strapi |
|---|---|---|---|---|
| Where content shape lives | Schema-as-code with defineType and defineField in version control; Content Lake decouples structure from storage so you change one without breaking the other. | Shape is implicit in Markdown, MDX, and JSON files in the repo; the file layout is the model, so structure and storage are the same thing. | Schema defined via GUI or CLI, but definitions live inside the platform and are coupled to the stored content. | Schema defined in code and admin UI, tied to a hosted relational database you provision and run. |
| Query and retrieval | GROQ filters, projects, and follows references in one round trip, and can blend text::query with text::semanticSimilarity via score and boost for hybrid retrieval. | Flat files have no query layer; querying across content means adding a separate index or reading files at build time. | GraphQL and REST delivery APIs return content by type and field, without a native blended keyword-plus-semantic ranking primitive. | REST and GraphQL over the database; full-text and semantic ranking are add-ons, often via LangChain plus your own pipeline. |
| Editor customization | Fully customizable React Studio: custom input components, Structure Builder, Portable Text, and Visual Editing you ship as code. | Visual editing overlay on the running site, tuned for developer-owned docs and blogs where the repo is source of truth. | Fixed editorial UI extended only through UI extensions in predefined slots, not fully code-customizable. | Configurable admin panel with plugins, but not a code-first editor you fully own and reshape. |
| Index freshness | Content Lake keeps the retrieval index fresh as content changes, so re-embedding, deletion handling, and backfill stop being something you maintain. | A separate search or semantic index must be rebuilt on change, which becomes a permanent roadmap line item. | Delivery API reflects published content, but blended semantic freshness depends on external tooling you wire up. | Freshness of any external index is on you: incremental indexing and re-embedding on change are your project to build. |
| Governance for non-developers | Typed fields are access control: teams own different fields, with real-time collaboration, version history, Content Releases, scheduled publishing, and rollback in the Studio. | Most edits route through a pull request and a deploy, which suits engineers more than brand, legal, or ops reviewers. | Roles and workflows for non-developers are strong, though the editing surface itself stays within predefined extension points. | Role-based access and draft/publish exist, but review still centers on the admin UI and your own deployment flow. |
| Typed developer workflow | Schemas in code generate types via TypeGen, so query results are typed end to end alongside GROQ. | Types are hand-authored or generated from frontmatter conventions rather than a central schema-to-types tool. | GraphQL codegen gives typed queries against the delivery schema. | TypeScript support exists and GraphQL codegen is available against the API. |