How to Integrate LlamaIndex with Your Headless CMS
Connect LlamaIndex to structured Sanity content so your teams can build real-time RAG search, support agents, and editorial copilots without scraping pages or copying content by hand.
What is LlamaIndex?
LlamaIndex is an open-source data framework for building LLM applications, especially retrieval-augmented generation, agent workflows, and semantic search. It helps developers connect private data sources to models from OpenAI, Anthropic, Cohere, local models, and other providers. Teams use it when an LLM needs grounded answers from product docs, knowledge bases, policies, commerce data, or editorial content.
Why integrate LlamaIndex with a headless CMS?
LLMs are only useful when they can answer from the content your team actually publishes. If your product docs, help center articles, landing pages, and policy pages live in one system, but your LlamaIndex app uses a separate spreadsheet or stale export, answers drift fast. A support agent might cite last month’s return policy. A site search tool might miss a new product category. An editorial workflow might generate suggestions from archived copy.
Architecture overview
A typical Sanity and LlamaIndex integration starts when an editor publishes or updates content in Sanity Studio. The content mutation lands in the Content Lake. A GROQ-powered webhook filters for the document types you want to index, such as article, product, policy, or faq, and sends the changed document ID to a webhook endpoint or Sanity Function. That handler uses @sanity/client to fetch the full document with a GROQ projection, including referenced fields like author name, category title, related product SKUs, and locale metadata. The handler then converts the result into one or more LlamaIndex Document objects, adds metadata such as sanityId, documentType, slug, locale, and publishedAt, and inserts the document into a LlamaIndex index. In production, that index is usually backed by LlamaCloud or a vector database supported by LlamaIndex. When an end user asks a question in your site search, support bot, or internal tool, your app queries LlamaIndex. LlamaIndex retrieves the most relevant indexed chunks, passes them to the selected LLM, and returns a grounded answer with source metadata that can link back to the Sanity document.
Common use cases
RAG search for docs and help centers
Index published articles from Sanity, then use LlamaIndex to answer user questions with citations to the source page.
Support agent grounding
Feed policies, troubleshooting guides, and product specs into LlamaIndex so a support agent answers from approved content.
Product discovery assistants
Sync product descriptions, categories, fit notes, and buying guides so shoppers can ask natural-language questions like “Which jacket works for rain under $150?”
Editorial gap analysis
Use LlamaIndex over Sanity content to find missing FAQs, outdated claims, duplicate topics, and content that needs SME review.
Step-by-step integration
- 1
Set up LlamaIndex
For local LlamaIndex, install the SDK and configure your model provider key, for example OPENAI_API_KEY. If you’re using LlamaCloud managed indexes or LlamaParse, create a LlamaCloud account, create an API key, and note your project and index name. For a TypeScript app, install packages with npm install llamaindex @llamaindex/openai @sanity/client express.
- 2
Model indexable content in Sanity Studio
Define schema fields that map cleanly to retrieval, such as title, slug, excerpt, body, category, locale, product references, publish date, and review status. Keep source text and metadata separate. LlamaIndex can use the body for embeddings, while metadata fields handle filtering, citations, routing, and permissions.
- 3
Create a GROQ projection
Write a GROQ query that returns only the fields LlamaIndex needs. Include joins across references, for example category->title or author->name, so your sync code doesn’t need extra round trips.
- 4
Add a webhook or Function
Create a Sanity webhook filtered to published document types, or use Functions if you want the sync logic to run inside Sanity’s event system. Send the document ID, mutation type, and dataset to your handler. For deletes, remove the corresponding document from your LlamaIndex-backed index or mark it as unpublished.
- 5
Insert documents into LlamaIndex
Convert Sanity documents into LlamaIndex Document objects. Put searchable text in the text field and fields like sanityId, slug, locale, and documentType in metadata. Then insert the document into your index. For small prototypes, a local persisted index is fine. For production, use LlamaCloud or a vector database through LlamaIndex so multiple app instances read the same index.
- 6
Test retrieval in the frontend
Build a small query route that calls LlamaIndex, asks 10 to 20 real user questions, and checks whether the returned source IDs match the expected Sanity documents. Test updates too. Publish a changed policy, trigger the webhook, and confirm the new answer appears without a manual re-index.
Code example
import express from "express";
import { createClient } from "@sanity/client";
import { Document, Settings, VectorStoreIndex, storageContextFromDefaults } from "llamaindex";
import { OpenAIEmbedding } from "@llamaindex/openai";
Settings.embedModel = new OpenAIEmbedding({ model: "text-embedding-3-small" });
const sanity = createClient({
projectId: process.env.SANITY_PROJECT_ID!,
dataset: process.env.SANITY_DATASET!,
apiVersion: "2025-02-19",
token: process.env.SANITY_READ_TOKEN!,
useCdn: false,
});
const storageContext = await storageContextFromDefaults({ persistDir: "./llamaindex-storage" });
const index = await VectorStoreIndex.init({ storageContext });
const app = express();
app.use(express.json());
app.post("/api/sanity-webhook", async (req, res) => {
const id = String(req.body._id || "").replace(/^drafts\./, "");
if (!id) return res.status(400).json({ error: "Missing document ID" });
const doc = await sanity.fetch(
`*[_id == $id][0]{
_id,
_type,
title,
excerpt,
"slug": slug.current,
"category": category->title,
body[]{children[]{text}}
}`,
{ id }
);
if (!doc) return res.status(404).json({ error: "Document not found" });
const bodyText = (doc.body || [])
.flatMap((block: any) => (block.children || []).map((child: any) => child.text))
.join("
");
await index.insert(new Document({
text: [doc.title, doc.excerpt, bodyText].filter(Boolean).join("
"),
metadata: {
sanityId: doc._id,
type: doc._type,
slug: doc.slug,
category: doc.category,
},
}));
await index.storageContext.persist({ persistDir: "./llamaindex-storage" });
res.json({ ok: true, indexed: doc._id });
});
app.listen(3000);How Sanity + LlamaIndex works
Build your LlamaIndex integration on Sanity
Sanity gives you the structured content foundation, real-time event system, and flexible APIs you need to connect LlamaIndex to published content without crawler-based indexing.
Start building free →CMS approaches to LlamaIndex
| Capability | Traditional CMS | Sanity |
|---|---|---|
| Source data for retrieval | Content often lives as rendered pages or large HTML fields, so indexing needs scraping, cleanup, and custom parsers. | The Content Lake exposes typed JSON, references, and metadata that LlamaIndex can turn into documents and filters. |
| Sync on publish | Teams often use scheduled exports or crawler refreshes, which can leave answers stale for hours. | GROQ-filtered webhooks and Functions can trigger indexing only for relevant content events. |
| Field-level query control | Indexing frequently pulls whole pages, including navigation, footer copy, and unrelated modules. | GROQ selects exact fields and joins references in one query, so LlamaIndex receives cleaner text and useful metadata. |
| Editorial workflow fit | Editors publish pages, then technical teams handle separate search or AI indexing steps. | Sanity Studio can include review fields, Tasks, Content Releases, and indexing status fields in the same editorial workflow. |
| Production AI agent access | Agents usually query a copied index or scraped site content, which makes permissions and freshness harder to control. | Agent Context gives production AI agents read-only, scoped access to structured content when direct content querying is a better fit than vector retrieval. |
| Trade-off to plan for | Fast to start if all content is page-based, but AI retrieval quality depends on cleanup. | You’ll spend time modeling content well in schema-as-code, but that structure pays off when LlamaIndex needs clean retrieval data. |
Keep building
Explore related integrations to complete your content stack.
Sanity + OpenAI
Use OpenAI models with structured Sanity content for generation, summarization, embeddings, and RAG answers.
Sanity + Anthropic (Claude)
Connect Claude to approved Sanity content for long-context analysis, editorial review, and grounded agent workflows.
Sanity + Writer
Pair Writer with Sanity to create governed content workflows that use brand rules, approved terminology, and structured fields.