How to Integrate Qdrant with Your Headless CMS
Connect Qdrant to your headless CMS to turn structured content into vector search, recommendations, and RAG-ready answers that update when editors publish.
What is Qdrant?
Qdrant is an open-source vector database built for similarity search and retrieval. Teams use it to store embeddings, filter vectors by metadata, and serve search or RAG results through Qdrant Cloud or self-hosted clusters. It’s commonly used by AI product teams building semantic search, support bots, recommendation systems, and retrieval pipelines.
Why integrate Qdrant with a headless CMS?
Vector search is only as good as the content you index. If product descriptions, docs, pricing notes, and support articles live in separate tools, your Qdrant collection gets stale fast. Editors publish a new troubleshooting guide, but your support bot still answers from last week’s content. A product name changes, but search results keep returning the old label.
Connecting Qdrant to a headless CMS fixes the handoff between editorial work and retrieval. With structured content, you can create embeddings from clean fields like title, summary, body, category, locale, and product references instead of scraping rendered HTML. With Sanity’s Content Lake, GROQ, webhooks, and Functions, a publish event can fetch exactly the fields Qdrant needs, create or update a vector point, and attach metadata for filtering, such as language, content type, slug, and publish date.
The trade-off is that Qdrant doesn’t create embeddings for you. You still need an embedding model, a chunking strategy, and delete handling. But once those pieces are defined, the integration replaces manual exports, nightly batch jobs, and one-off scripts with an event-driven sync path that keeps retrieval close to editorial truth.
Architecture overview
A typical Sanity and Qdrant integration starts with structured documents in the Content Lake. When an editor publishes, updates, or deletes content, a Sanity webhook fires with the document ID and mutation type. That event can call a Sanity Function, a Next.js route, or another server-side listener. The sync handler uses @sanity/client and GROQ to fetch only the fields needed for retrieval, for example title, excerpt, Portable Text body, slug, locale, and referenced product names. It then turns the selected text into one or more embeddings using your chosen embedding provider. Each embedding is sent to Qdrant with the Qdrant REST API or @qdrant/js-client-rest by calling upsert on a collection, with the vector plus a payload containing Sanity metadata. For deletions, the handler calls Qdrant delete for the matching point ID. At query time, your app embeds the user’s search text, sends it to Qdrant’s search or query endpoint, applies payload filters like locale or content type, and receives matching point IDs. The frontend can render results directly from Qdrant payloads or use those Sanity document IDs to fetch fresh content from the Content Lake. Functions are a good fit when you want this sync to run server-side on content events without running separate infrastructure.
Common use cases
Semantic site search
Index articles, product pages, and docs in Qdrant so users can search by meaning instead of exact keywords.
RAG for support agents
Retrieve the most relevant Sanity-authored help content before an AI agent generates an answer.
Product recommendations
Use product descriptions, tags, and editorial buying guides as vectors for similarity-based recommendations.
Localized retrieval
Store locale, region, and audience metadata in Qdrant payloads so search results match the user’s language and market.
Step-by-step integration
- 1
Set up Qdrant
Create a Qdrant Cloud cluster or run Qdrant locally with Docker. Create an API key, note your cluster URL, and create a collection with the vector size used by your embedding model, such as 1536 for OpenAI text-embedding-3-small and cosine distance.
- 2
Install the SDKs
In your sync service, install the Qdrant client, Sanity client, and your embedding provider SDK. For a TypeScript project, use npm install @qdrant/js-client-rest @sanity/client openai.
- 3
Model retrieval-ready content in Sanity Studio
Define schemas with fields that map cleanly to embeddings and filters, such as title, summary, body, slug, locale, product, category, audience, and publishedAt. Keep filter fields typed as strings, references, arrays, or dates so they can be copied into Qdrant payloads.
- 4
Create the sync trigger
Use a Sanity webhook for publish, update, and delete events, or run the logic in a Sanity Function. The event should include the document ID and operation so your handler can fetch the latest version or remove deleted points from Qdrant.
- 5
Fetch, embed, and upsert
Use GROQ to fetch only the fields Qdrant needs. Convert Portable Text to plain text, split long documents into chunks if needed, generate embeddings, and call Qdrant upsert with a stable point ID plus payload metadata.
- 6
Test the user experience
Search for known phrases, synonyms, and partial questions. Verify that publish events update Qdrant, deleted documents disappear, locale filters work, and result links resolve back to current Sanity content.
How Sanity + Qdrant works
Build your Qdrant integration on Sanity
Sanity gives you the structured content foundation, real-time event system, and flexible APIs to keep Qdrant aligned with what your team publishes.
Start building free →CMS approaches to Qdrant
| Capability | Traditional CMS | Sanity |
|---|---|---|
| Vector-ready content structure | Often stores content as rendered pages or mixed HTML, so indexing usually needs scraping and cleanup. | Structures content as typed JSON in the Content Lake, with schemas that match your domain and retrieval needs. |
| Real-time sync to Qdrant | Usually depends on plugins, cron jobs, or manual exports, which can leave stale vectors in production. | Webhooks and Functions can trigger server-side Qdrant upserts or deletes on publish events, updates, and removals. |
| Field-level control for embeddings | Indexing often starts from full rendered pages, which can mix navigation, footers, and unrelated text into vectors. | GROQ fetches only the fields you choose, including referenced data, so embeddings stay focused and payloads stay useful. |
| Metadata filters for retrieval | Filters like locale, audience, product, or publish date may be inconsistent or trapped in templates. | Schema-defined fields can map directly into Qdrant payload filters, such as locale, type, category, region, and audience. |
| Editorial control over AI answers | Editors may not see which content feeds AI retrieval, so fixes happen outside their normal workflow. | Editors work in Sanity Studio while developers sync the same published fields to Qdrant for search, RAG, and agents. |
| Delete and unpublish handling | Removed pages can remain in vector indexes unless a separate cleanup job catches them. | Publish and delete events can carry document IDs into your sync code, where stable Qdrant point IDs make cleanup predictable. |
Keep building
Explore related integrations to complete your content stack.
Sanity + Pinecone
Build managed vector search and RAG pipelines from structured Sanity content.
Sanity + Weaviate
Index Sanity documents in Weaviate for hybrid search, semantic retrieval, and metadata-filtered results.
Sanity + Chroma
Prototype local or developer-focused retrieval workflows using Sanity content and Chroma collections.