How to Integrate Elasticsearch with Your Headless CMS
Add fast, typo-tolerant search by syncing structured content from your headless CMS into Elasticsearch the moment itโs published.
What is Elasticsearch?
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. Teams use it to index JSON documents, run full-text search, filter large result sets, rank by relevance, and support features like autocomplete, faceting, and geo search. Itโs widely used in product catalogs, documentation sites, media archives, internal knowledge bases, and observability systems.
Why integrate Elasticsearch with a headless CMS?
Search starts getting hard when your content lives in one system and your search index lives somewhere else. A documentation site with 2,000 articles, 12 locales, authors, product tags, and versioned pages needs more than a simple SQL LIKE query. Elasticsearch gives you analyzers, mappings, scoring, filters, synonyms, and fast aggregations, but it needs clean documents to index.
A headless CMS integration solves the handoff. Instead of scraping rendered HTML or running a nightly export, you send structured JSON into Elasticsearch whenever content changes. With Sanity, content is already typed in the Content Lake, GROQ can select only the fields your search index needs, and webhooks or Functions can trigger on publish, update, and delete events.
The alternative is usually a fragile sync job. Someone exports CSV files, a script strips tags from body fields, references are missing, and search results lag behind published content by hours. That can work for a small site, but it breaks down when editors expect a newly published product page, support article, or campaign landing page to appear in search within seconds.
Architecture overview
A typical Sanity and Elasticsearch integration has five parts. First, editors publish structured content in Sanity Studio. That content is written to the Content Lake as typed JSON, such as articles, products, authors, categories, and Portable Text body content. Second, a GROQ-powered webhook or Sanity Function listens for content mutations, usually publish, update, and delete events on document types you want searchable. The trigger payload includes the document ID, type, and mutation details. For a publish or update, the handler fetches the current document from the Content Lake with GROQ, including referenced fields like author name, category slug, or related product metadata. Third, the handler transforms the Sanity document into an Elasticsearch document. This is where you flatten Portable Text into searchable text, normalize slugs, remove fields that shouldnโt be public, and shape nested data for Elasticsearch mappings. Then it calls the Elasticsearch API through the official SDK, usually client.index(), client.update(), client.delete(), or client.bulk() for larger backfills. Fourth, Elasticsearch stores the indexed document in an index such as content-search-v1 with mappings for text, keyword, date, nested, and completion fields. Finally, your frontend calls a search endpoint that queries Elasticsearch with match, multi_match, bool filters, aggregations, highlighting, or suggestions, and returns ranked results to the user.
Common use cases
Documentation search with filters
Index articles, API references, product versions, and tags so users can search docs by keyword, version, language, and topic.
Product catalog search
Sync product content, categories, specs, and merchandising copy into Elasticsearch for faceted search across thousands of SKUs.
Localized site search
Index locale-specific titles, slugs, body text, and metadata so search results stay scoped to the userโs language and region.
Autocomplete and suggestions
Use Elasticsearch completion suggesters or search-as-you-type fields for fast query suggestions from Sanity-managed titles and taxonomy terms.
Step-by-step integration
- 1
Create an Elasticsearch deployment
Set up Elastic Cloud or a self-managed Elasticsearch cluster. Create an index, define mappings for fields like title, slug, bodyText, type, locale, publishedAt, and tags, then create an API key with index and delete permissions for that index.
- 2
Install the SDKs
In your sync service, Function, or webhook handler, install the official Elasticsearch JavaScript client with npm install @elastic/elasticsearch and the Sanity client with npm install @sanity/client.
- 3
Model searchable content in Sanity Studio
Define schemas for the content you want to index, such as article, product, author, category, and locale fields. Include stable fields like slug, title, excerpt, publish date, and references that search results need to display.
- 4
Create the sync trigger
Use a Sanity webhook for a hosted endpoint or a Sanity Function if you want the server-side sync to run on content events without maintaining separate infrastructure. Filter triggers to only the document types that belong in Elasticsearch.
- 5
Fetch, transform, and index documents
Use GROQ to fetch the full document and join referenced fields in one request. Transform the response into your Elasticsearch index shape, then call client.index() for creates and updates, or client.delete() when a document is unpublished or removed.
- 6
Test search behavior in the frontend
Run searches against Elasticsearch using multi_match, filters, highlighting, and aggregations. Test misspellings, empty results, permissions, locale filters, and publish latency before sending production traffic to the new index.
How Sanity + Elasticsearch works
Build your Elasticsearch integration on Sanity
Sanity gives you the structured content foundation, real-time event system, and flexible APIs to keep Elasticsearch indexed from the same source that powers your sites, apps, and AI agents.
Start building free โCMS approaches to Elasticsearch
| Capability | Traditional CMS | Sanity |
|---|---|---|
| Structured data for indexing | The Content Lake stores typed JSON, and GROQ can return one index-ready document with referenced fields included. | |
| Real-time sync on publish | Webhooks can notify your endpoint, and Functions can run sync logic on content events without separate infrastructure. | |
| Field-level query control | GROQ filters, joins, and projections let you send Elasticsearch only the fields needed for ranking, filters, and result cards. | |
| Delete and unpublish handling | Mutation events can trigger client.delete() in Elasticsearch, and draft IDs can be normalized before indexing. | |
| Search result context | Search documents can include the exact display context users need, such as title, excerpt, slug, author, category, tags, and locale. | |
| Operational trade-offs | You still need to design Elasticsearch mappings and ranking rules, but Sanity reduces the work of getting clean, current content into the index. |
Keep building
Explore related integrations to complete your content stack.
Sanity + Algolia
Build hosted site search with fast indexing, typo tolerance, facets, and relevance controls for content from Sanity.
Sanity + Typesense
Connect Sanity content to an open-source search engine built for instant search, filtering, and typo-tolerant results.
Sanity + Meilisearch
Sync structured Sanity content into Meilisearch for fast search experiences with simple setup and clear ranking rules.