Search8 min read

How to Integrate Elasticsearch with Your Headless CMS

Add fast, typo-tolerant search by syncing structured content from your headless CMS into Elasticsearch the moment itโ€™s published.

Published April 29, 2026
01 โ€” Overview

What is Elasticsearch?

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. Teams use it to index JSON documents, run full-text search, filter large result sets, rank by relevance, and support features like autocomplete, faceting, and geo search. Itโ€™s widely used in product catalogs, documentation sites, media archives, internal knowledge bases, and observability systems.


02 โ€” The case for integration

Why integrate Elasticsearch with a headless CMS?

Search starts getting hard when your content lives in one system and your search index lives somewhere else. A documentation site with 2,000 articles, 12 locales, authors, product tags, and versioned pages needs more than a simple SQL LIKE query. Elasticsearch gives you analyzers, mappings, scoring, filters, synonyms, and fast aggregations, but it needs clean documents to index.

A headless CMS integration solves the handoff. Instead of scraping rendered HTML or running a nightly export, you send structured JSON into Elasticsearch whenever content changes. With Sanity, content is already typed in the Content Lake, GROQ can select only the fields your search index needs, and webhooks or Functions can trigger on publish, update, and delete events.

The alternative is usually a fragile sync job. Someone exports CSV files, a script strips tags from body fields, references are missing, and search results lag behind published content by hours. That can work for a small site, but it breaks down when editors expect a newly published product page, support article, or campaign landing page to appear in search within seconds.


03 โ€” Architecture

Architecture overview

A typical Sanity and Elasticsearch integration has five parts. First, editors publish structured content in Sanity Studio. That content is written to the Content Lake as typed JSON, such as articles, products, authors, categories, and Portable Text body content. Second, a GROQ-powered webhook or Sanity Function listens for content mutations, usually publish, update, and delete events on document types you want searchable. The trigger payload includes the document ID, type, and mutation details. For a publish or update, the handler fetches the current document from the Content Lake with GROQ, including referenced fields like author name, category slug, or related product metadata. Third, the handler transforms the Sanity document into an Elasticsearch document. This is where you flatten Portable Text into searchable text, normalize slugs, remove fields that shouldnโ€™t be public, and shape nested data for Elasticsearch mappings. Then it calls the Elasticsearch API through the official SDK, usually client.index(), client.update(), client.delete(), or client.bulk() for larger backfills. Fourth, Elasticsearch stores the indexed document in an index such as content-search-v1 with mappings for text, keyword, date, nested, and completion fields. Finally, your frontend calls a search endpoint that queries Elasticsearch with match, multi_match, bool filters, aggregations, highlighting, or suggestions, and returns ranked results to the user.


04 โ€” Use cases

Common use cases

๐Ÿ”Ž

Documentation search with filters

Index articles, API references, product versions, and tags so users can search docs by keyword, version, language, and topic.

๐Ÿ›๏ธ

Product catalog search

Sync product content, categories, specs, and merchandising copy into Elasticsearch for faceted search across thousands of SKUs.

๐ŸŒ

Localized site search

Index locale-specific titles, slugs, body text, and metadata so search results stay scoped to the userโ€™s language and region.

๐Ÿ’ก

Autocomplete and suggestions

Use Elasticsearch completion suggesters or search-as-you-type fields for fast query suggestions from Sanity-managed titles and taxonomy terms.


05 โ€” Implementation

Step-by-step integration

  1. 1

    Create an Elasticsearch deployment

    Set up Elastic Cloud or a self-managed Elasticsearch cluster. Create an index, define mappings for fields like title, slug, bodyText, type, locale, publishedAt, and tags, then create an API key with index and delete permissions for that index.

  2. 2

    Install the SDKs

    In your sync service, Function, or webhook handler, install the official Elasticsearch JavaScript client with npm install @elastic/elasticsearch and the Sanity client with npm install @sanity/client.

  3. 3

    Model searchable content in Sanity Studio

    Define schemas for the content you want to index, such as article, product, author, category, and locale fields. Include stable fields like slug, title, excerpt, publish date, and references that search results need to display.

  4. 4

    Create the sync trigger

    Use a Sanity webhook for a hosted endpoint or a Sanity Function if you want the server-side sync to run on content events without maintaining separate infrastructure. Filter triggers to only the document types that belong in Elasticsearch.

  5. 5

    Fetch, transform, and index documents

    Use GROQ to fetch the full document and join referenced fields in one request. Transform the response into your Elasticsearch index shape, then call client.index() for creates and updates, or client.delete() when a document is unpublished or removed.

  6. 6

    Test search behavior in the frontend

    Run searches against Elasticsearch using multi_match, filters, highlighting, and aggregations. Test misspellings, empty results, permissions, locale filters, and publish latency before sending production traffic to the new index.



07 โ€” Why Sanity

How Sanity + Elasticsearch works

Build your Elasticsearch integration on Sanity

Sanity gives you the structured content foundation, real-time event system, and flexible APIs to keep Elasticsearch indexed from the same source that powers your sites, apps, and AI agents.

Start building free โ†’

08 โ€” Comparison

CMS approaches to Elasticsearch

CapabilityTraditional CMSSanity
Structured data for indexingThe Content Lake stores typed JSON, and GROQ can return one index-ready document with referenced fields included.
Real-time sync on publishWebhooks can notify your endpoint, and Functions can run sync logic on content events without separate infrastructure.
Field-level query controlGROQ filters, joins, and projections let you send Elasticsearch only the fields needed for ranking, filters, and result cards.
Delete and unpublish handlingMutation events can trigger client.delete() in Elasticsearch, and draft IDs can be normalized before indexing.
Search result contextSearch documents can include the exact display context users need, such as title, excerpt, slug, author, category, tags, and locale.
Operational trade-offsYou still need to design Elasticsearch mappings and ranking rules, but Sanity reduces the work of getting clean, current content into the index.

09 โ€” Next steps

Keep building

Explore related integrations to complete your content stack.

Ready to try Sanity?

See how Sanity's Content Operating System powers integrations with Elasticsearch and 200+ other tools.