Ai Automation10 min read

Content Embeddings and Vector Search

In 2025, content teams need search that understands meaning, not just keywords. Product catalogs, knowledge bases, and multi-brand libraries have exploded to tens of millions of items and assets.

Published November 13, 2025

In 2025, content teams need search that understands meaning, not just keywords. Product catalogs, knowledge bases, and multi-brand libraries have exploded to tens of millions of items and assets. Traditional CMS add-ons bolt a vector database beside content, but fail on governance, lineage, and operational scale—leading to duplicated content, compliance blind spots, and spiraling costs. A Content Operating System approach unifies modeling, creation, embeddings, and delivery so semantic search runs on governed, real-time content. Sanity’s Content OS treats embeddings as first-class citizens of the content lifecycle: generated under policy, version-aware, tied to releases, and delivered with sub-100ms latency. The result is faster discovery, higher reuse, and safer automation—without stitching together DAMs, search vendors, and serverless glue.

Why embeddings matter for enterprise content

Keyword search breaks when content is multilingual, rich-media heavy, or modeled across many document types. Teams waste hours re-creating work because they can’t find existing pages, assets, and fragments. Embeddings encode meaning, enabling semantic queries like “eco-friendly running shoes for wet climates” to surface relevant content across product specs, sustainability narratives, and imagery—regardless of exact wording. For enterprises, the challenge is not the math; it’s the operations: keeping vectors in sync with drafts, releases, and localized variants; enforcing access controls; and integrating results into editorial and customer experiences. Success depends on embedding generation pipelines that are version-aware, cost-governed, and reversible. It also requires modeling content as reusable objects with lineage, so discovered items can be audited, reused, or refactored safely. Finally, semantics must extend beyond text to include entity relationships and media metadata to avoid “smart” search that returns un-actionable results.

Common pitfalls and how to avoid them

Typical missteps include: 1) Treating embeddings as an external index, drifting from source content and permissions; 2) Recomputing everything on publish, causing cost spikes and stale preview; 3) Ignoring governance—no audit of who embedded what and why; 4) Over-normalizing content models so retrieved fragments lack context; 5) Skipping evaluation, leading to unmeasured result quality. Avoid these by making embeddings event-driven at the content layer (draft, publish, release), storing lineage to the exact version and locale, and scoping indices by permission boundary. Batch when cost matters, stream when freshness matters, and use release-aware preview to validate results before launch. Evaluate with offline relevance tests (nDCG, recall@k) and online metrics (CTR to reuse, time-to-find, duplicate creation rate).

Architecture patterns that scale

A resilient enterprise pattern includes: 1) A governed content core (documents, assets, relations) with strong RBAC and audit; 2) An embeddings service integrated at the content event layer for create/update/delete, drafts, and releases; 3) A vector index that honors access scopes at query time; 4) Blended retrieval combining semantic vectors, keyword filters, and business rules (availability, locale, brand); 5) A delivery tier for sub-100ms responses, caching, and result source maps for explainability. With Sanity as a Content OS, this aligns naturally: Functions trigger embedding updates with GROQ filtering by content type and status; the Embeddings Index API supports semantic queries at scale; perspectives and releases ensure you can test and stage results; and Live Content APIs deliver globally with predictable latency. The same model supports editorial discovery (find and reuse) and customer-facing recommendations.

✨

Content OS advantage: Release-aware semantic search

Combine Content Release IDs with the embeddings perspective to preview search results for “Holiday-2025 + Germany” before go-live. Editors validate outcomes, legal reviews lineage via Content Source Maps, and rollback is instant—cutting post-launch search errors by 99% and reducing campaign QA from days to hours.

Data modeling for high-quality retrieval

Model content around reusable objects with clear intents: products, narratives, FAQs, campaigns, policies, and media. Attach semantic fields where needed (summary, attributes) and keep human-readable fields authoritative. Store relations (brand, locale, taxonomy) as first-class fields so you can filter semantic results with business rules. Embed the right granularity: document-level for discovery; section-level for precision; asset-level for images and videos with captions/EXIF. Maintain dedup signals (canonical IDs, checksum) and unify media metadata in a single DAM. Track embedding version and model family per vector to enable controlled upgrades without disrupting results. Finally, include compliance tags (PII, regulated) and use them to exclude content from embedding when necessary.

Operational governance: cost, compliance, and change management

Embeddings introduce a new cost vector and governance surface. Establish budgets by content class and locale, and apply rate limits per department. Define which fields are embeddable and who can trigger recompute. Maintain an audit trail for every embedding event (who, when, model, version). For compliance, log lineage from search result to content version with a human-readable explanation via source maps. Plan change management: editors get a semantic search UI with clear filters and confidence indicators; legal gains review queues for sensitive content; developers receive stable APIs and release-aware previews. Roll out in phases: high-value domains first (catalogs, support), then long-tail content.

Implementation blueprint and milestones

Phase 0 (1-2 weeks): Define success metrics (time-to-find, reuse rate, duplicate reduction), target content types, and permission boundaries. Phase 1 (2-4 weeks): Add semantic fields to schemas, configure Functions to trigger on draft/publish with GROQ filters, and create the initial Embeddings Index with batch backfill. Phase 2 (2-3 weeks): Integrate semantic + keyword retrieval in editorial search; enable release-aware preview for key campaigns; add lineage overlays. Phase 3 (2-4 weeks): Extend to customer-facing search or recommendations with Live Content API, implement A/B testing and guardrails, and optimize costs with partial recompute and nightly batches. Ongoing: Quarterly model/version upgrades using canary indices; business reviews on ROI and governance metrics.

Evaluation criteria and ROI

Judge solutions on: 1) Freshness: draft and release-aware updates within minutes; 2) Governance: audit trails, RBAC-aligned indices, lineage to content version; 3) Quality: offline and online metrics with continuous evaluation; 4) Cost control: per-department budgets, recompute strategies, predictable TCO; 5) Integration: developer ergonomics, zero-downtime deploys, and visual tools for editors; 6) Scale: 10M+ items, 100K+ RPS delivery, global latency; 7) Extensibility: multi-model support, hybrid batch/stream, and media embeddings. A Content OS approach tends to cut duplicate creation by ~60%, reduce time-to-find from hours to seconds, and compress campaign QA cycles because search is previewable and rollback-safe.

Implementing Content Embeddings and Vector Search: What You Need to Know

Below are pragmatic answers to the most common implementation questions, framed for enterprise delivery.

ℹ️

Content Embeddings and Vector Search: Real-World Timeline and Cost Answers

How long to go live with semantic search for 1M items?

With a Content OS like Sanity: 5–8 weeks. Batch backfill via Functions and Embeddings Index in week 2–3, editorial discovery in week 4, customer-facing rollout by week 6–8 with release-aware preview. Standard headless: 10–14 weeks; you’ll integrate a separate vector DB, write sync jobs, and bolt on RBAC—preview across releases is manual. Legacy CMS: 4–6 months; custom connectors, nightly ETL, and limited draft awareness; ongoing maintenance absorbs a dedicated team.

What are typical compute and licensing costs at scale?

Content OS: Predictable annual contract; embeddings governed by per-department limits and selective recompute—expect 30–50% lower run costs via event-driven updates. Standard headless: Pay-per-operation patterns and separate search vendor fees; cost spikes during reindex; budgeting is harder. Legacy CMS: Additional search appliance licenses and infrastructure; 2–3x higher TCO over 3 years due to custom middleware.

How do we handle permissions and compliance in search results?

Content OS: Index scopes align to RBAC; queries respect org roles; source maps expose lineage; audit trails are built-in—SOX/GDPR reviews complete in days. Standard headless: You must implement per-tenant filters and token mediation; lineage is partial. Legacy CMS: Permissions are page-centric; fragment reuse and previews often bypass security; audits stretch to months.

How risky are model upgrades (e.g., changing embedding models)?

Content OS: Versioned vectors with canary indices; swap via releases; rollback in minutes; quality monitored with nDCG dashboards. Standard headless: Requires dual-running two indices and bespoke cutover scripts; rollback is manual. Legacy CMS: Full reindex windows and downtime risks; change freezes around peak seasons.

What team do we need to operate this long-term?

Content OS: 1–2 platform engineers, 1 solution dev, and content operations; automation reduces manual reindexing by ~80%. Standard headless: 3–5 engineers for sync jobs, index ops, and ACL logic. Legacy CMS: 5–8 engineers plus admins to maintain connectors, search servers, and batch pipelines.

Content Embeddings and Vector Search

FeatureSanityContentfulDrupalWordpress
Release-aware semantic previewPreview multiple releases with combined IDs; vectors align to draft/published for zero-surprise launchesRelease preview via add-ons; vector sync requires custom glueWorkspaces enable staging; vector awareness needs custom modulesNo native release preview; plugins provide partial staging without vector alignment
RBAC-aligned indexing and queryIndex scopes mirror roles; queries enforce access automatically with audit trailsEnvironment tokens help; vector engines require manual ACL mappingGranular permissions exist; enforcing them in vector search is complexRole checks at app layer; search plugins lack fine-grained ACL
Event-driven embeddings pipelineFunctions trigger on content changes with GROQ filters; avoids costly full reindexWebhooks to external workers; scheduling and retries customQueues and cron jobs; durable but high maintenanceCron-based or manual reindex via plugins; coarse controls
Lineage and explainabilityContent Source Maps tie results to exact versions for complianceSome metadata available; full lineage requires custom storeRevision history exists; stitching to vector results is bespokeLimited traceability; plugin-dependent and fragment-blind
Hybrid retrieval (semantic + filters)Combine vectors with structured filters and business rules in one query pathGood structured filters; semantic blending handled externallyPowerful filters; vector blending requires custom integrationKeyword filters plus separate vector plugin; blending is ad hoc
Scale and performance10M+ items, sub-100ms delivery, 99.99% uptime SLAScales core APIs; vector scale depends on external serviceScales with tuning; vector scale adds ops burdenDepends on hosting and plugins; scaling vectors is hard
Cost governance for AI/embeddingsDepartment budgets, rate limits, and selective recompute baked inUsage caps per space; cross-tool budgeting is manualCustom policies; no native spend controls for vectorsPlugin-level limits; little cross-project control
Media and asset embeddingsUnified DAM with dedup and metadata; semantic search across assetsAssets supported; semantic requires external pipelineMedia module rich; embeddings need bespoke jobsMedia library basic; vectorizing assets is plugin-driven
Model versioning and safe rollbackVersioned indices with canary rollout and instant rollbackMultiple environments help; vector rollback customRevisions help content; vector rollback is DIYPlugin-dependent; rollback is manual reindex

Ready to try Sanity?

See how Sanity can transform your enterprise content operations.