How to Integrate Redshift with Your Headless CMS
Send structured content changes into Redshift so analysts can join publish data, campaign metadata, and product copy with revenue, traffic, and customer events.
What is Redshift?
Amazon Redshift is AWSβs cloud data warehouse for SQL analytics across structured and semi-structured data. Data teams use it to run reporting, BI dashboards, customer analysis, and data science workflows on large datasets, often alongside tools like QuickSight, dbt, Looker, Tableau, and SageMaker.
Why integrate Redshift with a headless CMS?
Your analytics warehouse already knows what happened: page views, conversions, signups, purchases, support tickets, and retention. What it often doesnβt know is what content caused it. Without content metadata in Redshift, analysts end up asking for CSV exports, copying campaign names from spreadsheets, or guessing which article version was live when a conversion spike happened.
Architecture overview
A typical Sanity to Redshift integration starts when an editor publishes, updates, or deletes a document in Sanity Studio. A webhook filtered by GROQ, or a Sanity Function triggered by the mutation, receives the event and gets the document ID. The sync code then uses @sanity/client to fetch the current document from the Content Lake with a GROQ projection, including referenced fields such as author name, category slug, locale, campaign ID, and publish date. Next, the sync code calls Amazon Redshift through the Redshift Data API using the AWS SDK, usually @aws-sdk/client-redshift-data in TypeScript. For Redshift Serverless, the call includes WorkgroupName, Database, SecretArn, and a parameterized SQL statement. For provisioned clusters, it uses ClusterIdentifier instead of WorkgroupName. The SQL writes into an analytics table such as content_documents, often with MERGE for upserts or INSERT into a staging table followed by a dbt model. Once content metadata is in Redshift, BI tools, reverse ETL jobs, data apps, and internal dashboards can join it with event tables, revenue tables, search logs, and experimentation data.
Common use cases
Content performance reporting
Join Sanity document metadata with page views, conversions, and revenue events in Redshift to compare performance by topic, author, locale, and publish date.
Experiment and variant analysis
Send A/B test labels, audience segments, and content variant IDs into Redshift so analysts can measure which copy or layout performed best.
Product content and revenue analysis
Connect product descriptions, category copy, buying guides, and merchandising fields from Sanity with orders, returns, and margin data in Redshift.
Editorial operations dashboards
Track publishing volume, review cycles, authorship, campaign coverage, and localization status next to downstream engagement metrics.
Step-by-step integration
- 1
Set up Redshift access
Create a Redshift Serverless workgroup or provisioned cluster, create a database such as analytics, and enable access through the Redshift Data API. Store database credentials in AWS Secrets Manager, grant your IAM user or role permissions for redshift-data:ExecuteStatement and secretsmanager:GetSecretValue, and install the SDK with npm install @aws-sdk/client-redshift-data @sanity/client.
- 2
Create a Redshift table for content metadata
Start with a narrow analytics table before adding more fields. A practical first table includes document_id, type, title, slug, locale, author_name, campaign_id, published_at, updated_at, and raw_content as SUPER for flexible analysis.
- 3
Model analytics-ready fields in Sanity Studio
Add fields that analysts can join on, not just fields editors see on the page. Common examples include slug, locale, canonicalUrl, campaignId, audience, topic references, author reference, lifecycleStatus, and publishedAt.
- 4
Create the sync trigger
Use a Sanity webhook for publish, update, and delete events, or use a Sanity Function if you want server-side sync logic tied directly to content mutations. Add a GROQ filter such as _type in ['post', 'productPage', 'landingPage'] so Redshift only receives analytics-relevant content.
- 5
Fetch the exact Sanity fields with GROQ
In the handler, fetch the document from the Content Lake by ID and project only the fields Redshift needs. Resolve references in the query, for example author->name and categories[]->slug.current, so the warehouse gets analysis-ready rows.
- 6
Write to Redshift and test the reporting path
Call ExecuteStatementCommand with a parameterized MERGE or INSERT statement, then verify the row in Redshift with SELECT * FROM analytics.content_documents WHERE document_id = '...'. Build a small Looker, Tableau, QuickSight, or SQL dashboard that joins content_documents to your events table before expanding the integration.
Code example
import {createClient} from '@sanity/client'
import {RedshiftDataClient, ExecuteStatementCommand} from '@aws-sdk/client-redshift-data'
const sanity = createClient({
projectId: process.env.SANITY_PROJECT_ID!,
dataset: process.env.SANITY_DATASET!,
apiVersion: '2025-01-01',
token: process.env.SANITY_READ_TOKEN,
useCdn: false
})
const redshift = new RedshiftDataClient({region: process.env.AWS_REGION})
export async function POST(req: Request) {
const {documentId} = await req.json()
const doc = await sanity.fetch(`*[_id == $id][0]{
_id, _type, title, "slug": slug.current,
locale, campaignId, publishedAt, _updatedAt,
"authorName": author->name
}`, {id: documentId})
if (!doc) return Response.json({ok: true, skipped: true})
await redshift.send(new ExecuteStatementCommand({
WorkgroupName: process.env.REDSHIFT_WORKGROUP!,
Database: process.env.REDSHIFT_DATABASE!,
SecretArn: process.env.REDSHIFT_SECRET_ARN!,
Sql: `merge into analytics.content_documents using (
select :id as document_id, :type as type, :title as title,
:slug as slug, :locale as locale, :author as author_name,
:campaign as campaign_id, :published as published_at, :updated as updated_at
) s on content_documents.document_id = s.document_id
when matched then update set title=s.title, slug=s.slug, updated_at=s.updated_at
when not matched then insert values (s.document_id, s.type, s.title, s.slug, s.locale, s.author_name, s.campaign_id, s.published_at, s.updated_at);`,
Parameters: [
{name: 'id', value: doc._id},
{name: 'type', value: doc._type},
{name: 'title', value: doc.title ?? ''},
{name: 'slug', value: doc.slug ?? ''},
{name: 'locale', value: doc.locale ?? 'en-US'},
{name: 'author', value: doc.authorName ?? ''},
{name: 'campaign', value: doc.campaignId ?? ''},
{name: 'published', value: doc.publishedAt ?? null},
{name: 'updated', value: doc._updatedAt}
]
}))
return Response.json({ok: true})
}How Sanity + Redshift works
Build your Redshift integration on Sanity
Sanity gives you the structured content foundation, real-time event system, and flexible APIs to send clean content metadata into Redshift.
Start building free βCMS approaches to Redshift
| Capability | Traditional CMS | Sanity |
|---|---|---|
| Analytics-ready content shape | Content is often stored as pages or HTML-heavy fields, so data teams need parsing jobs before loading Redshift. | Schemas define typed fields such as campaignId, locale, topic, author, and publish date, which map cleanly to Redshift columns. |
| Sync timing | Teams often depend on manual exports, database replicas, or nightly ETL jobs, which can leave dashboards stale. | GROQ-filtered webhooks and Functions can trigger sync logic on publish, update, and delete events without polling. |
| Field-level control | Exports may include too much page data or too little metadata, which creates extra warehouse modeling work. | GROQ can project the exact Redshift row shape in one query, including referenced authors, categories, campaigns, and localized fields. |
| Handling high-volume backfills | Backfills often require database access or custom scripts against legacy tables. | Use APIs for targeted backfills, then load large historical datasets into Redshift through S3 and COPY when volume gets high. |
| Multi-channel consistency | Website data, app data, and warehouse data often drift because each channel has its own content copy. | The Content Lake can feed websites, apps, Redshift, and AI agents from the same structured source. |
Keep building
Explore related integrations to complete your content stack.
Sanity + Google Analytics
Connect content metadata with traffic, engagement, and conversion data to see which pages and campaigns perform.
Sanity + Segment
Send consistent content IDs and campaign fields into your customer event pipeline before they land in Redshift.
Sanity + Snowflake
Load structured Sanity content into Snowflake for teams that run analytics outside AWS or across multiple clouds.