SEO & Web Optimization8 min read

How to Integrate Screaming Frog with Your Headless CMS

Connect Screaming Frog to structured content so every publish can trigger targeted crawls, catch SEO regressions, and help teams fix issues before they reach search results.

Published April 29, 2026
01Overview

What is Screaming Frog?

Screaming Frog is a desktop-based SEO crawler used by technical SEO teams, agencies, and site owners to audit URLs, metadata, redirects, canonicals, status codes, hreflang, structured data, and more. Its SEO Spider is widely used for technical audits because it can crawl small sites, large sites, staging environments, and URL lists with configurable extraction and export settings.


02The case for integration

Why integrate Screaming Frog with a headless CMS?

SEO issues usually show up after content changes. An editor updates a title, a developer changes a route, or a localization team publishes 400 translated pages, and nobody notices that 37 pages now have missing canonicals or duplicate H1s until the next scheduled audit. Connecting Screaming Frog to your content workflow lets you run focused crawls when content changes, instead of waiting for a monthly site-wide scan.


03Architecture

Architecture overview

A common flow starts when an editor publishes or updates a document in Sanity Studio. A Sanity webhook fires on the publish mutation, filtered with GROQ so only SEO-relevant document types, such as page, article, product, or landingPage, trigger the workflow. The webhook calls a Sanity Function or a small middleware endpoint. That code uses @sanity/client and a GROQ query to fetch the changed document, join referenced fields, and build the exact public URL or a short URL list for nearby pages that should be checked. Because Screaming Frog SEO Spider does not expose a hosted REST API, the middleware calls the installed SEO Spider command-line binary, for example screamingfrogseospider, with headless list-mode crawl options. The runner writes a urls.txt file, starts the crawl, exports tabs such as Internal:All, Page Titles:Missing, Meta Description:Duplicate, H1:Missing, Response Codes:Client Error (4xx), and Canonicals:Missing, then saves the CSV output to a shared location or parses it into your reporting system. The SEO team reviews the issues, fixes the source fields in Sanity Studio, and the published site updates for visitors and search crawlers.


04Use cases

Common use cases

🕷️

Crawl new pages after publish

Run a Screaming Frog list-mode crawl for newly published URLs and catch missing titles, 404s, noindex tags, and canonical issues within minutes.

🌎

Check localized URL sets

When a market publishes translated pages, crawl the locale-specific URLs and export hreflang, status code, and canonical reports for that language.

🔁

Audit redirects after slug changes

Use Sanity webhook events to detect slug updates, crawl old and new URLs, and verify 301 behavior before search engines recrawl the page.

📄

Validate metadata at scale

Compare structured title, description, Open Graph, and canonical fields in Sanity against what Screaming Frog finds in rendered HTML.


05Implementation

Step-by-step integration

  1. 1

    Install and license Screaming Frog SEO Spider

    Install SEO Spider on the machine that will run crawls, activate a paid license if you need more than the free crawl limits, and confirm the command-line binary works with a test command such as screamingfrogseospider --headless --crawl https://example.com.

  2. 2

    Create a reusable Screaming Frog configuration

    In the SEO Spider app, configure crawl settings, rendering mode, user agent, authentication if needed, custom extraction, and PageSpeed Insights settings if you use that API. Save the configuration as a .seospiderconfig file for the runner.

  3. 3

    Model SEO fields in Sanity Studio

    Add fields such as slug, seoTitle, seoDescription, canonicalUrl, noindex, language, market, and parent references to the relevant schemas. This gives your crawl workflow typed source data instead of relying on HTML scraping.

  4. 4

    Add a publish webhook or Sanity Function trigger

    Create a webhook that fires on publish events for page-like documents. Use a GROQ filter such as _type in ['page','article','product'] so crawl automation does not run for unrelated content changes.

  5. 5

    Run Screaming Frog from a crawl runner

    Because Screaming Frog does not provide a hosted API or official JavaScript SDK, call the SEO Spider command-line interface from a licensed runner. The runner can receive the Sanity webhook, fetch the changed URL with @sanity/client, write a URL list, and start a headless crawl.

  6. 6

    Test the full loop

    Publish a test page, confirm the webhook fires, verify the generated URL list, inspect the exported Screaming Frog CSV files, and decide where issues go next, such as Slack, Jira, GitHub, or a custom Sanity Studio dashboard.


06Code

Code example

typescriptscreaming-frog-webhook.ts
import {createClient} from '@sanity/client';
import {writeFile, mkdir} from 'node:fs/promises';
import {execFile} from 'node:child_process';
import {promisify} from 'node:util';

const exec = promisify(execFile);
const client = createClient({
  projectId: process.env.SANITY_PROJECT_ID!,
  dataset: process.env.SANITY_DATASET!,
  apiVersion: '2025-01-01',
  token: process.env.SANITY_READ_TOKEN,
  useCdn: false
});

export async function handleWebhook(req: any, res: any) {
  const {_id} = req.body;
  const page = await client.fetch(
    `*[_id == $id][0]{"url": "https://www.example.com/" + slug.current}`,
    {id: _id.replace('drafts.', '')}
  );

  if (!page?.url) return res.status(204).end();

  await mkdir('/tmp/sf', {recursive: true});
  await writeFile('/tmp/sf/urls.txt', page.url + '
');

  await exec('screamingfrogseospider', [
    '--headless',
    '--crawl-list', '/tmp/sf/urls.txt',
    '--config', '/opt/screamingfrog/sanity.seospiderconfig',
    '--output-folder', '/tmp/sf/out',
    '--export-tabs', 'Internal:All,Page Titles:Missing,Meta Description:Missing,Response Codes:Client Error (4xx)'
  ]);

  res.json({crawled: page.url});
}

07Why Sanity

How Sanity + Screaming Frog works

Build your Screaming Frog integration on Sanity

Sanity gives you the structured content foundation, real-time event system, and flexible APIs to connect publish workflows with Screaming Frog audits.

Start building free →

08Comparison

CMS approaches to Screaming Frog

CapabilityTraditional CMSSanity
Generating crawl URL listsOften requires sitemap scraping, database exports, or plugins that vary by site setup.Uses GROQ to query published URLs, locale variants, parent references, and SEO fields directly from the Content Lake.
Triggering crawls on publishUsually depends on plugin hooks or scheduled jobs, which can miss custom publish flows.Uses GROQ-powered webhooks to trigger only for relevant document types, publish events, or field changes.
Running Screaming Frog automationTypically needs an external script that polls the site or waits for manual URL exports.Functions can handle lightweight event processing, while a licensed crawl runner executes the Screaming Frog command-line workflow.
Comparing source fields to rendered HTMLSEO teams often compare spreadsheets against pages manually.A single GROQ query can return title, description, canonical, noindex, language, and related content for comparison against crawl exports.
Handling multi-market SEO checksLocale rules are often embedded in templates, plugins, or separate site instances.Schemas can model markets, languages, routes, and hreflang relationships so each crawl targets the right regional URL set.

09Next steps

Keep building

Explore related integrations to complete your content stack.

Ready to try Sanity?

See how Sanity's Content Operating System powers integrations with Screaming Frog and 200+ other tools.