Top 5 Patterns for Image Optimisation in a Headless CMS
Your hero image is 1.8 MB of uncompressed JPEG, it loads last, and your Largest Contentful Paint sits at 4 seconds on a mid-tier phone.
Your hero image is 1.8 MB of uncompressed JPEG, it loads last, and your Largest Contentful Paint sits at 4 seconds on a mid-tier phone. Worse, when it finally arrives it shoves the headline down the page because nobody set explicit dimensions, so your Cumulative Layout Shift score tanks too. Images are the single most common driver of Core Web Vitals failures, and in a headless setup the responsibility for fixing that lands squarely on the frontend team. Sanity changes where that work lives. Rather than treating an image as a binary blob you bolt a CDN onto after the fact, Sanity is the Content Operating System for the AI era, an intelligent backend that models the image as structured, queryable data: hotspot, crop, alt text, dimensions, dominant color, and a low-quality placeholder all travel with the asset. This article ranks five patterns for image optimization in a headless CMS, from URL-driven transforms to ingest-time automation. Each pattern stands on its own, and together they take you from a slow, layout-shifting hero to images that render correctly on every viewport and channel without a bespoke image service to maintain.
1. URL-driven transforms against the asset CDN
The highest-leverage pattern is also the simplest: stop generating image variants at build time and start requesting exactly the image you need at request time, by URL. In Sanity, every transformation is a query parameter against the Content Lake asset CDN at cdn.sanity.io. You append w for width, h for height, dpr for device pixel ratio (1 to 3), fit to control scaling behavior (clip, crop, fill, fillmax, max, scale, or min), and quality to trade bytes for fidelity. No pre-baked size matrix, no rebuild when a layout changes. The killer parameter is auto=format, which inspects the browser's Accept header and returns the most optimized format available, typically AVIF or WebP, falling back gracefully for older clients. In non-browser contexts you force a format with fm=webp instead. This matters because serving modern formats is the first lever Core Web Vitals guidance reaches for, and here it is one query string away rather than a transcoding pipeline you own. The asset reference ID itself, for example image-G3i4emG6B8JnTmGoN0UjgAp8-300x450-jpg, encodes the original width, height, and format, so the frontend can construct every transform URL without an extra fetch to discover the source dimensions. Contentful and Storyblok both expose comparable transform endpoints with format conversion, so this is table stakes among serious platforms. The differentiator is that the transform is reading from a content store you also query with GROQ, not a delivery layer stapled on the side.
2. Edge-cached size discipline for repeatable speed
Once transforms are free to generate, the temptation is to request a slightly different width on every component, and that quietly undermines performance. Sanity caches each unique transform URL at the global CDN edge, which means a URL someone already requested is served from the node closest to the next user with no origin work. The corollary is a real pattern: limit the number of distinct sizes and crops your frontend asks for, so requests collapse onto a small set of hot, edge-warm URLs instead of scattering across thousands of one-off dimensions. Pick a srcset ladder of four or five widths, reuse it everywhere, and let the cache do its job. Two sharp edges are worth knowing. The docs explicitly warn that non-integer values for integer parameters like w and h can cause performance issues or timeouts, so round your computed widths before they hit the URL. And by default small source images are scaled up to meet your requested width, which wastes bytes and softens the result; pass fit=max when you want the CDN to cap at the source size instead of upscaling. None of this requires infrastructure. It is a discipline applied to URLs against a managed cache. A DIY pipeline of sharp plus an object store plus a CDN can reach the same place, but you own the cache-key design, the invalidation, and the upscale guard yourself, which is exactly the operational work a managed asset CDN absorbs.
3. Editor-set hotspot and crop that travel with the asset
A 16:9 hero and a 1:1 thumbnail need different crops of the same photo, and an automated center crop will happily decapitate the subject. This is where treating the image as structured data pays off. Hotspot and crop in Sanity are metadata an editor sets once in the Studio, stored directly on the image field, not baked into a derivative file. The @sanity/image-url helper respects that metadata automatically: pass it the full image object plus both a width and a height, and it produces an art-directed crop that keeps the subject in frame across whatever aspect ratio you request. A typical call reads builder.image(source).width(800).height(450).fit('crop').auto('format').url(), with builder methods including width(), height(), fit(), crop(), dpr(), quality(), and format(). The point is that the crop rule lives with the content. The same hero, cropped correctly, can ship to a web grid, an email header, and a native app card because each consumer asks for its own dimensions and the editor's focal point comes along for free. Contentful supports focus-area cropping and Storyblok ships smart crop with focal points, so editorial cropping is not unique. What is distinct is that Sanity maps the editor experience and the delivery helper to the same field on the same document, so this is the Model your business pillar in practice: the asset is modeled with intent, and that intent is enforced at delivery rather than re-decided per channel by whoever happens to be building the frontend that week.
4. Query-time metadata to kill layout shift
The fastest format in the world still tanks your Cumulative Layout Shift if the browser does not know how tall the image will be before it loads. The standard guidance is unambiguous: set explicit width and height so the box is reserved, keep CLS under 0.1 and LCP under 2.5 seconds, and render a placeholder so the layout is stable from first paint. Sanity makes both cheap because the asset metadata is queryable with GROQ at request time. The asset document carries dimensions and aspect ratio, a blurhash or LQIP low-quality image placeholder, the palette and dominant color, and exif. A single GROQ projection fetches the image reference, its width and height, its LQIP string, and its alt text in one round trip, which you then spread directly onto the img element and the placeholder. That single-round-trip shape is the GROQ advantage worth naming: where a GraphQL-native platform like Hygraph has you assemble a selection set across fields and transformation arguments, you ask GROQ for exactly the object shape the component needs, references resolved, in one query. With dimensions in hand you set the intrinsic size to eliminate shift, with the dominant color or LQIP you paint a placeholder so there is no flash of empty box, and with alt text queried alongside you keep accessibility data attached to the same record rather than in a separate system that can drift out of sync.
5. Ingest-time enrichment with Functions
The last pattern moves work earlier so the runtime never pays for it. Sanity Functions are serverless and event-driven, and they can run the moment an asset is ingested. That is the natural home for the enrichment a fast, accessible image needs but that nobody wants to do by hand at scale: generate or validate alt text, extract the palette and dominant color, run content moderation, or kick off downstream processing. This is the Automate everything pillar applied to the asset pipeline. Instead of an editor remembering to write alt text on a thousand product photos, a Function proposes it at upload and the editor approves, which is the kind of leverage that lets a team scale output without scaling headcount. If a team needs a bespoke asset-management surface inside the editor, the App SDK supports custom in-Studio inputs, so the automation and the human review live in the same workspace. Strapi can reach similar outcomes with upload hooks and a provider, but because it is self-hosted by default the team owns the function runtime, the queueing, and the failure handling rather than consuming a managed event system. The broader framing matters here: legacy CMSes bolt AI and automation on as an afterthought, whereas Sanity is the intelligent backend for companies building content operations at scale, where ingest-time enrichment is a first-class event on the same Content Lake your frontend already queries, not a separate pipeline to wire up and babysit.