2026Edge SEO and AI crawlers

LLM Proxy Worker

Project resources

Technologies

TypeScriptCloudflare WorkersCloudflare KVCloudflare Browser RenderingPuppeteerDataForSEOOpenAIAnthropic ClaudeWrangler

Features

LLM crawler markdown, AI-generated SEO, and HTML prerendering in one edge Worker

Overview

LLM Proxy Worker is a serverless backend built in TypeScript to sit in front of a website as an intermediate layer. The Worker detects traditional browsers, search crawlers, and AI crawlers, then decides whether to return the original page, an LLM-readable markdown version, HTML enriched with dynamically generated SEO metadata, or a prerendered version for JavaScript-heavy pages. The goal is to improve readability, technical SEO, and crawler compatibility without rewriting the origin site.

Role

End-to-end design and development of the Worker: request classification, HTML-to-markdown conversion, SEO generation with AI providers, Chromium/Puppeteer prerendering, KV cache, fallbacks, and debug routes.

What I did

Designed the main Worker flow to distinguish browsers, AI crawlers, search crawlers, and debug routes. Implemented HTML-to-markdown conversion, removing noisy elements such as scripts, navigation, footers, forms, and banners. Built dynamic SEO metadata generation using SERP data, OpenAI as the primary provider, and Claude as fallback. Validated AI output before injecting title, meta description, Open Graph, canonical, and JSON-LD into the HTML document. Integrated Cloudflare Browser Rendering with Puppeteer to serve prerendered HTML to crawlers when a page depends on JavaScript. Used Cloudflare KV, configurable timeouts, and local fallbacks to reduce cost, latency, and dependency on external providers.

Problems solved

Different responses for different clients

The Worker classifies requests through technical routes, debug queries, User-Agent signals, and purpose headers. This lets browsers, AI crawlers, and search crawlers receive the most suitable path without adding cost to every visit.

Controlled AI token usage

SEO generation uses SERP data and constrained prompts, then stores the result in KV. Later requests reuse the cache, avoiding repeated calls to DataForSEO, OpenAI, and Claude.

Prerendering without blocking the site

Chromium and Puppeteer are used only for crawlers or explicit debug requests. If rendering fails, the browser binding is missing, or cache is unavailable, the Worker falls back to the original page.