A web page is no longer read only by people. It is visited by search engines, AI crawlers, preview systems, SEO tools, and automations that often need responses different from the ones designed for a human browser.
The web no longer has a single reader
For years we designed websites by imagining a user in front of a screen. Today that is only part of the story. The same page can be requested by Googlebot, an AI crawler, a preview tool, or a traditional browser.
The issue is that these clients do not read a website in the same way. A browser can handle JavaScript, animations, and interactions. An AI crawler may prefer compact, readable content. A search engine needs clear metadata, coherent canonicals, and structured data.
LLM Proxy Worker starts from this idea: do not rewrite the origin site, but add a layer in front of it that adapts the response based on who is reading.
An edge proxy as an intelligent layer
The Worker runs on Cloudflare Workers and behaves like edge middleware. It receives the request, looks at the URL, query params, User-Agent, and headers, then decides which path to apply.
If the request comes from a human browser, the site keeps behaving normally, with optional SEO enrichment. If it comes from an AI crawler or an LLM debug route, the Worker can transform HTML into markdown. If it comes from a crawler that struggles with JavaScript, it can serve prerendered HTML.
This architecture is useful because it adds cross-cutting capabilities without rewriting the CMS, frontend, or main backend.
- browser: original page or SEO-enriched page
- AI crawler: cleaner and more readable markdown version
- search crawler: prerendered HTML when needed
- debug route: diagnostic responses to inspect behavior
Markdown for LLMs: less noise, more content
A complete HTML page contains much more than a language model really needs to read: scripts, styles, footers, navigation, banners, SVG, forms, and other technical elements.
For an AI crawler, that noise consumes tokens and can pull attention away from the actual content. Markdown conversion returns a more compact, ordered version that stays closer to the useful text of the page.
It is not a magic shortcut. It is an interface choice between a website and automated systems: when the reader is not human, the response can be shaped around how that reader processes information.
AI-generated SEO, but constrained
The delicate part is not calling a model. It is preventing the model from becoming a black box. In this project, SEO generation starts from real data: page HTML, keyword, SERP data, search intent, competitors, and rules computed by the code.
OpenAI is the primary provider, Claude is the fallback, but output is not accepted blindly. The result must be valid JSON and must pass checks on title, description, Open Graph, canonical, and JSON-LD.
That changes the role of AI: it does not decide alone what matters, but works inside a boundary defined by the system.
Prerendering only when it is actually needed
Many modern pages depend on JavaScript. A human browser sees them correctly, but some crawlers may stop at the initial HTML and miss part of the content.
For these cases, the Worker can use Cloudflare Browser Rendering with Puppeteer: it opens the page, lets JavaScript run, and returns the final HTML. It is powerful, but also expensive compared with a normal request.
That is why prerendering is not applied to everyone. It is triggered only for crawlers, specific policies, or explicit debug requests. Human users stay on the lighter path.
Cache, fallbacks, and sustainability
A solution based on LLMs and Chromium can become fragile if every request creates expensive work. That is why the project uses Cloudflare KV to store SEO metadata and prerendered HTML.
The first request can generate the result. Later requests can reuse it. This reduces tokens, external calls, latency, and output variability.
The most important part is behavior under failure. If DataForSEO does not respond, OpenAI fails, Claude is unavailable, or Chromium times out, the site should not break. The Worker returns reasonable fallbacks or the original page.
The senior part: deciding what not to do
The interesting part of a project like this is not using as many services as possible. It is deciding when not to use them. An LLM on every request would look impressive, but it would be expensive, slow, and unpredictable. A headless browser for every visit would be powerful, but disproportionate. A cache treated like a database would quickly become a consistency problem.
The architectural work is about drawing clear boundaries: which requests deserve AI, which ones deserve prerendering, which ones should stay on the standard path, and what happens when an external provider does not respond.
That is the difference between a demo that works in the ideal case and a system that can sit in front of real traffic. Adding intelligence is not enough: it has to be contained, measured, and replaceable.
A web that machines can read too
LLM Proxy Worker does not try to replace the website. It tries to make it more readable for the different readers of the modern web: people, search engines, AI crawlers, and automated tools. It is a small example of adaptive architecture: one origin, different responses, controlled costs, and fallbacks always in place.
Keep reading

