AI Visibility

What AI Tools Actually Read On Your Website (And What They Ignore)

5 min read

When firms ask me why they aren't being mentioned in AI answers, the conversation almost always turns to their website. They walk me through the homepage, the hero video, the testimonials carousel, the polished services section. And almost none of it matters to the models they're trying to be cited by. AI tools don't experience your website. They harvest it. Understanding the difference is the difference between being recommended and being invisible.

The model's actual diet

Large language models — and the retrieval systems built on top of them, like Perplexity, ChatGPT search, and Google's AI Overviews — consume web pages as plain, structured text. They strip out the visual chrome, ignore the JavaScript-driven flourishes, and look for clear, declarative answers to specific questions. What they reward is the opposite of what most firms have spent the last ten years optimising for.

Specifically, here's what AI tools actually read on your site:

The first 200 words of any page, in plain text, above any interactive element. This is the snippet they're most likely to lift verbatim.
Headings (H1, H2, H3) used as topic anchors. They use these to understand what the page is about and which sub-questions it answers.
Lists, tables, and short answer paragraphs. These are the easiest formats to extract, cite, and reuse inside an AI answer.
Schema markup — specifically Organization, Person, LocalBusiness, Service, FAQPage, and Article. Schema tells the model who you are, where you are, and what you do, with no interpretation required.
Internal links with descriptive anchor text. They're how the model maps the relationships between your services, your locations, and your specialties.
The footer. Address, phone, founding year, parent company. Boring, structured, gold for entity disambiguation.

What the model ignores

Just as important is what the model throws away. Most of the budget firms spend on their websites lands in this category:

Hero videos and image carousels. The model cannot see them. If your value proposition only lives inside a video, it does not exist to AI.
JavaScript-injected text that doesn't render in the initial HTML. If a model fetches your page and the content arrives only after a client-side render, much of it is invisible.
Decorative copy. Taglines like "Trusted advisors. Proven results." carry no extractable information. They're discarded as noise.
Stock imagery and team photos. Useful for humans, irrelevant for retrieval. Models do not award authority for design.
Long brand-narrative paragraphs that don't answer a specific question. The model is looking for "what does this firm do, for whom, where, and how," not your origin story.
Pop-ups, chat widgets, and consent banners. At best ignored. At worst, they break the parse.

The pattern is consistent. The model rewards declarative, structured, retrievable text and ignores everything that exists primarily for visual or emotional impact.

The "extractable answer" rule

The single most useful idea I can give you is this: every page on your site should contain at least one passage that, if pulled out and shown alone, would still be a clear, complete, useful answer to a real buyer question. That passage is what the model will lift, attribute, and surface inside an AI answer.

An extractable answer looks like this: "An estate planning attorney for blended families typically coordinates three documents: a revocable trust, pour-over will, and beneficiary designations across retirement and insurance accounts. The goal is to prevent the surviving spouse and children from prior marriages from ending up in court over assumptions the deceased never wrote down." That paragraph can be lifted whole. It answers a specific question, in plain language, with concrete detail. A model can cite it confidently.

A non-extractable paragraph looks like this: "We bring decades of experience and a personalised approach to every estate planning engagement, helping families navigate complex situations with care and discretion." That sentence answers nothing. No model will lift it, because lifting it would tell the user nothing they didn't already know.

A practical audit you can run this week

You don't need a tool. You need an hour and a notepad. Open every important page on your site and ask three questions:

Is the value proposition stated in plain text within the first 200 words, or is it locked inside a video, image, or animation?
Are there at least two passages on the page that would survive being pulled out and shown alone as an answer to a buyer question?
Does the page have proper schema — Organization, Service, LocalBusiness, FAQPage, or Article — that tells the model what kind of content it is?

If the answer to any of those is no, the page is invisible to AI tools regardless of how good it looks to humans. Fix the highest-traffic pages first: homepage, primary service pages, the about page, and your top three blog posts. The fixes are unglamorous — rewriting hero text in plain language, adding schema, breaking long paragraphs into extractable answers, and ensuring critical text isn't trapped inside JavaScript.

Why this is the cheapest authority work you'll ever do

Most AI visibility advice ends up sounding like a giant content program: write a hundred articles, get cited everywhere, become a thought leader. That work matters, but it takes years. Making your existing site machine-legible takes weeks and produces visible changes in AI mentions within a quarter. It's the lowest-hanging fruit in modern marketing — and it's almost completely ignored by firms still optimising their hero animation.

The firms that quietly dominate AI answers in their category aren't the loudest. They're the most legible. They've stopped designing their sites for awards juries and started writing them for the retrieval systems their buyers now ask first. That choice — boring, structural, deeply unsexy — is what compounds into being the named firm when ChatGPT is asked who to call.

What AI Tools Actually Read On Your Website (And What They Ignore)

The model's actual diet

What the model ignores

The "extractable answer" rule

A practical audit you can run this week

Why this is the cheapest authority work you'll ever do

Keep reading.

AI Is Already Choosing Your Competitors. Here's Why.

Your Website Isn't Broken. It's Illegible.

Ready to be the obvious choice in your market?