GEO playbook 2026: how to get cited by ChatGPT, Claude, and Perplexity (with what worked for us)
Generative Engine Optimization (GEO) makes your business citable by AI search engines. Eight things matter: schema.org markup, FAQ pages, /llms.txt, AI-crawler allowlist in robots.txt, sitemap, server-rendered HTML, consistent NAP data, and fresh content. We rebuilt findloc.ai around these and tracked the result.
What is GEO?
Generative Engine Optimization (GEO) is the discipline of structuring a website so AI search engines — ChatGPT, Claude, Perplexity, Google AI Overviews — can read it, trust it, and cite it by name when a user asks a question.
Traditional SEO optimises for being one of ten ranked blue links. GEO optimises for being one of two or three sources synthesised inside a single AI-generated answer. The mechanics are different: AI engines do not weigh keyword density the way Google's 2010-era PageRank did; they weigh how easy your content is for a language model to parse without ambiguity. That means structure (schema.org, FAQPage, /llms.txt) wins over volume.
Rule of thumb: if a 12-year-old skimming your page in 30 seconds can correctly answer "what does this business do, where is it, what are its hours, what are people's top questions about it?" — your page is GEO-ready. If they can't, neither can an LLM.
The 8-point GEO checklist
These are the eight signals we found AI engines actually use, ranked roughly by impact. Every findloc.ai mini-page ships with all eight automatically; if you're hand-building, work top to bottom.
| # | Signal | Why it matters | Effort |
|---|---|---|---|
| 1 | AI-crawler allowlist in robots.txt | Blocking any major bot = invisible to that engine. Most sites block by default. | 5 min |
| 2 | schema.org JSON-LD (LocalBusiness / Organization) | AI engines extract structured facts without ambiguity. Single highest-impact change. | 15 min |
| 3 | FAQPage schema on a real Q&A section | AI engines preferentially quote Q→A pairs verbatim. | 30 min |
| 4 | /llms.txt summary file | One-fetch overview for AI agents. Crawler-side proof of intent. | 20 min |
| 5 | Sitemap.xml that includes every cite-worthy page | Tells engines what to crawl. Without it, deep pages are invisible. | 10 min |
| 6 | Server-rendered HTML (not client-only React) | Bots run no JavaScript. Client-only sites are blank to them. | depends |
| 7 | Consistent NAP (name, address, phone) across the web | Inconsistency triggers AI engines to lower confidence. | ongoing |
| 8 | Fresh content (last-modified timestamps in sitemap) | AI engines weight recency for time-sensitive answers. | ongoing |
Which AI bots do I actually need to allow?
This is the cheapest 5-minute win in GEO. Open your robots.txt. Make sure none of these are in a Disallow rule:
- GPTBot — OpenAI's training crawler
- OAI-SearchBot — ChatGPT's real-time search crawler (the one that fetches when you ask "what's happening")
- ChatGPT-User — fired when a user explicitly clicks a ChatGPT-cited link
- ClaudeBot — Anthropic's training crawler
- anthropic-ai — Anthropic's search-time crawler
- PerplexityBot — Perplexity's indexing crawler
- Perplexity-User — Perplexity's user-click crawler
- Google-Extended — Google's AI training opt-in (separate from regular Googlebot)
- Applebot-Extended — Apple Intelligence training
- Amazonbot — used by Alexa and Amazon's AI products
- CCBot — Common Crawl, the upstream dataset most LLMs train on
- Bytespider — ByteDance / TikTok AI
Default WordPress installs, Cloudflare's "Block AI Bots" toggle, and several popular plugins block half of these out of the box. Worth opening yourdomain.com/robots.txt in an incognito tab and reading it line by line.
Why /llms.txt is your single most underrated GEO file
/llms.txt is a plain-text summary of your site written FOR AI agents (vs robots.txt which is written for crawlers). It tells an LLM: what this site is, what its key pages are, what content it would like to be cited for, what to attribute when quoting.
Most websites do not have one. The ones that do see real crawler activity to /llms.txt within days of publishing. findloc.ai's /llms.txt logs hits from OAI-SearchBot, PerplexityBot, and ClaudeBot multiple times per week.
Every findloc.ai mini-page is automatically included in findloc.ai/llms.txt — meaning your business becomes part of the file AI engines fetch when they want a structured view of who's on the platform.
Our own case study: rebuilding findloc.ai around GEO
In May 2026 we shipped findloc.ai with the full 8-point GEO stack baked in. The numbers below are real SQL aggregates from our own ai_crawler_visits and businesses tables — the live, always-current versions of the same numbers are pinned to the top of findloc.ai/directory.
| Metric | Snapshot · 2026-06-05 |
|---|---|
| AI crawler hits, last 7 days | 35,890 (≈5,127/day avg) |
| Distinct AI bots, last 30 days | 7 (GPTBot, ClaudeBot, OAI-SearchBot, ChatGPT-User, PerplexityBot, Amazonbot, CCBot) |
| ChatGPT-User clicks, last 30 days | 38 — humans clicking links inside ChatGPT answers |
| Mini-pages claimed | 15 across 10 countries |
| Visitor → claim conversion (PH launch cohort) | 1.6% (PH average: ~0.5%) |
The single most exciting metric is the third row: 38 ChatGPT-User hits. That bot name is fired specifically when a user clicks a link inside a ChatGPT answer. It means ChatGPT cited us, the user clicked, and they ended up on findloc.ai. Not a guess — a logged TCP connection per click. That's the actual unit of GEO success.
We publish the live versions of these numbers as Supabase aggregates on /directory so visitors can verify they're real, not vendor whitepaper claims. Numbers are updated automatically on every page render.
How long until AI engines start citing you?
Three different timelines, ordered fastest to slowest:
- Crawler activity (1-2 weeks). Once robots.txt, sitemap, and /llms.txt are live, AI crawlers start fetching. Visible immediately in server logs or — if you have a findloc.ai mini-page — in your /my dashboard.
- Long-tail citation in AI answers (4-8 weeks). Specific queries like "best dentist in [your suburb]" can surface you within a month. Lower competition = faster.
- Brand-name citation when someone asks for you directly (8-16 weeks). The slowest because it requires both indexing AND the AI engine learning that your brand name maps to your business.
These are real findloc.ai timelines, not benchmarks from a vendor whitepaper. Your mileage varies with site authority, content depth, and the specificity of queries you care about.
What we DON'T do (and you shouldn't either)
- Keyword stuffing. AI engines penalise it harder than Google does — repeated phrases trip language model heuristics for "low-quality content".
- AI-generated FAQ filler with no real Q&A behind it. Easy to spot, hurts trust signals.
- Cloaking (serving different HTML to bots vs users). One of the few hard violations.
- Buying schema-injection services that charge per page. Schema.org JSON-LD is free; any developer can add it in 15 minutes.
- Paying for "AI SEO" consultants who can't name the 12 bots above. The field is too new for credentialing; expertise is on display in the answers.
Where to start this week
- Run findloc.ai's free Visibility Checker on your own site. Takes 60 seconds, no signup. See what you're missing.
- Read your robots.txt in an incognito tab. Compare against the 12-bot allowlist above.
- Claim a free findloc.ai mini-page. It ships the full 8-point stack automatically — schema, FAQ, /llms.txt inclusion, sitemap, crawler allowlist.
- Check back in 2 weeks. Your /my dashboard shows real AI bot visits. That's ground truth.
Frequently asked
- What is GEO (Generative Engine Optimization)?
GEO is the practice of structuring a website so AI search engines (ChatGPT, Claude, Perplexity, Google AI Overviews) can read it, trust it, and cite it by name when answering user questions. It overlaps with traditional SEO but prioritises schema.org markup, FAQ-style content, and explicit AI-crawler permissions over keyword density.
- How is GEO different from SEO?
SEO optimises for ranked search results; GEO optimises for being quoted inside an AI answer. AI engines do not show a ranked list — they synthesise one answer from a few trusted sources. Winning means being one of those sources, which depends on machine-readable structure (schema.org, FAQPage, /llms.txt) more than keyword volume.
- Which AI crawlers should I allow in robots.txt?
At minimum: GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, anthropic-ai, PerplexityBot, Perplexity-User, Google-Extended, Applebot-Extended, Amazonbot. Blocking any of these means that AI engine cannot index your content and will never cite you. Most websites accidentally block several via default WordPress / Cloudflare settings.
- Do I need a /llms.txt file?
Recommended. /llms.txt is a plain-text summary of your site written for AI agents. It lets you control what AI engines learn about you in one fetch instead of crawling dozens of pages. We see real crawler hits to findloc.ai/llms.txt from OAI-SearchBot, PerplexityBot, and ClaudeBot.
- Does schema.org markup actually help GEO?
Yes — measurably. AI engines parse JSON-LD to extract structured facts (business name, hours, address, FAQs) without ambiguity. A LocalBusiness or Organization schema is the single highest-impact change for a local business. Add FAQPage to mark up Q&A sections, and Article for blog posts.
- How long until AI engines start citing me after I implement GEO?
Crawler activity typically starts within 1-2 weeks of robots.txt and sitemap changes. Citations in actual AI answers take longer — usually 4-8 weeks — because AI engines re-train or re-index on a schedule. The fastest signal is /my dashboard crawler hit counts; the slowest is brand-name citations in ChatGPT.
- Is GEO worth it for a small local business?
It is the lowest-cost discovery channel available right now. Setting up a free findloc.ai mini-page gets you the full GEO stack in 5 minutes. The downside risk is zero (it is free); the upside is being the business ChatGPT recommends when someone asks "best cafe in [your city]".
- How can I check if my current website is AI-visible?
findloc.ai runs a free, no-signup Visibility Checker that scores any URL on six GEO signals: SEO basics, schema.org markup, FAQ structure, AI-bot allowlist, /llms.txt presence, and sitemap. You also see exactly which fixes will move the score most.
- What does findloc.ai cost?
Free during this build period. Every feature — visibility check, mini-page, schema generation, FAQ generator, crawler-hit dashboard — is free, no credit card. A paid Pro tier (citation monitoring, competitor delta, multi-location) is planned for later; current free users lock in 50% off Pro for life.
- Can I do GEO without a developer?
Yes for any business that just needs a citable presence — a free findloc.ai mini-page ships the full GEO stack (schema.org, FAQ, /llms.txt inclusion, AI-bot allowlist) without touching code. If you own a website and want to GEO-optimise that, you need a developer for some of the steps (schema injection, /llms.txt creation).
A free findloc.ai mini-page ships the full 8-point GEO stack automatically — schema.org markup, FAQPage, /llms.txt inclusion, sitemap, AI-bot allowlist. Five minutes, no credit card.
Run the free Visibility Checker →