GEO Engine v1.2 · methodology

How findloc scores AI visibility — the full math.

Most AI-visibility tools give you a 0-100 score with no way to verify it. We publish the formula, the dimensions, the evidence base, and the source code. If we’re wrong, you can tell us why.

Design principles

Rule-based, not LLM. Every score has a transparent formula. Same input → same output. No model jitter, no hidden weights.
Evidence-anchored. Every recommendation cites a structured source — Princeton GEO paper, per-vertical industry research, observed competitor patterns, or specific AI engine mechanism.
Vertical-agnostic core. The top 4 weighted dimensions (content / citation / competitor / mini-page) work across lodging, dental, legal, HVAC, med spa, and wedding venues. Vertical-specific signals get a small adapter weight (0.05).
Honest about coverage. If we can’t measure a dimension yet (no GBP connected, no website set), we say so explicitly. We never inflate a score by hiding stubs.
Auditable. Every dimension surfaces the signals it inspected. Click into a score and you see what we looked at.

The 7 dimensions

Each business gets scored on 7 independent dimensions. Composite score is a weighted average. Dimensions that can’t be measured (e.g. no GBP linked) are excluded from the composite — we renormalise weights over what we CAN measure rather than artificially deflating the number.

Dimension	Weight	Scores
Content density content_density	25%	Princeton GEO patterns on your mini-page text + FAQ: presence of statistics, quoted testimonials, external citations, concrete features. Negative score for keyword stuffing.
Citation performance citation_performance	25%	Your actual rank in the 4 AI engines (ChatGPT, Claude, Perplexity, Gemini) based on weekly market snapshots. Cross-engine coverage × rank bonus.
Competitor gap competitor_gap	15%	Distance to the top 3 cited competitors in your market. Banded: rank 1-2 = 100, 3-5 = 80, 6-10 = 60, 11-20 = 40, beyond = 20.
Mini-page completeness minipage_completeness	15%	How fully you've filled in your findloc mini-page: FAQ items (35pts), description quality (25pts), photos (15pts), hours (15pts), contact channels (10pts).
Distribution footprint distribution_footprint	10%	Google Business Profile presence + review volume + featured reviews embedded into your mini-page. TripAdvisor cross-check arrives in v1.2.
AI readability (external site) ai_readability	5%	Schema.org JSON-LD, /llms.txt, sitemap.xml, robots.txt GPTBot allow — scored against your external business website (not the mini-page, which already passes all of these).
Industry signals industry_signals	5%	Vertical-specific adapter. Hospitality adapter (v1.2): Google rating + review count + photo count + hours. Restaurant / clinic / law / real-estate adapters arrive in v1.1.

How recommendations are structured

Every recommendation we give Pro subscribers has 4 prose layers (problem / evidence / action / expected_impact) plus structured metadata. The evidence layer is the difference between "vague suggestion" and "report worth paying for".

Problem. Plain observable fact. "Your description contains zero quoted testimonials."
Evidence. Why it matters — research / observation / engine mechanism. "Princeton GEO research (KDD 2024, 10k-query controlled test) measured Quotation Addition as the single highest-impact textual optimization — +41% citation rate."
Action. Specific enough to complete in one sitting. "Add one real guest quote to the first or second paragraph. Format: ‘The kitchen was huge — perfect for our family of 5.’ — Sarah, June 2026."
Expected impact. Banded estimate: high / medium / low. High = Princeton top-tier or strong observed pattern.

Evidence base — what we built this on

Aggarwal et al., “GEO: Generative Engine Optimization” · Princeton + Georgia Tech, KDD 2024
10,000-query controlled test. Measured citation lift for 9 optimization techniques. Top 3 (each: +28-41%): Quotation Addition, Statistics Addition, Cite Sources. Keyword Stuffing has a NEGATIVE effect on AI citation — opposite of traditional SEO. Our content_density rules directly encode these findings.
Customer Alliance + Hospitality Net 2026 research
Hospitality-specific findings (representative of how vertical directories dominate AI recommendations): GBP + TripAdvisor presence dominate AI hotel recommendations. Perplexity has a published TripAdvisor partnership. The same pattern holds in other verticals — Avvo / Justia for legal, Healthgrades for dental, Angi / BBB for HVAC, RealSelf for med spa. Encoded in our industry_signals + distribution_footprint dimensions.
Ahrefs 15,000-query ChatGPT-Bing alignment study
ChatGPT’s browsing mode shows 87% alignment with Bing’s top organic results. Drives our recommendation when a business is missing from ChatGPT specifically: optimise for Bing indexing via IndexNow + Webmaster Tools.

Source code

The engine itself lives at src/lib/geo-engine/ in our repo. Every dimension is a self-contained file you can read. Every rule has its evidence basis as a structured field in the output, not buried in a prompt.

What we don’t measure (yet)

Being honest about gaps is the trust signal. Things missing from v1.2:

Refresh-cycle latency per engine. No public study has measured "I changed X → engine Y started citing me N days later" at scale. We’re building a controlled experiment to ship the first dataset on this.
Causal model. We currently identify correlation between content patterns and citation rates. A real "if you do X you’ll lift N positions on engine Z" model requires ≥12 weeks of longitudinal data across ≥50 businesses. We’re collecting it now.
External backlink graph. We don’t crawl who links to whom. Adding this is a v1.5 item.
Vertical adapters still maturing. The industry_signals dimension ships adapters for lodging, dental, legal (PI), HVAC, med spa, and wedding venues. Restaurant / real-estate / accounting adapters arrive in v1.3 once we collect enough probe data to anchor them.

FAQ

Why a rule-based engine instead of letting an LLM generate suggestions?

Three reasons. (1) Auditability — when an LLM tells you "add family-friendly keywords", you can't check whether that's sound advice or hallucination. Our rules cite Princeton GEO research with measured effect sizes. (2) Reproducibility — same input → same output. (3) Cost — zero LLM call per analysis means we can run it weekly on every Pro business for free.

Why isn't the composite score just an average of all 7 dimensions?

Because weights matter — content_density and citation_performance are 5x more important than industry_signals based on research evidence. And because some dimensions can't be measured for every business (e.g. no GBP linked → distribution_footprint stays at null). We renormalise weights over the dimensions we CAN score, so you never get penalised for an un-measurable dimension.

What's the difference between a score's "confidence" and a recommendation's "expected impact"?

Confidence is about how SURE we are the measurement is right. Expected impact is about how BIG the lift would be if you implemented the recommendation. A high-impact low-confidence rec is a worth-trying experiment. A low-impact high-confidence rec is a reliable small win.

How often do scores update?

We re-probe each market's AI Reports weekly (Sunday night cron). Each Pro business in that market gets a fresh engine run automatically. You can also click "Regenerate" anytime to force a fresh analysis on demand.

Will the score predict whether I'll be cited by AI in 4 weeks?

Not yet — we're honest about that. Right now the score is a diagnostic + improvement framework based on research correlation. A real causal model ("if you do X you'll rank +N in Y weeks") needs ≥12 weeks of longitudinal data across many businesses. We're collecting it now and will publish the model once it has statistical significance.

See the engine on your own business

Free to claim. Pro $49/mo runs the full 7-dimension engine on your business weekly + serves you the evidence-anchored recommendations.

Browse the live database See pricing