How Yogoo measures AI citation
When buyers ask ChatGPT, Perplexity, Gemini, Claude, or Google's AI Overviews which products to consider, who do those AI assistants actually name?
That's what Yogoo measures.
We run thousands of buyer-discovery queries across the five major AI engines, count how often each domain appears in the answers, and report it as a Yogoo Score — a single number from 0 to 100 with a confidence band.
No hype. No academic theater. Here's exactly how it works.
On this page: Citation Hit Rate · Share of Voice · Sentiment · Multi-engine sampling · Confidence band · Category coverage · What we don't measure
What we measure
Yogoo Score is one number — but it answers five questions about how AI assistants cite your brand.
| Question we answer | What we report |
|---|---|
| What fraction of AI answers in your category mention your domain? | Citation Hit Rate — drives the score |
| When AI assistants talk about brands in your category, what fraction of those mentions is yours? | Share of Voice — competitive-share signal |
| How does AI talk about your brand — favorably, neutrally, critically? | Sentiment — qualitative signal, peer to the score |
| How many of the 5 major AI engines cite you? | Multi-engine breadth — drives the score |
| How reliable is the measurement, given that LLMs are stochastic? | Confidence band — annotates the score (e.g. 67 [62–72]) |
Citation Hit Rate and Multi-engine breadth combine into the Yogoo Score itself, scaled 0–100. Share of Voice and Sentiment are reported alongside as peer signals — they answer different questions than the score does. The confidence band describes how much to trust the measurement.
A score of 67 with a band of ± 5 means: in your category, your brand shows up in AI answers reliably more often than most, less often than the leaders.
1. Citation Hit Rate
What it measures
The fraction of buyer-discovery queries in your category where AI assistants mention your brand in their answer.
How we compute it
For each category (e.g., "Project management"), we curate prompts that match what real buyers actually ask:
- Best project management tools for SaaS teams
- Linear vs. Asana vs. Jira
- Most-loved project management software in 2026
The full prompt set per category is shared with subscribers; the patterns are public — see Category coverage.
We run each prompt many times across each of the 5 engines, then count how often your brand appears in the response text — matched against your brand name, your domain, and any aliases you declare. When AI says "Linear" or "linear.app" or "Linear, Inc.", we count all of them. The Citation Hit Rate is the fraction of AI answers where your brand showed up.
Example
If linear.app shows up in roughly 4 out of every 10 AI answers for the "Project management" prompts, the Citation Hit Rate is 42%.
Why it matters
When a buyer asks an AI assistant for a recommendation in your category, your domain either appears in the answer — or it doesn't. Citation Hit Rate is the first-order measure of whether you're in the AI's consideration set at all.
A 0% Citation Hit Rate isn't an error. It's a diagnosis.
3. Sentiment
What it measures
How AI assistants talk about your brand when they mention it: favorably, neutrally, or critically. Citation rate and Share of Voice tell you whether AI is talking about you. Sentiment tells you what AI is saying.
How we compute it
For every AI answer that mentions your brand in our coverage, we classify the mention as positive, neutral, or negative. We aggregate across the category to give you an overall sentiment indicator: mostly positive, mixed, or mostly negative.
Example
Across the AI answers that cite linear.app for "Project management" prompts, suppose 78 of every 100 cited mentions are framed positively ("Linear is a great choice for fast-moving teams"), 18 are neutral ("Linear is among the options for project management"), and 4 are negative ("Linear may be limited for larger orgs"). The category-level indicator reads mostly positive.
What's in T1 vs. coming next
Available now (every tier, free included): category-level sentiment indicator.
Coming with our next release: per-engine sentiment breakdown (which engines frame you positively vs. critically); per-prompt sentiment with competitor deltas; sentiment by feature category (e.g., your sentiment on "data" vs. "pricing" vs. "integrations"). Tier breakdown at /pricing.
Why it matters
A high citation rate with negative sentiment is a different problem than a low citation rate. The first means AI knows you and is critical; the second means AI doesn't know you yet. Different prescriptions, different content priorities. Sentiment is the qualitative signal that makes the quantitative metrics actionable.
4. Multi-engine sampling
Why we sample 5 engines
Different AI engines train on different data, retrieve from different web indexes, and have different prompt-following behaviors. A brand strong in ChatGPT may be invisible in Gemini. A single-engine measurement misses most of the picture.
Yogoo samples all 5 of the engines that drive B2B SaaS buyer discovery today:
- Perplexity — pure retrieval-augmented; closest to "search 2.0"
- ChatGPT (gpt-4o, web tool enabled) — most-used by buyers
- Gemini (Pro, with Search) — Google's AI discovery surface
- Claude (Sonnet, with WebSearch) — fastest-growing in 2026
- Google AI Overviews (AIO) — embedded in Google Search; shapes default awareness for buyers who don't yet use a chatbot
How the breakdown is reported
The public widget shows the aggregate Citation Hit Rate across all 5 engines.
Subscribers see the per-engine breakdown — which engines cite you, which don't, and how the gap has trended over the last 30 days.
5. Confidence band
LLMs are stochastic
Run the same prompt against ChatGPT twice; you'll get two different answers. That isn't a bug — it's how LLMs work. They sample from a probability distribution every time they generate text.
Ask AI "best CRM" once and you might get Salesforce, HubSpot, and Pipedrive. Ask again and you might get Salesforce, Attio, and Close.com. Both are true outputs.
A single AI answer is a coin flip. To estimate Citation Hit Rate, you need many flips.
How we report confidence
Yogoo samples many AI answers per category. From that sample we compute a 95% confidence interval — standard binomial-distribution math. The more samples, the tighter the band.
A score reported as 67 [62–72] means: we're 95% confident the true score sits between 62 and 72.
Category coverage at launch
Yogoo measures Citation Hit Rate within 30 categories. Categories are clusters of buyer-discovery prompts that map to how real buyers ask AI assistants for product recommendations.
| 1 | E-commerce platforms | "best e-commerce platforms" / "Shopify alternatives" |
| 2 | Customer support / Help desk | "best customer support software" / "Zendesk alternatives" |
| 3 | Content management systems | "best CMS for SaaS" / "Webflow vs. WordPress" |
| 4 | Email marketing | "best email marketing platforms" / "Mailchimp alternatives" |
| 5 | Project management | "best project management tools" / "Asana vs. Linear" |
| 6 | CRM | "best CRM for B2B SaaS" / "Salesforce alternatives" |
| 7 | Marketing automation | "marketing automation platforms" / "HubSpot competitors" |
| 8 | Analytics / Business intelligence | "best analytics platforms" / "Amplitude vs. Mixpanel" |
| 9 | Recruiting / ATS | "best applicant tracking systems" |
| 10 | HR software | "best HRIS platforms" / "Rippling alternatives" |
| 11 | Accounting / Finance SaaS | "best accounting software for startups" |
| 12 | Legal SaaS | "best contract management tools" |
| 13 | Developer tools / IDE | "VS Code vs. JetBrains" |
| 14 | API platforms | "best API platforms" / "Postman alternatives" |
| 15 | Observability / APM | "best observability platforms" / "Datadog alternatives" |
| 16 | Logging / Tracing | "best logging tools" |
| 17 | SaaS security | "best SaaS security platforms" |
| 18 | Identity / SSO / Auth | "Auth0 vs. Okta" |
| 19 | Cloud hosting / IaaS | "AWS vs. GCP" |
| 20 | CDN / Edge | "best CDN providers" / "Cloudflare alternatives" |
| 21 | CI/CD | "best CI/CD platforms" |
| 22 | Data warehouse / ETL | "Snowflake vs. BigQuery" |
| 23 | Customer data platforms | "best CDP" / "Segment alternatives" |
| 24 | Product analytics | "best product analytics" |
| 25 | A/B testing | "best A/B testing platforms" |
| 26 | Knowledge base / Docs | "best documentation platforms" |
| 27 | Design tools / UX research | "best design tools" / "Figma alternatives" |
| 28 | GEO / SEO / AI-citation tools | "best GEO tools" / "Yogoo vs. Profound" |
| 29 | Workflow automation / iPaaS | "best workflow automation" |
| 30 | Async video communication | "best async video tools" / "Loom alternatives" |
The full prompt set within each category is shared with subscribers. We add new topics monthly when ≥5 people request the same one.
Don't see your category? Email me when my category is added →
What we don't measure
Yogoo measures whether AI assistants cite your domain in buyer-discovery queries. Here's what we don't:
- Cross-vertical credit. A brand strong in marketing automation doesn't get credit in CRM if buyers don't ask about it that way. Citation Hit Rate is bounded to the category you're measured in.
- Query bias. Some prompts skew toward incumbent brands ("best CRM" disproportionately cites Salesforce regardless of context, simply because Salesforce dominated CRM media for two decades). Bias-adjusted scores are a hard measurement problem; the score you see reflects raw citation, not bias-adjusted citation.
If you want a measurement we don't yet provide, tell us: contact@yogoo.ai