How AI visibility is measured

Two tools can check the same brand and still report different numbers. That is rarely a bug; it is almost always the measurement method.

The short version

✓ Two tools can measure the same brand and legitimately report different numbers. None of them alone is "the truth".
✓ Collection: some tools scrape a personalized consumer session, others query the API with live web search, which is reproducible and auditable.
✓ Sampling: a single reading per day carries the full random variance; deterministic repeats with fixed parameters lower it.
✓ Tools count different things, from a bare mention to the source actually cited. Judge a tool by whether its method is disclosed.

The sections below walk through each of these levers, without crowning a winner. Once you understand how the measuring is done, you can read any number in context, including the ones from Achtung.app.

1. How the answers are collected

Before anything gets counted, a tool has to query the AI platform in the first place. Two schools have emerged for doing that.

Browser simulation (UI scraping)

The tool simulates a real browser session and reads the answer from the consumer interface of ChatGPT, Gemini and others. Upside: it sees exactly what the platform serves a user at that moment. Downside: the answer can be shaped by account, memory, plan tier and A/B tests, is harder to reproduce, and depends on the interface not changing. Peec.ai is a well-known representative of this school.

API with live web search

The tool calls the providers' official search-grounded APIs, which run a live web search per query. Upside: reproducible, auditable with fixed sampling parameters, and independent of UI changes or anti-scraping. Downside: it measures the path agents and applications take, not every personalization of the consumer interface.

On top of that, scraping an interface means committing to exactly one configuration. A free account often gets a different model than a paid one, and a fast answer mode cites different sources than a thorough reasoning mode. The scraped result then only ever holds for that one setting, which depends on the account and mode used. The models themselves change on both paths, because the providers keep updating them. With the API, though, it is on record which model answered a query, so a change is visible instead of silent.

How Achtung.app does it

Achtung.app measures through the official APIs with live web search per query, deterministic sampling parameters, and a citation with URL per response. Which model answered a query is logged, so every measurement stays reproducible. More on this in the methodology

2. Single sample versus deterministic repeats

AI models are inherently non-deterministic: the same question yields slightly different answers on two different days. How a tool handles this decides how reliable a trend is.

Many tools run each prompt once a day. That is cheap and produces a trend line, but every single measurement carries the model's full random variance. A spike up or down can be real movement or simply noise.

How Achtung.app does it

Achtung.app runs every query at temperature 0 with fixed seeds where the provider supports them, plus multiple runs per keyword. That lowers variance, so a change in the curve is more likely real movement than noise.

3. What actually gets counted

"Visibility" is not a single value. Tools count different things, and a metric's name does not always reveal what sits underneath:

✓ Mention: the brand is named in the answer, or it is not.
✓ Cited source: the model pulls in certain URLs to support the answer. This is the layer you can actually influence.
✓ Position and share: the brand appears at a certain spot and with a certain share against competitors.
✓ Sentiment: the mention is framed positively, neutrally or negatively.

A pure mention count and a source analysis can diverge sharply for the same brand. Achtung.app tracks both and discloses which sources dominate per platform. More on this under cited sources

4. Which platforms, and why the count misleads

Tools advertise three to seven "models". Read that number with care: several of them are often different surfaces of the same provider, such as Google AI Mode, AI Overviews and Gemini, not independent sources.

More telling than the raw count is the overlap: the same brand can be heavily cited on one platform and absent on another. This asymmetry between providers says more about your position than a high average across many surfaces.

5. Why the same brand scores differently

Put the four levers together and it is clear: two credible tools can measure the same brand and legitimately report different numbers. One scrapes the interface and captures a personalized session; the other queries the API with a fixed sample. One counts mentions; the other counts cited sources.

None of these values is "the truth". Each measures a particular slice under particular assumptions. What matters is that the assumptions are on the table, so you know what the number means and what it does not.

A side-by-side of the common tools by provider, collection method and price is in the tool comparison

How to read any tool's number

Five questions that put any AI visibility number in context, no matter which vendor it comes from:

How is it collected? Through a scraped consumer interface or through the API with live web search?
How often and how stable? A single daily reading or multiple runs with fixed sampling parameters?
What is counted? Bare mentions or the sources actually cited?
How many real providers? Independent platforms or several surfaces of the same house?
Is the method disclosed? Can you read up on how it measures, or does it stay a black box?

FAQ

Because they measure differently. Collection (UI scraping versus API with live web search), sampling (once a day versus multiple deterministic runs), the quantity counted (mention versus cited source) and provider selection all vary from tool to tool. Each number describes a particular slice under particular assumptions; none of them alone is "the truth".

Both have merit. Scraping the consumer interface shows what a user sees at that moment, but is harder to reproduce and depends on personalization and UI changes. The API with live web search is reproducible and auditable and reflects the path agents and applications take. Achtung.app measures through the API because reproducibility and citation evidence are decisive for trend data.

AI models answer slightly differently by nature. With temperature 0 and fixed seeds where supported, plus multiple runs per keyword, that variance drops. A change in the curve is then more likely real movement than random noise.

No. A mention means the brand is named in the text. A cited source is the URL the model draws on to support its answer. The source layer is where you can actually change something, which is why Achtung.app tracks both separately.

The count alone says little, because several advertised "models" are often just different surfaces of the same provider. What matters more is that the covered platforms run a live web search per query and that the tool discloses the overlap between them. Achtung.app tracks four search-grounded providers on every plan.

Start a free AI visibility check

1. How the answers are collected

Browser simulation (UI scraping)

API with live web search

2. Single sample versus deterministic repeats

3. What actually gets counted

4. Which platforms, and why the count misleads

5. Why the same brand scores differently

How to read any tool's number

FAQ

Why do two AI visibility tools report different numbers for the same brand?

Is browser scraping or the API the better approach?

What does deterministic sampling mean?

Does a mention say the same thing as a cited source?

How many AI platforms should a tool cover?