Agent Disco

Check catalogue

Every check we run when grading a site, grouped by category. Higher-weight checks move the grade more when they pass or fail.

Well-known URIs

Well-known URIs checks
Check Phase Weight Description
/.well-known/agent.json (A2A AgentCard) well_known.agent_json passive 10 Looks for an A2A AgentCard at `/.well-known/agent.json` ([a2aprotocol.ai](https://a2aprotocol.ai)) and checks for the four load-bearing top-level keys: `name`, …
/.well-known/ai-plugin.json manifest well_known.ai_plugin_json passive 8 Looks for the ChatGPT-plugin manifest at `/.well-known/ai-plugin.json` and counts how many of the load-bearing OpenAI-schema keys are present (`name_for_human`,…
/.well-known/mcp.json (Model Context Protocol) well_known.mcp_json passive 8 Looks for an MCP (Model Context Protocol) manifest at `/.well-known/mcp.json`. Records the top-level keys but only insists on the presence of at least one MCP i…
/.well-known/openapi.{json,yaml} well_known.openapi passive 6 Probes the two RFC 8615 well-known OpenAPI paths (`/.well-known/openapi.json`, `/.well-known/openapi.yaml`) and confirms the body is OpenAPI 3.x. This is narrow…

Root-level files

Root-level files checks
Check Phase Weight Description
/ai.txt AI-crawler directives root_level.ai_txt passive 4 Looks for `/ai.txt` — a secondary, still-draft root-level declaration of AI-crawler directives. Multiple drafts compete, so this check only records presence + a…
/llms-full.txt long-form index root_level.llms_full_txt passive 8 Looks for `/llms-full.txt` — the long-form companion to `/llms.txt` ([llmstxt.org](https://llmstxt.org)). Passes on a substantive (≥ 1 KB) text body; warns on s…
/llms.txt index for LLMs root_level.llms_txt passive 8 Looks for `/llms.txt` at the site root — a Markdown index of important URLs + summaries for LLM consumers ([llmstxt.org](https://llmstxt.org)). Presence is the …

Crawl & indexing

Crawl & indexing checks
Check Phase Weight Description
RSS/Atom feed crawl.feed passive 4 Looks for an RSS/Atom feed via the conventional paths (`/feed`, `/feed.xml`, `/rss`, `/atom.xml`) and the homepage `` tag. The link-alternate declaration is the…
robots.txt AI-agent rules crawl.robots_txt passive 13 Parses `/robots.txt` and checks whether the major AI crawlers (GPTBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended, and others) are allowed to crawl the si…
XML sitemap discovery crawl.sitemap passive 10 Looks for a sitemap via both the conventional paths (`/sitemap.xml`, `/sitemap_index.xml`) and any `Sitemap:` directives in `/robots.txt`. Accepts `` and ``. Ab…

HTML & meta

HTML & meta checks
Check Phase Weight Description
meta description html_meta.description passive 3 Looks for a `` on the homepage, and checks it's in the 50-300 character range most search engines and AI agents actually surface. Absence fails; an SPA shell (h…
JSON-LD structured data html_meta.json_ld passive 8 Parses every `` block on the homepage and categorises by `@type`. `WebAPI` or `SoftwareApplication` is the primary agent signal — those entries explicitly decla…
Open Graph tags html_meta.open_graph passive 4 Looks for the three load-bearing Open Graph tags on the homepage: `og:title`, `og:description`, `og:type`. LLM-driven link previews and search result cards surf…

API discoverability

API discoverability checks
Check Phase Weight Description
GraphQL introspection api.graphql_introspection passive 8 POSTs a minimal GraphQL introspection query (`{ __schema { queryType { name } } }`) to `/graphql`, `/api/graphql`, `/query`. Passes when the server returns the …
JSON error bodies for API callers api.json_error_body passive 5 Requests a random non-existent path with `Accept: application/json`. Passes when the server returns a JSON error body (`application/json` or `application/proble…
OpenAPI specification discovery api.openapi_discovery passive 10 Probes nine conventional paths for an OpenAPI spec (`/openapi.json`, `/api/openapi.yaml`, `/swagger.json`, etc.) and confirms the top-level `openapi` key declar…

Protocols

Protocols checks
Check Phase Weight Description
A2A AgentCard conformance protocols.a2a_agent_card passive 8 Deep conformance check on the A2A AgentCard. Requires `version`, a non-empty `skills` array (each with `name` + `description`) and a non-empty `endpoints` array…
Public MCP registry listing protocols.mcp_registry_presence passive 10 Searches the major public MCP registries (Smithery, mcp.so, PulseMCP, Glama) for the target host. A listing in any registry earns full credit — being catalogued…

Registries

Registries checks
Check Phase Weight Description
GitHub public repository registries.github_repo passive 5 Searches GitHub for repositories whose name or topic matches the target host. Extra note in evidence when the repo carries an agent-relevant topic (`mcp-server`…
npm SDK package registries.npm_package passive 6 Queries the npm registry for packages plausibly attributable to the target — scoped packages at `@/*`, plus any package whose name carries the target's bare hos…
PyPI SDK package registries.pypi_package passive 6 Direct-probes PyPI for `` and `-sdk` — the two names most likely to exist if an official Python SDK does. Skips when neither probe succeeds; does not fail, beca…

Documentation

Documentation checks
Check Phase Weight Description
Docs platform discoverability docs.platform passive 6 Probes the conventional docs paths (`/docs`, `/documentation`, `/api`, `/api/docs`, `/reference`, `/developers`) and fingerprints the first hit. Recognises Mint…
SDK availability across languages docs.sdk_availability passive 8 Counts language-level SDKs by combining the npm + PyPI registry findings with any install commands scraped from the docs homepage (`npm install`, `pip install`,…

LLM training data

LLM training data checks
Check Phase Weight Description
Common Crawl index presence llm_training.common_crawl passive 8 Queries the Common Crawl CDX endpoint for the most recent monthly snapshot and counts the target's pages. Common Crawl's corpus underpins most open-source LLM t…
Hacker News mentions llm_training.hn_mentions passive 5 Queries the Algolia-hosted HN Search API for mentions of the target host. A handful of stories or comments indicate the service has been discussed enough to sho…
Wikipedia article llm_training.wikipedia passive 8 Searches Wikipedia for an article about the service (derived from the org part of the host) and checks whether the domain appears in the article's external link…

Anti-bot posture

Anti-bot posture checks
Check Phase Weight Description
Anti-bot interstitial anti_bot.cloudflare_interstitial passive 10 Fetches the homepage and flags Cloudflare / PerimeterX / Akamai / HUMAN interstitials by header (`cf-mitigated`, `cf-chl-*`, `x-px-captcha`, `x-akamai-session-i…
User-agent sniffing anti_bot.user_agent_sniffing passive 5 Fetches the homepage as `AgentDisco/1.0` and again as `curl/7.88.0`, then compares status, content-type, and body length. A large divergence indicates user-agen…

Identity & verification

Identity & verification checks
Check Phase Weight Description
Email auth (SPF, DMARC, DKIM) identity.email_auth passive 5 Looks up TXT records on the domain for SPF (`v=spf1`), DMARC (`_dmarc.`), and DKIM at the four most common selector names (`default`, `google`, `s1`, `selector1…
TLS + HSTS + HTTPS redirect identity.tls passive 10 Three-part trust check: TLS certificate validity (chain + dates), `Strict-Transport-Security` with `max-age >= 15552000` (6 months, the HSTS preload minimum), a…

Agent onboarding

Agent onboarding checks
Check Phase Weight Description
API-key / signup path discoverability onboarding.api_key_path passive 6 Looks for API-key signup discoverability: probes conventional paths (/signup, /register, /developers, /api-keys, /account/api) plus anchors on the homepage and …