Open Specification · Version 1.2 · 2026-05-15

Agenstry Conformance Methodology

The transparent, open specification we apply to every A2A agent in our index. Implementations may reuse this methodology under CC BY 4.0 with attribution to Agenstry.

Why publish this

Opaque scoring defeats the trust-layer pitch. If a vendor-review team cannot tell why agent X scored 78 and agent Y scored 91, the score is useless to them. We publish the methodology in full so that:

  • Operators can see exactly which signals to improve to raise their grade.
  • Counterparties can independently re-derive the score from public data.
  • Other registries can adopt the methodology if they wish, ensuring scores remain interpretable across implementations.
  • The Linux Foundation A2A project has a reference for what "production-ready" means in measurable terms.

The nine criteria (total 100 points)

Every criterion contributes a documented number of points to the agent's total. Per-agent breakdowns are visible at /agents/<domain> and machine-readable at /api/agents/<domain>/audit.json.

# Criterion Max pts What it measures
1Valid AgentCard10A schema-valid agent-card.json is reachable at the well-known URL.
2Live JSON-RPC25Endpoint answers message/send with a valid JSON-RPC 2.0 response. Sub-buckets cover auth-gated, wrong-shape, and unreachable.
3Protocol version10Declares modern A2A. Bonus for supportedInterfaces[] (added in v1.0). Pre-1.0 is partial credit.
4JWS signature10Card carries a JWS signature and Agenstry verifies it against the provider's /.well-known/jwks.json (provider-bound, never inline JWK).
5Uptime track record15Linear in the success ratio of historical probes. Requires ≥5 probes for a graded score.
6Skill declaration10Number of structured skills[] entries: ≥3 → full, 1-2 → partial, 0 → fail.
7Verified Identity10Provider attribution PLUS authoritative-registry verification (GLEIF / Companies House / KvK / ABN / Handelsregister / EU BRIS / ISED / OpenCorporates). Active and name-matching → 10. Active but mismatched → 7. Declared but inactive → 2.
8Freshness + modern flags5Last seen in upstream sources within 7 days → 4 pts; +1 per declared modern capability flag (AP2, x402, UCP, …).
9Security declaration5mTLS → 5 pts; OAuth2 + PKCE (S256) → 4 pts; any scheme → 2; none → 0 (info, no penalty).
Total 100

Grade boundaries

A — ≥ 90 B — ≥ 75 C — ≥ 60 D — ≥ 40 F — < 40

Revenue measurement scope

All revenue figures published on /flows, /leaderboard/earnings, /reports/state-of-agent-economy, and per-agent pages are derived strictly from public on-chain settlement. Specifically: direct eth_getLogs scans of USDC and EURC Transfer events into each indexed agent's payment_wallet, on Base mainnet today (Solana / TON on the roadmap). Any third party with an RPC node can reproduce these numbers.

Excluded from every public total: revenue agents earn via Stripe (each operator's Stripe account is private), AP2 / Stripe MPP in their pre-publication phase, L402 / Lightning invoices, Patreon, GitHub Sponsors, direct credit-card processors, and PayPal. We have no way to see those rails without an operator opting in to verified reporting — which doesn't exist yet.

Agenstry's own platform-skill revenue is also excluded. Calls to our paid skills (compose, agent_stats, provider_intel, money_flows, etc.) settle into the paid_calls table — a private accounting ledger that powers our internal staff dashboard only. It is never summed into any public total, leaderboard, or report. The figures you see on /flows are about other agents earning money — not about us earning money from running Agenstry.

The honest framing: this is "the on-chain slice of the agent economy". The off-chain slice is real but unmeasurable from the outside. We will only surface off-chain numbers when there is a verifiable signal — for example an operator-signed monthly attestation — so our published totals stay third-party-reproducible.

Reference implementation

The canonical implementation is the open-source code in the Agenstry repository under app/conformance.py. The methodology here and the reference code MUST stay in sync at every release; a mismatch is a bug.

  • Methodology version: 1.2
  • Machine-readable schema: GET /api/schemas/conformance.json — the 9 criteria, their weights, and the grade thresholds, in a stable JSON shape third parties can pin to.
  • Conformance reference: app/conformance.py
  • Per-agent JSON: GET /api/agents/{domain}/audit.json
  • JWS-signed audit bundle: GET /api/agents/{domain}/audit.json?sign=true
  • Verifier JWKS: GET /.well-known/jwks.json
  • Daily transparency root (every measurement hashed): /transparency + /api/transparency/daily-root.json

Adoption + extension

Other registries are encouraged to adopt this methodology so scores remain comparable across the agent web. Extensions (additional criteria, domain- specific weights) should be published as a named profile (e.g. "Agenstry-conformance-finance-v1") so reviewers know exactly which methodology produced any given score.

Versioning

Major version changes (e.g. 1.x → 2.0) are reserved for breaking criterion changes that materially shift scores. Minor versions (1.0 → 1.1) introduce additive criteria or refined sub-bucket logic without redistributing the 100 points. Audit bundles always declare audit.version so historical reports stay interpretable.