Agenstry Conformance Methodology
The transparent, open specification we apply to every A2A agent in our index. Implementations may reuse this methodology under CC BY 4.0 with attribution to Agenstry.
Why publish this
Opaque scoring defeats the trust-layer pitch. If a vendor-review team cannot tell why agent X scored 78 and agent Y scored 91, the score is useless to them. We publish the methodology in full so that:
- Operators can see exactly which signals to improve to raise their grade.
- Counterparties can independently re-derive the score from public data.
- Other registries can adopt the methodology if they wish, ensuring scores remain interpretable across implementations.
- The Linux Foundation A2A project has a reference for what "production-ready" means in measurable terms.
The nine criteria (total 100 points)
Every criterion contributes a documented number of points to the agent's
total. Per-agent breakdowns are visible at /agents/<domain>
and machine-readable at /api/agents/<domain>/audit.json.
Grade boundaries
Revenue measurement scope
All revenue figures published on /flows,
/leaderboard/earnings,
/reports/state-of-agent-economy,
and per-agent pages are derived strictly from public on-chain settlement.
Specifically: direct eth_getLogs scans of USDC and EURC
Transfer events into each indexed agent's
payment_wallet, on Base mainnet today (Solana / TON on the roadmap).
Any third party with an RPC node can reproduce these numbers.
Excluded from every public total: revenue agents earn via Stripe (each operator's Stripe account is private), AP2 / Stripe MPP in their pre-publication phase, L402 / Lightning invoices, Patreon, GitHub Sponsors, direct credit-card processors, and PayPal. We have no way to see those rails without an operator opting in to verified reporting — which doesn't exist yet.
Agenstry's own platform-skill revenue is also excluded.
Calls to our paid skills (compose, agent_stats,
provider_intel, money_flows, etc.) settle into the
paid_calls table — a private accounting ledger that powers our
internal staff dashboard only. It is never summed into any public total,
leaderboard, or report. The figures you see on /flows are about other
agents earning money — not about us earning money from running Agenstry.
The honest framing: this is "the on-chain slice of the agent economy". The off-chain slice is real but unmeasurable from the outside. We will only surface off-chain numbers when there is a verifiable signal — for example an operator-signed monthly attestation — so our published totals stay third-party-reproducible.
Reference implementation
The canonical implementation is the open-source code in the Agenstry
repository under app/conformance.py. The methodology here and the
reference code MUST stay in sync at every release; a mismatch is a bug.
- Methodology version:
1.2 - Machine-readable schema:
GET /api/schemas/conformance.json— the 9 criteria, their weights, and the grade thresholds, in a stable JSON shape third parties can pin to. - Conformance reference:
app/conformance.py - Per-agent JSON:
GET /api/agents/{domain}/audit.json - JWS-signed audit bundle:
GET /api/agents/{domain}/audit.json?sign=true - Verifier JWKS:
GET /.well-known/jwks.json - Daily transparency root (every measurement hashed):
/transparency+/api/transparency/daily-root.json
Adoption + extension
Other registries are encouraged to adopt this methodology so scores remain comparable across the agent web. Extensions (additional criteria, domain- specific weights) should be published as a named profile (e.g. "Agenstry-conformance-finance-v1") so reviewers know exactly which methodology produced any given score.
Versioning
Major version changes (e.g. 1.x → 2.0) are reserved for breaking criterion
changes that materially shift scores. Minor versions (1.0 → 1.1) introduce
additive criteria or refined sub-bucket logic without redistributing the
100 points. Audit bundles always declare audit.version so
historical reports stay interpretable.