Scoring Methodology

Every tool is assessed across 5 pillars, each worth up to 20 points. Total max score: 100. Scores reflect how well a tool supports autonomous AI agents — not general developer experience.

MCP Server

20 pts max

Signal	Points
Official MCP server	+20
Community MCP server	+12
None	0

Model Context Protocol — a standardized interface for LLMs to call tools. Official support scores higher because it guarantees maintenance alignment with the product.

Platform API

20 pts max

Signal	Points
Official REST/GraphQL API	+20
Community API wrapper	+12
None	0

A well-documented programmatic API lets agents perform actions via HTTP without needing human-facing interfaces.

CLI

20 pts max

Signal	Points
Official CLI	+12
Community CLI	+7
None	0
JSON output flag (--json)	+4
Non-interactive mode	+4

CLI tools let agents invoke commands in sandboxed environments. JSON output and non-interactive flags are critical for agent pipelines — they determine whether output can be parsed reliably.

Agent Skills

20 pts max

Signal	Points
Official skill/plugin	+8
Community skill	+5
None	0
Skill file (.md / config)	+4
Agent rules file	+4
Prompts library	+4

Skills are structured prompts or integrations that let agents use a tool with minimal guesswork. Skill files, agent rules, and prompt libraries reduce hallucination and improve task success rates.

AI Docs

20 pts max

Signal	Points
llms.txt	+5
OpenAPI spec	+5
AI-optimized quickstart	+5
Copy-as-markdown on docs pages	+5

Documentation quality directly affects how well LLMs can reason about a tool. llms.txt provides a curated index; OpenAPI specs enable typed tool-calling; AI quickstarts reduce cold-start errors; copy-as-markdown means docs can be injected into context without manual reformatting.

Tier Classification

Score	Tier	What it means
95–100	Tier 1	Production-ready for autonomous agents
80–94	Tier 2	Strong coverage, minor gaps
60–79	Tier 3	Partial agent support
< 60	Needs Improvement	Significant gaps for agent workflows

Assessment Process

All assessments are done manually. Each tool is researched against its official docs, GitHub repos, npm/PyPI packages, and community sources. Scores reflect the state of tooling at the time of last research.

The lastResearched date on each tool page indicates when the assessment was last verified. If you find outdated or incorrect data, open an issue or PR on the Agent Stack repository.

Community contributions (e.g. a third-party MCP server or CLI wrapper) are counted but scored lower than official offerings, since they may fall out of sync with the tool's API surface.

← All toolsTotal: 100 points max