MCP Servers Tool Catalog
922 public Model Context Protocol servers spawned over stdio. 359 introspected successfully. 9,922 real tool schemas captured.
Not scraped from READMEs - this is the verbatim response to a JSON-RPC tools/list call,
the same handshake every MCP host uses. Refreshed monthly.
What you get
Two tables and a monthly diff. Every record is the output of code that actually ran - no human curation, no LLM summarization between source and dataset.
What each record contains
servers.jsonl - one row per attempted package (922 rows), regardless of whether introspection succeeded. The status field is the key column: ok means the server responded to tools/list; anything else is one of 16 classified failure reasons.
| Field | Type | Notes |
|---|---|---|
name |
string | Short identifier from the curated seed list (e.g. brave-search, stripe). |
package |
string | npm package name. What npx would resolve to. |
status |
string | ok on success. On failure: init_timeout, needs_env_var, needs_cli_args, npm_install_generic, broken_install, or one of 11 vendor-specific buckets (needs_slack_token, needs_aws_creds, etc.). |
elapsed_s |
number | Wall-clock seconds from spawn to terminal state. Useful for spotting cold-start sluggishness. |
tool_count |
number | Number of tools returned by tools/list. 0 on failure. |
server_info |
object | The serverInfo block from the MCP initialize response: name and version self-reported by the server. |
tools.jsonl - one row per (server, tool) pair on successful introspections (9,922 rows). Each tool record carries the verbatim inputSchema the server returned - usable as an OpenAI / Anthropic function-calling tool definition with no transformation.
| Field | Type | Notes |
|---|---|---|
server_name |
string | Foreign key into servers.jsonl. |
package |
string | Denormalized npm package for convenience. |
tool_name |
string | The tool's machine name (e.g. brave_web_search). What an LLM would call. |
tool_description |
string | The description string the server reported. This is the text an LLM reads to decide whether to use the tool. |
input_schema |
object (JSON Schema) | The verbatim inputSchema from tools/list. Compatible with Anthropic tool.input_schema and OpenAI function.parameters. |
What makes it different
Introspected, not scraped
Existing MCP catalogs (modelcontextprotocol/servers, awesome-mcp-servers, mcp.so) list READMEs. We spawn each server and capture what it actually responds with. The schemas are real. The descriptions are what the LLM will see at runtime.
Failures are first-class data
The 563 servers that didn't introspect aren't dropped - they're classified. needs_env_var, needs_cli_args, init_timeout, and 13 other buckets tell you what each server expects before it'll boot.
Drop-in for function calling
The input_schema field is a valid JSON Schema as defined by the MCP spec. Pass it straight into Anthropic tool.input_schema or OpenAI function.parameters with no transform.
Monthly auto-refresh
A GitHub Actions cron re-runs the full pipeline on the 1st of every month. Each refresh produces a diff (new servers, removed servers, changed tool schemas) committed to the repo as a changelog. Older snapshots stay reproducible.
How introspection works
The pipeline is small enough to read in one sitting - the whole thing is in the GitHub repo. For each candidate package, the introspector:
-
Validates the package still exists on npm. A separate
validate_npm.pypass catches packages that vanished between snapshots. -
Spawns
npx -y <package>with stdio piped and a 120-second timeout. Each child gets its own subprocess; up to 8 run in parallel. -
Sends a JSON-RPC
initializewith protocol version2024-11-05and standard client capabilities. Same handshake Claude Desktop sends. -
Sends
tools/listand reads the response. The result array becomes the tool rows verbatim. -
On failure: classifies stderr by pattern matching into one of 16 status buckets. Vendor-specific tokens (
needs_slack_token,needs_aws_creds) get their own bucket so you can filter for them.
Concurrency is configurable (CONCURRENCY=8 in CI). The whole run takes about 3 hours of runner time on a clean machine. The transport is the official MCP stdio transport - we don't touch the HTTP transport because most published servers ship stdio-only.
Three ways to access it
-
HuggingFace datasets API - load servers or tools as a separate split
pip install datasets from datasets import load_dataset # tools split: one row per (server, tool) tools = load_dataset("automatelab/mcp-servers-tool-catalog", "tools", split="train") # all tools whose description mentions "search" search_tools = tools.filter(lambda r: "search" in r["tool_description"].lower()) print(len(search_tools), "search-related MCP tools") -
Parquet + pandas - join servers and tools locally
pip install pandas pyarrow import pandas as pd servers = pd.read_parquet("servers.parquet") tools = pd.read_parquet("tools.parquet") # Servers that need credentials, with their first tool name needs_auth = servers[servers["status"].str.startswith("needs_")] print(needs_auth.groupby("status").size().sort_values(ascending=False)) -
DuckDB SQL - cross-table queries with no setup
SELECT s.name, COUNT(t.tool_name) AS tools FROM read_parquet('servers.parquet') s LEFT JOIN read_parquet('tools.parquet') t USING (package) WHERE s.status = 'ok' GROUP BY s.name ORDER BY tools DESC LIMIT 10;
Full schema and monthly diff history: dataset card on HuggingFace. Pipeline source, classifier logic, and monthly cron: AutomateLab-tech/mcp-tool-catalog on GitHub.
FAQ
What is the license?
How is this different from modelcontextprotocol/servers or awesome-mcp-servers?
Why did so many servers fail to introspect?
initialize. A clean-room spawn with no env doesn't satisfy them. We classify each failure - needs_env_var, needs_cli_args, needs_slack_token, etc. - so the failure list is itself a useful "what setup does this server need" lookup.Can I add a server to the catalog?
servers-final.validated.json in the GitHub repo with a one-line entry ({"name": "...", "package": "@scope/pkg"}) and open a PR. The next monthly refresh picks it up. We accept any public npm package that ships an MCP server. validate_npm.py will reject hallucinated packages automatically.Why JSONL and Parquet for the same data?
input_schema object as a real struct instead of a string, and the HuggingFace dataset viewer renders it directly.Does the catalog include Python or Go MCP servers?
uvx-runnable) are a planned addition; the bottleneck is environment isolation, not transport (the JSON-RPC layer is identical). Watch the GitHub repo for the multi-runtime PR.Need this wired into an agent pipeline?
We use this catalog to pick the right MCP server for jobs at AutomateLab - and to build retrieval over tool schemas so an agent can find the right tool from thousands of options. If you want it integrated into your own agent harness, MCP host, or fine-tuning run, we can scope and build it.
Get in touch