Free · CC-BY-4.0

MCP Servers Tool Catalog

922 public Model Context Protocol servers spawned over stdio. 359 introspected successfully. 9,922 real tool schemas captured. Not scraped from READMEs - this is the verbatim response to a JSON-RPC tools/list call, the same handshake every MCP host uses. Refreshed monthly.

What you get

Two tables and a monthly diff. Every record is the output of code that actually ran - no human curation, no LLM summarization between source and dataset.

922 npm packages introspected
359 servers that responded to tools/list
9,922 tool schemas captured
16 classified failure categories

What each record contains

servers.jsonl - one row per attempted package (922 rows), regardless of whether introspection succeeded. The status field is the key column: ok means the server responded to tools/list; anything else is one of 16 classified failure reasons.

Field Type Notes
name string Short identifier from the curated seed list (e.g. brave-search, stripe).
package string npm package name. What npx would resolve to.
status string ok on success. On failure: init_timeout, needs_env_var, needs_cli_args, npm_install_generic, broken_install, or one of 11 vendor-specific buckets (needs_slack_token, needs_aws_creds, etc.).
elapsed_s number Wall-clock seconds from spawn to terminal state. Useful for spotting cold-start sluggishness.
tool_count number Number of tools returned by tools/list. 0 on failure.
server_info object The serverInfo block from the MCP initialize response: name and version self-reported by the server.

tools.jsonl - one row per (server, tool) pair on successful introspections (9,922 rows). Each tool record carries the verbatim inputSchema the server returned - usable as an OpenAI / Anthropic function-calling tool definition with no transformation.

Field Type Notes
server_name string Foreign key into servers.jsonl.
package string Denormalized npm package for convenience.
tool_name string The tool's machine name (e.g. brave_web_search). What an LLM would call.
tool_description string The description string the server reported. This is the text an LLM reads to decide whether to use the tool.
input_schema object (JSON Schema) The verbatim inputSchema from tools/list. Compatible with Anthropic tool.input_schema and OpenAI function.parameters.

What makes it different

Introspected, not scraped

Existing MCP catalogs (modelcontextprotocol/servers, awesome-mcp-servers, mcp.so) list READMEs. We spawn each server and capture what it actually responds with. The schemas are real. The descriptions are what the LLM will see at runtime.

Failures are first-class data

The 563 servers that didn't introspect aren't dropped - they're classified. needs_env_var, needs_cli_args, init_timeout, and 13 other buckets tell you what each server expects before it'll boot.

Drop-in for function calling

The input_schema field is a valid JSON Schema as defined by the MCP spec. Pass it straight into Anthropic tool.input_schema or OpenAI function.parameters with no transform.

Monthly auto-refresh

A GitHub Actions cron re-runs the full pipeline on the 1st of every month. Each refresh produces a diff (new servers, removed servers, changed tool schemas) committed to the repo as a changelog. Older snapshots stay reproducible.

How introspection works

The pipeline is small enough to read in one sitting - the whole thing is in the GitHub repo. For each candidate package, the introspector:

  1. Validates the package still exists on npm. A separate validate_npm.py pass catches packages that vanished between snapshots.
  2. Spawns npx -y <package> with stdio piped and a 120-second timeout. Each child gets its own subprocess; up to 8 run in parallel.
  3. Sends a JSON-RPC initialize with protocol version 2024-11-05 and standard client capabilities. Same handshake Claude Desktop sends.
  4. Sends tools/list and reads the response. The result array becomes the tool rows verbatim.
  5. On failure: classifies stderr by pattern matching into one of 16 status buckets. Vendor-specific tokens (needs_slack_token, needs_aws_creds) get their own bucket so you can filter for them.

Concurrency is configurable (CONCURRENCY=8 in CI). The whole run takes about 3 hours of runner time on a clean machine. The transport is the official MCP stdio transport - we don't touch the HTTP transport because most published servers ship stdio-only.

Three ways to access it

  1. HuggingFace datasets API - load servers or tools as a separate split
    pip install datasets
    from datasets import load_dataset
    
    # tools split: one row per (server, tool)
    tools = load_dataset("automatelab/mcp-servers-tool-catalog", "tools", split="train")
    
    # all tools whose description mentions "search"
    search_tools = tools.filter(lambda r: "search" in r["tool_description"].lower())
    print(len(search_tools), "search-related MCP tools")
  2. Parquet + pandas - join servers and tools locally
    pip install pandas pyarrow
    import pandas as pd
    
    servers = pd.read_parquet("servers.parquet")
    tools = pd.read_parquet("tools.parquet")
    
    # Servers that need credentials, with their first tool name
    needs_auth = servers[servers["status"].str.startswith("needs_")]
    print(needs_auth.groupby("status").size().sort_values(ascending=False))
  3. DuckDB SQL - cross-table queries with no setup
    SELECT s.name, COUNT(t.tool_name) AS tools
    FROM read_parquet('servers.parquet') s
    LEFT JOIN read_parquet('tools.parquet') t USING (package)
    WHERE s.status = 'ok'
    GROUP BY s.name
    ORDER BY tools DESC
    LIMIT 10;

Full schema and monthly diff history: dataset card on HuggingFace. Pipeline source, classifier logic, and monthly cron: AutomateLab-tech/mcp-tool-catalog on GitHub.

FAQ

What is the license?
The pipeline code is MIT. The dataset rows on HuggingFace are CC-BY-4.0. The introspected tool descriptions and schemas come from each individual MCP server's source - attribute the original server author when you redistribute, and check their license if you're using a specific server's tools downstream. We're cataloging public-facing JSON-RPC responses, not the server source code itself.
How is this different from modelcontextprotocol/servers or awesome-mcp-servers?
Those are curated lists - human-maintained README directories of MCP servers. They tell you what exists. This is a catalog of the runtime behavior of those servers: the tool names they actually expose, the input schemas they actually accept, and a classified reason for the ones that don't boot cleanly. If you're building an agent and want to pick the right MCP server for a job, the README tells you the pitch and this dataset tells you the API.
Why did so many servers fail to introspect?
Most MCP servers expect environment variables (API tokens, database URLs), CLI arguments, or a working credential chain before they'll respond to initialize. A clean-room spawn with no env doesn't satisfy them. We classify each failure - needs_env_var, needs_cli_args, needs_slack_token, etc. - so the failure list is itself a useful "what setup does this server need" lookup.
Can I add a server to the catalog?
Yes. Edit servers-final.validated.json in the GitHub repo with a one-line entry ({"name": "...", "package": "@scope/pkg"}) and open a PR. The next monthly refresh picks it up. We accept any public npm package that ships an MCP server. validate_npm.py will reject hallucinated packages automatically.
Why JSONL and Parquet for the same data?
JSONL is human-readable and diff-friendly - the rolling state in the repo is JSONL so each monthly commit is a meaningful changelog. Parquet is what you want for analysis - it's 5-10x smaller, preserves the nested input_schema object as a real struct instead of a string, and the HuggingFace dataset viewer renders it directly.
Does the catalog include Python or Go MCP servers?
Not yet - the introspector currently spawns npm packages only. Python MCP servers (typically uvx-runnable) are a planned addition; the bottleneck is environment isolation, not transport (the JSON-RPC layer is identical). Watch the GitHub repo for the multi-runtime PR.

Need this wired into an agent pipeline?

We use this catalog to pick the right MCP server for jobs at AutomateLab - and to build retrieval over tool schemas so an agent can find the right tool from thousands of options. If you want it integrated into your own agent harness, MCP host, or fine-tuning run, we can scope and build it.

Get in touch