Free · CC-BY-4.0

n8n Nodes Catalog

A structured, machine-readable catalog of 524 n8n nodes: operations each one supports, credential requirements, and properties schema. Extracted monthly from n8n source. The node-level metadata layer that lets an AI agent reason about which nodes to use - without guessing from stale training data.

What each record contains

524 records. One per node. 13 fields extracted from n8n source - covering both packages/nodes-base (431 nodes) and packages/@n8n/nodes-langchain (93 nodes). All fields are locked to what the extraction script produces.

Field Type Notes
node_name string Internal identifier (e.g. slack, airtable). Matches INodeTypeDescription.name.
display_name string Human-readable label shown in the n8n UI.
categories list[string] Category tags from the node's codex file. Examples: Communication, AI, Data & Storage.
subcategories list[string] Subcategory leaf values, flattened from codex.subcategories.
group list[string] n8n execution group: input, output, or transform.
version string Explicit version for single-version nodes; defaultVersion for multi-version nodes.
description string One-line description from INodeTypeDescription.description.
credentials_required list[string] Credential type names from the node's credentials array. Empty for trigger and core nodes.
operations_supported list[string] Values from the operation property options; falls back to resource options. Empty for nodes without a resource/operation picker.
properties_schema string (JSON) Compact top-level property descriptors: [{"name":"...","displayName":"...","type":"..."}]. Serialized as a JSON string.
source_package string nodes-base or @n8n (for nodes-langchain nodes).
source_file_path string Repo-relative path to the primary .node.ts file.
github_permalink string Permanent GitHub link at the exact tag the record was extracted from.

Parquet note: list fields (categories, credentials_required, etc.) are stored as JSON strings in the Parquet file. Parse with json.loads().

What makes it different

Fills the gap below workflow datasets

Existing n8n datasets on HuggingFace catalog workflow examples - they train models on how to assemble nodes. None catalog what each node is. This dataset is the metadata layer underneath: what a node does, what it accepts, what it requires.

Agent tooling at inference time

An agent building an n8n workflow can load this dataset as context to pick the right node, validate operation names, and check credential requirements - before generating a single line of workflow JSON.

Monthly auto-update from source

The extraction pipeline runs monthly against the latest n8n release. The github_permalink field anchors every record to the tag it was extracted from, so older rows remain stable across updates.

Queryable in one line

"What n8n nodes support OAuth2?" is currently a docs-browsing exercise. With this dataset it is a single pandas filter or DuckDB query. Two formats: nodes.json and nodes.parquet.

Three ways to access it

  1. HuggingFace datasets API - works directly in Python, no file download required
    pip install datasets
    from datasets import load_dataset
    import json
    
    ds = load_dataset("automatelab/n8n-nodes-catalog", split="train")
    
    # nodes requiring OAuth2
    oauth = [r for r in ds if "oAuth2Api" in json.loads(r["credentials_required"])]
    print([r["display_name"] for r in oauth])
  2. Parquet + pandas - best for local analysis and filtering
    pip install pandas pyarrow
    import pandas as pd, json
    
    df = pd.read_parquet("nodes.parquet")
    df["ops"] = df["operations_supported"].apply(json.loads)
    
    # all Slack nodes and their operations
    slack = df[df["node_name"].str.contains("slack", case=False)]
    print(slack[["display_name", "ops"]])
  3. DuckDB SQL - count nodes by category with no setup beyond the Parquet file
    SELECT category, COUNT(*) AS node_count
    FROM read_parquet('nodes.parquet'),
         UNNEST(json_extract_string(categories, '$[*]')) AS t(category)
    GROUP BY category
    ORDER BY node_count DESC;

Full schema, methodology, and sample queries: dataset card on HuggingFace. Extraction script and monthly update pipeline: source on GitHub. Deep dive on AI-agent use cases: the companion post.

FAQ

What is the license?
Our additions - the catalog format, extraction script, and dataset card - are CC-BY-4.0. The upstream node metadata is derived from n8n source (copyright n8n team, used under the n8n Sustainable Use License). This dataset is a community-maintained catalog/index, not a redistribution of n8n source.
How is this different from other n8n datasets on HuggingFace?
Existing n8n datasets (npv2k1/n8n-workflow, mbakgun/n8nbuilder-n8n-workflows-dataset) are collections of workflow examples - they train models to assemble workflows. This dataset fills the layer underneath: what each node is, what operations it supports, and what credentials it requires. Those datasets tell a model how to use nodes; this one tells it which nodes exist and what they do.
Why Parquet and not just CSV?
Parquet preserves the list-typed fields (categories, operations, credentials) in a way CSV cannot cleanly represent. It is also Snappy-compressed, which makes it ~10x smaller than the equivalent CSV. The HuggingFace dataset viewer renders Parquet files directly - no download needed to browse records.
Does it cover all n8n nodes?
It covers the 524 community nodes in packages/nodes-base (431 nodes) and packages/@n8n/nodes-langchain (93 nodes). Not included: credentials definitions, utility modules, the core workflow engine, and EE-only nodes that don't follow the standard descriptor pattern.
Can I use this for LLM fine-tuning?
Yes, for non-commercial use under CC-BY-4.0. Attribute AutomateLab as the catalog maintainer and preserve the n8n team copyright notice per the n8n Sustainable Use License. Full attribution text is on the dataset card.

Need this wired into an agent pipeline?

We use this catalog to power n8n agent tooling at AutomateLab. If you want it integrated into your own workflow-building agent, retrieval pipeline, or fine-tuning run, we can help scope and build it.

Get in touch