protein-mcp-server

v0.1.2 pre-1.0

Federated protein structure & function across experimental (PDB) and predicted (AlphaFold) models via MCP. STDIO or Streamable HTTP.

protein.caseyjhand.com/mcp
claude mcp add --transport http protein-mcp-server https://protein.caseyjhand.com/mcp
codex mcp add protein-mcp-server --url https://protein.caseyjhand.com/mcp
{
  "mcpServers": {
    "protein-mcp-server": {
      "url": "https://protein.caseyjhand.com/mcp"
    }
  }
}
gemini mcp add --transport http protein-mcp-server https://protein.caseyjhand.com/mcp
{
  "mcpServers": {
    "protein-mcp-server": {
      "command": "bunx",
      "args": [
        "mcp-remote",
        "https://protein.caseyjhand.com/mcp"
      ]
    }
  }
}
{
  "mcpServers": {
    "protein-mcp-server": {
      "type": "http",
      "url": "https://protein.caseyjhand.com/mcp"
    }
  }
}
curl -X POST https://protein.caseyjhand.com/mcp \
  -H "Content-Type: application/json" \
  -H "MCP-Protocol-Version: 2025-11-25" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-11-25","capabilities":{},"clientInfo":{"name":"curl","version":"1.0.0"}}}'

Tools

7

protein_search_structures

open-world

Search experimental (PDB) and predicted (computed-model) protein structures by free text, protein sequence (triggers an mmseqs2 similarity search), and/or organism, method, and resolution filters. Returns ranked hits; the experimental page is enriched with title, method, resolution, and organism. Chain hit IDs into protein_get_structure. Optionally returns a facet breakdown (counts by method / organism / release year / …) alongside the hits at no extra call.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "protein_search_structures",
    "arguments": {}
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "query": {
      "description": "Free-text query (protein name, gene, keyword, PDB title terms).",
      "type": "string"
    },
    "sequence": {
      "description": "One-letter amino-acid sequence; triggers an RCSB mmseqs2 sequence-similarity search.",
      "type": "string"
    },
    "organism": {
      "description": "Filter by source organism scientific name (e.g. \"Homo sapiens\").",
      "type": "string"
    },
    "method": {
      "description": "Filter by experimental method (e.g. \"X-RAY DIFFRACTION\", \"ELECTRON MICROSCOPY\").",
      "type": "string"
    },
    "max_resolution": {
      "description": "Maximum resolution in Å (lower is sharper); applies to experimental structures.",
      "type": "number",
      "exclusiveMinimum": 0
    },
    "min_identity": {
      "description": "Minimum sequence identity (0–1) for a sequence search. Default 0.",
      "type": "number",
      "minimum": 0,
      "maximum": 1
    },
    "max_evalue": {
      "description": "Maximum E-value for a sequence search. Default 1.",
      "type": "number",
      "exclusiveMinimum": 0
    },
    "content_type": {
      "default": "all",
      "description": "Which structure universe to search: experimental (PDB), predicted (computed models), or all.",
      "type": "string",
      "enum": [
        "experimental",
        "predicted",
        "all"
      ]
    },
    "facets": {
      "description": "Optional dimensions to summarize as a facet breakdown alongside the hits.",
      "type": "array",
      "items": {
        "type": "string",
        "enum": [
          "method",
          "organism",
          "polymer_type",
          "resolution",
          "release_year",
          "molecular_weight"
        ]
      }
    },
    "limit": {
      "default": 25,
      "description": "Maximum hits to return (1–100).",
      "type": "integer",
      "minimum": 1,
      "maximum": 100
    }
  },
  "required": [
    "content_type",
    "limit"
  ],
  "additionalProperties": false
}
view source ↗

protein_get_structure

open-world

Fetch structures with metadata and coordinate-file URLs. source "experimental" takes PDB entry IDs (batched in one call); "predicted" takes UniProt accessions (AlphaFold, with pLDDT/PAE confidence); "best_available" takes UniProt accessions and returns the top federated model (experimental if one exists, else the best prediction). Resolves up to the configured batch cap per call with per-ID partial success — missed IDs are listed in failed[]. Set include_coords to inline coordinate content; if that overflows, a section outline is returned — re-call with sections:[ids] to inline specific structures.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "protein_get_structure",
    "arguments": {
      "ids": "<ids>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "ids": {
      "minItems": 1,
      "type": "array",
      "items": {
        "type": "string",
        "minLength": 1
      },
      "description": "PDB entry IDs (source experimental) or UniProt accessions (predicted / best_available)."
    },
    "source": {
      "default": "experimental",
      "description": "Where to fetch: experimental (PDB), predicted (AlphaFold), or best_available (federated pick).",
      "type": "string",
      "enum": [
        "experimental",
        "predicted",
        "best_available"
      ]
    },
    "include_coords": {
      "default": false,
      "description": "Inline coordinate-file content (cif). Off by default — URLs are always returned.",
      "type": "boolean"
    },
    "sections": {
      "description": "Structure IDs to inline coordinates for, from a prior overflow outline.",
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  },
  "required": [
    "ids",
    "source",
    "include_coords"
  ],
  "additionalProperties": false
}
view source ↗

protein_find_similar

open-world

Find structurally or evolutionarily related proteins. by:"sequence" runs an RCSB mmseqs2 sequence-similarity search (synchronous) over a sequence — supplied directly, or pulled from a PDB ID or UniProt accession. by:"structure" runs a Foldseek fold-similarity search (asynchronous) against experimental and predicted databases; if the job is still computing when the poll budget elapses, the response reports status "computing" with a ticket — re-call to resume. Output names the engine and database each hit came from.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "protein_find_similar",
    "arguments": {
      "by": "<by>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "by": {
      "type": "string",
      "enum": [
        "sequence",
        "structure"
      ],
      "description": "Similarity axis: sequence (mmseqs2) or structure (Foldseek)."
    },
    "sequence": {
      "description": "One-letter amino-acid sequence to search from (by:sequence).",
      "type": "string"
    },
    "pdb_id": {
      "description": "PDB entry ID to derive the query from.",
      "type": "string"
    },
    "uniprot": {
      "description": "UniProt accession to derive the query from.",
      "type": "string"
    },
    "databases": {
      "description": "Foldseek target databases (by:structure). Default pdb100 + afdb50. e.g. afdb-swissprot, BFVD.",
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "max_evalue": {
      "description": "Maximum E-value (by:sequence). Default 1.",
      "type": "number",
      "exclusiveMinimum": 0
    },
    "min_identity": {
      "description": "Minimum sequence identity 0–1 (by:sequence). Default 0.",
      "type": "number",
      "minimum": 0,
      "maximum": 1
    },
    "limit": {
      "default": 25,
      "description": "Maximum hits to return (1–100).",
      "type": "integer",
      "minimum": 1,
      "maximum": 100
    }
  },
  "required": [
    "by",
    "limit"
  ],
  "additionalProperties": false
}
view source ↗

protein_track_ligands

open-world

Ligand discovery and binding-site analysis across the PDB. mode "find_ligand" resolves a name or formula to chemical component IDs with metadata (formula, weight, SMILES). mode "structures_with_ligand" returns PDB entries containing a ligand (by exact component ID — get the ID from find_ligand first). mode "binding_site" returns the protein residues lining a ligand's pocket in a given structure, with contact distances. Binding sites are experimental-only (computed from deposited coordinates; predicted models carry no bound ligands).

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "protein_track_ligands",
    "arguments": {
      "mode": "<mode>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "mode": {
      "type": "string",
      "enum": [
        "find_ligand",
        "structures_with_ligand",
        "binding_site"
      ],
      "description": "Operation: resolve a ligand, find structures containing it, or analyze its binding site."
    },
    "query": {
      "description": "Ligand name or formula (mode find_ligand).",
      "type": "string"
    },
    "comp_id": {
      "description": "Exact chemical component ID (modes structures_with_ligand and binding_site).",
      "type": "string"
    },
    "pdb_id": {
      "description": "PDB entry ID (mode binding_site).",
      "type": "string"
    },
    "limit": {
      "default": 25,
      "description": "Maximum results to return (1–100).",
      "type": "integer",
      "minimum": 1,
      "maximum": 100
    }
  },
  "required": [
    "mode",
    "limit"
  ],
  "additionalProperties": false
}
view source ↗

protein_compare_structures

open-world

Structurally align 2–10 structures via the RCSB Structural Comparison service (TM-align / jFATCAT). reference:"first" aligns every structure to the first; reference:"all_pairs" computes the full pairwise matrix. Each pair is an independent async alignment job, fanned out with a concurrency cap and per-pair partial success — a pair still computing when the budget elapses returns status "computing" with its job UUID (re-call to resume), and a failed pair degrades its row without sinking the others. Returns TM-score, RMSD, and aligned-residue count per pair.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "protein_compare_structures",
    "arguments": {
      "structures": "<structures>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "structures": {
      "minItems": 2,
      "maxItems": 10,
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "pdb_id": {
            "type": "string",
            "minLength": 1,
            "description": "PDB entry ID."
          },
          "chain": {
            "description": "Chain (label_asym_id) to restrict the alignment to a single chain.",
            "type": "string"
          }
        },
        "required": [
          "pdb_id"
        ],
        "additionalProperties": false,
        "description": "A structure to align, by PDB entry ID with optional chain."
      },
      "description": "The 2–10 structures to compare."
    },
    "reference": {
      "default": "first",
      "description": "Align all to the first structure, or compute the full pairwise matrix.",
      "type": "string",
      "enum": [
        "first",
        "all_pairs"
      ]
    },
    "method": {
      "default": "tm-align",
      "description": "Alignment algorithm: tm-align, fatcat-rigid, or fatcat-flexible.",
      "type": "string",
      "enum": [
        "tm-align",
        "fatcat-rigid",
        "fatcat-flexible"
      ]
    },
    "timeout_s": {
      "description": "Poll budget per pair in seconds before returning \"computing\". Defaults to the server setting.",
      "type": "integer",
      "minimum": 5,
      "maximum": 120
    }
  },
  "required": [
    "structures",
    "reference",
    "method"
  ],
  "additionalProperties": false
}
view source ↗

protein_analyze_collection

open-world

Profile the PDB into distributions and trends over an optional scoping query: counts by method, organism, or polymer composition; resolution and molecular-weight histograms; release-year timelines; and multidimensional cross-tabs (e.g. method × release_year). Aggregation runs server-side at RCSB — one call returns compact buckets, no row pull. Pass one group_by dimension for a single breakdown, or two for a cross-tab (the first nests the second).

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "protein_analyze_collection",
    "arguments": {
      "group_by": "<group_by>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "group_by": {
      "minItems": 1,
      "maxItems": 2,
      "type": "array",
      "items": {
        "type": "string",
        "enum": [
          "method",
          "organism",
          "polymer_type",
          "resolution",
          "release_year",
          "molecular_weight"
        ]
      },
      "description": "1 dimension for a breakdown, or 2 for a cross-tab (the first nests the second)."
    },
    "query": {
      "description": "Optional free-text scope (e.g. \"kinase\"); omit to profile the whole PDB.",
      "type": "string"
    },
    "organism": {
      "description": "Optional source-organism scope.",
      "type": "string"
    },
    "method": {
      "description": "Optional experimental-method scope.",
      "type": "string"
    },
    "max_resolution": {
      "description": "Optional maximum-resolution scope (Å).",
      "type": "number",
      "exclusiveMinimum": 0
    },
    "content_type": {
      "default": "experimental",
      "description": "Which structure universe to profile. Default experimental.",
      "type": "string",
      "enum": [
        "experimental",
        "predicted",
        "all"
      ]
    },
    "interval": {
      "description": "Bin width for a histogram dimension (number) or period for a date histogram (year/month/quarter).",
      "anyOf": [
        {
          "type": "number",
          "exclusiveMinimum": 0,
          "description": "Numeric bin width for a value histogram (e.g. resolution Å)."
        },
        {
          "type": "string",
          "enum": [
            "year",
            "month",
            "quarter"
          ],
          "description": "Period granularity for a date histogram."
        }
      ]
    },
    "bucket_limit": {
      "description": "Max buckets per dimension. Defaults to the server PROTEIN_FACET_BUCKET_CAP.",
      "type": "integer",
      "minimum": 1,
      "maximum": 500
    }
  },
  "required": [
    "group_by",
    "content_type"
  ],
  "additionalProperties": false
}
view source ↗

protein_get_annotations

open-world

Sequence and functional annotation for a protein: UniProt features (domains, binding sites, PTMs), natural variants, and InterPro domain/family memberships (Pfam, PROSITE, …) with GO terms. Provide a UniProt accession directly, or a PDB ID — it is resolved to its UniProt accession via the structure's sequence cross-reference. Use the "include" parameter to scope which annotation classes are fetched.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "protein_get_annotations",
    "arguments": {}
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "uniprot": {
      "description": "UniProt accession (e.g. P69905). Takes precedence over pdb_id.",
      "type": "string"
    },
    "pdb_id": {
      "description": "PDB entry ID; resolved to a UniProt accession via cross-reference.",
      "type": "string"
    },
    "include": {
      "default": "all",
      "description": "Which annotation classes to fetch: features, domains (InterPro), variants, or all.",
      "type": "string",
      "enum": [
        "features",
        "domains",
        "variants",
        "all"
      ]
    }
  },
  "required": [
    "include"
  ],
  "additionalProperties": false
}
view source ↗

Resources

2

Experimental structure summary for a PDB entry: title, method, resolution, organism, chains, and bound ligands.

uri pdb://{entry_id} mime application/json

Predicted-structure summary for a UniProt accession from AlphaFold DB: mean pLDDT, confidence-band fractions, model URLs, and version.

uri af://{uniprot} mime application/json