Search experimental (PDB) and predicted (computed-model) protein structures by free text, protein sequence (triggers an mmseqs2 similarity search), and/or organism, method, and resolution filters. Returns ranked hits; the experimental page is enriched with title, method, resolution, and organism. Chain hit IDs into protein_get_structure. Optionally returns a facet breakdown (counts by method / organism / release year / …) alongside the hits at no extra call.
Fetch structures with metadata and coordinate-file URLs. source "experimental" takes PDB entry IDs (batched in one call); "predicted" takes UniProt accessions (AlphaFold, with pLDDT/PAE confidence); "best_available" takes UniProt accessions and returns the top federated model (experimental if one exists, else the best prediction). Resolves up to the configured batch cap per call with per-ID partial success — missed IDs are listed in failed[]. Set include_coords to inline coordinate content; if that overflows, a section outline is returned — re-call with sections:[ids] to inline specific structures.
Find structurally or evolutionarily related proteins. by:"sequence" runs an RCSB mmseqs2 sequence-similarity search (synchronous) over a sequence — supplied directly, or pulled from a PDB ID or UniProt accession. by:"structure" runs a Foldseek fold-similarity search (asynchronous) against experimental and predicted databases; if the job is still computing when the poll budget elapses, the response reports status "computing" with a ticket — re-call to resume. Output names the engine and database each hit came from.
Ligand discovery and binding-site analysis across the PDB. mode "find_ligand" resolves a name or formula to chemical component IDs with metadata (formula, weight, SMILES). mode "structures_with_ligand" returns PDB entries containing a ligand (by exact component ID — get the ID from find_ligand first). mode "binding_site" returns the protein residues lining a ligand's pocket in a given structure, with contact distances. Binding sites are experimental-only (computed from deposited coordinates; predicted models carry no bound ligands).
Structurally align 2–10 structures via the RCSB Structural Comparison service (TM-align / jFATCAT). reference:"first" aligns every structure to the first; reference:"all_pairs" computes the full pairwise matrix. Each pair is an independent async alignment job, fanned out with a concurrency cap and per-pair partial success — a pair still computing when the budget elapses returns status "computing" with its job UUID (re-call to resume), and a failed pair degrades its row without sinking the others. Returns TM-score, RMSD, and aligned-residue count per pair.
Profile the PDB into distributions and trends over an optional scoping query: counts by method, organism, or polymer composition; resolution and molecular-weight histograms; release-year timelines; and multidimensional cross-tabs (e.g. method × release_year). Aggregation runs server-side at RCSB — one call returns compact buckets, no row pull. Pass one group_by dimension for a single breakdown, or two for a cross-tab (the first nests the second).
Sequence and functional annotation for a protein: UniProt features (domains, binding sites, PTMs), natural variants, and InterPro domain/family memberships (Pfam, PROSITE, …) with GO terms. Provide a UniProt accession directly, or a PDB ID — it is resolved to its UniProt accession via the structure's sequence cross-reference. Use the "include" parameter to scope which annotation classes are fetched.