Proteome statistics

Interactive charts and tables for disorder, mutations, pathogenicity, and structure context over saved catalogs, GO terms, custom protein lists, or PPI neighbourhoods — colours match Browse and per-gene Summary.

Downloads Help

About this page

Legends, exports, and how colours match Browse / Summary

These charts summarise disorder, somatic and germline variation, in silico pathogenicity, and structure-aware context over selectable protein sets (saved catalogs, a PPI neighbourhood, or a GO term).

Colours follow the same palettes as Browse Dynamic and the per-gene Summary views (mutation sources, ClinVar buckets, ELM / Pfam / phase-separation / disorder layers).

Legend — mutation databases:

ClinVar TCGA COSMIC cBioPortal OMIM

Annotation layers (mutation × region charts):

ELM Pfam PhasePro MFIB DIBS Experimental disorder

Split JSON exports for the human preset: Downloads → Proteome statistics. Same layout pattern as Downloads (section menu + in-column scroll).

Proteome statistics

Choose a data set below; all sections update together. Same navigation pattern as Downloads.

Precomputed statistics are not available on this deployment. Ask your administrator to run the site statistics build after data load.

Protein set (applies to every chart below)

Use either a preset catalog or the custom tab — not both. Presets load instantly; custom sets use a seed protein (PPI neighbourhood), a GO term, or your uploaded/pasted protein list.

Gene lists vs. annotation-filtered subsets Saved catalogs

Gene-based data sets

Whole proteome or cancer driver census (main isoforms).

Annotation-based data sets

Proteins with at least one hit or region in the selected layer (main isoforms). Letters match the Summary Visual track column: G genome · D disorder · O ordered · M protein-level; C curated · P predicted.

PPI neighborhood, GO term, or protein list Live merge

Quick tries: · ·

PPI neighborhood

Hub protein plus direct partners from curated interaction records (filterable by source).

Protein

Interaction sources

IntAct

HIPPIE

BioGRID

GO term

Same GO autocomplete as dynamic browse.

GO term

Protein list upload / paste

Accepted identifiers (one per line or comma separated): UniProt accession (e.g. P04049), UniProt ID (e.g. RAF1_HUMAN), GENCODE protein ID (e.g. RAF1-201), Ensembl transcript (e.g. ENST00000469120 or ENST00000469120.1), and gene name (e.g. RAF1).

Protein identifiers

Tip: you can paste mixed identifier types in the same list.

Upload file (.txt, .csv, .tsv)

Example set: ERK1/2 pathway.

Loading pre-aggregated JSON for the selected set…

Protein set — disorder & mutation context

Per-protein disorder, somatic split ordered/disordered, pathogenic ClinVar by site

Per-protein disorder fraction (combined disorder ≥ 2.0), somatic/cancer mutation records split by ordered vs disordered site, and pathogenic ClinVar rows by site context. Sorted by mutation load; very large sets may be truncated in the precomputed export.

Disorder — residue distribution

Combined disorder vector (threshold ≥ 2.0) across the active set

All residues from proteins in the active set that have a disorder score profile.

Somatic / cancer mutations & annotated regions

100% stacked bars by source; driver slice when available

Use Somatic mutations for data-source bars (ordered vs disordered site side by side) and the Census driver slice. Open Annotated regions for Pfam / ELM / binding / MobiDB bar charts (one layer at a time; loaded when you open the tab so large proteomes stay responsive).

Mutations by data source & variant class

Mutation hits overlapping annotation types

layers Top annotated regions per layer

Pick a layer; only one Plotly chart runs at a time. Click a bar or table row to open the protein page. Up to 20 regions per layer. Tables are sortable and filterable.

ClinVar

Significance buckets, site context, annotation overlap

Distributions: per significance bucket (pathogenic / benign / uncertain), site context pies and top disease labels. Annotated regions: same Pfam / ELM / binding / MobiDB overlap as somatic mutations, split by bucket. Tables support search and sorting.

Open this tab to load ClinVar × annotation charts (pathogenic / benign / uncertain). With very large protein sets, the first load can take longer while charts are prepared.

Pathogenicity scores — ordered vs disordered positions

dbNSFP-style predictors and AlphaMissense (subsampled)

Per-position predictor values (dbNSFP-style columns plus AlphaMissense where available). Values are subsampled for responsiveness; distributions match the summary Statistics tab logic.

Means by gene and by annotated region

Structure context on disordered residues

Conservation, ANCHOR, AIUPred binding (disordered sites)

Mammalia conservation, ANCHOR, and AIUPred binding scores restricted to disordered residues — subsampled over a capped number of proteins for performance.

Proteome statistics

About this page

Proteome statistics

PPI neighborhood

GO term

Protein list upload / paste

Protein set — disorder & mutation context

Disorder — residue distribution

Somatic / cancer mutations & annotated regions

Mutations by data source & variant class

Mutation hits overlapping annotation types

Cancer Gene Census slice (human proteome tab only)

By data source & variant class

Hits on annotation types

Top annotated regions (drivers only)

layers Top annotated regions per layer

ClinVar

Pathogenicity scores — ordered vs disordered positions

Means by gene and by annotated region

Structure context on disordered residues

Means by gene and by annotated region (disordered sites only)