Proteome statistics

Interactive charts and tables for disorder, mutations, pathogenicity, and structure context over saved catalogs, GO terms, or PPI neighbourhoods — colours match Browse and per-gene Summary.

About this page

Legends, exports, and how colours match Browse / Summary

These charts summarise disorder, somatic and germline variation, in silico pathogenicity, and structure-aware context over selectable protein sets (saved catalogs, a PPI neighbourhood, or a GO term).

Colours follow the same palettes as Browse Dynamic and the per-gene Summary views (mutation sources, ClinVar buckets, ELM / Pfam / phase-separation / disorder layers).

Legend — mutation databases:

ClinVar TCGA COSMIC cBioPortal OMIM

Annotation layers (mutation × region charts):

ELM Pfam PhasePro MFIB DIBS Experimental disorder

Split JSON exports for the human preset: Downloads → Proteome statistics. Same layout pattern as Downloads (section menu + in-column scroll).

Proteome statistics

Choose a data set below; all sections update together. Same navigation pattern as Downloads.

Precomputed statistics are not available on this deployment. Ask your administrator to run the site statistics build after data load.

Protein set (applies to every chart below)

Use either a preset catalog or the custom tab — not both. Presets load instantly; custom sets use a seed protein (PPI neighbourhood) or a GO term. Example shortcuts below fill a custom set in one click.

Gene lists vs. annotation-filtered subsets Saved catalogs
Gene-based data sets

Whole proteome or cancer driver census (main isoforms).

Annotation-based data sets

Proteins with at least one hit or region in the selected layer (main isoforms). Letters match the Summary Visual track column: G genome · D disorder · O ordered · M protein-level; C curated · P predicted.

PPI neighborhood or GO term Live merge

Quick tries: ·

PPI neighborhood

Hub protein plus direct partners from curated interaction records (filterable by source).

Interaction sources

GO term

Same GO autocomplete as dynamic browse.

Loading pre-aggregated JSON for the selected set…

Protein set — disorder & mutation context

Per-protein disorder, somatic split ordered/disordered, pathogenic ClinVar by site

Per-protein disorder fraction (combined disorder ≥ 2.0), somatic/cancer mutation records split by ordered vs disordered site, and pathogenic ClinVar rows by site context. Sorted by mutation load; very large sets may be truncated in the precomputed export.

Disorder — residue distribution

Combined disorder vector (threshold ≥ 2.0) across the active set

All residues from proteins in the active set that have a disorder score profile.

Somatic / cancer mutations & annotated regions

100% stacked bars by source; driver slice when available

Use Somatic mutations for data-source bars (ordered vs disordered site side by side) and the Census driver slice. Open Annotated regions for Pfam / ELM / binding / MobiDB bar charts (one layer at a time; loaded when you open the tab so large proteomes stay responsive).

Mutations by data source & variant class

Mutation hits overlapping annotation types

Cancer Gene Census slice (human proteome tab only)

Mutations restricted to Census driver proteins: 100% bars by data source and variant class, overlap across annotation layers, and top regions per layer (same interactions as above).

By data source & variant class

Hits on annotation types

Top annotated regions (drivers only)

layers Top annotated regions per layer

Pick a layer; only one Plotly chart runs at a time. Click a bar or table row to open the protein page. Up to 20 regions per layer. Tables are sortable and filterable.

ClinVar

Significance buckets, site context, annotation overlap

Distributions: per significance bucket (pathogenic / benign / uncertain), site context pies and top disease labels. Annotated regions: same Pfam / ELM / binding / MobiDB overlap as somatic mutations, split by bucket. Tables support search and sorting.

Open this tab to load ClinVar × annotation charts (pathogenic / benign / uncertain). With very large protein sets, the first load can take longer while charts are prepared.

Significance bucket, then annotation layer (one chart at a time). Bar/row click opens the protein.

Pathogenicity scores — ordered vs disordered positions

dbNSFP-style predictors and AlphaMissense (subsampled)

Per-position predictor values (dbNSFP-style columns plus AlphaMissense where available). Values are subsampled for responsiveness; distributions match the summary Statistics tab logic.

Means by gene and by annotated region

Structure context on disordered residues

Conservation, ANCHOR, AIUPred binding (disordered sites)

Mammalia conservation, ANCHOR, and AIUPred binding scores restricted to disordered residues — subsampled over a capped number of proteins for performance.

Means by gene and by annotated region (disordered sites only)