Download data

Genome-wide TSV / FASTA tables and pre-built mutation × region joins. For REST, live try-it, Python examples, and JSON layer names, open API.

Open API

Annotation layers & methods

Aligned with Summary, Visual, and Browse (colour keys)

These tables are the same evidence you see as tracks and columns in Summary and Browse; colours are aligned across the portal.

Disorder & structure

IUPred / combined disorder MobiDB experimental AlphaFold PDB

Domains, motifs & sites

Pfam ELM PEM core motifs ScanSite MFIB DIBS PhasePro

Variants (merged tables + disease)

TCGA COSMIC cBioPortal ClinVar OMIM

Pathogenicity & downloads

In silico pathogenicity scores (same predictor table as in downloads); somatic mutation layers in TCGA / COSMIC / cBioPortal tabs. For machine-readable field names and programmatic access, see API → Annotation keys.

Bulk downloads

Genome-wide TSV and FASTA tables

Choose a category to see its files and a sample of the real on-disk format. For single-protein tables (full annotation per protein), use the download control on the Summary page, or fetch the same slices via REST on the API page.

One sequence per protein (FASTA) or a wide protein table (TSV) with accessions, gene names, UniProt IDs, transcripts, and other core columns.

Sample from static/download/ (first lines)
— Proteins.tsv —
Protein ID	UniProt Accession	Transcript ID	Gene Name	Name	Chromosome	Cancer Driver
A1BG-201	P04217	ENST00000263100.8	A1BG	Alpha-1B-glycoprotein	chr19	Not Cancer Driver
A1CF-207	Q9NQ94	ENST00000414883.2	A1CF	APOBEC1 complementation factor	chr10	Census
A1CF-201	Q9NQ94	ENST00000373993.6	A1CF	APOBEC1 complementation factor	chr10	Census
A1CF-202	Q9NQ94	ENST00000373995.7	A1CF	APOBEC1 complementation factor	chr10	Census
A1CF-206	Q9NQ94	ENST00000395495.6	A1CF	APOBEC1 complementation factor	chr10	Census
A1CF-205	Q9NQ94	ENST00000395489.7	A1CF	APOBEC1 complementation factor	chr10	Census

Exon boundaries and PhastCons-style conservation tracks aligned to protein coordinates.

Sample from static/download/
— Exonborder.tsv —
Protein ID	Exon borders
A1BG-201	"0 0 11
1 12 23
2 24 113
3 114 204
4 205 303
5 304 397

— Conservation_phastCons.tsv —
Protein ID	Conservation Scores
A1BG-201	0.0005 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.134 ,0.292 ,0.8165 ,0.9424999999999999 ,0.9795 ,0.9844999999999999 ,0.2355 ,0.0005 ,0.0 ,0.0 ,0.0945 ,0.0005 ,0.0 ,0.0 ,0.29100000000000004 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0005 ,0.0035 ,0.0 ,0.014 ,0.0 ,0.0195 ,0.0 ,0.004 ,0.5465 ,0.27149999999999996 ,0.0 ,0.001 ,0.0015 ,0.0 ,0.0 ,0.063 ,0.0 ,0.0015 ,0.0 ,0.0005 ,0.0 ,0.033 ,0.0 ,0.0025 ,0.0 ,0.0005 ,0.0 ,0.0 ,0.0 ,0.002 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.10899999999999999 ,0.502 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0055 ,0.0005 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.004 ,0.002 ,0.0 ,0.013000000000000001 ,0.001 ,0.064 ,0.159 ,0.0 ,0.002 ,0.008 ,0.002 ,0.0005 ,0.0 ,0.001 ,0.0 ,0.391 ,0.172 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.4315 ,0.319 ,0.0 ,0.0 ,0.0 ,0.0 ,0.008 ,0.0125 ,0.08149999999999999 ,0.001 ,0.0055 ,0.002 ,0.28 ,0.333 ,0.052000000000000005 ,0.1255 ,0.0 ,0.0 ,0.4205 ,0.0055 ,0.9924999999999999 ,0.9955 ,0.0 ,0.0 ,0.0045000000000000005 ,0.614 ,0.9884999999999999 ,0.993 ,0.0975 ,0.016 ,0.001 ,0.0 ,0.0005 ,0.0 ,0.0005 ,0.0 ,0.001 ,0.0 ,0.001 ,0.0005 ,0.0 ,0.0 ,0.0 ,0.0255 ,0.005 ,0.0 ,0.10400000000000001 ,0.0225 ,0.001 ,0.091 ,0.026 ,0.009 ,0.001 ,0.0015 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0005 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0135 ,0.023 ,0.0115 ,0.3385 ,0.0015 ,0.274 ,0.3655 ,0.001 ,0.0015 ,0.0 ,0.0615 ,0.133 ,0.9724999999999999 ,0.855 ,0.0105 ,0.0655 ,0.0 ,0.0 ,0.0005 ,0.0 ,0.0195 ,0.0015 ,0.0 ,0.020999999999999998 ,0.7215 ,0.011000000000000001 ,0.9585 ,0.9590000000000001 ,0.0 ,0.0525 ,0.22849999999999998 ,0.0005 ,0.0005 ,0.001 ,0.001 ,0.0015 ,0.07100000000000001 ,0.0 ,0.010499999999999999 ,0.001 ,0.002 ,0.0065 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.004 ,0.0 ,0.043 ,0.06 ,0.0 ,0.0 ,0.0045000000000000005 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0235 ,0.0015 ,0.003 ,0.044 ,0.0005 ,0.0 ,0.0 ,0.0005 ,0.0 ,0.965 ,0.0335 ,0.0 ,0.0 ,0.0005 ,0.0025 ,0.0005 ,0.0 ,0.0 ,0.228 ,0.23 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0005 ,0.042499999999999996 ,0.6214999999999999 ,0.02 ,0.0 ,0.0005 ,0.0 ,0.0 ,0.0 ,0.002 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0005 ,0.0 ,0.0 ,0.0 ,0.0015 ,0.0 ,0.119 ,0.062 ,0.29700000000000004 ,0.991 ,0.994 ,0.3205 ,0.0045000000000000005 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.00
…

Low-complexity and repeat annotations (SEG, DUST, TRF).

Sample from static/download/
— ComplexitySeg.tsv —
Protein ID	Start	End
AADACL2-202	2	14
AADACL2-201	2	14
ACE2-206	2	11
ACE2-201	2	11
ADGRA2-202	2	36
ARHGEF2-221	2	8

— ComplexityDust.tsv —
Protein ID	Start	End
UBE2J2-211	2	23
ACAP3-202	2	27
FNDC10-201	2	75
CAMTA1-201	2	22

— ComplexityTrf.tsv —
Protein ID	Start	End
MS4A7-208	1	1
MS4A7-201	1	1
IRF8-212	1	1
PLCB1-203	1	1

Germline polymorphism, disease tracks (OMIM, ClinVar), and dbNSFP-style pathogenicity scores per variant.

Sample from static/download/
— Polymorphism.tsv —
Protein ID	Mutation	Position	Type

— OMIM_Disease.tsv —
Protein ID	Mutation	Position	Disease	dbSNP	FTId

— ClinVar.tsv —
Protein ID	Position	Mutation	Disease	ClinicalSignificance	RCVaccession	dbSNP	MIMID
SCN1B-206	1	M1L	Brugada syndrome 5	Uncertain	RCV000578768|RCV001215961|RCV002420551	1375857363	612838
SCO2-203	1	M1I	-	Uncertain	RCV003033856		nan
SCO2-202	1	M1T	COX deficiency, infantile mitochondrial myopathy	Pathogenic	RCV000985024	1603441682	604377
SCN2A-214	1	M1L	developmental and epileptic encephalopathy 11	Pathogenic	RCV000677679	1553564139	613721

— PathogenicityPredictors.tsv —
Protein ID	Position	protein_variant	AlphaMissense	ClinPred	ESM1b	EVE	Polyphen2_HDIV	Polyphen2_HVAR	PrimateAI	SIFT	VARITY_ER_LOO	VARITY_R_LOO	gMVP

PDB links and Pfam domain intervals.

Sample from static/download/
— PDB.tsv —
Protein ID	PDBs

— Pfam.tsv —
Protein ID	alignment_start	alignment_end	envelope_start	envelope_end	hmm_acc	hmm_name	type	hmm_start	hmm_end	hmm_length	bit_score	e_value	significance	clan

— Disordered_PDB_regions.tsv —
accession	gene_name	chromosome	dataset	region_identifier	start	end	source_db	category	count_total	count_disordered	count_ordered	disorder_combined
A2M-201	A2M	chr12	Disordered+PDB	6TAVD (1–23)	1	23	TCGA	Cancer	2	2	0	1.0
A2M-201	A2M	chr12	Disordered+PDB	6TAVC (1–23)	1	23	TCGA	Cancer	2	2	0	1.0
A2M-201	A2M	chr12	Disordered+PDB	6TAVB (1–23)	1	23	TCGA	Cancer	2	2	0	1.0
A2M-201	A2M	chr12	Disordered+PDB	6TAVA (1–23)	1	23	TCGA	Cancer	2	2	0	1.0

Disorder (IUPred, Anchor, MobiDB), AIUPred binding propensity vectors, and AlphaFold-related fields.

Sample from static/download/
— IUPred.tsv —
Protein ID	IUPred scores

— Anchor.tsv —
Protein ID	Anchor scores

— AIUPred_Binding.tsv —
Protein ID	AIUPred binding scores

— MobiDB.tsv —
Protein ID	Regions	Content Fraction	Content Count

— Alphafold.tsv —
Protein ID	PLLDT scores

Per-position conservation scores.

Sample from static/download/
— Conservation_Scores.tsv —
Protein ID	Organism Level	Conservation Score

Somatic mutations: TCGA, legacy TCGA (COSMIC-named files), and cBioPortal.

Sample from static/download/
— TCGA_Missense.tsv —
Protein ID	Phenotype	Mutation	Position	Cancer Type	Cancer Name	Sample ID

— COSMIC_Missense.tsv —
Protein ID	Phenotype	Mutation	Position	Cancer Type	Cancer Name	Sample ID

— CBioportal_Missense.tsv —
Protein ID	Phenotype	Mutation	Position	Cancer Type	Cancer Name	Sample ID

— CBioportal_Frameshift.tsv —
Protein ID	Phenotype	Mutation	Position	Cancer Type	Cancer Name	Sample ID

— CBioportal_Indel.tsv —
Protein ID	Phenotype	Mutation	Position	Cancer Type	Cancer Name	Sample ID

ELM motifs, PEM core motifs, ELM switches, and PTM sites (see also disorder / binding tabs for ScanSite and related tracks).

Sample from static/download/
— ELM.tsv —
Protein ID	ELM_Accession	ELMType	ELMIdentifier	Start	End	References	Methods	InstanceLogic	PDB	Organism

— ELM_Switches.tsv —
Protein ID	Switch_ID	Status	Interaction_ID	Intramolecular	ID_A	Bindingsite_A_ID	Bindingsite_A_Start	Bindingsite_A_End	ID_B	Bindingsite_B_ID	Bindingsite_B_Start	Bindingsite_B_End	Affected_interactor	Switch_type	Switch_subtype	Switch_mechanism	Switch_direction	Switch_outcome_direction	Switch_outcome	Modification	Modification_sites	Modifying_enzymes	Effector	Cell_cycle_phase	Localisation	Pathway	PMID

— PTM.tsv —
Protein ID	Position	Type	Database

UniProt-derived regions and binding annotations.

Sample from static/download/
— ROI_UniProt.tsv —
Protein ID	Start	End	Note	Evidence

— Binding_UniProt.tsv —
Protein ID	Position	Note	Evidence

Interaction resources (DIBS, MFIB) and binding-domain summaries.

Sample from static/download/
— dibs.tsv —
Protein ID	DIBS_ID	start	end

— mfib.tsv —
Protein ID	MFIB_ID	start	end

— binding.tsv —
Protein ID	BINDING_ID	start	end

Phase separation calls from PhasePro.

Sample from static/download/
— phasepro.tsv —
Protein ID	PHASEPRO_ID	start	end

Significantly mutated regions (iSimpre).

Sample from static/download/
— ISimpre_sig_mutated.tsv —
Protein ID	Start	End	Sig Cancer Types	Cancer Types	Method

sciencePer-protein and positional data

Download the full annotation table for one protein from the Summary page, or retrieve the same slices via REST from the API page. Positional exports (.txt / .json) are available from the sequence view on Summary.

Mutation × annotation region tables

Tab-separated joins: ClinVar (all clinical significance classes) + somatic cohort variants overlapping MobiDB, ELM, Pfam, MFIB, DIBS, PhasePro

Each file is tab-separated (UTF-8). One row = one variant whose position falls inside one annotated interval (the same variant may appear on multiple rows if it overlaps several regions). Rows from ClinVar include disease names, clinical significance, and identifiers where available. Rows from somatic cohorts include variant class (missense, frameshift, indel), data source, and tumour / sample context fields.

table_chartColumns (all files)

ColumnMeaning
gencode_accessionGENCODE protein accession (DisCanVis primary key, links to summary URLs).
gene_nameHGNC gene symbol where available.
uniprot_accessionUniProt accession on the protein record.
position1-based residue position of the variant on the canonical isoform.
mutation_aaAmino-acid change or variant label as stored (e.g. missense notation).
variant_originclinvar = ClinVar disease rows; somatic = cohort somatic rows (Mutation* tables: TCGA, COSMIC, cBioPortal, …).
somatic_variant_classmissense | frameshift | indel for somatic rows; empty for ClinVar.
somatic_databaseSource label from the somatic record; empty for ClinVar.
clinical_significanceClinVar clinical significance (pathogenic, benign, uncertain, etc.); empty for somatic.
disease_or_cancer_labelClinVar disease name or somatic cancer_name / cohort label.
sample_or_rcv_idClinVar RCV accession(s) or somatic matchable_sample_id.
db_snpdbSNP rs id when present (ClinVar); empty for somatic in this export.
region_layerexperimental_disorder | elm | pfam | mfib | dibs | phasepro.
region_startStart of the overlapping annotation interval (1-based, inclusive).
region_endEnd of the overlapping annotation interval (1-based, inclusive).
region_feature_idStable id where applicable (ELM accession, Pfam hmm_acc, binding region name).
region_feature_labelHuman-readable type or name (ELM class|id, Pfam domain name, binding layer tag).
extra_noteSomatic phenotype field when set; otherwise empty.
Preview (first lines)
— mutations_x_experimental_disorder.tsv —
gencode_accession	gene_name	uniprot_accession	position	mutation_aa	variant_origin	somatic_variant_class	somatic_database	clinical_significance	disease_or_cancer_label	sample_or_rcv_id	db_snp	region_layer	region_start	region_end	region_feature_id	region_feature_label	extra_note
ABCC9-201	ABCC9	O60706	665	A665T	clinvar			Benign	dilated cardiomyopathy 1O	RCV000640321|RCV003162879	200891785	experimental_disorder	665	665		MobiDB experimental segment	
ABCC9-202	ABCC9	O60706	665	A665T	clinvar			Benign	dilated cardiomyopathy 1O	RCV000640321|RCV003162879	200891785	experimental_disorder	665	665		MobiDB experimental segment	
ABCC9-215	ABCC9	O60706	665	A665T	clinvar			Benign	dilated cardiomyopathy 1O	RCV000640321|RCV003162879	200891785	experimental_disorder	665	665		MobiDB experimental segment	
ABL1-201	ABL1	P00519	1021	R1021Q	clinvar			Benign	-	RCV002720250		experimental_disorder	1021	1021		MobiDB experimental segment	
ABL1-202	ABL1	P00519	1020	A1020T	clinvar			Uncertain	-	RCV001992307		experimental_disorder	1020	1020		MobiDB experimental segment	
ABL1-202	ABL1	P00519	1020	A1020V	clinvar			Uncertain	-	RCV001768291		experimental_disorder	1020	1020		MobiDB experimental segment	
FANCM-201	FANCM	Q8IYD8	1814	E1814K	clinvar			Uncertain	Fanconi anemia complementation group A	RCV000989214|RCV001061433|RCV001593165	139074680	experimental_disorder	1814	1814		MobiDB experimental segment	
ABCC9-218	ABCC9	O60706	665	A665T	clinvar			Benign	dilated cardiomyopathy 1O	RCV000640321|RCV003162879	200891785	experimental_disorder	665	665		MobiDB experimental segment	
AFF1-201	AFF1	P51825	758	P758Q	clinvar			Uncertain	-	RCV002674519		experimental_disorder	758	758		MobiDB experimental segment	
AFF1-211	AFF1	P51825	758	P758Q	clinvar			Uncertain	-	RCV002674519		experimental_disorder	758	758		MobiDB experimental segment	
AKT1-206	AKT1	P31749	460	T460P	clinvar			Pathogenic	Cowden syndrome 6	RCV000033178	397514645	experimental_disorder	460	460		MobiDB experimental segment