Data Sources

The BMEG is an expanding resource of interconnected data. Sources include:

Resource Description Contains Resource License References
TCGA The Cancer Genome Atlas (TCGA) profiles the DNA, RNA, protein, and epigenetic levels of over 10,000 individuals across 33 cancer types.
MC3 The Cancer Genome Atlas (TCGA) cancer genomics dataset includes over 10,000 tumor-normal exome pairs across 33 different cancer types, in total >400 TB of raw data files requiring analysis. https://www.ncbi.nlm.nih.gov/pubmed/29596782
GTEx The Genotype-Tissue Expression (GTEx) project. https://www.ncbi.nlm.nih.gov/pubmed/23715323
PharmacoDB An integrative database for mining drug sensitivty data for over 650K experiemnts involving 1600 cancer cell lines and 750 compounds. https://www.ncbi.nlm.nih.gov/pubmed/30053271
CCLE https://depmap.org/portal/ccle/terms_and_conditions https://www.ncbi.nlm.nih.gov/pubmed/22460905
CTRP https://ocg.cancer.gov/programs/ctd2/using-ctd2-data https://www.ncbi.nlm.nih.gov/pubmed/23993102
GDSC CC BY-NC-ND 2.5 https://www.ncbi.nlm.nih.gov/pubmed/23180760
CCLE The Cancer Cell Line Encyclopedia(CCLE): gene expression, chromosomal copy number and massively parallel sequencing data from 947 human cancer cell lines. https://depmap.org/portal/ccle/terms_and_conditions https://www.ncbi.nlm.nih.gov/pubmed/22460905
Cell Model Passports The Cell Model Passports provides manual and programmatic access to a cancer cell model database containing curated patient, sample and model relationship information as well as genomic and functional datasets. https://cellmodelpassports.sanger.ac.uk/documentation/cellmodelpassports/datasets
Ensembl The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial release of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. https://uswest.ensembl.org/info/about/legal/disclaimer.html https://www.ncbi.nlm.nih.gov/pubmed/21045057
GO Gene Ontology Consortium is a controlled vocabulary describing knowledge of gene and protein roles in cells. CC BY 4.0 http://www.ncbi.nlm.nih.gov/pubmed/10802651
MonDO The Monarch Disease Ontology merges in multiple disease resources to yield a coherent ontology. CC BY 3.0 https://www.biorxiv.org/content/10.1101/048843v3
MSigDB The Molecular Signatures Database is a collection of annotated gene sets. CC BY 4.0 https://www.ncbi.nlm.nih.gov/pubmed/21546393
PFAM Pfam is a database of protein families and annotations. CC0 https://www.ncbi.nlm.nih.gov/pubmed/30357350
PubChem PubChem is a public repository containing information about chemical substances and their biological activities. https://www.ncbi.nlm.nih.gov/pubmed/26400175
PubMed PubMed is an archive of biomedical and life sciencesjournal literature. https://www.nlm.nih.gov/databases/download/terms_and_conditions.html
VICC G2P The VICC G2P is a framework for aggregating and harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations covering 3,437 unique variants in 415 genes, 357 diseases, and 791 drugs. https://www.biorxiv.org/content/10.1101/366856v1
CGI CC0
CIViC CC0
JaxCKB CC BY-NC-SA 4.0
MolecularMatch https://www.molecularmatch.com/terms/
OnkoKB https://oncokb.org/terms
PMKB CC BY 4.0
Pathway Commons Pathway Commons integrates public pathway and interaction databases. https://www.ncbi.nlm.nih.gov/pubmed/21071392
BIND Open Access https://www.ncbi.nlm.nih.gov/pubmed/12519993
BioGRID MIT https://www.ncbi.nlm.nih.gov/pubmed/16381927
CORUM https://www.helmholtz-muenchen.de/en/data-protection-statement/index.html https://www.ncbi.nlm.nih.gov/pubmed/30357367
CTD http://ctdbase.org/about/legal.jsp https://www.ncbi.nlm.nih.gov/pubmed/27651457
DIP CC BY-ND 3.0 https://www.ncbi.nlm.nih.gov/pubmed/14681454
HPRD http://hprd.org/download https://www.ncbi.nlm.nih.gov/pubmed/18988627
HumanCyc SRI https://www.ncbi.nlm.nih.gov/pubmed/15642094
INOH CC BY-SA https://www.ncbi.nlm.nih.gov/pubmed/22120663
IntAct Open Access https://www.ncbi.nlm.nih.gov/pubmed/24234451
KEGG Pathway https://www.ncbi.nlm.nih.gov/pubmed/23433509
NetPath CC0 https://www.ncbi.nlm.nih.gov/pubmed/20067622
Panther GNU GPL V2 https://www.ncbi.nlm.nih.gov/pubmed/27899595
PhosphositePlus CC BY-NC-SA 3.0 https://www.ncbi.nlm.nih.gov/pubmed/25514926
Protein Interaction Database https://www.ncbi.nlm.nih.gov/pubmed/18832364
Reactome CC0 https://www.ncbi.nlm.nih.gov/pubmed/29145629
Recon X CC BY-NC 2.0 https://www.ncbi.nlm.nih.gov/pubmed/23455439
WikiPathways CC0 https://www.ncbi.nlm.nih.gov/pubmed/18651794
DGIDB The Drug Gene Interaction Database integrates public interaction databases. https://www.ncbi.nlm.nih.gov/pubmed/29156001
CancerCommons http://www.ncbi.nlm.nih.gov/pubmed/21307913
ChEMBL Interactions CC BY-SA 3.0 https://www.ncbi.nlm.nih.gov/pubmed/24214965
Clearity Foundation: Biomarkers http://www.clearityfoundation.org/healthcare-pros/drugs-and-biomarkers.aspx
Clearity Foundation: Clinical Trials https://www.clearityfoundation.org/form/findtrials.aspx
DoCM CC BY 4.0 http://www.ncbi.nlm.nih.gov/pubmed/27684579
FDA Biomarkers https://www.fda.gov/drugs/science-research-drugs/table-pharmacogenomic-biomarkers-drug-labeling
Guide to Pharmacology: Interactions http://www.ncbi.nlm.nih.gov/pubmed/24234439
My Cancer Genome https://www.mycancergenome.org/content/page/legal-policies-licensing/ https://www.mycancergenome.org/
NCI Cancer Gene Index https://wiki.nci.nih.gov/display/cageneindex
TALC http://www.ncbi.nlm.nih.gov/pubmed/24377743
TDG Clincal Trials http://www.ncbi.nlm.nih.gov/pubmed/24016212
TEND http://www.ncbi.nlm.nih.gov/pubmed/21804595
TTD http://www.ncbi.nlm.nih.gov/pubmed/19933260