Proteogenomic characterization of human colon and rectal cancer

Zhang, B., et al., Nature (2014) doi:10.1038/nature13438 Published online

Proteomes of colon and rectal tumors previously characterized by the Cancer Genome Atlas (TCGA) were analyzed along with integrated proteogenomic analyses. Ninety-five TCGA tumor samples were used in this study from 90 patients, with 5 samples representing different portions from the same tumor. The samples originated from two TCGA cohorts: 64 are from Colon Adenocarcinoma (COAD) samples and 31 are from the Rectum Adenocarcinoma (READ) collection. The primary data from the liquid chromatography-tandem mass spectrometry (LC-MS/MS) global proteomic profiling of each tumor sample is associated with a data set in the table below. Three protein assemblies are provided from different protein database searches.

This assembly reflects the 95 TCGA samples for 90 tumors, spanning three search engines (MS-GF+, MyriMatch, and Pepitome), employing a standard RefSeq database and NIST spectral library.

The Custom assembly represents the same samples, the sequence database has been augmented with nonsynonymous sequence variants detected by TCGA.

The third assembly contains the searches of the 95 TCGA samples employed in the first assembly alongside the 60 normal colons analyzed by Vanderbilt University as a control, employing the same three search engines, a standard RefSeq database, and a NIST spectral library.

Peptide spectrum matches from the TCGA_VU_N95_Custom IDPicker3 protein assembly were converted to protein BAM (proBAM) format for visualization, available in the metadata folder below.

COAD tumor sample genomic data can be downloaded from here.
READ tumor sample genomic data can be downloaded from here.

Normal colon epithelium sample mass spectrometry data can be downloaded from here.
Mass spectrometry data for comparison and reference (CompRef) sample standards run with this study can be downloaded from here.
Peptide-Spectrum-Matches and Protein Reports from the CPTAC Common Data Analysis Pipeline (CDAP) can be downloaded from here.
Network analysis of this study can be viewed at the NetGestalt CRC Portal here.

This work was accomplished by the Proteome Characterization Center (PCC) at Vanderbilt University led by Dr. Daniel C. Liebler.

Please include this attribution in publications:
“Data used in this publication were generated by the Clinical Proteomic Tumor Analysis Consortium (NCI/NIH).”


Clinical Data for TCGA Cancer Proteome Study of Colorectal Tissue

Data Types Available for Download

(ALL): Selection of this box downloads all data in the row
(raw): The original mass spectrometry(MS) instrument files
(mzML): HUPO-PSI standard raw data files generated from the original MS instrument files
(PSM): Peptide-Spectrum Match data
(prot): Protein assembly data and protein relative abundance
(meta): Clinical data files, mapping of biospecimens to iTRAQ labels or TMT10 labels (where applicable), folder and file naming conventions
Checksum files are included in all downloads for verification.

Data Sets


Data set name

VU Vanderbilt University Daniel C. Liebler, Ph.D.