Skip to content

Latest commit

 

History

History
53 lines (35 loc) · 3.42 KB

choosing-a-study-cohort.md

File metadata and controls

53 lines (35 loc) · 3.42 KB

Choosing a study/cohort

General recommendations

We recommend the TCGA Pan-Cancer (PANCAN) study for most analysis. Unless you need a specific type of data or need to run a type of analysis listed below, we recommend the TCGA Pan-Cancer (PANCAN) study.

{% hint style="info" %} TCGA Pan-Cancer (PANCAN) study {% endhint %}

Why do we recommend this study?

We recommend it because it has the data from the Cancer Genome Atlas (TCGA) Research Network, which generated the most comprehensive cross-cancer analysis to date: The Pan-Cancer Atlas. Xena displays the curated genomics and clinical data generated by the Pan-Cancer Atlas consortium working groups.

Note that if you use the TCGA Pan-Cancer (PANCAN) to study a specific cancer type, you will need to filter down to just that cancer type.

If you don't want to filter ...

Our second most recommended datasets are the cancer-specific GDC TCGA studies. These avoid the need to filter down to a single cancer type and contain harmonized data from the Genomic Data Commons.

{% hint style="info" %} GDC Data Hub {% endhint %}

Differences between the GDC and the legacy TCGA data

More information comparing the data in the GDC to the legacy TCGA data can be found here:

{% embed url="https://gdc.cancer.gov/about-data/publications/HG38QC" %}

Choosing a study by type of data

The table below assumes that you are interested in TCGA data. These data types may also appear in other studies, but these are the recommended studies.

Data type Study Dataset name Menu
Transcript expression TCGA Pan-Cancer (PANCAN) TOIL Transcript expression Advanced
lncRNA expression TCGA Pan-Cancer (PANCAN) TOIL Gene expression Advanced
Exon expression legacy TCGA datasets (per cancer type) Exon expression Advanced
miRNA expression TCGA Pan-Cancer (PANCAN) Batch Effects normalized miRNA data Advanced
DNA methylation Any DNA methylation Advanced
ATAC-seq GDC Pan-Cancer (PANCAN) ATAC-seq Advanced
Varied Survival endpoints TCGA Pan-Cancer (PANCAN) NA (run KM plot) --

Choosing a study based on a specific analysis or sample type

Analysis Study
Compare Tumor vs Normal TCGA, TARGET, GTEx
GRCh38 coordinates Any GDC study
Cell Line CCLE
Disease specific survival, disease free survival, progression free survival TCGA Pan-Cancer (PANCAN)