Nikolaus Schultz
Marie Josée and Henry R. Kravis
Center for Molecular Oncology
Memorial Sloan Kettering Cancer Center
October 13, 2015
Visualization and Analysis
of Cancer Genomics Data
Cost of DNA Sequencing is dropping rapidly
The Hallmarks of Cancer
Hanahan and Weinberg. Cell. March 4 2011.
Cancer is a class of diseases in which a group of cells display:
uncontrolled growth
invasion that intrudes upon and destroys adjacent tissues, and
sometimes metastasis (spreading to other locations in the body via lymph or blood)
Many of these mechanisms are
known, many not. Only some
are treatable.
All these properties are caused
by genetic or epigenetic
alterations.
Can we identify the responsible
alterations in the genomes of
cancer patients?
The Hallmarks of Cancer
Cancer is a class of diseases in which a group of cells display:
uncontrolled growth
invasion that intrudes upon and destroys adjacent tissues, and
sometimes metastasis (spreading to other locations in the body via lymph or blood)
Tumor development / Drivers versus passengers
How does a cancer cell acquire all these
different alterations?
Sequential accumulation of genomic
alterations that confer a growth
advantage (like in evolution, but faster).
Certain early events can increase the rate
of accumulation, like mutations in DNA
damage repair genes or cell-cycle
checkpoint genes (or mutagens).
Over time, many alterations develop. The
ones that confer a growth advantage are
called “drivers”, all others are
“passengers”. Can we distinguish between
them?
Identification of functional alterations in genomic data
- per disease - per gene
- per patient - per pathway
Different, recurrent ways to alter the same pathway /
process?
Many events are rare, so we need hundreds of samples
of the same disease (sub-)type to find them based on
recurrence!
Clinical applications:
Development of new prognostic tools
Identification of new treatment options
Patient-specific treatment
Utility of cancer genomics data
bioinformaticians
biologists
clinicians
2010 2011 2012 2013
Kidney clear cell
Endometrial cancer
Thyroid cancer
Head & neck squamous
Lung squamous cell carcinoma
Colorectal cancer
Breast cancer
Low grade glioma
2014
GBM Phase II
Bladder cancer
Lung adenocarcinoma
Melanoma
Prostate cancer
Stomach adenocarcinoma
+ lobular breast cancer, chromophobe kidney, papillary kidney,, pancreatic, rare tumors
…
500 samples per tumor type  10,000 tumor / normal pairs total
The Cancer Genome Atlas Project History
20092008
Ovarian cancer
GBM
AML
Cervical
Liver
Sarcoma
2015
Cancer Cell Line Encyclopedia (CCLE)
Broad Institute, Sanger, Washington University, etc.
Tumor sequencing in hospitals (MSKCC 500 per month)
Sources of tumor sequencing data
10,000 tumors
6,000 tumors
1,000 cell lines
5,000 tumors
>15,000 tumors
Raw data (FASTQ / BAM files)
dbGaP, CGHub, ICGC Data Portal
Processed data (gene level data, mutation calls)
TCGA Data Portal, ICGC Data Portal, SupplementaryTables
Data slices (subsets of processed data)
Data visualization and analysis tools
Data availability
bioinformaticians
biologists, clinicians
Raw data (FASTQ / BAM files)
dbGaP, CGHub, ICGC Data Portal
Processed data (gene level data, mutation calls)
TCGA Data Portal, ICGC Data Portal, SupplementaryTables
Data slices (subsets of processed data)
Data visualization and analysis tools
Data availability
bioinformaticians
biologists, clinicians
Reduction of complexity!
Most mutations found in cancer are “passengers”
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Driver alteration frequencies per tumor type
Driver alteration frequencies per tumor type
Rec L domain Furin-like Rec L domain Kinase domain
ERBB2 mutation hotspots across cancer types
Rec L domain Furin-like Rec L domain Kinase domain
ERBB2 mutation hotspots across cancer types
signal
noise
ERBB2 mutation hotspots across cancer types
S310F
Bladder: 1
Breast: 3
Cervical: 1
Colorectal: 2
Lung adeno: 2
Ovarian: 2
Stomach: 1
CCLE: 1 (bladder)
L755S/M/P/W
Breast: 4
Colorectal: 2
Endometrial: 1
Kidney (pap): 1
Melanoma: 1
Stomach: 1
CCLE: 3 (colorectal,
stomach, brain)
V777L/A
Breast: 1
Colorectal: 2
GBM: 2
V842I
Breast: 1
Colorectal: 4
Endometrial: 2
CCLE: 4 (Lung,
ovarian,
endometrial)
R678Q
Breast: 1
Colorectal: 1
Endometrial: 1
Stomach: 2
CCLE: 1 (colorectal)
774-776ins
Lung adeno: 6
CCLE: 1 (lung)
Rec L domain Furin-like Rec L domain Kinase domain
ERBB2 mutation hotspots across cancer types
S310F
Bladder: 1
Breast: 3
Cervical: 1
Colorectal: 2
Lung adeno: 2
Ovarian: 2
Stomach: 1
CCLE: 1 (bladder)
L755S/M/P/W
Breast: 4
Colorectal: 2
Endometrial: 1
Kidney (pap): 1
Melanoma: 1
Stomach: 1
CCLE: 3 (colorectal,
stomach, brain)
V777L/A
Breast: 1
Colorectal: 2
GBM: 2
V842I
Breast: 1
Colorectal: 4
Endometrial: 2
CCLE: 4 (Lung,
ovarian,
endometrial)
R678Q
Breast: 1
Colorectal: 1
Endometrial: 1
Stomach: 2
CCLE: 1 (colorectal)
774-776ins
Lung adeno: 6
CCLE: 1 (lung)
Rec L domain Furin-like Rec L domain Kinase domain
Greulich et al.
PNAS 2012.
Kancha et al.
PLoS ONE 2011.
Bose et al.
Cancer Discovery 2012.
Bose et al.
Cancer Discovery 2012.
Bose et al.
Cancer Discovery 2012.
cBioPortal for Cancer Genomics: Data to knowledge
Tumor DNA DNA sequencer,
microarrays …
Tumor and normal
sequences
Data
Intuitive interface, quick response time, reduction of complexity
Alteration types and thresholds can be customized for
each gene.
Reduction of complexity: Event calls
Which genes are altered in which samples?
cBioPortal
Data visualization and exploration in cBioPortal
Clinical
MSK-IMPACT
Genomicdata
CMO
Research
Foundation
Medicine
Clinical
MSK-IMPACT
Genomicdata
CMO
Research cBioPortal
Data visualization and exploration in cBioPortal
TCGA, ICGC
Other
public data
Foundation
Medicine
Clinical
MSK-IMPACT
Genomicdata
CMO
Research cBioPortal
Foundation
Medicine
TCGA, ICGC
Other
public data
MSKCC
clinical data
Data visualization and exploration in cBioPortal
Clinical
MSK-IMPACT
Genomicdata
CMO
Research cBioPortal
OncoKB: Annotation of variant effects, treatment
Foundation
Medicine
TCGA, ICGC
Other
public data
Clinical annotation
Step 1: Manual
Step 2: Automated via
institutional databases
MSKCC
clinical data
OncoKB
Knowledgebase
of oncogenic mutations
Variant effect
NCCN guidelines
Standard therapy
Investigational
therapy
Clinical trials
Live demonstration of cBioPortal
http://cbioportal.org/
cBioPortal usage and interest
cbioportal.org
>5,000 unique users per week, doubling every year
cBioPortal usage and interest
cbioportal.org
>5,000 unique users per week, doubling every year
Numerous academic installations of cBioPortal:
Dana-Farber, Princess Margaret, CHOP, Weill Cornell, Fred Hutchinson, UCSC, Columbia,
NYU, NY Genome Center, British Columbia, University of Michigan, SickKids, Vanderbilt,
Emory, UNC, University of Pittsburgh, CRUK, EMBL, Charite Berlin, institutions in Japan,
China, …
Interest by several people to modify or customize the code, and to contribute new
features
Interest by pharmaceutical companies and others to use cBioPortal
● For internal data analysis (large pharma)
● In customer-facing applications (smaller service companies)
Switch to open source
cBioPortal source code is available via GitHub:
https://github.com/cBioPortal/cbioportal
AGPL license v3 (Affero GPL):
A GPL variant, main difference is that redistribution over a network triggers
the copyleft requirements
Impact on cancer research, patient treatment, drug development through:
• More robust and flexible software
• Accelerated development of new features
• Wider user base, collaborative culture
Core cBioPortal Development group
Memorial Sloan Kettering Cancer Center
Nikolaus Schultz, Chris Sander, Benjamin Gross, JJ Gao
Dana Farber Cancer Institute
Ethan Cerami
Princess Margaret Cancer Centre
Trevor Pugh, Stuart Watt
Re-uniting two cBioPortal founders
Coordination of architectural decisions, feature development, merges, etc.
TheHyve now offering commercial services around cBioPortal
Summary
Rapidly growing body of cancer genomics data (public and private)
Reduction of complexity can make these data accessible and interpretable
cBioPortal allows access to cancer genomics data sets:
cbioportal.org: public site
via GitHub: install local versions
cBioPortal is now fully open source
software
data pipelines and data sets coming soon
commercial support available
Still exploring pre-competitive funding options
Acknowledgements
CMO
Cyriac Kandoth
William Lee
Rajmohan Murali
Nicholas D. Socci
BarryTaylor
Michael Berger
Agnes Viale
David B. Solit
MichaelTrapani
Ederlinda Paraiso
Molecular Diagnostics
Ahmet Zehir
Aijaz Syed
Donavan Cheng
Michael Berger
Maria Arcila
Marc Ladanyi
Information Systems
Mike Eubanks
Stu Gardos
cBioPortal
JianJiong Gao
Benjamin Gross
Yichao Sun
Hongxin Zhang
Fred Criscuolo
Dong Li
Adam Abeshouse
Ritika Kundra
Annice Chen
Chris Sander
Onur Sumer
Arman Aksoy
Ethan Cerami
Knowledgebase
Debyani Chakravarty
Sarah Phillips
Julia Rudolph
Bioinformatics Core
Joanne Edington
Demo slides
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz
End of Live Demo

More Related Content

PPT
Molecular basis of Cancer
PPT
GKA deel 1 college 15
PDF
Cancer Genetics - Denise Sheer
PDF
Introduction cancer genetics and genomics
PPTX
Molecular Genetics of Cancer
PPT
Cancer and genetic influences
PPTX
Cancer genome (2)
PPTX
Cancer genetics [autosaved]
Molecular basis of Cancer
GKA deel 1 college 15
Cancer Genetics - Denise Sheer
Introduction cancer genetics and genomics
Molecular Genetics of Cancer
Cancer and genetic influences
Cancer genome (2)
Cancer genetics [autosaved]

What's hot (20)

PPTX
Genetic basis of cancer
PPT
7.Cancer Genetics.Oct.09
PPTX
An Overview of Cancer Genetics
PDF
Dr Ludmil B. Alexandrov - Carcinogenesis Young Investigator Award 2016
PPTX
Cancer genetics
PPTX
Cancer genetics
PPTX
The Genetics of Cancer
PPTX
Neoplasia-Molecular basis of cancer
PPT
Cancer Genes And Growth Factors
PPTX
Hallmarks of Cancer
DOCX
Genetics of cancer
PPTX
Oncogenes and tumour suppressor genes
PDF
ONCOGENE AND TUMOUR SUPPRESSOR GENE
PPT
Genitics of cancer
PPT
Oncogene activation
PPT
Oncogenes&Cancer
PPT
Biologia del Cancer
PPTX
Cancer genetics and diagnosis
Genetic basis of cancer
7.Cancer Genetics.Oct.09
An Overview of Cancer Genetics
Dr Ludmil B. Alexandrov - Carcinogenesis Young Investigator Award 2016
Cancer genetics
Cancer genetics
The Genetics of Cancer
Neoplasia-Molecular basis of cancer
Cancer Genes And Growth Factors
Hallmarks of Cancer
Genetics of cancer
Oncogenes and tumour suppressor genes
ONCOGENE AND TUMOUR SUPPRESSOR GENE
Genitics of cancer
Oncogene activation
Oncogenes&Cancer
Biologia del Cancer
Cancer genetics and diagnosis
Ad

Viewers also liked (6)

PPTX
my database design
PPTX
Presentatie maastricht
PPTX
A Vision for a Cancer Research Knowledge System
DOCX
X nuevo trabajo de word
PDF
Exploiting microRNAs for precision oncology
PDF
The Top Skills That Can Get You Hired in 2017
my database design
Presentatie maastricht
A Vision for a Cancer Research Knowledge System
X nuevo trabajo de word
Exploiting microRNAs for precision oncology
The Top Skills That Can Get You Hired in 2017
Ad

Similar to Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz (20)

PPT
Introduction to Cancer Genomics Databases
PPT
Dr g vassiliou
PPTX
Big Data Training for Cancer Research, Purdue, May 2023
PPTX
Omprn 2018 module1_final
PPTX
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
PPTX
Genomics and Computation in Precision Medicine March 2017
PPTX
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
PPTX
frank-s-ong-illumina-inc-usa.pptx
PDF
CDAC 2018 Gonzales-Perez interpretation of cancer genomes
PPTX
Cancer uk 2015_module1_ouellette_ver02
PDF
PPTX
Biocuration activities for the International Cancer Genome Consortium (ICGC).
PDF
Personalized Medicine Through Tumor Sequencing
PDF
Personalized Medicine through Tumor Sequencing
PPT
Gasctric cancer , Tcga Nature july 2015
PDF
Identification of cancer drivers across tumor types
PDF
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
PDF
Next-Generation Sequencing Clinical Research Milestones Infographic
PDF
Chapter 1 cancer genome
PDF
Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'
Introduction to Cancer Genomics Databases
Dr g vassiliou
Big Data Training for Cancer Research, Purdue, May 2023
Omprn 2018 module1_final
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Genomics and Computation in Precision Medicine March 2017
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
frank-s-ong-illumina-inc-usa.pptx
CDAC 2018 Gonzales-Perez interpretation of cancer genomes
Cancer uk 2015_module1_ouellette_ver02
Biocuration activities for the International Cancer Genome Consortium (ICGC).
Personalized Medicine Through Tumor Sequencing
Personalized Medicine through Tumor Sequencing
Gasctric cancer , Tcga Nature july 2015
Identification of cancer drivers across tumor types
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Next-Generation Sequencing Clinical Research Milestones Infographic
Chapter 1 cancer genome
Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'

More from Pistoia Alliance (20)

PDF
Fairification experience clarifying the semantics of data matrices
PPTX
MPS webinar master deck
PPTX
Digital webinar master deck final
PDF
Heartificial intelligence - claudio-mirti
PDF
Fair by design
PDF
Knowledge graphs ilaria maresi the hyve 23apr2020
PPTX
2020.04.07 automated molecular design and the bradshaw platform webinar
PDF
Data market evolution, a future shaped by FAIR
PPTX
AI in translational medicine webinar
PDF
CEDAR work bench for metadata management
PDF
Open interoperability standards, tools and services at EMBL-EBI
PDF
Fair webinar, Ted slater: progress towards commercial fair data products and ...
PDF
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
PPTX
Implementing Blockchain applications in healthcare
PPTX
Building trust and accountability - the role User Experience design can play ...
PPTX
Pistoia Alliance-Elsevier Datathon
PDF
Data for AI models, the past, the present, the future
PDF
PA webinar on benefits & costs of FAIR implementation in life sciences
PDF
AI & ML in Drug Design: Pistoia Alliance CoE
PDF
Ai in drug design webinar 26 feb 2019
Fairification experience clarifying the semantics of data matrices
MPS webinar master deck
Digital webinar master deck final
Heartificial intelligence - claudio-mirti
Fair by design
Knowledge graphs ilaria maresi the hyve 23apr2020
2020.04.07 automated molecular design and the bradshaw platform webinar
Data market evolution, a future shaped by FAIR
AI in translational medicine webinar
CEDAR work bench for metadata management
Open interoperability standards, tools and services at EMBL-EBI
Fair webinar, Ted slater: progress towards commercial fair data products and ...
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Implementing Blockchain applications in healthcare
Building trust and accountability - the role User Experience design can play ...
Pistoia Alliance-Elsevier Datathon
Data for AI models, the past, the present, the future
PA webinar on benefits & costs of FAIR implementation in life sciences
AI & ML in Drug Design: Pistoia Alliance CoE
Ai in drug design webinar 26 feb 2019

Recently uploaded (20)

PPTX
Introduction of Plant Ecology and Diversity Conservation
PDF
No dilute core produced in simulations of giant impacts on to Jupiter
PDF
The Physiology Of The Red Blood Cells pdf
PPTX
EPILEPSY UPDATE in kkm malaysia today new
PDF
The Future of Telehealth: Engineering New Platforms for Care (www.kiu.ac.ug)
PDF
Exploring PCR Techniques and Applications
PDF
ECG Practice from Passmedicine for MRCP Part 2 2024.pdf
PPTX
Cells and Organs of the Immune System (Unit-2) - Majesh Sir.pptx
PDF
Telemedicine: Transforming Healthcare Delivery in Remote Areas (www.kiu.ac.ug)
PPTX
complications of tooth extraction.pptx FIRM B.pptx
PPT
ecg for noob ecg interpretation ecg recall
PPTX
Spectroscopic Techniques for M Tech Civil Engineerin .pptx
PPT
Chapter 52 introductory biology course Camp
PDF
BCKIC FOUNDATION_MAY-JUNE 2025_NEWSLETTER
PPTX
Targeted drug delivery system 1_44299_BP704T_03-12-2024.pptx
PDF
2024_PohleJellKlug_CambrianPlectronoceratidsAustralia.pdf
PDF
Thyroid Hormone by Iqra Nasir detail.pdf
PPTX
Thyroid disorders presentation for MBBS.pptx
PDF
final prehhhejjehehhehehehebesentation.pdf
PDF
Microplastics: Environmental Impact and Remediation Strategies
Introduction of Plant Ecology and Diversity Conservation
No dilute core produced in simulations of giant impacts on to Jupiter
The Physiology Of The Red Blood Cells pdf
EPILEPSY UPDATE in kkm malaysia today new
The Future of Telehealth: Engineering New Platforms for Care (www.kiu.ac.ug)
Exploring PCR Techniques and Applications
ECG Practice from Passmedicine for MRCP Part 2 2024.pdf
Cells and Organs of the Immune System (Unit-2) - Majesh Sir.pptx
Telemedicine: Transforming Healthcare Delivery in Remote Areas (www.kiu.ac.ug)
complications of tooth extraction.pptx FIRM B.pptx
ecg for noob ecg interpretation ecg recall
Spectroscopic Techniques for M Tech Civil Engineerin .pptx
Chapter 52 introductory biology course Camp
BCKIC FOUNDATION_MAY-JUNE 2025_NEWSLETTER
Targeted drug delivery system 1_44299_BP704T_03-12-2024.pptx
2024_PohleJellKlug_CambrianPlectronoceratidsAustralia.pdf
Thyroid Hormone by Iqra Nasir detail.pdf
Thyroid disorders presentation for MBBS.pptx
final prehhhejjehehhehehehebesentation.pdf
Microplastics: Environmental Impact and Remediation Strategies

Pistoia Alliance US Conference 2015 - 1.5.4 New data - Nikolaus Schultz

  • 1. Nikolaus Schultz Marie Josée and Henry R. Kravis Center for Molecular Oncology Memorial Sloan Kettering Cancer Center October 13, 2015 Visualization and Analysis of Cancer Genomics Data
  • 2. Cost of DNA Sequencing is dropping rapidly
  • 3. The Hallmarks of Cancer Hanahan and Weinberg. Cell. March 4 2011. Cancer is a class of diseases in which a group of cells display: uncontrolled growth invasion that intrudes upon and destroys adjacent tissues, and sometimes metastasis (spreading to other locations in the body via lymph or blood)
  • 4. Many of these mechanisms are known, many not. Only some are treatable. All these properties are caused by genetic or epigenetic alterations. Can we identify the responsible alterations in the genomes of cancer patients? The Hallmarks of Cancer Cancer is a class of diseases in which a group of cells display: uncontrolled growth invasion that intrudes upon and destroys adjacent tissues, and sometimes metastasis (spreading to other locations in the body via lymph or blood)
  • 5. Tumor development / Drivers versus passengers How does a cancer cell acquire all these different alterations? Sequential accumulation of genomic alterations that confer a growth advantage (like in evolution, but faster). Certain early events can increase the rate of accumulation, like mutations in DNA damage repair genes or cell-cycle checkpoint genes (or mutagens). Over time, many alterations develop. The ones that confer a growth advantage are called “drivers”, all others are “passengers”. Can we distinguish between them?
  • 6. Identification of functional alterations in genomic data - per disease - per gene - per patient - per pathway Different, recurrent ways to alter the same pathway / process? Many events are rare, so we need hundreds of samples of the same disease (sub-)type to find them based on recurrence! Clinical applications: Development of new prognostic tools Identification of new treatment options Patient-specific treatment Utility of cancer genomics data bioinformaticians biologists clinicians
  • 7. 2010 2011 2012 2013 Kidney clear cell Endometrial cancer Thyroid cancer Head & neck squamous Lung squamous cell carcinoma Colorectal cancer Breast cancer Low grade glioma 2014 GBM Phase II Bladder cancer Lung adenocarcinoma Melanoma Prostate cancer Stomach adenocarcinoma + lobular breast cancer, chromophobe kidney, papillary kidney,, pancreatic, rare tumors … 500 samples per tumor type  10,000 tumor / normal pairs total The Cancer Genome Atlas Project History 20092008 Ovarian cancer GBM AML Cervical Liver Sarcoma 2015
  • 8. Cancer Cell Line Encyclopedia (CCLE) Broad Institute, Sanger, Washington University, etc. Tumor sequencing in hospitals (MSKCC 500 per month) Sources of tumor sequencing data 10,000 tumors 6,000 tumors 1,000 cell lines 5,000 tumors >15,000 tumors
  • 9. Raw data (FASTQ / BAM files) dbGaP, CGHub, ICGC Data Portal Processed data (gene level data, mutation calls) TCGA Data Portal, ICGC Data Portal, SupplementaryTables Data slices (subsets of processed data) Data visualization and analysis tools Data availability bioinformaticians biologists, clinicians
  • 10. Raw data (FASTQ / BAM files) dbGaP, CGHub, ICGC Data Portal Processed data (gene level data, mutation calls) TCGA Data Portal, ICGC Data Portal, SupplementaryTables Data slices (subsets of processed data) Data visualization and analysis tools Data availability bioinformaticians biologists, clinicians Reduction of complexity!
  • 11. Most mutations found in cancer are “passengers”
  • 15. Rec L domain Furin-like Rec L domain Kinase domain ERBB2 mutation hotspots across cancer types
  • 16. Rec L domain Furin-like Rec L domain Kinase domain ERBB2 mutation hotspots across cancer types signal noise
  • 17. ERBB2 mutation hotspots across cancer types S310F Bladder: 1 Breast: 3 Cervical: 1 Colorectal: 2 Lung adeno: 2 Ovarian: 2 Stomach: 1 CCLE: 1 (bladder) L755S/M/P/W Breast: 4 Colorectal: 2 Endometrial: 1 Kidney (pap): 1 Melanoma: 1 Stomach: 1 CCLE: 3 (colorectal, stomach, brain) V777L/A Breast: 1 Colorectal: 2 GBM: 2 V842I Breast: 1 Colorectal: 4 Endometrial: 2 CCLE: 4 (Lung, ovarian, endometrial) R678Q Breast: 1 Colorectal: 1 Endometrial: 1 Stomach: 2 CCLE: 1 (colorectal) 774-776ins Lung adeno: 6 CCLE: 1 (lung) Rec L domain Furin-like Rec L domain Kinase domain
  • 18. ERBB2 mutation hotspots across cancer types S310F Bladder: 1 Breast: 3 Cervical: 1 Colorectal: 2 Lung adeno: 2 Ovarian: 2 Stomach: 1 CCLE: 1 (bladder) L755S/M/P/W Breast: 4 Colorectal: 2 Endometrial: 1 Kidney (pap): 1 Melanoma: 1 Stomach: 1 CCLE: 3 (colorectal, stomach, brain) V777L/A Breast: 1 Colorectal: 2 GBM: 2 V842I Breast: 1 Colorectal: 4 Endometrial: 2 CCLE: 4 (Lung, ovarian, endometrial) R678Q Breast: 1 Colorectal: 1 Endometrial: 1 Stomach: 2 CCLE: 1 (colorectal) 774-776ins Lung adeno: 6 CCLE: 1 (lung) Rec L domain Furin-like Rec L domain Kinase domain Greulich et al. PNAS 2012. Kancha et al. PLoS ONE 2011. Bose et al. Cancer Discovery 2012. Bose et al. Cancer Discovery 2012. Bose et al. Cancer Discovery 2012.
  • 19. cBioPortal for Cancer Genomics: Data to knowledge Tumor DNA DNA sequencer, microarrays … Tumor and normal sequences Data Intuitive interface, quick response time, reduction of complexity
  • 20. Alteration types and thresholds can be customized for each gene. Reduction of complexity: Event calls Which genes are altered in which samples?
  • 21. cBioPortal Data visualization and exploration in cBioPortal Clinical MSK-IMPACT Genomicdata CMO Research Foundation Medicine
  • 22. Clinical MSK-IMPACT Genomicdata CMO Research cBioPortal Data visualization and exploration in cBioPortal TCGA, ICGC Other public data Foundation Medicine
  • 23. Clinical MSK-IMPACT Genomicdata CMO Research cBioPortal Foundation Medicine TCGA, ICGC Other public data MSKCC clinical data Data visualization and exploration in cBioPortal
  • 24. Clinical MSK-IMPACT Genomicdata CMO Research cBioPortal OncoKB: Annotation of variant effects, treatment Foundation Medicine TCGA, ICGC Other public data Clinical annotation Step 1: Manual Step 2: Automated via institutional databases MSKCC clinical data OncoKB Knowledgebase of oncogenic mutations Variant effect NCCN guidelines Standard therapy Investigational therapy Clinical trials
  • 25. Live demonstration of cBioPortal http://cbioportal.org/
  • 26. cBioPortal usage and interest cbioportal.org >5,000 unique users per week, doubling every year
  • 27. cBioPortal usage and interest cbioportal.org >5,000 unique users per week, doubling every year Numerous academic installations of cBioPortal: Dana-Farber, Princess Margaret, CHOP, Weill Cornell, Fred Hutchinson, UCSC, Columbia, NYU, NY Genome Center, British Columbia, University of Michigan, SickKids, Vanderbilt, Emory, UNC, University of Pittsburgh, CRUK, EMBL, Charite Berlin, institutions in Japan, China, … Interest by several people to modify or customize the code, and to contribute new features Interest by pharmaceutical companies and others to use cBioPortal ● For internal data analysis (large pharma) ● In customer-facing applications (smaller service companies)
  • 28. Switch to open source cBioPortal source code is available via GitHub: https://github.com/cBioPortal/cbioportal AGPL license v3 (Affero GPL): A GPL variant, main difference is that redistribution over a network triggers the copyleft requirements Impact on cancer research, patient treatment, drug development through: • More robust and flexible software • Accelerated development of new features • Wider user base, collaborative culture
  • 29. Core cBioPortal Development group Memorial Sloan Kettering Cancer Center Nikolaus Schultz, Chris Sander, Benjamin Gross, JJ Gao Dana Farber Cancer Institute Ethan Cerami Princess Margaret Cancer Centre Trevor Pugh, Stuart Watt Re-uniting two cBioPortal founders Coordination of architectural decisions, feature development, merges, etc. TheHyve now offering commercial services around cBioPortal
  • 30. Summary Rapidly growing body of cancer genomics data (public and private) Reduction of complexity can make these data accessible and interpretable cBioPortal allows access to cancer genomics data sets: cbioportal.org: public site via GitHub: install local versions cBioPortal is now fully open source software data pipelines and data sets coming soon commercial support available Still exploring pre-competitive funding options
  • 31. Acknowledgements CMO Cyriac Kandoth William Lee Rajmohan Murali Nicholas D. Socci BarryTaylor Michael Berger Agnes Viale David B. Solit MichaelTrapani Ederlinda Paraiso Molecular Diagnostics Ahmet Zehir Aijaz Syed Donavan Cheng Michael Berger Maria Arcila Marc Ladanyi Information Systems Mike Eubanks Stu Gardos cBioPortal JianJiong Gao Benjamin Gross Yichao Sun Hongxin Zhang Fred Criscuolo Dong Li Adam Abeshouse Ritika Kundra Annice Chen Chris Sander Onur Sumer Arman Aksoy Ethan Cerami Knowledgebase Debyani Chakravarty Sarah Phillips Julia Rudolph Bioinformatics Core Joanne Edington
  • 53. End of Live Demo