0% found this document useful (0 votes)

54 views39 pages

Cnvs Dataset and Analysis: Prepared By: Mohammed Abdulghani Taha Supervised By: Assist. Prof. Gokmen Altay

This document summarizes a dataset and analysis of copy number variations (CNVs). It describes several techniques used to discover CNVs, focusing on array comparative genomic hybridization (aCGH). The key steps of an aCGH analysis are presented: normalization, segmentation to identify aberrant regions, and calling to categorize segments. The document then outlines analysis of a real breast cancer aCGH dataset to test associations between CNV regions and estrogen receptor status.

Uploaded by

Muhammad A. Bazzaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views39 pages

Cnvs Dataset and Analysis: Prepared By: Mohammed Abdulghani Taha Supervised By: Assist. Prof. Gokmen Altay

Uploaded by

Muhammad A. Bazzaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

CNVs dataset and analysis

Prepared by: Mohammed Abdulghani Taha Supervised by: Assist. Prof. Gokmen Altay

What is CNVs ??
Large regions of the genome that have been deleted or duplicated on certain chromosomes [4] For example, the chromosome that normally has sections in order as A-B-C-D might instead have sections A-B-C-C-D (a duplication of "C") or A-B-D (a deletion of "C") [4].

What is CNVs ??

Techniques to discover CNVs

There are several techniques that have been used to discover CNVs[3].
ROMA (Representational Oligonucleotide Microarray Analysis), Fosmid Paired-End Sequencing, SKY (Spectral Karyotyping) or FISH (Fluorescence In Situ Hybridization), CGH (Comparative Genomic Hybridization), RT/Q-PCR (Real Time Quantitative PCR) aCGH (Array Comparative Genomic Hybridization)

aCGH (Array Comparative Genomic Hybridization)

has become one of the new powerful arraybased methods [3]. In an aCGH experiment a DNA sample of interest (test sample), and a reference sample are mixed [6] . The combined sample is then hybridized to the microarray and imaged [6].

aCGH (Array Comparative Genomic Hybridization)

Normal microarray procedure is followed [3]. Based on the color of each dot, the colors intensity, and a complicated algorithm, the amount of copying or deletion can be estimated [7]. The higher uorescence intensity ratio indicates that the target genome contains more copies

Quantification of dataset
Array CGH consists of a number of probes and each probe contains a small DNA fragment. Array CGH approaches can provide a vector V = (v1, v2,,vn), where vi is the log ratio of the reference genome for the ith probe and this is done by measuring the fluorescence intensity at each probe . V=log2(fluorescence intensity in the target genome/fluorescence intensity in the reference genome)

Pre-processing
Normalisation Segmentation Calling

Normalisation
The aim of Normalization is to make log2-ratio from different hybridizations comparable [10]. Types of normalization [10]:
Median normalization. Mode normalization. Spatial normalization.

Median Normalization
Median Normalization must be used after using either Mean Normalization or Standard Deviation Normalization Data sets for each microrrays normalized data must be compiled into a matrix. From each microarray, 2 data sets are available to be used for analysis. For this experiment, 2 microarrays were used giving us 4 different data sets.

Median Normalization
X denotes a gene ,N equals the number of data sets used, P equals the number of genes in each microarray. M1 equals the red intensity median for genes X11 X1n Mm also calculated which is equal to equals the median for all combined red medians M1 Mp . A1 is calculated the median for all the expression ratio values for the Data Set #1.

Median Normalization
Each genes expression ratio was then multiplied by a
Ratio = (Mm / A1). Ratio = (Mm / A2). Ratio = (Mm / A3). Ratio = (Mm / A4).

Segmentation

Divide the genome into contiguous segments. Clones that belong to the same segment have the same copy number. The purpose of segmentation are Noise reduction, detection of aberration (loss, normal, gain) and breakpoint analysis [10].

Calling

Calling is the process of categorizing the different segmentation states as loss, normal, gain, or amplification [10].

The pre-processed data

Analysis (Clustering)
Similarity: The copy number o a clone of two samples is in agreement if they are equal. Two clones of two samples are in concordance if they agree on which clone has the largest copy number

Clustering

Real-life dataset

Analysis of multiple CNVs

An example of association tests involving several CNVs , where data from a CGH array is analysed Real data sets are used to illustrate how to analyze CNV data. Start by loading the package CNVassoc:
> library(CNVassoc)

and some required libraries

> library(xtable)

Analysis of multiple CNVs

Step 1. Use any aCGH calling procedure,here they use (CGHcall). Step2. Build blocks/regions of consecutive probes with similar signatures (CGHregions). Step 3. Use the signature that occurs most in a block to perform association here they use (multiCNVassoc). Step 4. Correct for multiple testing considering dependency among signatures here they use (getPvalBH).

Analysis of multiple CNVs

To illustrate, they apply these steps to the breast cancer data studied by Neve et al.. The data consists of CGH arrays of 1MB resolution and is available from Bioconductor http://www.bioconductor.org The authors chose the 50 samples In this example the association between strogen receptor positivity (dichotomous variable; 0: negative, 1: positive) and CNVs was tested.

Analysis of multiple CNVs

The original data set contained 2621 probes The data reduced to 459 blocks after the application of CGHcall and CGHregions. The data is saved in an object called NeveData This object is a list with two components. The data can be loaded as usual:
> data(NeveData) > intensities <- NeveData$data > pheno <- NeveData$pheno

Analysis of multiple CNVs

The calling can be performed using CGHcall package by using the following instructions: dontrun{} This process takes about 20 minutes.The alternative way is that they saved the final object of class cghCall that can be loaded as
> data(NeveCalled)

Analysis of multiple CNVs

CGHcall function does not estimates the underlying number of copies for each segment but assigns the underlying status: loss, normal or gain. This is done by

> probs <- getProbs(NeveCalled)

This is a dataframe that looks like this: > probs[1:5, 1:7]

Analysis of multiple CNVs

In order to determine the regions that are recurrent or common among samples. This is done by CGHregion function This can be done by executing : dontrun{ library(CGHregions) NeveRegions <- CGHregions(NeveCalled) }

Analysis of multiple CNVs

This process takes about 3 minutes. We have stored the result in the object NeveRegions that can be loaded as usual > data(NeveRegions) Now we have to get the posterior probabilities for each block/region. > probsRegions <- getProbsRegions(probs, +NeveRegions, intensities)

Analysis of multiple CNVs

Finally, the association analysis between each region and the strogen receptor positivity can be analyzed by using the multiCNVassoc function. > pvals <- multiCNVassoc(probsRegions, formula = +"pheno~CNV", model = "mult", num.copies = 0:2, + cnv.tol = 0.01)

Analysis of multiple CNVs

The function getPvalBH produces the FDRadjusted p-values > pvalsBH <- getPvalBH(pvals) > head(pvalsBH)

Types of Genomics
No ratings yet
Types of Genomics
28 pages
4 RNAseq-Quantification LO
No ratings yet
4 RNAseq-Quantification LO
30 pages
MBG2004 Variant Detection and Methods (SV and CNV) Week VI
No ratings yet
MBG2004 Variant Detection and Methods (SV and CNV) Week VI
73 pages
Data Analysis in Next Generation Sequencing
100% (1)
Data Analysis in Next Generation Sequencing
78 pages
Lecture Slides SV Calling
No ratings yet
Lecture Slides SV Calling
56 pages
Pooling Data Across Micorarray
No ratings yet
Pooling Data Across Micorarray
49 pages
Gene Expression Databases - 525 - 2016
No ratings yet
Gene Expression Databases - 525 - 2016
60 pages
Structural Variation in The Human Genome: Michael Snyder March 2, 2010
No ratings yet
Structural Variation in The Human Genome: Michael Snyder March 2, 2010
80 pages
CGH
No ratings yet
CGH
19 pages
Genomics Lectures 9 To 14-2023 PDF
No ratings yet
Genomics Lectures 9 To 14-2023 PDF
65 pages
GAP lecture 3 and 4
No ratings yet
GAP lecture 3 and 4
35 pages
Prezentare Array Cancer
No ratings yet
Prezentare Array Cancer
44 pages
Matteson Thesis
No ratings yet
Matteson Thesis
37 pages
NOISeq
No ratings yet
NOISeq
26 pages
Computing and Visualizing GSVD
No ratings yet
Computing and Visualizing GSVD
29 pages
RNA Seq R - Final Decode
No ratings yet
RNA Seq R - Final Decode
76 pages
Project O: Breast Cancer Gene Analysis Using R: Sheena Scroggins, Susan Mcgowan, John Caras
No ratings yet
Project O: Breast Cancer Gene Analysis Using R: Sheena Scroggins, Susan Mcgowan, John Caras
25 pages
CMMB 461 Dna Microarray 1 2019 For D2L1
No ratings yet
CMMB 461 Dna Microarray 1 2019 For D2L1
37 pages
Large-Scale Analysis of Gene Expression
No ratings yet
Large-Scale Analysis of Gene Expression
27 pages
Methods of Molecuar Analysis
No ratings yet
Methods of Molecuar Analysis
48 pages
PPT (1)
No ratings yet
PPT (1)
35 pages
Práctica 1 Eng
No ratings yet
Práctica 1 Eng
17 pages
Algorithmic Improvements For Discovery of Germline
No ratings yet
Algorithmic Improvements For Discovery of Germline
14 pages
Difference Between CGH and Array CGH - The Novel Difference
No ratings yet
Difference Between CGH and Array CGH - The Novel Difference
17 pages
GATKwr17-01-Intro to Variant Discovery
No ratings yet
GATKwr17-01-Intro to Variant Discovery
39 pages
Beginner's Guide To Using The DESeq2 Package
No ratings yet
Beginner's Guide To Using The DESeq2 Package
32 pages
CNV
No ratings yet
CNV
12 pages
tmp25AA TMP
No ratings yet
tmp25AA TMP
19 pages
Analysis of Sequence-Based COpy Number Variation Detection Tools For Cancer Studies
No ratings yet
Analysis of Sequence-Based COpy Number Variation Detection Tools For Cancer Studies
8 pages
Covarying Neighbor Analysis
No ratings yet
Covarying Neighbor Analysis
15 pages
Example Analysis AMDA Version 2.0.0: Mattia Pelizzola March 13, 2006
No ratings yet
Example Analysis AMDA Version 2.0.0: Mattia Pelizzola March 13, 2006
48 pages
M.sc Transcriptome Analysis 2025
No ratings yet
M.sc Transcriptome Analysis 2025
21 pages
NGS DATA ANALYSIS.pptx
No ratings yet
NGS DATA ANALYSIS.pptx
19 pages
Genespring GX: Analysis of SNP Arrays
No ratings yet
Genespring GX: Analysis of SNP Arrays
48 pages
Laia_Molecular_Karyotyping_MasterUB
No ratings yet
Laia_Molecular_Karyotyping_MasterUB
9 pages
RNA-Seq Analysis Course
No ratings yet
RNA-Seq Analysis Course
40 pages
TrieDedup - a fast trie-based deduplication algorithm to handle ambiguous base deduplication in HTS
No ratings yet
TrieDedup - a fast trie-based deduplication algorithm to handle ambiguous base deduplication in HTS
13 pages
Target Organism
No ratings yet
Target Organism
12 pages
s41598-025-88494-3
No ratings yet
s41598-025-88494-3
11 pages
From Microarray To RNA-Seq: A Review of Transcriptome Analysis With Next-Generation Sequencing Data
No ratings yet
From Microarray To RNA-Seq: A Review of Transcriptome Analysis With Next-Generation Sequencing Data
27 pages
Benchmarking Germline CNV Calling Tools From Exome Sequencing Data
No ratings yet
Benchmarking Germline CNV Calling Tools From Exome Sequencing Data
11 pages
CNV Detection With NGS - SeqNext
No ratings yet
CNV Detection With NGS - SeqNext
11 pages
Balamurugan
No ratings yet
Balamurugan
17 pages
Array CGH2011
No ratings yet
Array CGH2011
10 pages
Microarray Full
No ratings yet
Microarray Full
56 pages
Manual 2
No ratings yet
Manual 2
7 pages
Comparative Genomic Hybridization CGH in Molecular
No ratings yet
Comparative Genomic Hybridization CGH in Molecular
5 pages
Comparative_copy_number_variation_from_whole_genome_sequencing
No ratings yet
Comparative_copy_number_variation_from_whole_genome_sequencing
4 pages
DPCR CNV Probe Assays Product Profile - 0622 - WW - PROM-20946-001
No ratings yet
DPCR CNV Probe Assays Product Profile - 0622 - WW - PROM-20946-001
7 pages
Methods: Construction of A Workflow For Genome-Wide Variation Analysis of Formalin Fixed Paraffin Embedded Tumor Samples
No ratings yet
Methods: Construction of A Workflow For Genome-Wide Variation Analysis of Formalin Fixed Paraffin Embedded Tumor Samples
1 page
TMM - A scaling normalization method for differential expression analysis of RNA-seq data-Robinson-GenomeBiology-2010
No ratings yet
TMM - A scaling normalization method for differential expression analysis of RNA-seq data-Robinson-GenomeBiology-2010
9 pages
CNV软件coverageMaster
No ratings yet
CNV软件coverageMaster
8 pages
Affy Diffexp Clustering Exercise-1
No ratings yet
Affy Diffexp Clustering Exercise-1
16 pages
GE UNIT V
No ratings yet
GE UNIT V
13 pages
Human-genome-maps
No ratings yet
Human-genome-maps
2 pages
Microarray Review
No ratings yet
Microarray Review
5 pages
Decision TCBB
No ratings yet
Decision TCBB
1 page
F1000research 274712
No ratings yet
F1000research 274712
1 page
Differential Abundance Analysis For Microbial Marker-Gene Surveys
No ratings yet
Differential Abundance Analysis For Microbial Marker-Gene Surveys
7 pages
Guide to Research Techniques in Neuroscience Matt Carter pdf download
No ratings yet
Guide to Research Techniques in Neuroscience Matt Carter pdf download
37 pages
Bio-Pharma Bio-Services Bio-Agri Bio-Industry Bio-Informatics
No ratings yet
Bio-Pharma Bio-Services Bio-Agri Bio-Industry Bio-Informatics
4 pages
Soil Microbiology Ecology and Biochemist PDF
100% (1)
Soil Microbiology Ecology and Biochemist PDF
13 pages
Chapter 4: Functional Anatomy of Prokaryotic and Eukaryotic Cells
No ratings yet
Chapter 4: Functional Anatomy of Prokaryotic and Eukaryotic Cells
94 pages
Protein Database Overview
No ratings yet
Protein Database Overview
13 pages
Compbio Paper Kelompok 1 PDF
No ratings yet
Compbio Paper Kelompok 1 PDF
21 pages
Rift Valley Fever Virus Nucleoprotein Triggers Autophagy To Dampen Antiviral Innate Immune Responses
No ratings yet
Rift Valley Fever Virus Nucleoprotein Triggers Autophagy To Dampen Antiviral Innate Immune Responses
16 pages
Organoids - Nature Reviews
No ratings yet
Organoids - Nature Reviews
21 pages
mRNA Drug List
No ratings yet
mRNA Drug List
48 pages
2020 Micropropagation of Medicinal Plants - Review Micropropagation of Medicinal Plants - Review
No ratings yet
2020 Micropropagation of Medicinal Plants - Review Micropropagation of Medicinal Plants - Review
8 pages
Science of Living System: Nihar Ranjan Jana
No ratings yet
Science of Living System: Nihar Ranjan Jana
27 pages
Health Planning in India
No ratings yet
Health Planning in India
45 pages
Cells Practice Worksheet 2
0% (1)
Cells Practice Worksheet 2
2 pages
UNIT 1 CT 1 NOTES
No ratings yet
UNIT 1 CT 1 NOTES
8 pages
Q2 Earth and Life Module 10
No ratings yet
Q2 Earth and Life Module 10
25 pages
Informatic Notes Part 1
No ratings yet
Informatic Notes Part 1
9 pages
Determinants of Rotavirus Stability and Density During CSCL Purification
No ratings yet
Determinants of Rotavirus Stability and Density During CSCL Purification
10 pages
Globion Literature
No ratings yet
Globion Literature
5 pages
Enzynomics_Manual_RT432_TOPreal+SYBR+Green+RT-qPCR+Kit
No ratings yet
Enzynomics_Manual_RT432_TOPreal+SYBR+Green+RT-qPCR+Kit
2 pages
Biological Molecules Worksheet Bozeman
No ratings yet
Biological Molecules Worksheet Bozeman
2 pages
Proteins and Lipids
No ratings yet
Proteins and Lipids
6 pages
Characterization of Escherichia Coli
No ratings yet
Characterization of Escherichia Coli
5 pages
Edward Jenner: Sourced From Sats-Papers - Co.Uk
No ratings yet
Edward Jenner: Sourced From Sats-Papers - Co.Uk
2 pages
pGLO Transformation and Purification
No ratings yet
pGLO Transformation and Purification
3 pages
Grade 10 Summative Test
No ratings yet
Grade 10 Summative Test
4 pages
Eileen Naga - Isolation of Dna
No ratings yet
Eileen Naga - Isolation of Dna
4 pages
Heartland Pharmacy Creates New Collaborative Care Program
No ratings yet
Heartland Pharmacy Creates New Collaborative Care Program
3 pages
Dna Review Packet Updated
No ratings yet
Dna Review Packet Updated
4 pages
OBN Approver Website 02-2013
No ratings yet
OBN Approver Website 02-2013
2 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

Cnvs Dataset and Analysis: Prepared By: Mohammed Abdulghani Taha Supervised By: Assist. Prof. Gokmen Altay

Uploaded by

Cnvs Dataset and Analysis: Prepared By: Mohammed Abdulghani Taha Supervised By: Assist. Prof. Gokmen Altay

Uploaded by

CNVs dataset and analysis

Techniques to discover CNVs

aCGH (Array Comparative Genomic Hybridization)

aCGH (Array Comparative Genomic Hybridization)

The pre-processed data

Analysis of multiple CNVs

and some required libraries

Analysis of multiple CNVs

Analysis of multiple CNVs

Analysis of multiple CNVs

Analysis of multiple CNVs

Analysis of multiple CNVs

> probs <- getProbs(NeveCalled)

Analysis of multiple CNVs

Analysis of multiple CNVs

Analysis of multiple CNVs

Analysis of multiple CNVs

Analysis of multiple CNVs

You might also like