Cnvs Dataset and Analysis: Prepared By: Mohammed Abdulghani Taha Supervised By: Assist. Prof. Gokmen Altay
Cnvs Dataset and Analysis: Prepared By: Mohammed Abdulghani Taha Supervised By: Assist. Prof. Gokmen Altay
Prepared by: Mohammed Abdulghani Taha Supervised by: Assist. Prof. Gokmen Altay
What is CNVs ??
Large regions of the genome that have been deleted or duplicated on certain chromosomes [4] For example, the chromosome that normally has sections in order as A-B-C-D might instead have sections A-B-C-C-D (a duplication of "C") or A-B-D (a deletion of "C") [4].
What is CNVs ??
Quantification of dataset
Array CGH consists of a number of probes and each probe contains a small DNA fragment. Array CGH approaches can provide a vector V = (v1, v2,,vn), where vi is the log ratio of the reference genome for the ith probe and this is done by measuring the fluorescence intensity at each probe . V=log2(fluorescence intensity in the target genome/fluorescence intensity in the reference genome)
Pre-processing
Normalisation Segmentation Calling
Normalisation
The aim of Normalization is to make log2-ratio from different hybridizations comparable [10]. Types of normalization [10]:
Median normalization. Mode normalization. Spatial normalization.
Median Normalization
Median Normalization must be used after using either Mean Normalization or Standard Deviation Normalization Data sets for each microrrays normalized data must be compiled into a matrix. From each microarray, 2 data sets are available to be used for analysis. For this experiment, 2 microarrays were used giving us 4 different data sets.
Median Normalization
X denotes a gene ,N equals the number of data sets used, P equals the number of genes in each microarray. M1 equals the red intensity median for genes X11 X1n Mm also calculated which is equal to equals the median for all combined red medians M1 Mp . A1 is calculated the median for all the expression ratio values for the Data Set #1.
Median Normalization
Each genes expression ratio was then multiplied by a
Ratio = (Mm / A1). Ratio = (Mm / A2). Ratio = (Mm / A3). Ratio = (Mm / A4).
Segmentation
Divide the genome into contiguous segments. Clones that belong to the same segment have the same copy number. The purpose of segmentation are Noise reduction, detection of aberration (loss, normal, gain) and breakpoint analysis [10].
Calling
Calling is the process of categorizing the different segmentation states as loss, normal, gain, or amplification [10].
Analysis (Clustering)
Similarity: The copy number o a clone of two samples is in agreement if they are equal. Two clones of two samples are in concordance if they agree on which clone has the largest copy number
Clustering
Clustering
Clustering
Clustering
Clustering
Real-life dataset
Real-life dataset
Real-life dataset
Real-life dataset