0% found this document useful (0 votes)
61 views

R Tutorial

The document describes the process of analyzing microarray data including: 1) Normalizing the intensity data using the rma method from the affy package and writing the normalized data to a text file. 2) Filtering the normalized data using the expFilter function from the EMA package. 3) Clustering the filtered data using hierarchical clustering with Ward's method.

Uploaded by

Kriti Chopra
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

R Tutorial

The document describes the process of analyzing microarray data including: 1) Normalizing the intensity data using the rma method from the affy package and writing the normalized data to a text file. 2) Filtering the normalized data using the expFilter function from the EMA package. 3) Clustering the filtered data using hierarchical clustering with Ward's method.

Uploaded by

Kriti Chopra
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

ggplot()+layer(data=kcd, mapping=aes(x=time, y=pro1), geom="line", na.rm=T, col="red", xlab="time", ylab="expression")+layer(data=kcd, mapping=aes(x=time, y=pro2), geom="line",na.rm=T, col="green") 1. 2.

Installing R and Bioconductor Download R and R studio by selecting any of the CRAN MIRROR(iitm, india) To instal bioconductor packages change repository in R studio Command in R console >setRepositories(graphics = getOption("menu.graphics"), ind = NULL, addURLs = character()) Select the option 1 to 6 to get packages of bioconductor To install package select the repository and enter the names of the packages required 3. Analysing Microarray data a) Download the .tar file from ncbi omnibus Unzip the file .tar file and then unzip the .cel.gz and save to a directory Change the working directory in R studio to the directory where u saved ur .cel files To view the files in ur directory >list.files() To read the microarray data Load the affy package >library(affy) To load the microarray intensity sets, define an object object _name and give the following command >object_name<- ReadAffy() To display the files contained in the object >object_name b) Data preprocessing normalization to normalize data any of the three packages can be used: rma, gcrma, mas5. Here we use rma >eset.rma<-rma(object_name) To write the result obtained into a text file: >write.exprs(eset.rma, file=filename.txt) The normalized data should be converted to a matrix form: >object_name2<-exprs(eset.rma) ** in case we wish to compare normalized and pre-normalized data >object_name3<-expresso(object_name, bgcorrect.method="rma", normalize=FALSE,pmcorrect.method="pmonly", summary.method="medianpolish") in case the file we have to work on is not a .cel file, we will have to convert out tab limited file to a matrix following this command: ***exprs<-as.matrix(read.table("abc.txt", header=T, sep= "\t", row.names=1, as.is=T)) c) data filtering

load package EMA >library(EMA) >object_name2.f<- expFilter(object_name2) To check the dimensions of the matrix created after data filtering >dim(object_name2.f) ## to find the top overexpressed genes## fit <- lmFit(object_name2) fit <- eBayes(fit) tt <- topTable(fit) d) clustering define a new object name and clustering method >object_name4<-clustering(data=object_name2.f, metric="pearson", method="ward") > clustering.plot(tree= object_name4, title="GCRMA Data - filtered") e) Heat maps to select a minimum set of genes >mvgenes<-genes.selection(object_name2.f, thres.num=100/thres.diff=value) Clustering: >Ob_name1<-clustering(data=object_name2.f[mvgenes,], metric="pearson", method="ward") >Ob_name2<- clustering(data=object_name2.f [mvgenes,], metric="pearsonabs", method="ward") >clustering.plot(tree=ob_name1, tree.sup=ob_name2, data=object_name2.f[mvgenes,], names.sup=FALSE, trim.heatmap=0.99) 4. annotation of genes 5. venn diagrams 6. gene networks

## Additional maps for plotting## 1. to view the .cel raw files and store images in a pdf file > pdf("filename_images.pdf") > par(mfrow=c(2,3)) > image(PQseries.data) >graphics.off() 2. Show boxplots before normalization and after normalization > pdf("PQseries_Normal.pdf") >par(mfrow = c(2,1)) # show 2 plots on one page >boxplot(PQseries.matrix.prenorm, col=1:6) >title("Before Normalization") >boxplot(PQseries.matrix.normal, col=1:6) >title("After Normalization") >graphics.off() 3. histogram >pdf("PQseries_hist.pdf") >hist(PQseries.data,col=1:6,main="Histogram of log2 intensities")

PM

probe

>legend(13,1.5,sampleNames(PQseries.data),col=1:6,lty=1:6,lwd=2) > graphics.off() 4. Relative log expression library(affyPLM) Pset <- fitPLM(PQseries.data) pdf("PQseries_RLE.pdf") par(mfrow=c(1,1)) las=2 RLE(Pset, las=2, col=1:6) title("RLE for PQseries data") graphics.off() 5. RNA Degradation pdf("PQseries_RNAdeg.pdf") rnaDeg = AffyRNAdeg(PQseries.data) plotAffyRNAdeg(rnaDeg, col=1:6,lty=1:6) graphics.off() 6. NUSE: Normalized Unscaled Standard Errors library(affyPLM) Pset <- fitPLM(PQseries.data) pdf("PQseries_NUSE.pdf") par(mfrow=c(1,1)) NUSE(Pset) title("NUSE for PQseries dataset") graphics.off() 7. MA plot. Caution this is memory intensive pdf("PQseries_MAplot.pdf") par(mfrow=c(2,3)) MAplot(eset.rma) title("MA Plot") graphics.off()

You might also like