Getting Started With HISAT, StringTie, and Ballgown
Getting Started With HISAT, StringTie, and Ballgown
A popular toolset used for analysing RNA-seq data is the tuxedo suite, which consists of TopHat and Cu inks. The
suite provided a start to nish pipeline that allowed users to map reads, assemble transcripts, and perform
di erential expression analyses. A newer “tuxedo suite” has been developed and is made up of three tools: HISAT,
StringTie, and Ballgown. A Nature Protocols article provides a summary of the new suite as well as a tutorial; this
StringTie
post was written while I was going through the tutorial.
I worked through the tutorial on a MacBook Pro, which means that I downloaded binaries for OS X. If you’re using
some avour of Linux, download the Linux binaries instead. The data for the tutorial is available at
ftp://ftp.ccb.jhu.edu/pub/RNAseq_protocol; you can perform a recursive download using wget to download all
the les on the FTP server. You can use your data but you’ll have to index the relevant reference le and prepare
your own sample text le. For this post, I used the same data as the tutorial.
1 # recursive download
2 wget -c -r ftp://ftp.ccb.jhu.edu/pub/RNAseq_protocol
3
4 # move the data tarball to directory root
5 mv ftp.ccb.jhu.edu/pub/RNAseq_protocol/chrX_data.tar.gz .
6
7 # extract
8 tar xzf chrX_data.tar.gz
9
10 # check out the directory structure
11 tree --charset=ascii chrX_data
12 chrX_data
13 |-- genes
14 | `-- chrX.gtf
15 |-- genome
16 | `-- chrX.fa
17 |-- geuvadis_phenodata.csv
18 |-- indexes
19 | |-- chrX_tran.1.ht2
20 | |-- chrX_tran.2.ht2
21 | |-- chrX_tran.3.ht2
22 | |-- chrX_tran.4.ht2
23 | |-- chrX_tran.5.ht2
24 | |-- chrX_tran.6.ht2
25 | |-- chrX_tran.7.ht2
26 | `-- chrX_tran.8.ht2
We use cookiesmergelist.txt
27 |-- to ensure that we give you the best experience on our website. If you continue to use this site we will
28 `-- samples
29 assume that you are happy with it.
|-- ERR188044_chrX_1.fastq.gz
30 |-- ERR188044_chrX_2.fastq.gz
31 |-- ERR188104_chrX_1.fastq.gz Ok
32 |-- ERR188104_chrX_2.fastq.gz
33 |-- ERR188234_chrX_1.fastq.gz
34 |-- ERR188234_chrX_2.fastq.gz
35 |-- ERR188245_chrX_1.fastq.gz
36 |-- ERR188245_chrX_2.fastq.gz
37 |-- ERR188257_chrX_1.fastq.gz
38 |-- ERR188257_chrX_2.fastq.gz
39 |-- ERR188273_chrX_1.fastq.gz
40 |-- ERR188273_chrX_2.fastq.gz
41 |-- ERR188337_chrX_1.fastq.gz
42 |-- ERR188337_chrX_2.fastq.gz
43 |-- ERR188383_chrX_1.fastq.gz
44 |-- ERR188383_chrX_2.fastq.gz
45 |-- ERR188401_chrX_1.fastq.gz
46 |-- ERR188401_chrX_2.fastq.gz
47 |-- ERR188428_chrX_1.fastq.gz
48 |-- ERR188428_chrX_2.fastq.gz
49 |-- ERR188454_chrX_1.fastq.gz
50 |-- ERR188454_chrX_2.fastq.gz
51 |-- ERR204916_chrX_1.fastq.gz
52 `-- ERR204916_chrX_2.fastq.gz
53
54 4 directories, 36 files
A description of the data set is provided by geuvadis_phenodata.csv. Normally, you will have to prepare this le
yourself; it will be used later in the Ballgown step.
1 cat chrX_data/geuvadis_phenodata.csv
2 "ids","sex","population"
3 "ERR188044","male","YRI"
4 "ERR188104","male","YRI"
5 "ERR188234","female","YRI"
6 "ERR188245","female","GBR"
7 "ERR188257","male","GBR"
8 "ERR188273","female","YRI"
9 "ERR188337","female","GBR"
10 "ERR188383","male","GBR"
11 "ERR188401","male","GBR"
12 "ERR188428","female","GBR"
13 "ERR188454","male","YRI"
14 "ERR204916","female","YRI"
Now let’s download the programs; have a look at the HISAT2 page to nd the appropriate binary to download. I
like to download programs in a src directory and link them to a bin directory, which is in my PATH.
1 # for OS X
2 cd ~/src
3 wget -c ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/downloads/hisat2-2.1.0-OSX_x86_64.zip
4 unzip hisat2-2.1.0-OSX_x86_64.zip
5
6 # provide link to binaries in my bin directory
7 cd ~/bin/
8 ln -s ~/src/hisat2-2.1.0/hisat2* .
9 # some files were already linked
10 ln -s ~/src/hisat2-2.1.0/*.py .
11 ln: ./hisat2_extract_exons.py: File exists
12 ln: ./hisat2_extract_snps_haplotypes_UCSC.py: File exists
13 ln: ./hisat2_extract_snps_haplotypes_VCF.py: File exists
14 ln: ./hisat2_extract_splice_sites.py: File exists
15 ln: ./hisat2_simulate_reads.py: File exists
Again, take a look at the StringTie page to nd the appropriate binary to download.
1 # for OS X
2 cd ~/src
3 wget -c http://ccb.jhu.edu/software/stringtie/dl/stringtie-1.3.3b.OSX_x86_64.tar.gz
We 4use cookies
tar xzftostringtie-1.3.3b.OSX_x86_64.tar.gz
ensure that we give you the best experience on our website. If you continue to use this site we will
5 assume that you are happy with it.
6 # provide link to binary in my bin directory
7 cd ~/bin/ Ok
8 ln -s ~/src/stringtie-1.3.3b.OSX_x86_64/stringtie
The g compare tool needs to be compiled.
1 cd ~/src/
2 git clone https://github.com/gpertea/gclib
3 git clone https://github.com/gpertea/gffcompare
4 cd gffcompare
5 make release
6
7 # link again
8 cd ~/bin/
9 ln -s ~/src/gffcompare/gffcompare
Ballgown is a Bioconductor package, so we need to install that using R. While we are at it, we will install various
dependencies too.
1 install.packages("devtools")
2 install.packages("dplyr")
3
4 source("https://www.bioconductor.org/biocLite.R")
5 biocLite(c("alyssafrazee/RSkittleBrewer", "ballgown", "genefilter"))
Now that we have downloaded and prepared all the required programs, we can start the analysis!
Mapping
Mapping is performed using HISAT2 and usually the rst step, prior to mapping, is to create an index of the
reference genome. The indices are provided in the data folder but let’s create them again.
1 mkdir my_index
2 cd my_index
3
4 # use the Python scripts to extract splice-site and exon information from a gene annotatio
5 extract_splice_sites.py ../chrX_data/genes/chrX.gtf > chrX.ss
6 extract_exons.py ../chrX_data/genes/chrX.gtf > chrX.exon
7
8 head -3 chrX.ss
9 chrX 276393 281481 +
10 chrX 281683 284166 +
11 chrX 284313 288732 +
12
13 head -3 chrX.exon
14 chrX 276323 276393 +
15 chrX 281393 281683 +
16 chrX 284166 284313 +
17
18 # now to build the index
19 # the --ss and --exon options can be omitted if annotation data is not available
20 time hisat2-build -p 8 --ss chrX.ss --exon chrX.exon ../chrX_data/genome/chrX.fa chrX_tran
21 # screen output not shown to save space
22 Total time for call to driver() for forward index: 00:03:34
23
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
24 real 3m33.870s
25 user 10m10.778s assume that you are happy with it.
26 sys 1m9.074s
Ok
27
28 ls -1
29 chrX.exon
30 chrX.fa
31 chrX.ss
32 chrX_tran.1.ht2
33 chrX_tran.2.ht2
34 chrX_tran.3.ht2
35 chrX_tran.4.ht2
36 chrX_tran.5.ht2
37 chrX_tran.6.ht2
38 chrX_tran.7.ht2
39 chrX_tran.8.ht2
Despite creating our own indices, we’ll use the ones provided by the tutorial for reproducibility’s sake. From
geuvadis_phenodata.csv we saw that there are 12 samples; each sample has two FASTQ les since this is paired-
end data. Let’s start the mapping.
You should always only store sorted BAM (or CRAM) les and delete the SAM les after conversion.
Assembly
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
assume that you are happy with it.
Ok
Now we need to assemble the mapped reads into transcripts. StringTie can assemble transcripts with or without
annotation; as noted in the protocol, annotation can be helpful when the number of reads for a transcript is too
low for an accurate assembly.
We use
1 cookies to ensure
# compare the that we givetranscripts
assembled you the best experience on our website. If you continue to use this site we will
to known transcripts
2 gffcompare -r chrX_data/genes/chrX.gtf -G -o merged stringtie_merged.gtf
assume that you are happy with it.
3
4 cat merged.stats Ok
5 # gffcompare v0.10.1 | Command line was:
6 #gffcompare -r chrX_data/genes/chrX.gtf -G -o merged stringtie_merged.gtf
stringtie
7 #
8
9 #= Summary for dataset: stringtie_merged.gtf
stringtie
10 # Query mRNAs : 3281 in 1521 loci (2651 multi-exon transcripts)
11 # (535 multi-transcript loci, ~2.2 transcripts per locus)
12 # Reference mRNAs : 2102 in 1086 loci (1856 multi-exon)
13 # Super-loci w/ reference transcripts: 998
14 #-----------------| Sensitivity | Precision |
15 Base level: 100.0 | 77.6 |
16 Exon level: 100.0 | 85.4 |
17 Intron level: 99.8 | 91.0 |
18 Intron chain level: 99.6 | 69.7 |
19 Transcript level: 99.6 | 63.8 |
20 Locus level: 100.0 | 70.9 |
21
22 Matching intron chains: 1848
23 Matching transcripts: 2094
24 Matching loci: 1086
25
26 Missed exons: 0/8804 ( 0.0%)
27 Novel exons: 971/10608 ( 9.2%)
28 Missed introns: 14/7946 ( 0.2%)
29 Novel introns: 219/8714 ( 2.5%)
30 Missed loci: 0/1086 ( 0.0%)
31 Novel loci: 421/1521 ( 27.7%)
32
33 Total union super-loci across all input datasets: 1521
34 3281 out of 3281 consensus transcripts written in merged.annotated.gtf (0 discarded as red
The high sensitivity means that almost all of the StringTie transcripts match the known transcripts, i.e. low false
negative. The precision is much lower indicating that many of the StringTie transcripts are not in the list of known
transcripts, which are either false positives or truly de novo transcripts. The novel exons, introns, and loci indicate
how many of the sites were not found in the list of known transcripts.
Now that we have our assembled transcripts, we can estimate their abundances.
We use
1 cookies to ensure
stringtie -e -Bthat
-pwe
8 give you the best experience-o
-G stringtie_merged.gtf on ballgown/ERR188044/ERR188044_chrX.gtf
our website. If you continue to use this site we will
map/
2 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188104/ERR188104_chrX.gtf map/
assume that you are happy with it.
3 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188234/ERR188234_chrX.gtf map/
4 stringtie -e -B -p 8 -G stringtie_merged.gtf Ok -o ballgown/ERR188245/ERR188245_chrX.gtf map/
5 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188257/ERR188257_chrX.gtf map/
6 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188273/ERR188273_chrX.gtf map/
7 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188337/ERR188337_chrX.gtf map/
8 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188383/ERR188383_chrX.gtf map/
9 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188401/ERR188401_chrX.gtf map/
10 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188428/ERR188428_chrX.gtf map/
11 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR188454/ERR188454_chrX.gtf map/
12 stringtie -e -B -p 8 -G stringtie_merged.gtf -o ballgown/ERR204916/ERR204916_chrX.gtf map/
13
14 # estimation took just over a minute and a half
15 # real 1m39.661s
16 # user 2m0.179s
17 # sys 0m9.223s
18
19 # check out the files
20 ls -1 ballgown/ERR188044
21 ERR188044_chrX.gtf
22 e2t.ctab
23 e_data.ctab
24 i2t.ctab
25 i_data.ctab
26 t_data.ctab
Differential expression
To perform the expression analyses, we need to use R and Ballgown; I recommend using RStudio. To get started
load the required libraries and the data.
1 library(ballgown)
2 library(RSkittleBrewer)
3 library(genefilter)
4 library(dplyr)
5 library(devtools)
6
7 # change this to the directory that contains all the StringTie results
8 setwd("~/muse/tuxedo")
9
10 # load the sample information
11 pheno_data <- read.csv("chrX_data/geuvadis_phenodata.csv")
12
13 # create a ballgown object
14 bg_chrX <- ballgown(dataDir = "ballgown",
15 samplePattern = "ERR",
16 pData = pheno_data)
17
18 class(bg_chrX)
19 [1] "ballgown"
20 attr(,"package")
21 [1] "ballgown"
22
23 bg_chrX
24 ballgown instance with 3491 transcripts and 12 samples
1 methods(class="ballgown")
2 [1] dirs eexpr expr expr<- geneIDs geneN
3 [8] iexpr indexes indexes<- mergedDate pData pData
4 [15] seqnames show structure subset texpr trans
5 see '?methods' for accessing help and source code
6
7 # we can get the gene, transcript, exon, and intron expression levels using
8 # gexpr(), texpr(), eexpr(), and iexpr()
9 head(gexpr(bg_chrX), 2)
10 FPKM.ERR188044 FPKM.ERR188104 FPKM.ERR188234 FPKM.ERR188245 FPKM.ERR188257 FPKM.E
11 MSTRG.1 7.169349
We use cookies to ensure that 10.42652
we give you the best experience13.83639
on our website. If1.050201 5.677819
you continue to use 1
this site we will
12 MSTRG.10 21.428192 13.13144 14.11443 18.454338 10.182308
13 assume that you
FPKM.ERR188383 FPKM.ERR188401 are happy with FPKM.ERR188454
FPKM.ERR188428 it. FPKM.ERR204916
14 MSTRG.1 4.732841 11.424809 5.733899 6.688090 5.061143
15 MSTRG.10 11.815677 8.196958 Ok 9.578302 9.961549 10.997639
16
17 head(texpr(bg_chrX), 2)
18 FPKM.ERR188044 FPKM.ERR188104 FPKM.ERR188234 FPKM.ERR188245 FPKM.ERR188257 FPKM.ERR18827
19 1 23.9694 18.49576 39.70492 14.06822 25.51846 23.8477
20 2 0.0000 0.00000 27.79636 13.96464 44.97094 0.0000
21 FPKM.ERR188401 FPKM.ERR188428 FPKM.ERR188454 FPKM.ERR204916
22 1 28.03131 24.97612 28.2617 20.24706
23 2 25.81932 0.00000 0.0000 0.00000
1 # note that this subset function is not the base R function but a ballgown one
2 # to see the order in which R looks for functions in packages use search()
3 # search()
4 # [1] ".GlobalEnv" "package:bindrcpp" "package:devtools" "package
5 # [5] "package:genefilter" "package:RSkittleBrewer" "package:ballgown" "tools:r
6 # [9] "package:stats" "package:graphics" "package:grDevices" "package
7 # [13] "package:datasets" "package:methods" "Autoloads" "package
8 #
9 # the rowVars is from the genefilter package and calculates the row variance
10 bg_chrX_filt <- subset(bg_chrX, "rowVars(texpr(bg_chrX)) >1", genomesubset=TRUE)
11
12 # 1,264 transcripts were filtered out
13 bg_chrX_filt
14 ballgown instance with 2227 transcripts and 12 samples
Perform the di erential expression analysis stattest() function; confounders are speci ed using the adjustvars
parameter, which has to match the column name in pheno_data. We are testing for transcripts and genes that are
di erentially expressed between male and females, hence sex is our covariate of interest. In addition to testing
transcripts and genes, we can also test di erential expression at exons and introns; just change the feature
parameter accordingly.
1 head(pData(bg_chrX_filt), 3)
2 ids sex population
3 1 ERR188044 male YRI
4 2 ERR188104 male YRI
5 3 ERR188234 female YRI
6
7 # test on transcripts
8 results_transcripts <- stattest(bg_chrX_filt,
9 feature="transcript",
10 covariate="sex",
11 adjustvars = c("population"),
12 getFC=TRUE, meas="FPKM")
13
14 # results are in a data frame
15 class(results_transcripts)
16 [1] "data.frame"
17
18 dim(results_transcripts)
19 [1] 2227 5
20
21 head(results_transcripts)
22 feature id fc pval qval
23 1 transcript 1 0.9386481 0.7208669 0.9454480
24 2 transcript 2 1.2073309 0.8670656 0.9756579
25 3 transcript 3 1.0058534 0.9964598 0.9997816
26 4 transcript 4 0.3847566 0.5214029 0.9290666
27 5 transcript 5 0.6089373 0.3247825 0.9278154
28 6 transcript 6 0.6449469 0.3062408 0.9253708
29
30 table(results_transcripts$qval < 0.05)
31
We 32
use cookies
FALSE toTRUE
ensure that we give you the best experience on our website. If you continue to use this site we will
33 2215 12 assume that you are happy with it.
34
35 # test on genes Ok
36 results_genes <- stattest(bg_chrX_filt,
37 feature="gene",
38 covariate="sex",
39 adjustvars = c("population"),
40 getFC=TRUE, meas="FPKM")
41
42 class(results_genes)
43 [1] "data.frame"
44
45 dim(results_genes)
46 [1] 1013 5
47
48 table(results_genes$qval<0.05)
49
50 FALSE TRUE
51 1002 11
The results_transcripts data frame doesn’t contain any identi ers; we will create a new data frame with this
information.
1 library(ggplot2)
2 library(cowplot)
3
4 results_transcripts$mean <- rowMeans(texpr(bg_chrX_filt))
5
6 ggplot(results_transcripts, aes(log2(mean), log2(fc), colour = qval<0.05)) +
7 scale_color_manual(values=c("#999999", "#FF0000")) +
8 geom_point() +
9 geom_hline(yintercept=0)
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
assume that you are happy with it.
Ok
Summary
The new tuxedo package is very fast; I realise that the tutorial only used a small subset of reads that were already
determined to map to chromosome X. Despite this, the mapping and assembly took mere minutes. A recent
benchmark of RNA-seq aligners did demonstrate that HISAT or HISAT2 was the fastest splice-aware mapper out of
14 algorithms. However, HISAT or HISAT2 had a low recall percentage when mapping reads with high complexity,
i.e. more polymorphic sites and higher error rates, on the default settings; mapping accuracy was vastly improved
after tuning the parameters.
I plan to set up a Snakemake pipeline for running the new tuxedo suite and will compare it with other pipelines,
such as this STAR and Cu inks/RSEM pipeline.
SHARE THIS:
LIKE THIS:
Like
Be use
We the first to like this.
cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
assume that you are happy with it.
Ok
RELA TED
Getting started with TopHat Getting started with Picard Getting started with Seurat
May 9, 2012 July 26, 2014 August 1, 2017
In "bioinformatics" In "bioinformatics" In "single cell"
Posted in bioinformatics
Tagged RNA-seq
1 1 C O M M EN TS A DD Y O URS
N A N DI T A
January 31, 2018 at 7:59 am
Hi Dave- I was wondering if you could comment on an observation we made when we ran this pipeline
as described here.
We did an experiment in mouse, knockout vs WT. For alignment we used hisat2, default parameters.
Followed by stringtie
stringtie, and ballgown. We got a large number of signi cantly D.E. “transcripts”, but, when
we conducted a gene level analysis, we got barely any D.E. genes. The D.E. transcripts list mostly has the
same gene showing D.E. of di erent splice forms in each condition. Since we are dealing with the same
tissue, we really don’t expect such a huge splicing e ect. I wonder if many of the splice variants could be
mapping artifacts, because, in some cases, I look at the aligned reads in a browser and it shows no
di erence between the two samples in terms of # of reads mapped.
RE PLY
DA VO
February 1, 2018 at 12:56 am
Hi Nandita,
I recall that a former colleague had a similar problem to what you are describing, which is the
discrepancy in DE between genes and transcripts. Regarding your example, I guess the obvious
thing to do (which you may have already done) is to create an expression table of the gene and
another of the transcripts belonging to the same gene. Perhaps in the knockout, it has switched
to another splice variant, therefore there is DE on the transcript level. However, when you
collapse expression onto a gene level they are expressed similarly. I’m not so sure about what
you meant about mapping artifacts though. If there was a systematic artifact, it should a ect
both samples equally and you shouldn’t have a discrepancy only in one sample.
Cheers,
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
Dave assume that you are happy with it.
Ok
RE PLY
N A N DI T A
February 1, 2018 at 7:37 am
However, when using the ucsc genome browser to view bigwig les generated from the aligned bams,
we cannot see any unique splice junction being covered in one condition versus the other. So we are not
sure why these reads have been assigned to di erent splice isoforms. Additionally, like I said, we really
don’t expect so many events where isoforms are switched in the system we are examining.
“If there was a systematic artifact, it should a ect both samples equally and you shouldn’t have a
discrepancy only in one sample.”
Agreed. I am unable to explain it either. ? Short of eyeballing every such event in a genome browser, or
asking the lab to validate via qPCR, I’m not able to assign con dence in the di erential transcripts
results, even though the fc, p-val and q-val look very good.
RE PLY
UPE N DR A K UM A R DE VI S E T T Y
June 9, 2018 at 4:13 am
Hi Dan,
Very nice blog. I have one quick questions. Is there a way one can logFC in addition to FC in ballgown
output?
Thanks,
Upendra
RE PLY
R A M A N S E T HI
January 3, 2019 at 10:30 am
Nice blog. I want to ask how Ballgown compares with DESeq2? And which is the best tool to plot heat
maps, GO and Pathway Analysis, PCA Analysis? Thank you!
RE PLY
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
DA VO assume that you are happy with it.
RE PLY
J O S R UI R O D5
January 9, 2019 at 10:25 am
Thank you!
RE PLY
DA VO
January 23, 2019 at 8:26 am
RE PLY
FA W Z I Y A S S I N E
March 16, 2019 at 3:45 pm
how to interpret fold change (fc) in ballgown results, a fake example calculation is appreciated.
regards,
RE PLY
J O S E B A S I LI O
March 26, 2019 at 2:49 pm
Thank you for your post. I would like to know if you have the possibility to get, and send to my email,
the paper which you have mentioned at the end of your post:
https://link.springer.com/protocol/10.1007%2F978-1-4939-4035-6_14
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
RE PLY
assume that you are happy with it.
Ok
DA VI D HUE LS
April 4, 2019 at 11:46 am
Hi Dave,
thank you for your detailed blog post. Very helpful!
Which les did you load in the IGV to visualise the known as well as the novel transcripts?
Cheers
David
RE PLY
Leave a Reply
Your email address will not be published. Required elds are marked *
Comment
Name *
Email *
Website
POS T C OM M E N T
This site uses Akismet to reduce spam. Learn how your comment data is processed.
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
assume that you are happy with it.
Ok
Search … S E A RC H
WHO ' S O N L I N E
SUP P O R T
Buy me a co ee
L I CE N SE
This work is licensed under a Creative Commons Attribution 4.0 International License.
R E CE N T P O ST S
Interactive plots in R
R E CE N T CO M M E N T S
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
assume that you are happy with it.
Ashley on Getting started with Seurat
Ok
Gwang-Jin Kim on Getting started with Monocle
Santiago on Making a heatmap in R with the pheatmap package
T AG CL O UD
6mer 10x annotation bedtools bioinformatics biomaRt CAGE clustering correlation DGE
encode etc fork genome GO graph heatmap histones home machine learning mapping
maths miRNA motif OMIM parser pca perl promoter python R refseq repeats rnaseq SAM scan
AR CHI V E S
Tweets by @davetang31
RStudio
February 2019
@rstudio
June 2018
Apr 15, 2019
May 2018
Dave Tang
@davetang31 February 2018
Given everything I've read about DataCamp in the
January 2018
past week, I have unsubscribed and deleted my
account. I have also removed all links to their site October 2017
from my blog and will stop recommending it.
September 2017
Apr 15, 2019
We use cookies to ensure that we give you the best August 2017on our website. If you continue to use this site we will
experience
Dave Tang Retweeted assume that you are happy with it.
July 2017
F Rodriguez-Sanchez Ok
@frod_san
Software authors deserve being cited too! June 2017
For #rstats, just run `grateful::cite_packages()` and
March 2017
get citations ready to paste into your
manuscript!github.com/Pakillo/gratef…
February 2017
October 2016
Feb 6, 2019
September 2016
Dave Tang Retweeted
August 2016
bioRxiv Bioinfo
@biorxiv_bioinfo July 2016
August 2015
Oct 25, 2018
July 2015
Dave Tang Retweeted
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
Oct 13, 2018
June 2014
Dave Tang Retweeted
May 2014
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
Eric Alper
@ThatEricAlper assume that you2014
April are happy with it.
January 2014
December 2013
November 2013
October 2013
Oct 2, 2018
September 2013
July 2012
June 2012
May 2012
April 2012
March 2012
Sep 26, 2018
February 2012
October 2010
ME T A
Log in
Entries RSS
Jul 25, 2018
Comments RSS
Embed View on Twitter
WordPress.org
I N T E N T I O N AL L Y BL AN K
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will
assume that you are happy with it.
Ok