0% found this document useful (0 votes)
47 views

Additional Note PDF

Uploaded by

LEE ZIWEI
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

Additional Note PDF

Uploaded by

LEE ZIWEI
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

What is Bioinformatics?

Bioinformatics is an interdisciplinary scientific field that develops methods and


software tools for storing, retrieving, organizing and analyzing biological data

As an interdisciplinary field,
bioinformatics combines computer
science, statistics, mathematics
and engineering to study
biological data and processes

P. Paulsharma Chakravarthy
The Central Dogma & Biological Data
Original DNA Sequences
(Genomes)

Expressed DNA sequences


( = mRNA Sequences
= cDNA sequences)
Expressed Sequence Tags
(ESTs)

Protein Sequences
-Inferred
-Direct sequencing

Protein structures
-Experiments
-Models (homologues)
Literature information
Types of Biological Database

Biological
Database

Primary databases - Secondary databases - Specialized databases


contain original contain computationally - those that cater to a
biological data processed or manually particular research
curated information, based interest
on original information from
primary databases.

https://www.ebi.ac.uk/training/online/course/bioinformatics-terrified/what-database/relational-databases/primary-and-secondary-databases
What can be discovered about a gene by
a database search?
A little or a lot, depending on the gene
• Evolutionary information - homologous genes, taxonomic distributions,
allele frequencies, synteny, etc.

• Genomic information - chromosomal location, introns, UTRs, regulatory


regions, shared domains, etc.

• Structural information - associated protein structures, fold types,


structural domains

• Expression information - expression specific to particular tissues,


developmental stages, phenotypes, diseases, etc.

• Functional information - enzymatic/molecular function, pathway/cellular


role, localization, role in diseases
Tour of Major Biological Databases

• Tremendous amount of information about biomolecules


in publicly available databases

• We will look at a few of the main databases and what kind


of information they contain, and practice at browsing
these databases
NCBI

https://www.ncbi.nlm.nih.gov/
NCBI

https://www.ncbi.nlm.nih.gov/
Try to search “gene expression
potato diseases” and see the
number of hits!
Try to key in “crop diseases” or
“insect pests” in the Title words and
see the number of hits!
NCBI

https://www.ncbi.nlm.nih.gov/
https://www.ncbi.nlm.nih.gov/books
Try to search
“crop diseases”
NCBI

https://www.ncbi.nlm.nih.gov/
One database of particular importance to biologists is
GenBank®, which encompasses all publicly available
protein and nucleotide sequences
Let’s work on this…
• Go to https://www.ncbi.nlm.nih.gov/genbank/
• Identify loci (genes) associated with the sequence.
à Input – Pi-b
• For each particular “hit”, we can look at that sequence and
its alignment in more detail.
• See similar sequences, and the organisms in which they are
found.
• But there’s much more that can be found on these genes,
even just inside NCBI…
PRIMARY VS. DERIVATIVE SEQUENCE DATABASES
ACGT
GC RefSeq
CTTC T
Labs
A A
GAG
GAG
A CATC TATAGCCG
TA AGCTCCGATA
TA CCGATGACAA
GC
Sequencing G C CG
Centers ACG
T Genome
CGT
TA
Curators Assembly
C
T
GA
T TG
GA

AT
ACA

CA
TGC

CG
GA
TT TTGACA Updated
CTA

C
CG
CGTGA

AC

A
ACG

TAT AT
CG GC
A

T continually
GC

G AC
GT
C GA
ATTGTG

TA AGC TGAA
TAT

C
TG TA

C
TT
GA
T TGCACT CT AGC TG
G
TATAGCCG CA by NCBI
A T A
A

T
TATAGCCG
TATAGCCG
A TATAGCCG
TA

ATA T A G C
TA TT
GA GenBank
AT UniGene
Updated ONLY
by submitters
TACTTTCTT CTTC T
GAGA A A
T GAGA GAG
GAG
A ATCA C A CATC Algorithms 20.
Similar to NCBI…
https://www.uniprot.org
https://www.youtube.com/user/NCBINLM
Some take home messages
ü There are a lot of molecular biology databases, containing a lot of
valuable information

ü Not even the best databases have everything (or the best of
everything)

ü These databases are moderately well cross-linked, and there are


“linker” databases

You might also like