0% found this document useful (0 votes)

18 views79 pages

MG - L8 - Genomics & Proteomics

Uploaded by

minghouu215

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views79 pages

MG - L8 - Genomics & Proteomics

Uploaded by

minghouu215

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 79

GENOMICS &

PROTEOMICS
OUTLINE

1. Genomics & Proteomics: An overview

2. Structural genomics
3. Functional genomics
4. Proteomics

2
GENOMICS

 Genome - the complete copy of the genetic

information or one complete set of chromosomes
of an organism.

 Genomics - mapping, sequencing, and analyzing

the functions of entire genomes.

3
GENOMICS

 Structural genomics: the study of genome structure

 Functional genomics: the study of genome function
 Transcriptome
 Proteome: the complete set of proteins encoded by a genome
 Proteomics: the determination of the structures and functions
of all of the proteins in an organism
 Comparative genomics: the study of genome evolution

4
WHAT IS TRANSCRIPTOMICS?
 The transcriptome is the set of all RNA molecules,
including mRNA, rRNA, tRNA, and non-coding RNA
produced in one or a population of cells
 Can mean the total set of transcripts in a given organism
OR
 A specific subset of transcripts present in a particular cell
type

5
WHAT IS TRANSCRIPTOMICS?
 Unlike the genome, which is roughly fixed for a given
cell line (excluding mutations), the transcriptome can
vary with external environmental conditions.

 It reflects the genes that are being actively expressed at

any given time, with the exception of mRNA
degradation phenomena such as transcriptional
attenuation .

7
WHAT IS TRANSCRIPTOMICS?

 Transcriptomics (expression profiling) examines the

expression level of mRNAs in a given cell population,
often using high-throughput techniques based on
DNA microarray technology.
 Can be used to compare gene expression in
production traits
e.g. Growth rates
Feed conversion rates (FCR)
Disease susceptibility 8
WHAT IS TRANSCRIPTOMICS?

 BUT only possible if a large amount of mRNA data is

available
 Traditionally this mRNA data is generated using
cDNA libraries and EST characterization.
 Problems:
Expensive
Low throughput
Time consuming
Low coverage – miss the rare transcripts 9

Requires a large amount of RNA

WHAT IS TRANSCRIPTOMICS?

 NOW next-generation sequencing: High-throughput

parallel sequencing technologies can generate millions of
short reads from a library of nucleotide sequences (DNA,
RNA, or a mixture)

10
OUTLINE

1. Genomics & Proteomics: An overview

2. Structural genomics
3. Functional genomics
4. Proteomics

11
GENETIC MAPS
 Genetic (linkage) map approximately provides the
location of one gene relative to the locations of other
known genes. Unit: cM, map units
 Estimate recombination frequency between loci in the
progeny by Testcross
 50% - loci on the different chromosome or far apart on the
same chromosome
 < 50% - loci close together on the same chromosome

 Need multiple two-point or three-point crosses to

construct genetic map for whole chromosome
12
LOCI OF SOME HUMAN GENES
LINKAGE GROUPS
 All genes on one chromosome are called a linkage
group
 The farther apart two genes are on a chromosome, the
more often crossing over occurs between them
 Linked genes are very close together; crossing over
rarely occurs between them
 The probability that a crossover will separate alleles of
two genes is proportional to the distance between those
genes
Crossing over
Linked gene loci
Locus 1 Locus 2
Maternal chromosome A B

AB
Paternal chromosome a b
ab

• no Ab or aB gametes
• genes are tightly linked
TESTCROSSES
 A testcross is a method of determining if an
individual is heterozygous or homozygous
dominant
 An individual with unknown genotype is crossed
with one that is homozygous recessive (PP x pp)
or (Pp x pp)
PP pp Pp
genotype: (homozygous for (homozygous for (heterozygous at
dominant allele P) recessive allele p) the P gene locus)

phenotype:
Recombinant
gametes as a result
of crossover during
meiosis in the
heterozygous parent

A number of
recombinant offspring
are fewer than non-
recombinant offspring

Figure 5.9
GENETIC MAPS

 Testcross is according to available single-locus traits

 Molecular markers – RFLP, microsatellite, SNP, PCR,
DNA sequencing – can be used to construct & refine
genetic maps
 Limitation:
 Less resolution or detail
 Not always correspond to physical distances between loci

20
21
PHYSICAL MAPS

 Physical map locates the genes in relation to their

distances measured in number of base pair (bp),
kilobases (kp, 1000 bp), megabases (mb, 1 million bp)
 Physical maps can be created by restriction mapping
 1cM # 1mb of DNA

22
23
24
25
SEQUENCES OF GENES &
CHROMOSOMES

 Why sequence DNA?

 Understand how biological
processes work
 Find changes to explain genetic
disorders

26
A QUICK HISTORY OF
SEQUENCING
1953 – Structure of DNA solved
1977 – Sanger sequencing invented
– First genome sequenced - Phage Φ-X174 (5 kb)
1986 – First automated sequencing machine
1990 – Human genome project started

27
A QUICK HISTORY OF
SEQUENCING
1995 – First bacterial genome
Haemophilus influenzae (1.8 Mb)
1998 – First animal genome
Caenorhabditis elegans (97 Mb)
2003 – Completion of Human genome project
Homo sapiens (3 Gb), 13 years, $ 2.7 bil.
2005 – First ‘next-generation’ sequencing
instrument
2013 – > 10,000 genome sequences in NCBI
database 28
GENERATION OF SEQUENCING
TECHNOLOGIES

29
GENERATION OF SEQUENCING
TECHNOLOGIES
 First generation: Sanger & Maxam-Gilbert
technique
 Next (second) generation: Roche 454, Illumina
Solexa, ABI SOLiD
 Third generation: Pacific Bio, Helicos
 Other/4th generation: Oxford nanopore, Polonator
Ion Torrent??

30
31
SANGER SEQUENCING: DYE-
TERMINATOR SEQUENCING

32
SANGER SEQUENCING: DYE-
TERMINATOR SEQUENCING

33
CHARACTERISTICS OF NGS
 There are four main sequencing methods
 Pyrosequencing (454)
 Reversible terminator sequencing (Illumina)
 Sequencing by ligation (SOLiD)
 Semiconductor sequencing (Ion Torrent)

 They differ each other in term

 Read length
 Data produced
 Data quality
 Bioinformatics required 34
CHARACTERISTICS OF NGS

 NGS platforms generate millions of reads and billions of

base calls each run
 NGS reads are typically short (<400 bp)
 Dramatic reduction in cost of sequencing
 GS-FLX provides > 100x decrease in costs compared to Sanger
sequencing
 HiSeq and SOLiD > 100x decrease in costs over GS-FLX

35
COMPARISON OF NGS

36
37
BIOINFOMATICS
 Bioinformatics = Molecular biology + Computer
 Developing databases, computer-based algorithms,
gene-prediction software, analytical tools to “mine the
data” from sequencing projects
 Assembly – align multiple sequencing reads that are
overlapping with one-another to reconstruct a long
DNA fragment
 Annotation – link sequence information to its function
& expression on similar genes in other species
 BLAST (Basic Local Alignment Search Tool)
38
Basic scheme of a NGS
39
40
Diagram of de novo sequence assembly
41
APPLICATIONS OF NGS
 Whole genome sequencing
• De novo assembly
• Re-sequencing
• Comparative genomics

 Targeted gene sequencing

42
APPLICATIONS OF NGS
RNA-seq
 Gene expression
 Transcriptome assembly

43
APPLICATIONS OF NGS IN
ANIMAL BIOTECHNOLOGY

44
APPLICATIONS OF NGS IN
HUMAN HEALTH
Cancer
research

Genetic
disorders

Personalized Human
medicine microbiome
Pre- &
Infectious
post-natal
diseases
diagnosis
45
46
OUTLINE

1. Genomics & Proteomics: An overview

2. Structural genomics
3. Functional genomics
4. Proteomics

47
EXPRESSED-SEQUENCE TAGS
 ESTs are markers associated with DNA
sequences that are expressed as RNA

Isolating RNA from cells

Reverse transcription
Tag as a marker to find active
genes
Sequencing
Set of cDNA
fragments
48
EXPRESSED SEQUENCES
 Eukaryote genomes contain a small proportion of the
DNA encodes protein – 1% in human
 Analysing cDNA (DNA complementary to RNA) or
EST to focus on the protein-coding content of genomes
 Study of gene expression to monitor changes in total
genome expression overtime
 Development
 In response to changes in the environment

49
DOT BLOT OR ARRAY HYBRIDIZATION
ANALYSIS OF GENE EXPRESSION

 Gene-specific
nucleotide probes are
applied to a membrane
in a specific pattern.
 Labeled (fluorescent or
radioactive) cDNA
preparations are
hybridized with the
probes on the
membrane.

50
MICROARRAYS (GENE CHIPS)
 A microarray contains thousands of hybridization
probes on a single membrane or silicon wafer to
simultaneously detect the expression of many genes
 A chip of 23,000 human genes
 Microarrays are produced in several ways
 Microsynthesis of oligonucleotides in situ
 Spotting prefabricated oligonucleotides on solid supports
 Spotting DNA fragments or cDNAs on solid supports
 Probes on microarrays are hybridized to fluorescent
cDNA samples

51
52
53
GENE CHIPS

54
Typical dual-colour
microarray experiment

55
Interpretation

 RED: more cDNA from

disease cells hybridize to
DNA probes –
overexpression of the genes
in disease cells
 GREEN: more cDNA from normal tissue hybridize
to DNA probes – underexpression of the genes in
disease cells
 YELLOW: both cDNA hybridize equally to DNA
probes – equal expression in both types of cells
 NO COLOR - neither cDNA from the control nor
cDNA from the disease cells hybridize to DNA
probes – no expression 56
RNA-seq
• Now have the technology to sequence the
mRNA (via cDNA) content of a
cell/organism
• Use Next Generation Sequencing
• Several types of NGS available
RNA-seq generates large numbers of short sequence reads

mRNA
AAAAAAAAAAA

Small cDNAs (75 – 100bp)

Short over-lapping
sequence reads

Computationally intense computer algorithms:

• align overlapping sequence reads with statistical certainty (helps to have a

genome sequence as a scaffold)
• calculate the frequency with which reads appear for each full length transcript
sequence
• determine the proportional representation of each transcript in the original
RNA sample
Advantages of RNA-seq
Technology Microarray cDNA/EST Sequencing RNA-seq
Technology Specification
Principle Hybridisation Sanger Sequencing High-throughput sequencing
Resolution Up to 100bp Single base Single base
Throughput High Low High
Reliance on genomic sequence Yes No In some cases
Background Noise High Low Low

Application
Simultaneously map transcribed Yes Limited for gene expression Yes
regions and gene expression

Dynamic range to quantify gene Up to a few 100 fold Not practical > 8000 fold
expression level

Ability to distinguish gene

Limited Yes Yes
isoforms

Ability to distinguish allelic

Limited Yes Yes
expression

Practical Issues
Required amount of RNA High High Low
Cost of mapping transcriptome of
High High Low
large genomes
Pathway Analysis

Muñoz Garcia et al. (2018) Pathway analysis of transcriptomic data shows immunometabolic effects of vitamin D. J. Mol.
Endocrinology 60 pp. 95-108
Issues with RNA seq technologies
• Involves amplification, usually by PCR - can sometimes get
sequence amplification bias
• Some transcripts reverse transcribe less efficiently
• Sequencing of short reads sometimes makes determination
of RNA processing (e.g. alternative splicing) difficult to
ascertain

Would be better to directly sequence full length RNAs without

amplification
Direct nanopore sequencing of long RNAs
The future of RNA-seq

Direct RNA-sequencing

Single-cell transcriptomics

Parker et al. (2020) Nanopore direct RNA sequencing maps the complexity of Arabidopsis mRNA
processing and m6A modification. eLife 9:e49658.
OUTLINE

1. Genomics & Proteomics: An overview

2. Structural genomics
3. Functional genomics
4. Proteomics

64
GOALS OF PROTEOMICS
 mRNA will produce relative little protein if the mRNA
is short-lived or poorly translated
 Many proteins are post-translationally modified in
ways that affect their activities, and transcription
profiling gives no data regarding this level of
regulation.
 The goals of proteomics is the identification of the full
set of proteins produced by a cell or tissue under a
particular set of conditions: their relative abundance,
their modification, their interacting partner proteins.

65
WHAT DO WE WANT TO
KNOW?
 What proteins are there?
 How much protein is present?
 Does the level change under certain conditions?
 Is the protein active?
 What does the protein do?
 What other proteins does it interact with?
 Is the protein modified? Under what circumstances? What
are the consequences of modification?
2D-PAGE
Key principles of 2D-PAGE (Two-Dimensional
Polyacrylamide Gel Electrophoresis)
 Proteins differ from each other in terms of their mass
and charge
 Both these properties can be used to separate
proteins by gel electrophoresis
 The successive application of both techniques in
perpendicular directions (two dimensions) provides
maximum separation and allows thousands of
proteins to be resolved
67
2D-PAGE
Key principles of 2D-PAGE (Two-Dimensional
Polyacrylamide Gel Electrophoresis)
 Staining the gel reveals the positions of individual
proteins as spots or smudges. These can be picked
and analysed by mass spectrometry.
 There are tens of thousands of proteins in the cell,
differing in abundance over six orders of magnitude.
2D-PAGE is not sensitive enough to detect the rare
proteins and many proteins will not be resolved.
Splitting the sample into different fractions is often
necessary to reduce the complexity of protein
mixtures prior to 2D-PAGE. 68
WHY SDS?
 All proteins have different charge
 Determined by average charge of all amino
acids
 SDS is a –very charged detergent
Before SDS
 Coats proteins, denatures them, gives them
a uniform negative charge
 Charge is dependent on molecular weight

 Usually use a reducing agent as well (e.g.

beta-mercaptoethanol/DTT) After SDS
 Break disulphide bonds

 This is added before the protein is loaded

onto the gel
71
HOW DOES 2D-PAGE WORK?

Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) is used to

separate mixtures of proteins, and is particularly useful for comparing related
samples - such as healthy and diseased tissue. The first step is to load the
proteins onto the gel, which has a pH gradient from top to bottom

73
HOW DOES 2D-PAGE WORK?

When a voltage is applied across the gel, the proteins move through the
gel until they reach the point at which their charge is the same as the
surrounding pH. This separation is called isoelectric focusing

74
HOW DOES 2D-PAGE WORK?

Having separated the proteins by charge, the second step is to separate

them according to their mass in the perpendicular dimension. Individual
proteins can then by isolated and identified by mass spectrometry.
Comparing the two gels can reveal proteins that are expressed at
different levels. For example, the red protein is more abundant in the
diseased tissue, and could be a useful drug target
75
MASS SPECTROMETRY

76
BASIC PRINCIPLES OF MASS
SPECTROMETRY
 Mixture of proteins is ionised by
laser pulse
 Accelerated by electrostatic field
 Separated by time of flight
(determined by mass/charge ratio)
 Detected as a TOF spectrum
PEPTIDE MASS FINGERPRINT
IDENTIFICATION
 Why trypsin?
MEHTTSRYLLDEDDKIAQNFLLEWA
 It cleaves on C-end of lysine +
arginine residues
 Therefore a given protein sequence
should always produce the same MEHTTSR YLLDEDDK
peptides
IAQNFLLEWA
 Peptides produced can be compared to
database (obtained from whole genome
sequencing)

Abundance
 Identification of protein!
m/z
BACKGROUND READING

Pierce, B.A., 2012. Genetics: A Conceptual Approach,

4th Ed., Chapter 20, pp. 557-590

Types of Genomics
No ratings yet
Types of Genomics
28 pages
Structural Functional Comparative Genomics
No ratings yet
Structural Functional Comparative Genomics
17 pages
Discovering Genomics, Proteomic - A. Malcolm Campbell
100% (1)
Discovering Genomics, Proteomic - A. Malcolm Campbell
366 pages
Introduction To Bioinformatics 1
No ratings yet
Introduction To Bioinformatics 1
109 pages
Bioinformatics Notes
No ratings yet
Bioinformatics Notes
104 pages
Gene Expression: Quantification of Information Molecules and Their Applications
No ratings yet
Gene Expression: Quantification of Information Molecules and Their Applications
146 pages
Nucleic Acid Sequencing
No ratings yet
Nucleic Acid Sequencing
59 pages
Bio Info Merged
No ratings yet
Bio Info Merged
154 pages
First Lecture
No ratings yet
First Lecture
89 pages
Genomes 5 5th Edition Instant Access
100% (18)
Genomes 5 5th Edition Instant Access
16 pages
CE6068 Lecture 3
No ratings yet
CE6068 Lecture 3
80 pages
Unit Vi
No ratings yet
Unit Vi
64 pages
Genomics 101
100% (1)
Genomics 101
64 pages
Unit 8
No ratings yet
Unit 8
102 pages
Lecture 7
No ratings yet
Lecture 7
61 pages
G-6 Report
No ratings yet
G-6 Report
78 pages
Comparative Genomics 2 - PART 1
No ratings yet
Comparative Genomics 2 - PART 1
31 pages
Lecture1-4 525 W16 Large
No ratings yet
Lecture1-4 525 W16 Large
80 pages
Human Transcriptom E: by Dr. Ina Garg
No ratings yet
Human Transcriptom E: by Dr. Ina Garg
60 pages
Genome and Genomics
No ratings yet
Genome and Genomics
41 pages
CE6068 Lecture 2
No ratings yet
CE6068 Lecture 2
95 pages
Bioinformatics 2
No ratings yet
Bioinformatics 2
42 pages
Biotechnology
No ratings yet
Biotechnology
29 pages
Genomicsproteomics 180414063127
No ratings yet
Genomicsproteomics 180414063127
46 pages
15) Microarray and Sequencing Presentation (DT)
No ratings yet
15) Microarray and Sequencing Presentation (DT)
40 pages
Functional Genomics Overview 222
No ratings yet
Functional Genomics Overview 222
24 pages
Geniomics
No ratings yet
Geniomics
14 pages
UNIT III Introduction To Bio Bricks & Its Applications
No ratings yet
UNIT III Introduction To Bio Bricks & Its Applications
24 pages
Copy3-Neuro Embryology Presentation
No ratings yet
Copy3-Neuro Embryology Presentation
21 pages
Lec (1) - Introduction
No ratings yet
Lec (1) - Introduction
41 pages
GAP Lecture 1
No ratings yet
GAP Lecture 1
24 pages
Next Generation Sequencing
No ratings yet
Next Generation Sequencing
44 pages
37 06 05 s3 Article
No ratings yet
37 06 05 s3 Article
34 pages
BHU Biotech
No ratings yet
BHU Biotech
38 pages
CE6068 Lecture 4
No ratings yet
CE6068 Lecture 4
82 pages
Genomic Medicine: Basic Molecular Biology
No ratings yet
Genomic Medicine: Basic Molecular Biology
23 pages
Bioinformatics
No ratings yet
Bioinformatics
22 pages
Brief Guide For NGS Transcriptomics: From Gene Expression To Genetics
No ratings yet
Brief Guide For NGS Transcriptomics: From Gene Expression To Genetics
120 pages
L2 Proteomics, Genomics and Bioinformatics
No ratings yet
L2 Proteomics, Genomics and Bioinformatics
30 pages
Detailed Report Seminar
No ratings yet
Detailed Report Seminar
21 pages
Bioinformatics Reviewer Full
No ratings yet
Bioinformatics Reviewer Full
16 pages
Genomics and Bioinformatics
No ratings yet
Genomics and Bioinformatics
34 pages
Intro To Bioinformatics
No ratings yet
Intro To Bioinformatics
16 pages
BM402 Genomics (2C)
No ratings yet
BM402 Genomics (2C)
8 pages
APPLICATION OF BIOINFORMATICS IN MOLECULAR BIOLOGY AND CURRENT RESEACRH-Dr. Ruchi Yadav
No ratings yet
APPLICATION OF BIOINFORMATICS IN MOLECULAR BIOLOGY AND CURRENT RESEACRH-Dr. Ruchi Yadav
105 pages
Genomics and Bioinformatics: Peter Gregory and Senthil Natesan
No ratings yet
Genomics and Bioinformatics: Peter Gregory and Senthil Natesan
22 pages
Lecture 1
No ratings yet
Lecture 1
23 pages
Soon Et Al 2013 High Throughput Sequencing For Biology and Medicine
No ratings yet
Soon Et Al 2013 High Throughput Sequencing For Biology and Medicine
14 pages
Human Genome Project: Presented By: Vaishali Gade & Sandhya Singh
No ratings yet
Human Genome Project: Presented By: Vaishali Gade & Sandhya Singh
30 pages
GE Unit IV
No ratings yet
GE Unit IV
10 pages
Litrature and Design
No ratings yet
Litrature and Design
17 pages
Omics Introduction
No ratings yet
Omics Introduction
25 pages
Genome Sequencing: BY:-Anitha.Y 14KUST4002
No ratings yet
Genome Sequencing: BY:-Anitha.Y 14KUST4002
20 pages
Genomics
No ratings yet
Genomics
4 pages
Chapter 20 Genomics
No ratings yet
Chapter 20 Genomics
43 pages
Deep Sequencing: Introduction To Bioinformatics Seminar November 9th, 2009
No ratings yet
Deep Sequencing: Introduction To Bioinformatics Seminar November 9th, 2009
56 pages
Functional Genomics
No ratings yet
Functional Genomics
5 pages
Chapter 21 Outline
No ratings yet
Chapter 21 Outline
21 pages
Molecular Genetics - Lab Manual - 22 May 2021
No ratings yet
Molecular Genetics - Lab Manual - 22 May 2021
36 pages
LAB 4 - Lab Method
No ratings yet
LAB 4 - Lab Method
19 pages
LAB 5 - Gene Discovery
No ratings yet
LAB 5 - Gene Discovery
10 pages
BT150IU - Introduction To Biotechnology
No ratings yet
BT150IU - Introduction To Biotechnology
5 pages
Assigment 5
No ratings yet
Assigment 5
3 pages