Proteomics & Mass Spectrometry
Proteomics & Mass Spectrometry
Mass Spectrometry
Nathan Edwards
Center for Bioinformatics and Computational Biology
Outline
• Proteomics
• Mass Spectrometry
• Protein Identification
• Peptide Mass Fingerprint
• Tandem Mass Spectrometry
2
Proteomics
• Proteins are the machines that drive
much of biology
• Genes are merely the recipe
• The direct characterization of a
sample’s proteins en masse.
• What proteins are present?
• How much of each protein is present?
3
Systems Biology
• Establish relationships by
• Choosing related samples,
• Global characterization, and
• Comparison.
Gene / Transcript / Protein
Measurement Predetermined Unknown
Discrete (DNA) Genotyping Sequencing
Continuous Gene Expression Proteomics
4
Samples
• Healthy / Diseased
• Cancerous / Benign
• Drug resistant / Drug susceptible
• Bound / Unbound
• Tissue specific
• Cellular location specific
• Mitochondria, Membrane
5
2D Gel-Electrophoresis
• Protein separation
• Molecular weight (MW)
• Isoelectric point (pI)
• Staining
• Birds-eye view of
protein abundance
6
2D Gel-Electrophoresis
8
Mass Spectrometer
Sample
+
_
Analyte/
Ed = 0
matrix
Length = D
Length = s
Backing plate
(grounded) Extraction grid
(source voltage -Vs) Detector grid -Vs
10
Mass Spectrum
11
Mass is fundamental
12
Peptide Mass Fingerprint
Cut out
2D-Gel
Spot
13
Peptide Mass Fingerprint
Trypsin Digest
14
Peptide Mass Fingerprint
MS
15
Peptide Mass Fingerprint
16
Peptide Mass Fingerprint
17
Protein Sequence
18
Protein Sequence
19
Peptide Masses
1811.90 GLSDGEWQQVLNVWGK
1606.85 VEADIAGHGQEVLIR
1271.66 LFTGHPETLEK
1378.83 HGTVVLTALGGILK
1982.05 KGHHEAELKPLAQSHATK
1853.95 GHHEAELKPLAQSHATK
1884.01 YLEFISDAIIHVLHSK
1502.66 HPGDFGADAQGAMTK
748.43 ALELFR
20
ALELFR
LFTGHPETLEK
21
HGTVVLTALGGILK
HPGDFGADAQGAMTK
VEADIAGHGQEVLIR
Peptide Mass Fingerprint
GLSDGEWQQVLNVWGK
GHHEAELKPLAQSHATK
YLEFISDAIIHVLHSK
KGHHEAELKPLAQSHATK
Mass Spectrometry
• Strengths
• Precise molecular weight
• Fragmentation
• Automated
• Weaknesses
• Best for a few molecules at a time
• Best for small molecules
• Mass-to-charge ratio, not mass
• Intensity ≠ Abundance
22
Sample Preparation for
MS/MS
Enzymatic Digest
and
Fractionation
23
Single Stage MS
MS
24
Tandem Mass Spectrometry
(MS/MS)
Precursor selection
25
Tandem Mass Spectrometry
(MS/MS)
Precursor selection +
collision induced dissociation
(CID)
MS/MS
26
Peptide Fragmentation
H…-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH
Ri-1 Ri Ri+1
C-terminus
27
Peptide Fragmentation
28
Peptide Fragmentation
yn-i
yn-i-1
-HN-CH-CO-NH-CH-CO-NH-
Ri CH-R’
i+1
bi R”
i+1
bi+1
29
Peptide Fragmentation
Peptide: S-G-F-L-E-E-D-E-L-K
MW ion ion MW
88 b1 S GFLEEDELK y9 1080
145 b2 SG FLEEDELK y8 1022
292 b3 SGF LEEDELK y7 875
405 b4 SGFL EEDELK y6 762
534 b5 SGFLE EDELK y5 633
663 b6 SGFLEE DELK y4 504
778 b7 SGFLEED ELK y3 389
907 b8 SGFLEEDE 30
LK y2 260
Peptide Fragmentation
88 145 292 405 534 663 778 907 1020 1166 b ions
S G F L E E D E L K
1166 1080 1022 875 762 633 504 389 260 147 y ions
100
% Intensity
0 m/z
250 500 750 1000
31
Peptide Fragmentation
88 145 292 405 534 663 778 907 1020 1166 b ions
S G F L E E D E L K
1166 1080 1022 875 762 633 504 389 260 147 y ions
y6
100
y7
% Intensity
y5
b3
b4
y2 y3 y4 b5 b6 b8 y
y9
b7 b9 8
0 m/z
250 500 750 1000
32
Peptide Identification
Given:
• The mass of the precursor ion, and
• The MS/MS spectrum
Output:
• The amino-acid sequence of the peptide
33
Peptide Identification
Two paradigms:
• De novo interpretation
34
De Novo Interpretation
100
% Intensity
0 m/z
250 500 750 1000
35
De Novo Interpretation
100
% Intensity
E L
0 m/z
250 500 750 1000
36
De Novo Interpretation
100
% Intensity
SGF L E E L F G
E
KL E D E D E L
0 m/z
250 500 750 1000
37
De Novo Interpretation
Amino-Acid Residual MW Amino-Acid Residual MW
A Alanine 71.03712 M Methionine 131.04049
C Cysteine 103.00919 N Asparagine 114.04293
D Aspartic acid 115.02695 P Proline 97.05277
E Glutamic acid 129.04260 Q Glutamine 128.05858
F Phenylalanine 147.06842 R Arginine 156.10112
G Glycine 57.02147 S Serine 87.03203
H Histidine 137.05891 T Threonine 101.04768
I Isoleucine 113.08407 V Valine 99.06842
K Lysine 128.09497 W Tryptophan 186.07932
L Leucine 113.08407 Y Tyrosine 163.06333
38
De Novo Interpretation
39
De Novo Interpretation
40
De Novo Interpretation
41
De Novo Interpretation
42
Sequence Database
Search
• Compares peptides from a protein
sequence database with spectra
• Filter peptide candidates by
• Precursor mass
• Digest motif
• Score each peptide against spectrum
• Generate all possible peptide fragments
• Match putative fragments with peaks
• Score and rank
43
Peptide Fragmentation
S G F L E E D E L K
100
% Intensity
0 m/z
250 500 750 1000
44
Peptide Fragmentation
88 145 292 405 534 663 778 907 1020 1166 b ions
S G F L E E D E L K
1166 1080 1022 875 762 633 504 389 260 147 y ions
100
% Intensity
0 m/z
250 500 750 1000
45
Peptide Fragmentation
88 145 292 405 534 663 778 907 1020 1166 b ions
S G F L E E D E L K
1166 1080 1022 875 762 633 504 389 260 147 y ions
y6
100
y7
% Intensity
y5
b3
b4
y2 y3 y4 b5 b6 b8 y
y9
b7 b9 8
0 m/z
250 500 750 1000
46
Sequence Database Search
47
Peptide Candidate
Filtering
Digestion Enzyme: Trypsin
• Cuts just after K or R unless followed
by a P.
• Must allow for “missed” cleavage sites
• “Average” peptide length about 10-15
amino-acids
48
Peptide Candidate
Filtering
>ALBU_HUMAN
MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFE
DHVKLVNEVTEFAK…
51
Mascot Search Engine
52
Mascot MS/MS Ions
Search
53
Mascot MS/MS Search
Results
54
Mascot MS/MS Search
Results
55
Mascot MS/MS Search
Results
56
Mascot MS/MS Search
Results
57
Mascot MS/MS Search
Results
58
Mascot MS/MS Search
Results
59
Mascot MS/MS Search
Results
60
Mascot MS/MS Search
Results
61
Mascot MS/MS Search
Results
62
Mascot MS/MS Search
Results
63
Summary
64
Further Reading
65