0% found this document useful (0 votes)
26 views

Chapter 5: Modern Sequencing: - Sequencing by Mass Spectrometry

This document discusses modern protein sequencing techniques including mass spectrometry and database search methods. It describes how sequencing information can be used to determine evolutionary relationships by comparing conserved residues and calculating percent identity/similarity between proteins. A key example discussed is cytochrome c, a highly conserved protein that is nearly identical across many species and has evolved for approximately 1.5-2 billion years. The document also covers protein domains, gene duplication, and how protein families such as globin and fibronectin arise and evolve over time.

Uploaded by

Kyle Broflovski
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Chapter 5: Modern Sequencing: - Sequencing by Mass Spectrometry

This document discusses modern protein sequencing techniques including mass spectrometry and database search methods. It describes how sequencing information can be used to determine evolutionary relationships by comparing conserved residues and calculating percent identity/similarity between proteins. A key example discussed is cytochrome c, a highly conserved protein that is nearly identical across many species and has evolved for approximately 1.5-2 billion years. The document also covers protein domains, gene duplication, and how protein families such as globin and fibronectin arise and evolve over time.

Uploaded by

Kyle Broflovski
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Chapter 5: Modern Sequencing

Sequencing by Mass Spectrometry Mass Spectrum Fragmentation Sequencing Applications of Sequence Information Sequence Comparison Evolutionary Relationships

Peptide fragmentation is predicable!

Difference in fragment masses = amino acids!

Deriving Sequence Information


Database Search Methods
Compare observed fragmentation pattern versus in silico fragmentation patterns for protein sequences in database.

De novo Sequencing
Define difference between fragmentation masses and relate to amino acid sequence.

Total Protein Sequencing


Either by chemical or instrumental methods. Requires overlapping fragmentation of the polypeptide sequence. Assembly of the protein sequence from overlapping peptide sequences. Mapping of PTMs and disulfide linkages requires further experimental methods.

Applying Sequence Information: Homology and Evolutionarily Related Proteins Conserved/invariant residues: i.e., these residues reflect chemical/structural/functional necessities. Conservative substitutions: substitutions with similar chemical properties Asp for Glu, Lys for Arg, Ile for Val

Variable regions/non-conservative substitutions, no requirement for chemical reactions, etc.


Percent identity = 100*(# identical residues)/(# total)

Percent similarity = 100*(# identical+similar residues)/(# total)


Percent similarity is always greater than percent identity.

Some Web Resources

Cytochrome C: The Story of a Highly Conserved Protein


Mitochondrial electron transport Critical for energy metabolism ~1.5-2 billion years old Identical amino acids in 38 positions across all 38 organisms in the Table. Most other organisms have similar amino acids in these positions

The numbers at the bottom indicate the number of different residue types that are accommodated at each position in the sequence.

AA Difference Matrix, 26 Species of Cytochrome C


Man,chimp Rhesus monkey Horse Donkey Cow,Sheep Dog Gray Whale Rabbit Kangaroo Chicken Penguin Duck Rattlesnake Turtle Bullfrog Tuna fish Worm fly Silk moth Wheat Bread mold Yeast Candida k. 0 1 12 11 10 11 10 9 10 13 13 11 14 15 18 21 27 31 43 48 45 51 0 11 10 9 10 9 8 11 12 12 10 15 14 17 21 26 30 43 47 45 51 Average differences 0 1 3 6 5 6 7 11 12 10 22 11 14 19 22 29 46 46 46 51 0 2 5 4 5 8 10 11 9 21 10 13 18 22 28 45 46 45 50 10.0 0 3 2 4 6 9 10 8 20 9 11 17 22 27 45 46 45 50 0 3 5 7 10 10 8 21 9 12 18 21 25 44 46 45 49

0 2 6 9 9 7 19 8 11 17 22 27 44 46 45 50

5.1 0 6 8 8 6 18 9 11 17 21 26 44 46 45 50 0 12 10 10 21 11 13 18 24 28 47 49 46 51

Approx. 100 residue protein Involved in electron transport Established 1.5-2 billion yrs ago for aerobic respiration
0 17 7 11 17 22 27 46 46 46 51 9.9 14.3 0 12.6 22 0 24 10 0 26 18 15 0 29 24 22 24 31 28 29 32 46 46 48 49 47 49 49 48 47 49 47 47 51 53 51 48

0 2 3 19 8 11 17 23 28 46 47 46 51

0 3 20 8 12 18 24 27 46 48 45 50

18.5 0 14 45 41 45 47 0 25.9 45 0 47 54 0 47.0 47 47 41 0 47 50 42 27 0

Phylogenetic Tree of Cytochrome C

NOTE: All organisms have evolved about the same amount, not lower organisms first, then higher.

number of amino acid differences per 100 residues

DNA mutates at an assumed constant rate.


However, mutations that kill the function of an important protein are not viable.

Highly conserved proteins have very few mutations and are unlikely to be dispensible!
i.e., Histone H4!

Protein Domain Sequence-Function Relationships


Domains > ~40% identical usually have the same function. Domains with similar functions have similar sequences. Domains with < ~25% sequence identity usually perform different roles.

Gene Duplication, Evolution, Protein Families


Many proteins within an organism have sequence similarities with other proteins. These are called gene or protein families. The relatedness among members of a family can vary greatly. These families arise by gene duplication. Once duplicated, individual genes can mutate into separate function. Duplicated genes may vary in their chemical properties due to mutations. These duplicate genes evolve with different properties. Example, the globin family.

The Globin Protein Family


Related Proteins

Hemoglobin: a2b2 Fetal hemoglobin: a2g2 Embryogenesis hemoglobin: z2e2 Hemoglobin (1%): a2d2 , unknown function, may evolve one in the future Proteins in family are derived from progenitor, not each other!

Progenitor Protein

Gene Duplication in Clotting Proteins

Duplications may not only occur in genes!

Duplications of domains can also lead to new proteins!


Proteins can be modular!

Fibronectin Domains 1 and 2

A.R.Pickford et al. (2001). The hairpin structure of the (6)F1(1)F2(2)F2 fragment from human fibronectin enhances gelatin binding. EMBO J, 20, 1519-1529.

Fibronectin Domain 3

Courtesy of PDB

Multiple Fibronectin Domains

A.R.Pickford et al. (2001). The hairpin structure of the (6)F1(1)F2(2)F2 fragment from human fibronectin enhances gelatin binding. EMBO J, 20, 1519-1529.

For Next Time:


Finish Reading Chapter 5 Finish Problems in Textbook and Student Companion

Read Chapter 6 Start Problems in Textbook and Student Companion

You might also like