BLAST Lab
BLAST Lab
Alveena Khan
SBI 3U0
Comparing DNA Sequences to Understand Evolutionary
Relationships With Blast
How can bioinformatics be used as a tool to determine evolutionary
relationships and to better understand genetic diseases?
■■BACKGROUND
Between 1990–2003, scientists working on an international research project known as the Human Genome Project
were able to identify and map the 20,000–25,000 genes that define a human being. The project also successfully
mapped the genomes of other species, including the fruit fly, mouse, and Escherichia coli. The location and complete
sequence of the genes in each of these species are available for anyone in the world to access via the Internet.
Why is this information important? Being able to identify the precise location and sequence of human genes will
allow us to better understand genetic diseases. In addition, learning about the sequence of genes in other species
helps us understand evolutionary relationships among organisms. Many of our genes are identical or similar to those
found in other species.
Suppose you identify a single gene that is responsible for a particular disease in fruit flies. Is that same gene found in
humans? Does it cause a similar disease? It would take you nearly 10 years to read through the entire human
genome to try to locate the same sequence of bases as that in fruit flies. This definitely isn’t practical, so a
sophisticated technological method is needed.
Note that the cladogram is treelike, with the endpoints of each branch representing a specific species. The closer
two species are located to each other, the more recently they share a common ancestor. For example,
Selaginella(spikemoss) and Isoetes(quillwort) share a more recent common ancestor than the common ancestor that
is shared by all three organisms.
■■Learning Objectives
● To create cladograms that depict evolutionary relationships
● To analyze biological data with a sophisticated bioinformatics online tool
● To use cladograms and bioinformatics tools to ask other questions of your own and to test your ability to
apply concepts you know relating to genetics and evolution
2. GAPDH (glyceraldehyde 3-phosphate dehydrogenase) is an enzyme that catalyzes the sixth step in glycolysis, an
important reaction that produces molecules used in cellular respiration. The following data table shows the
percentage similarity of this gene and the protein it expresses in humans versus other species. For example,
according to the table, the GAPDH gene in chimpanzees is 99.6% identical to the gene found in humans, while
the protein is identical.
Table 2. Percentage Similarity Between the GAPDH Gene and Protein in Humans and
Other Species
Gene Percentage Protein Percentage
Species
Similarity Similarity
Chimpanzee (Pan troglodytes) 99.6% 100%
Dog (Canis lupus familiaris) 91.3% 95.2%
Fruit fly (Drosophila melanogaster) 72.4% 76.7%
Roundworm (Caenorhabditiselegans) 68.2% 74.3%
a. Why is the percentage similarity in the gene always lower than the percentage similarity in the
protein for each of the species? (Hint: Recall how a gene is expressed to produce a protein.)
Since all animals share the same essential amino acids yet we do not need to share essential genes.
b. Draw a cladogram depicting the evolutionary relationships among all five species (including
humans) according to their percentage similarity in the GAPDH gene.
1. What species in the BLAST result has the most similar gene sequence to the gene of interest?
The chicken is the species with the most similar gene sequence to gene in the BLAST results. The Fruit Fly is the
BLAST result species with the most similar gene sequence to gene 2. The Zebra Finch is the BLAST result species with
the most similar gene sequence to gene 3. The Chinese Alligator is the BLAST result species with the most similar
gene sequence to gene 4.
Gene 1 is placed in feathered birds. Gene 2 is found in invertebrates, such as insects. Gene 3 has also been placed in
birds. Crocodilians are home to Gene 4. Gene 3 is the one that is most closely connected to the fossil.
3. How similar is that gene sequence?
Gene 1 is 99 percent comparable to a chicken and has a 10,091 score. Gene 2 is identical to a fruit fly at 100% and
has a 4,420 score. Gene 3 is identical to a zebra finch at 100% and has a 1,193 score. A chinese alligator and Gene 4
have a 100 percent similarity and a 1,768 score.
4. What species has the next most similar gene sequence to the gene of interest?
The Ring-necked pheasant has the next most comparable gene sequence to gene 1. A fly has the next most similar
gene sequence to gene 2. The Bengalese finch has the next most comparable gene sequence to gene 3. The
American Alligator has the next most comparable gene sequence to gene 4.
5. Based on what you have learned from the sequence analysis and what you know from the
structure, decide where the new fossil species belongs on the cladogram with the other
organisms. If necessary, redraw the cladogram in your lab notebook.
6. What other data could be collected from the fossil specimen to help properly identify its
evolutionary story?
To ensure that the fruit fly gene has nothing to do with the prehistoric species, further genes could be added to the
cladogram. Adding more genes would increase the variety and provide evidence of where the unknown species falls
in among the other organisms.
Designing and Conducting Your Investigation: Myosins
What is the function in humans of the protein produced from that gene?
Myosins are involved in growth and tissue formation, metabolism, reproduction, communication, reshaping, and
movement of all 100 trillion cells in the human body. Further, myosins power the rapid entry of microbial pathogens
such as parasites, viruses, and bacteria in eukaryotic host cells.
Would you expect to find the same protein in other organisms? If so, which ones?
I would expect the same gene in species such as chimpanzees, gorillas or gibbons. This is due to them being so
closely related.
Is it possible to find the same gene in two different kinds of organisms but not find the protein
that is produced from that gene?
Yes, latent genes, or genes that are not expressed, are responsible for this phenomena. Two separate species can
share the same gene, but one may have it in a dormant state while the other has it active and expressed, resulting in
the protein associated with the shared gene.
If you found the same gene in all organisms you test, what does this suggest about the evolution
of this gene in the history of life on earth?
This indicates that the gene has not evolved over the course of life on Earth and has most likely been present for
many generations.
Does the use of DNA sequences in the study of evolutionary relationships mean that other
characteristics are unimportant in such studies? Explain your answer.
In the study of evolutionary links, DNA sequences provide accurate and reliable data that all scientists can find, but
this does not negate the importance of other traits in such research. Fossils and other tests are used by scientists to
learn about the evolution of species.