Bio Primer
Bio Primer
1
Genetic Material
• DNA (deoxyribonucleic acid) is the genetic
material
• Information stored in DNA
– the basis of inheritance
– distinguishes living things from nonliving
things
• Genes
– various units that govern living thing’s
characteristics at the genetic level
2
Nucleotides
• Genes themselves contain their information as a specific
sequence of nucleotides found in DNA molecules
• Only four different bases in DNA molecules
– Guanine (G)
– Adenine (A) Base
– Thymine (T)
P
– Cytosine (C) Sugar
3
Purine:
Pyrimidine:
Nucleoside
4
Nucleotides
• Complicated genes can be many
thousands of nucleotides long
• All of an organism’s genetic instructions,
its genome, can be maintained in millions
or even billions of nucleotides
5
Orientation
• Strings of nucleotides can be attached to
each other to make long polynucleotide
chains
• 5’ (5 prime) end
– The end of a string of nucleotides with a 5'
carbon not attached to another nucleotide
• 3’ (3 prime) end
– The other end of the molecule with an
unattached 3' carbon
6
5’ 1’
4’ 2’
3’
7
Base Pairing
• Structure of DNA
– Double helix
– Seminal paper by Watson and Crick in 1953
– Rosalind Franklin’s contribution
• Information content on one of those strands
essentially redundant with the information on the
other
– Not exactly the same—it is complementary
• Base pair
– G paired with C (G C)
– A paired with T (A = T)
8
9
Base Pairing
• Reverse complements
– 5' end of one strand corresponding to the 3' end of its
complementary strand and vice versa
• Example
– one strand: 5'-GTATCC-3'
the other strand: 3'-CATAGG-5' 5'-GGATAC-3'
• Upstream: Sequence features that are 5' to a
particular reference point
• Downstream: Sequence features that are 3' to a
particular reference point
5' 3'
Upstream Downstream
10
DNA Structure
11
DNA Structure
12
Chromosome
• Threadlike "packages" of genes and other
DNA in the nucleus of a cell
13
14
Chromosome
• Different kinds of organisms have different
numbers of chromosomes
• Humans
– 23 pairs
– 46 in all
15
Central Dogma of Molecular
Biology
• DNA: information storage
• Protein: function unit, such as enzyme
• Gene: instructions needed to make protein
• Central dogma
16
Central Dogma of Molecular
Biology
• Central dogma
reverse transcription
(reverse transcriptase)
replication
(DNA polymerase)
• DNA obtained from reverse transcription is
called complementary DNA (cDNA)
Difference between DNA and cDNA will be
discussed later 17
Central Dogma of Molecular
Biology
• RNA (ribonucleic acid) DNA
Base
– Single-stranded polynucleotide
P
– Bases Sugar
• A
H
• G
RNA
• C Base
• U (uracil), instead of T P
Sugar
• Transcription (simplified …)
– A A, G G, C C, T U OH
18
19
20
DNA Replication (DNA DNA)
21
DNA Replication (DNA DNA)
22
DNA Replication Animation
23
Courtesy of Rob Rutherford, St. Olaf University
Transcription (DNA RNA)
• Messenger RNA (mRNA)
– carries information to be
translated
• Ribosomal RNA (rRNA)
– the working “spine” of
the ribosome
• Transfer RNA (tRNA)
– the “decoder keys” that
will translate nucleic
acids to amino acids
24
Transcription Animation
25
Courtesy of Rob Rutherford, St. Olaf University
Peptides and Proteins
• mRNA Sequence of amino
acids connected by peptide
bond
• Amino acid sequence
– Peptide: < 30 – 50 amino acids
– Protein: longer peptide
26
27
28
Genetic Code – Codon
Codon:
3-base
RNA Stop
codons
sequence
Start
codon
29
List of Amino Acids
Amino acid Symbol Codon
A Alanine Ala GC*
C Cysteine Cys UGU, UGC
D Aspartic Acid Asp GAU, GAC
E Glutamic Acid Glu GAA, GAG
F Phenylalanine Phe UUU, UUC
G Glycine Gly GG*
H Histidine His CAU, CAC
I Isoleucine Ile AUU, AUC, AUA
K Lysine Lys AAA, AAG
L Leucine Leu UUA, UUG, CU*
30
List of Amino Acids
Amino acid Symbol Codon
M Methionine Met AUG
N Asparagine Asn AAU, AAC
P Proline Pro CC*
Q Glutamine Gln CAA, CAG
R Arginine Arg CG*, AGA, AGG
S Serine Ser UC*, AGU, AGC
T Threonine Thr AC*
V Valine Val GU*
W Tryptophan Trp UGG
Y Tyrosine Tyr UAU, UAC
20 letters, no B J O U X Z 31
Codon and Reading Frame
• 4 AA letters 43 = 64 triplet possibilities
• 20 (< 64) known amino acids
• Wobbling 3rd base
• Redundant Resistant to mutation
• Reading frame: linear sequence of codons in a
gene
• Open Reading Frame (ORF), definition varies:
– a reading frame that begins with a start codon and
end at a stop codon
– a series of codons in a DNA sequence uninterrupted
by the presence of a stop codon
a potential protein-coding region of DNA sequence
32
Open Reading Frame
• Given a nucleotide sequence
– How many reading frames? __
• __ forward and __ backward
• Example: Given a DNA sequence,
5’-ATGACCGTGGGCTCTTAA-3’
– ATG ACC GTG GGC TCT TAA M T V G S *
– TGA CCG TGG GCT CTT AA * P W A L
– GAC CGT GGG CTC TTA A D R G L L
– Figure out the three backward reading frames
• In random sequence, a stop codon will follow a Met in
~20 AAs
• Substantially longer ORFs are often genes or parts of
them
33
Translation (RNA Protein)
34
Translation Animation
35
Courtesy of Rob Rutherford, St. Olaf University
Gene Expression
• Gene expression
– Process of using the information stored in
DNA to make an RNA molecule and then a
corresponding protein
• Cells controlling gene expression by
– reliably distinguishing between those parts of
an organism’s genome that correspond to the
beginnings of genes and those that do not
– determining which genes code for proteins
that are needed at any particular time.
36
Promoter
• The probability (P) that a string of nucleotides will occur
by chance alone if all nucleotides are present at the same
frequency P = (1/4)n, where n is the string’s length
• Promoter sequences
– Sequences recognized by RNA polymerases as being associated
with a gene
• Example
– Prokaryotic RNA polymerases scan along DNA looking for a
specific set of approximately 13 nucleotides marking the
beginning of genes
– 1 nucleotide that serves as a transcriptional start site
– 6 that are 10 nucleotides 5' to the start site, and
– 6 more that are 35 nucleotides 5' to the start site
– What is the frequency for the sequence to occur?
37
Gene Regulation
• Regulatory proteins
– Capable of binding to a cell’s DNA near the promoter
of the genes
– Control gene expression in some circumstances but
not in others
• Positive regulation
– binding of regulatory proteins makes it easier for an
RNA polymerase to initiate transcription
• Negative regulation
– binding of the regulatory proteins prevents
transcription from occurring
38
Promoter and Regulatory Example
40
Exons and Introns
41
Exons and Introns Example
42
Protein Structure and Function
• Genes encode the recipes for proteins
43
Protein Structure and Function
• Proteins are amino acid polymers
44
Proteins: Molecular Machines
Proteins in your muscles allows you to move:
myosin
and
actin
45
Proteins: Molecular Machines
Digestion, catalysis
(enzymes)
Structure (collagen)
46
Proteins: Molecular Machines
Signaling
(hormones,
kinases)
Transport
(energy,
oxygen)
47
Protein
Structures
48
Information Flow in Nucleated Cell
49
Point Mutation Example:
Sickle-cell Disease
• Wild-type hemoglobin • Mutant hemoglobin
DNA DNA
3’----CTT----5’ 3’----CAT----5’
mRNA mRNA
5’----GAA----3’ 5’----GUA----3’
52
Thinking about the Human
Genome
~ 3 X 109 bps
(3 billion base pairs)