0% found this document useful (0 votes)
31 views

Bio Primer

binds to operator, blocking RNA polymerase from binding to promoter

Uploaded by

Hyorin Kim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Bio Primer

binds to operator, blocking RNA polymerase from binding to promoter

Uploaded by

Hyorin Kim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 53

Introduction to Bioinformatics

Molecular Biology Primer

1
Genetic Material
• DNA (deoxyribonucleic acid) is the genetic
material
• Information stored in DNA
– the basis of inheritance
– distinguishes living things from nonliving
things
• Genes
– various units that govern living thing’s
characteristics at the genetic level

2
Nucleotides
• Genes themselves contain their information as a specific
sequence of nucleotides found in DNA molecules
• Only four different bases in DNA molecules
– Guanine (G)
– Adenine (A) Base
– Thymine (T)
P
– Cytosine (C) Sugar

• Each base is attached to a phosphate group and a


deoxyribose sugar to form a nucleotide.
• The only thing that makes one nucleotide different from
another is which nitrogenous base it contains

3
Purine:

Pyrimidine:

Nucleoside

4
Nucleotides
• Complicated genes can be many
thousands of nucleotides long
• All of an organism’s genetic instructions,
its genome, can be maintained in millions
or even billions of nucleotides

5
Orientation
• Strings of nucleotides can be attached to
each other to make long polynucleotide
chains
• 5’ (5 prime) end
– The end of a string of nucleotides with a 5'
carbon not attached to another nucleotide
• 3’ (3 prime) end
– The other end of the molecule with an
unattached 3' carbon

6
5’ 1’
4’ 2’
3’

7
Base Pairing
• Structure of DNA
– Double helix
– Seminal paper by Watson and Crick in 1953
– Rosalind Franklin’s contribution
• Information content on one of those strands
essentially redundant with the information on the
other
– Not exactly the same—it is complementary
• Base pair
– G paired with C (G  C)
– A paired with T (A = T)
8
9
Base Pairing
• Reverse complements
– 5' end of one strand corresponding to the 3' end of its
complementary strand and vice versa
• Example
– one strand: 5'-GTATCC-3'
the other strand: 3'-CATAGG-5'  5'-GGATAC-3'
• Upstream: Sequence features that are 5' to a
particular reference point
• Downstream: Sequence features that are 3' to a
particular reference point
5' 3'
Upstream Downstream
10
DNA Structure

11
DNA Structure

12
Chromosome
• Threadlike "packages" of genes and other
DNA in the nucleus of a cell

13
14
Chromosome
• Different kinds of organisms have different
numbers of chromosomes
• Humans
– 23 pairs
– 46 in all

15
Central Dogma of Molecular
Biology
• DNA: information storage
• Protein: function unit, such as enzyme
• Gene: instructions needed to make protein
• Central dogma

16
Central Dogma of Molecular
Biology
• Central dogma

reverse transcription
(reverse transcriptase)
replication
(DNA polymerase)
• DNA obtained from reverse transcription is
called complementary DNA (cDNA)
 Difference between DNA and cDNA will be
discussed later 17
Central Dogma of Molecular
Biology
• RNA (ribonucleic acid) DNA
Base
– Single-stranded polynucleotide
P
– Bases Sugar
• A
H
• G
RNA
• C Base
• U (uracil), instead of T P
Sugar
• Transcription (simplified …)
– A  A, G G, C  C, T  U OH

18
19
20
DNA Replication (DNA  DNA)

21
DNA Replication (DNA  DNA)

22
DNA Replication Animation

23
Courtesy of Rob Rutherford, St. Olaf University
Transcription (DNA  RNA)
• Messenger RNA (mRNA)
– carries information to be
translated
• Ribosomal RNA (rRNA)
– the working “spine” of
the ribosome
• Transfer RNA (tRNA)
– the “decoder keys” that
will translate nucleic
acids to amino acids

24
Transcription Animation

25
Courtesy of Rob Rutherford, St. Olaf University
Peptides and Proteins
• mRNA  Sequence of amino
acids connected by peptide
bond
• Amino acid sequence
– Peptide: < 30 – 50 amino acids
– Protein: longer peptide

26
27
28
Genetic Code – Codon

Codon:
3-base
RNA Stop
codons
sequence

Start
codon
29
List of Amino Acids
Amino acid Symbol Codon
A Alanine Ala GC*
C Cysteine Cys UGU, UGC
D Aspartic Acid Asp GAU, GAC
E Glutamic Acid Glu GAA, GAG
F Phenylalanine Phe UUU, UUC
G Glycine Gly GG*
H Histidine His CAU, CAC
I Isoleucine Ile AUU, AUC, AUA
K Lysine Lys AAA, AAG
L Leucine Leu UUA, UUG, CU*
30
List of Amino Acids
Amino acid Symbol Codon
M Methionine Met AUG
N Asparagine Asn AAU, AAC
P Proline Pro CC*
Q Glutamine Gln CAA, CAG
R Arginine Arg CG*, AGA, AGG
S Serine Ser UC*, AGU, AGC
T Threonine Thr AC*
V Valine Val GU*
W Tryptophan Trp UGG
Y Tyrosine Tyr UAU, UAC

20 letters, no B J O U X Z 31
Codon and Reading Frame
• 4 AA letters  43 = 64 triplet possibilities
• 20 (< 64) known amino acids
• Wobbling 3rd base
• Redundant  Resistant to mutation
• Reading frame: linear sequence of codons in a
gene
• Open Reading Frame (ORF), definition varies:
– a reading frame that begins with a start codon and
end at a stop codon
– a series of codons in a DNA sequence uninterrupted
by the presence of a stop codon
 a potential protein-coding region of DNA sequence
32
Open Reading Frame
• Given a nucleotide sequence
– How many reading frames? __
• __ forward and __ backward
• Example: Given a DNA sequence,
5’-ATGACCGTGGGCTCTTAA-3’
– ATG ACC GTG GGC TCT TAA  M T V G S *
– TGA CCG TGG GCT CTT AA  * P W A L
– GAC CGT GGG CTC TTA A  D R G L L
– Figure out the three backward reading frames
• In random sequence, a stop codon will follow a Met in
~20 AAs
• Substantially longer ORFs are often genes or parts of
them
33
Translation (RNA  Protein)

34
Translation Animation

35
Courtesy of Rob Rutherford, St. Olaf University
Gene Expression
• Gene expression
– Process of using the information stored in
DNA to make an RNA molecule and then a
corresponding protein
• Cells controlling gene expression by
– reliably distinguishing between those parts of
an organism’s genome that correspond to the
beginnings of genes and those that do not
– determining which genes code for proteins
that are needed at any particular time.
36
Promoter
• The probability (P) that a string of nucleotides will occur
by chance alone if all nucleotides are present at the same
frequency P = (1/4)n, where n is the string’s length
• Promoter sequences
– Sequences recognized by RNA polymerases as being associated
with a gene
• Example
– Prokaryotic RNA polymerases scan along DNA looking for a
specific set of approximately 13 nucleotides marking the
beginning of genes
– 1 nucleotide that serves as a transcriptional start site
– 6 that are 10 nucleotides 5' to the start site, and
– 6 more that are 35 nucleotides 5' to the start site
– What is the frequency for the sequence to occur?
37
Gene Regulation
• Regulatory proteins
– Capable of binding to a cell’s DNA near the promoter
of the genes
– Control gene expression in some circumstances but
not in others
• Positive regulation
– binding of regulatory proteins makes it easier for an
RNA polymerase to initiate transcription
• Negative regulation
– binding of the regulatory proteins prevents
transcription from occurring

38
Promoter and Regulatory Example

• Low tryptophan concentration


 RNA polymerase binds to promoter
 genes transcribed
• High tryptophan concentration
 repressor protein becomes active and binds to operator
 blocks the binding of RNA polymerase to the promoter
• Tryptophan concentration drops
 repressor releases its tryptophan and is released from DNA
39
 polymerase again transcribes genes
Gene Structure

40
Exons and Introns

41
Exons and Introns Example

42
Protein Structure and Function
• Genes encode the recipes for proteins

43
Protein Structure and Function
• Proteins are amino acid polymers

44
Proteins: Molecular Machines
 Proteins in your muscles allows you to move:
myosin
and
actin

45
Proteins: Molecular Machines
 Digestion, catalysis
(enzymes)
 Structure (collagen)

46
Proteins: Molecular Machines
 Signaling
(hormones,
kinases)
 Transport
(energy,
oxygen)

47
Protein
Structures

48
Information Flow in Nucleated Cell

49
Point Mutation Example:
Sickle-cell Disease
• Wild-type hemoglobin • Mutant hemoglobin
DNA DNA
3’----CTT----5’ 3’----CAT----5’

mRNA mRNA
5’----GAA----3’ 5’----GUA----3’

Normal hemoglobin Mutant hemoglobin


------[Glu]------ ------[Val]------
50
51
image credit: U.S. Department of Energy Human Genome Program, http://www.ornl.gov/hgmis.
Thinking about the Human
Genome
50% is high copy number repeats
About 10% is transcribed
(made into RNA)
Only 1.5% actually codes for protein
98.5% Junk DNA

52
Thinking about the Human
Genome
~ 3 X 109 bps
(3 billion base pairs)

If each base were one mm long…


2000 miles, across the center of Africa
Average gene about 30 meters long
Occur about every 270 meters between them
Once spliced the message would only be
53
~1 meter long

You might also like