Module_5_Reference Course content
Module_5_Reference Course content
Module05
MITBIO/MITADT University
Syllabus:
Module 5:
Gene Expression and and Representation of patterns and
relationship
General introduction to Gene expression in prokaryotes and
eukaryotes, transcription factors binding sites. SNP, EST, STS.
Introduction to Regular Expression, Hierarchies, and Graphical models
(including Marcov chain and Bayes notes). Genetic variability and
connections to clinical data.
MITBIO/MITADT University
Objective/Learning Outcome:
Discuss about the basics of gene expression and understanding the difference between pattern finding
CO5
and regular expression
CO6 Deduce the evolutionary relationships between the sequences by generating a phylogenetic tree.
MITBIO/MITADT University
General Introduction
What is a Genome?
Priyanka Nath
What is a Gene?
A gene is a portion of a DNA that codes or a transcript
Start codon
MITBIO/MITADT University
Genome organization in Genome of prokaryotes
prokaryotes?
Plasmids
(Usually circular
Double stranded)
Genomic DNA
(Usually circular
Double stranded)
MITBIO/MITADT University
Molecular marker
Molecular marker is a DNA or gene sequence within a
recognized location on a chromosome which is used as
identification tool.
MITBIO/MITADT University
Single nucleotide polymorphism (SNP)
SNP was invented by Lander in 1996.
SNP is formed when any alteration/mutation occurs in single nucleotide (A, T,
C, or G).
The point mutation as such substitutions, insertions or deletions in single
nucleotide it represents SNP.
Origin of SNPs
DNA replication errors
Spontaneous Mutations
Mutagen Exposure
Inherited SNPs
MITBIO/MITADT University
Expressed Sequence Tags (ESTs)
❖Gene identification is difficult as
most of our genome is comprised
of introns interspersed with a
relative few DNA coding
sequences, or genes.
MITBIO/MITADT University
A Sequence Tagged Site (STS)
Just as a person driving a car may need a map to find a destination,
scientists searching for genes also need genome maps to help them to
navigate through the billions of nucleotides that make up the human
genome.
The most powerful mapping technique, and one that has been used to
generate many genome maps, relies on STS mapping.
A Sequence Tagged Site (STS) is a short DNA sequence that is easily recognizable
and occurs only once in a genome (or chromosome).
The 3' ESTs serve as a common source of STSs due to their likelihood of being unique to a
particular species, and provide the additional feature of pointing directly to an expressed gene.
MITBIO/MITADT University
Applications of STSs
Advantages of STSs
ESTs also have a number of practical advantages in that their sequences can be generated rapidly and
inexpensively; only one sequencing experiment is needed per each cDNA generated; and they do not have to be
checked for sequencing errors as mistakes do not prevent identification of the gene from which the EST was
derived.
MITBIO/MITADT University
Gene Predictions
➢ It can be done by predicting the open reading frames and describing the
structure introns and exons.
MITBIO/MITADT University
MITBIO/MITADT University
MITBIO/MITADT University
MITBIO/MITADT University
Gene Prediction in Prokaryotes using Open
Reading Frames
❖Open Reading Frames are the reading frame that does not contain the stop codons
18
Sanket Bapat
❖ A stop codon occurs in about every twenty codons by chance in a non coding region
❖ Therefore a frame longer than 30 codons without interruption by stop codon is suggestive of a
gene coding region
❖ The putative frame is further manually confirmed by the presence other signals such as Shine-
Delgarno sequence and start codon
MITBIO/MITADT University
Gene Prediction in Prokaryotes using GC Bias
and Test code
❖ Gene prediction by examining non randomness of nucleotide distribution
❖ In a coding sequence it has been observed that at the third position of codon preference is G and C
is over A and T
❖ Plotting the GC composition at this position, regions with values significantly above random level
can be identified, indicative of presence of ORFs
❖ TESTCODE explots the fact that the third codon nucleotides in a coding region tend to repeat
themselves. By plotting the repeating patterns of the nucleotides at this position , coding and non-
coding region can be differentiated
20
SANKET BAPAT BIOINFORMATICS Sanket Bapat
Gene Prediction in Prokaryotes using Hidden
Markov Model
MITBIO/MITADT University
Disclaimer:
MITBIO/MITADT University
References:
MITBIO/MITADT University
The content is intended for internal use only, and the ownership belongs to the coordinator. It
should not be uploaded on any platform without proper authorization.