0% found this document useful (0 votes)
6 views

Module_5_Reference Course content

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Module_5_Reference Course content

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

MIT School of Bioengineering Sciences and Research

(A Constituent unit of MIT ADT University)

Basic Concepts In Bioinformatics / BI301

Module05

Gene Expression and and Representation


of patterns and relationship

Course Coordinator: Dr. Priyanka Nath


Mail ID: [email protected] | [email protected]
Disclaimer:

The content delivered here should be considered of utmost importance. However, it is


to be noted that, this material is not Stand-alone material for the fulfilment of the
course syllabus. The content in this presentation should only be used as an aid to
learning.
Books and other resources provided are suggested to be referred for exhaustive
understanding.

MITBIO/MITADT University
Syllabus:

Module 5:
Gene Expression and and Representation of patterns and
relationship
General introduction to Gene expression in prokaryotes and
eukaryotes, transcription factors binding sites. SNP, EST, STS.
Introduction to Regular Expression, Hierarchies, and Graphical models
(including Marcov chain and Bayes notes). Genetic variability and
connections to clinical data.

MITBIO/MITADT University
Objective/Learning Outcome:

CO1 Understanding the basics of bioinformatics and its Applications

CO2 Difference between databases and various biological databases

CO3 Performing data storage methods and various formats.

CO4 Understanding sequence alignment and types of sequence alignment

Discuss about the basics of gene expression and understanding the difference between pattern finding
CO5
and regular expression

CO6 Deduce the evolutionary relationships between the sequences by generating a phylogenetic tree.

MITBIO/MITADT University
General Introduction

What is a Genome?

The term was coined in 1920 by Hans Winkler, Professor of


Botany at the University of Hamburg, Germany, from the words
gene and chromosome

The genome of an organism is its whole hereditary information


and is encoded in the DNA (or, for some viruses, RNA )

The genome of an organism is the complete DNA sequence


of one set of chromosomes

This includes both the genes and the non-coding sequences

Priyanka Nath
What is a Gene?
A gene is a portion of a DNA that codes or a transcript
Start codon

Promoter Transcription terminator

Shine Dalgarno Translation stop

The simplest prokaryotic gene


Regulatory region

MITBIO/MITADT University
Genome organization in Genome of prokaryotes
prokaryotes?

Plasmids
(Usually circular
Double stranded)

Genomic DNA
(Usually circular
Double stranded)

MITBIO/MITADT University
Molecular marker
Molecular marker is a DNA or gene sequence within a
recognized location on a chromosome which is used as
identification tool.

MITBIO/MITADT University
Single nucleotide polymorphism (SNP)
SNP was invented by Lander in 1996.
SNP is formed when any alteration/mutation occurs in single nucleotide (A, T,
C, or G).
The point mutation as such substitutions, insertions or deletions in single
nucleotide it represents SNP.
Origin of SNPs
DNA replication errors

Spontaneous Mutations

Mutagen Exposure

Inherited SNPs

MITBIO/MITADT University
Expressed Sequence Tags (ESTs)
❖Gene identification is difficult as
most of our genome is comprised
of introns interspersed with a
relative few DNA coding
sequences, or genes.

❖mRNA can be used to identify the


genes

❖However mRNA is unstable

❖cDNA is a much more stable


compound and, importantly,
because it was generated from a
mRNA
MITBIO/MITADT University
Expressed Sequence Tags (ESTs)

❖ cDNA representing an expressed gene has been


isolated, scientists can then sequence a few hundred
nucleotides from either end of the molecule to create
two different kinds of ESTs. Sequencing only the
beginning portion of the cDNA produces what is called
a 5' EST

MITBIO/MITADT University
A Sequence Tagged Site (STS)
Just as a person driving a car may need a map to find a destination,
scientists searching for genes also need genome maps to help them to
navigate through the billions of nucleotides that make up the human
genome.

For a map to make navigational sense, it must include reliable landmarks


or "markers".

The most powerful mapping technique, and one that has been used to
generate many genome maps, relies on STS mapping.
A Sequence Tagged Site (STS) is a short DNA sequence that is easily recognizable
and occurs only once in a genome (or chromosome).

The 3' ESTs serve as a common source of STSs due to their likelihood of being unique to a
particular species, and provide the additional feature of pointing directly to an expressed gene.

MITBIO/MITADT University
Applications of STSs

ESTs as Gene Discovery Resources


Because ESTs represent a copy of just the interesting part of a genome--that which is expressed, they have
proven themselves again and again as powerful tools in the hunt for genes involved in hereditary diseases.

Advantages of STSs

ESTs also have a number of practical advantages in that their sequences can be generated rapidly and
inexpensively; only one sequencing experiment is needed per each cDNA generated; and they do not have to be
checked for sequencing errors as mistakes do not prevent identification of the gene from which the EST was
derived.

MITBIO/MITADT University
Gene Predictions

➢ Due to Rapid accumulation of genomic information , we require computational


tools to predict the gene structure

➢ Computational tools are used to detail annotation of genes and genomes

➢ It can be done by predicting the open reading frames and describing the
structure introns and exons.

MITBIO/MITADT University
MITBIO/MITADT University
MITBIO/MITADT University
MITBIO/MITADT University
Gene Prediction in Prokaryotes using Open
Reading Frames
❖Open Reading Frames are the reading frame that does not contain the stop codons

❖Region from start codon and stop codon

18
Sanket Bapat
❖ A stop codon occurs in about every twenty codons by chance in a non coding region

❖ Therefore a frame longer than 30 codons without interruption by stop codon is suggestive of a
gene coding region

❖ The putative frame is further manually confirmed by the presence other signals such as Shine-
Delgarno sequence and start codon

MITBIO/MITADT University
Gene Prediction in Prokaryotes using GC Bias
and Test code
❖ Gene prediction by examining non randomness of nucleotide distribution

❖ In a coding sequence it has been observed that at the third position of codon preference is G and C
is over A and T

❖ Plotting the GC composition at this position, regions with values significantly above random level
can be identified, indicative of presence of ORFs

❖ TESTCODE explots the fact that the third codon nucleotides in a coding region tend to repeat
themselves. By plotting the repeating patterns of the nucleotides at this position , coding and non-
coding region can be differentiated

20
SANKET BAPAT BIOINFORMATICS Sanket Bapat
Gene Prediction in Prokaryotes using Hidden
Markov Model

GeneMark is a suite used to predict gene in prokaryotic organism using HMM


21
SANKET BAPAT BIOINFORMATICS Sanket Bapat
Gene Expression in Eukaryotes Central Dogma and Splicing

MITBIO/MITADT University
Disclaimer:

The content delivered here should be considered of utmost importance. However, it is


to be noted that, this material is not Stand-alone material for the fulfilment of the
course syllabus. The content in this presentation should only be used as an aid to
learning.
Books and other resources provided are suggested to be referred for exhaustive
understanding.

MITBIO/MITADT University
References:

References Book Name Library

Jin Xiong Essential Bioinformatics Ebook / Present in Library

Rastogi, S. C Bioinformatics: Concepts, Skills And Applications Present in Library

Bosu Thukral Bioinformatics: Databases, Tools and Algorithms Present in Library

Neelam Yadav Handbook to Bioinformatics Present in Library

MITBIO/MITADT University
The content is intended for internal use only, and the ownership belongs to the coordinator. It
should not be uploaded on any platform without proper authorization.

You might also like