0% found this document useful (0 votes)

4 views4 pages

Exons and Introns of Eukaryotic Genes

The document presents a new statistical distance measure for analyzing exons and introns in eukaryotic genes, which improves the segregation of these regions compared to existing methods. The proposed measure utilizes logarithm transformation of the dot product of probability vectors to capture the dependencies between nucleotide bases. Experimental results demonstrate its effectiveness in distinguishing coding and non-coding regions of genes, showing clear distinctions between introns and exons.

Uploaded by

benguyarenbeyaz98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views4 pages

Exons and Introns of Eukaryotic Genes

Uploaded by

benguyarenbeyaz98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB16)

A New Statistical Distanceuseful for Analyzing

Exons and Introns of Eukaryotic Genes
Uddalak Mitra1 and Balaram Bhattacharyya2
1 2
Professor, Research Scholar ,Department of Computer and System Sciences, Visva-Bharati University.
Santiniketan -731205, India.
[email protected], [email protected]

regions is inspected in [7]. A segmentation method for exon

Abstract—Segregation of exons and introns from gene and introns, based on entropy is described in [8]. Authors of
sequence is an important issue in simulating transcription. [9] have used an alternating definition of information that is
Although various computational efforts based on probabilistic found to be useful to analyze coding and noncoding regions of
approaches are taken to discriminate the regions, lack of gene. Another information theoretic approach, average mutual
accuracy is the principal reason to achieve perfection. We information [10], has been used to recognize the protein-
propose a statistical distance measure using logarithm
transformation of the dot product oftwo probability vectors.
coding regions of genes for which training set are not
Experimental result shows effectiveness of the distance measure available.
in segregation of genes with certainty better than other existing Feature of genetic segments are diverse and is too difficult to
methods, even for gene having very less numbers of exons and concordant perfectly with the existing approaches.An
introns. appropriate statistical measure is thus a requirement to capture
features of gene segments for their distinction. Complex
Index Terms—Computational Biology, Gene prediction, patterns of exons and introns in eukaryotic genes make it more
Information Theory, Mathematical transformation, Statistical difficult to predict regions of interest compared to those of
distance measure. prokaryotic genes. They possess plenty ofdiversity in size and
organization andhave no typical structure.However they
contain several conserved features. Such conserved features
I. INTRODUCTION can be used as discriminatory statistics among exons and

G ENE prediction or finding focuses on the process of

identification of specific regions of genomic DNA [1].
Gene prediction includes the recognition of protein-coding
introns of eukaryotic genes. This is the key concept for
segregation of the regions.
We formulate a new statistical distance measure for
genes and non-coding RNA, segregation of exons and introns determining dependency between two statistical objects.
[2] in a gene, but may also include finding of other functional Probability mass function (PMF) is used to represent patterns
elements such as regulatory regions and untranslated regions of the objects. We apply the measure to extract dependencies
(UTR) [3]. In eukaryotic cell the genomic DNA is the of a nucleotide base with other bases at k locations
principal DNA. After a genome has been sequenced, gene downstream and find that the measured dependencies are
finding is the first and most important step in understanding distinctly different for exon regions than that of introns,
the structure of the genome. Identification of the correct genes thereby capturing the discriminatory feature. Although
and determining their functions still demand in vivo downstream is natural in nucleotide sequence, the measure is
experimentation, although the bioinformatics researches are equally applicable for upstream.
making it increasingly possible to isolate gene sequences and
predict functions of genes based only on the sequence alone.
II. DERIVATION OF THE DISTANCE MEASURE
Gene sequences,being a distribution of nucleotides, can be
seen as a repository of biological information necessary for Measurement of dependency between two statistical objects
activity inorganism, statistical and information theoretic is important with wide applicability ranging from
approaches are likely to be relevant forits’ analysis[4]. anthropology, biology, physics, chemistry, computer science,
It has been inspected in [5], on the basis of Shannon Entropy, ecology, physiology etc. The objects may be two random
that gene sequence has more randomness compared to human variables, two sample spaces or two population spaces [11-
language and computer programming languages. Another 13]. Distance or dependency measure is a quantitative degree
important feature, long range correlation between nucleotide of an indication of how far two statistical objects are apart.
sequences, has been investigated in [6]. A recognized Statistical distances satisfying the following properties -
statistical feature, the non-uniform codon usage of coding positivity or non-negativity, symmetry and triangular
inequality called metrics. Otherwise it is statistical divergence.

978-1-4673-9745-2 ©2016 IEEE

International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB16)

joint PMF fk(x,y) = Pk(X=x,Y=y) and the product of marginal

A PMF can be assumed as a vector whose elements are PMFs, f(x)*f(y) = P(X=x)*P(Y=y) of two random variables X
points in Euclidean space and essentially sum up to 1, termed and Y, where the subscript k used with joint PMF indicate that
as probability vector. Probability vector P can be written as P the nucleotides x and y occur at k location apart . Let Nk(x,y)
= [p1, p2, p3 …, pn] in standard notation with the assumption be the number of times two nucleotides at k bases apart takes
∑ i=1.Individual components are within 0 and 1, 0≤pi≤1. the values x and y, where x and y can be A, C, G and T. The
Algebraic form of the dot product of P=[p1,,p2,p3,…,pn] and joint probability Pk(X=x,Y=y) can be estimated by
Q=[q1,q2,q3,…qn] can be defined as
n Pk(X=x,Y=y)=Nk(x,y)/   Nk (x, y) … (4)
 pq = p q + p q + … p q
x∈{A,C,G,T} y∈{A,C,G,T}
P.Q= i i
1 1 2 2 n n … (1)
i =1
Marginal probabilities P(X=x) can be estimated by dividing
The corresponding geometric definition of the dot product total number of times nucleotide x occurs divided by the total
is given by number of bases in the sequence .The difference between the
joint PMF and the product of marginal PMF determines
P.Q = ||P|| ||Q|| CosΘ … (2) mutual dependency of two random variables. The task thus
boils down to the measurement of the difference between the
where Θ is the direction angle between the probability joint PMF and the product of two marginal PMFs. Application
vectors and ||P || and ||Q|| are magnitudes of the vectors P and of Eq.3 yields the measurement
Q respectively. The direction angle has the implication that
when it is 90 degree the probability vectors are orthogonal and
D(Pk(X=x,Y=y)||(P(X=x)*P(Y=y)))
is 0 degree when they are collinear. With P and Q being
probability vectors, P.Q can be interpreted as a statistical 16
distance between the probability distributions. But simple P.Q
is not enough powerful to capture discriminatory feature of
= log(1-log(P(X= x,Y = y)*P(X= x)*P(Y= y)))
i=1
k … (5)

sequences, containing complex inherent statistical patterns.

Statistical data transformationis a procedure to III. PROPERTIES OF THE MEASURE
mathematically modify the values of variable using
transformation operators. Statistical operators are based on the The measurement satisfies the following properties of a
assumption that the variables are normally distributed. Minor statistical distance measure
violations of this assumption increase chances of committing A. Non–negativity: As 0≤Pi≤1 and 0≤Qi≤1 , their product
errors, Type I or Type-II. True normality is exceedingly rare PiQimust satisfy the inequality 0≤PiQi≤1. So the quantity PiQi
and thus data transformations are a need to capture near- is always a fraction and the logarithm of a fraction is always
normality through reduction of errors.The task is thus negative, hence the term (1 – log(PiQi)) will produce a positive
designing appropriate operator for data transformation. value greater than 1. So the term log(1-log(PiQi) is always
Among various transformation operations, square root and positive and the sum of all positive term must generate a
logarithm are widely used. We observe that square root of positive value.Thus our statistical measurement is always non-
values above 1.00 becomes smaller and those between 0.00 negative.
and 0.99becomes larger. Logarithm transformation, on the B. Semi-definiteness of the measurement: When all the Pi’s
contrary, maintains functional pattern of variables on which goes to zero and Qi’s are non negative real numbers all the
the transformation is applied[14].A logarithm transformation terms log(1-(PiQi)) becomes zero and vice-versa.Hence we can
on the dot product of two probability vectors produces the assume the minimum value for the measurement as zero. As
distance measure as the distance measure is a statistical distance and it can actually
n
measure the dependence between two statistical objects we
D(P||Q)=  log(1 − log( p q ))
i =1
i i
can state that the maximum distance will cause in case of
statistical independence. Hence the measurement is semi-
=log(1-log(p1q1))+log(1-log(p2q2))+…+log(1-log(pnqn))
definite.
(3)
C. Symmetry: As multiplication of two real numbers is
commutative, its logarithm is symmetric, so
The base the logarithm can be any standard base value like
e, 2 or 10. As the data transformation techniques do not n n
change the intuitive meaning of the operands and the product
values, the quantity D(P||Q) can be interpreted as the
 log(1 − log( piqi )) =
i =1
 log(1 − log(q p ))
i =1
i i

(6)
logarithm transformed dot product of the probability vectors P
and Q.
A gene can be viewed as a sequence of nucleotide bases and
patterns of the bases may be considered as manifestation of

978-1-4673-9745-2 ©2016 IEEE

International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB16)

TABLE I
COMPARISON OF RESULTS BETWEEN SUPERINFORMATION AND PROPOSED
MEASURE
Range of
Range of
ProposedDistance
Superinformation
measure of
Gene ID measure
exons(above)
exons (above)
and
and introns(below)
introns(below)
HGNC1809 3.1536- 3.8518 50.8851 – 51.0297
1.5000-2.3219 44.2350 – 47.8641

HGNC1810 2.9140-3.6899 50.8746- 50.0425

0.8113-2.3219 44.1843-47.8080

HGNC11311 2.1556- 2.9477 50.7741-50.8235

HGNC11882 3.1751-3.7464 50.7756-50.8662

Fig.1.Patterns of statistical distance (y-axis) against nucleotide distance (x- 2.1556-3.0000 47.3689- 47.4801
axis) of the gene HGNC: 462
HGNC12668 1.9219- 2.5850 50.9062-51.0074
1.0000-1.0000 44.5161- 48.7337

HGNC14024 1.5059- 3.0062 50.7396 -50.9839

IV. RESULT AND DISCUSSION 4.4108- 4.6452 47.4614 -47.6848

HGNC38732 2.4194-3.1699 50.8256-50.9436

The proposed distance measure is applied as a statistical 3.0531-3.6566 47.2825- 47.4517
tool to analyze the coding and non-coding regions of genes.
HGNC18568 2.6924-3.3750 51.0122-51.2085
Human chromosomes data are taken from the site 1.5000-2.3219 48.5335-48.7282
http://www.ensembl.org/Homo_sapiens for study. The gene
HGNC:462in human chromosome Ycodes for protein HGNC462 2.2500- 3.1699 50.9292-51.2904
4.3090-4.6640 47.6975-47.8937
amelogenin involved in amelogenesis, the development of
enamel. It’s first transcript ENST00000215479 has 6 exons HGNC37464 2.0588-2.8074 50.9581-51.1611
(coding regions) and 5 introns (non-coding regions). We have 2.7899-3.6250 47.4404-47.6399
concatenated 6 exon regions to form the contiguous protein
HGNC37473 2.5503-3.4183 50.8754- 50.9689
coding sequence. The same is done for 5 introns. The 2.0875-3.0761 47.3018-47.4325
dependency measure is then calculated for each combined
coding and non-coding sections. The resultsfor the values of k
between 1 and 20 are presented in Figure1 which shows clear Thus for each gene we have 20 dependency values for its
distinction between introns and exons. concatenated coding regions and the same for the non-coding
regions. The average dependency value is computed forthe
concatenated exon region for each gene to construct
probability density function. Thecorresponding computation is
done for the concatenated introns. The result is interesting.
There is cleardistinction between the two distributions without
any overlap(Figure 2).
A comparative study has been carried outto assess
efficiency of the proposed distance measure over Mutual
Information, Bhattacharyya Distance(Figure 3 and Figure 4)
and Superinformation(Table 1). Experiments are conductedon
randomly selected 20 numbers of genes taken from each of the
Human Chromosome 1, 10 and 19.It may be noted in the table
that the minimum distance in case of exons differ much from
the maximum distance of introns, resulting in distinct
segregation of introns and exons. This is an achievement over
Fig2: Probability density function of exons and introns existing measures. It is worth mentioning that for all valuesof
k ≤ L/2, where L is the length of a sequence, there is clear
To study the efficiency of the proposed distance measure in segregation of dependency values, calculated using proposed
segregating introns and exons we further consider all genes in distance measure, for exon and intron regions.
chromosome Y.

International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB16)

V. CONCLUSION
In this paper we have proposed a new statistical distance
measure from the logarithm transformation of the dot product
of two probability vectors. An important application of the
measure has been found for computation of probability density
function of parts of DNA sequences with a view to distinguish
the selected parts. Experiments are conducted on gene
sequences of several human chromosomes. The distance
measure is found to be efficient, superseding some other
existing distance measures, in segregating intron and exon
regions of human gene sequences.

REFERENCES

[1] R.D. Sleator, An overview of the current status of eukaryote gene

prediction strategies. Gene. 2010 Aug 1;461(1-2):1-4. doi:
10.1016/j.gene.2010.04.008. Epub 2010 Apr 27.
[2] S. Ohno, Brookhaven Symp. Biol. 23, 366 (1972).
[3] J. Wang, J. Kudoh, A. Shintani, S. Minoshima, and N. Shimizu,
Biochem. Biophys. Res. Commun. 250, 704 (1998).
[4] Ganna Leonenko, Sietse O. Los and Peter R. J. North, Statistical
Distances and Their Applications to Biophysical Parameter
Estimation: Information Measures, M-Estimates, and Minimum
Contrast Methods, Remote Sens. 2013, 5, 1355-1388;
doi:10.3390/rs5031355
Fig 3: Patterns of statistical distance (y-axis) against nucleotide distance (x- [5] A. O. Schmitt and H. Herzel, J. Theor. Biol. 1888, 369 (1987).
axis) of combined coding and non-coding regions of the gene from
[6] C. K. Peng, S. Buldyrev, A. Goldberger, S. Havlin, F. Sciortino, M.
Chromosome 1 and 10
Simons, and H. E. Stanley, Nature (London) 356, 168 (1992).
. [7] R. Grantham, C. Gautier, M. Gouy, M. Jacobzone, and R. Mercier,
Nucleic Acids Res. 9, R43 (1981).
[8] P. Bernaola-Galvan, I. Grosse, P. Carpena, J. L. Oliver, R. Roman-
Roldan, and H. E. Stanley, Phys. Rev. Lett. 85, 1342 (2000).
[9] Ranjan Bose and Sonali Chouhan, PHYSICAL REVIEW E 83,
051918 (2011)
[10] I. Grosse, V.B. Sergey, S.H. Eugene , (2000) Average Mutual
Information of Coding and Noncoding DNA, Pacific Symposium on
Biocomputing 5:611-620.
[11] A. Bhattacharyya,(1943). "On a measure of divergence between two
statistical populations defined by their probability distributions".
Bulletin of the Calcutta Mathematical Society 35: 99–109.
MR 0010358
[12] Mahalanobis, P. C. (1936).On the generalised distance in statistics .
Proceedings of the National Institute of Sciences of India2 (1): 49–55.
Retrieved 2012-05-03.
[13] Sung-Hyuk Cha.( 2007), Comprehensive Survey on
Distance/Similarity Measures between Probability Density Functions.
INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS
AND METHODS IN APPLIED SCIENCES, Issue 4, Volume 1
[14] Osborne, Jason (2002). Notes on the use of data transformations.
Practical Assessment, Research & Evaluation, 8(6)

Fig 4:Patterns of statistical distance (y-axis) against nucleotide distance (x-

axis) of combined coding and non-coding regions of the gene from
Chromosome 19 and Y

Bioinformatics 2015
No ratings yet
Bioinformatics 2015
269 pages
Module_5_Reference Course content
No ratings yet
Module_5_Reference Course content
25 pages
12572067
No ratings yet
12572067
45 pages
Gene Prediction Using Statistical Methods
No ratings yet
Gene Prediction Using Statistical Methods
47 pages
Lecture Notes Algorithms in Bioinformatics I - Prof. Daniel Huson
No ratings yet
Lecture Notes Algorithms in Bioinformatics I - Prof. Daniel Huson
28 pages
Bioinformatics Module Final Version-Word
No ratings yet
Bioinformatics Module Final Version-Word
18 pages
3. Gene Prediction
No ratings yet
3. Gene Prediction
14 pages
02 Essential Genes Identification Model Based on Sequence Feature Map and Graph Convolutional Neural Network
No ratings yet
02 Essential Genes Identification Model Based on Sequence Feature Map and Graph Convolutional Neural Network
14 pages
Module3 Dbms
No ratings yet
Module3 Dbms
192 pages
COMPARATIVE GENOMICS
No ratings yet
COMPARATIVE GENOMICS
48 pages
Lec - 7 Decision Analysis
100% (1)
Lec - 7 Decision Analysis
63 pages
BioAlg10 9
No ratings yet
BioAlg10 9
69 pages
12 ICIEV Dhaka
No ratings yet
12 ICIEV Dhaka
5 pages
Support Vector Machines and Kernels For Computational Biology
No ratings yet
Support Vector Machines and Kernels For Computational Biology
26 pages
Gene Finding and Gene Structure Prediction: Outline
No ratings yet
Gene Finding and Gene Structure Prediction: Outline
30 pages
Bachelor of Engineering in Computer Science & Engineering: Gene Recognition
No ratings yet
Bachelor of Engineering in Computer Science & Engineering: Gene Recognition
52 pages
CUBT401 - 4 - Sequence and Genome Annotation
No ratings yet
CUBT401 - 4 - Sequence and Genome Annotation
66 pages
Gene L0cation and Structure
No ratings yet
Gene L0cation and Structure
20 pages
Gene Ontology and Functional Enrichment: Genome 559: Introduction To Statistical and Computational Genomics
No ratings yet
Gene Ontology and Functional Enrichment: Genome 559: Introduction To Statistical and Computational Genomics
30 pages
Unit 2 BI
No ratings yet
Unit 2 BI
10 pages
Mnit J - Pwc-Fa - 15-03-22
No ratings yet
Mnit J - Pwc-Fa - 15-03-22
54 pages
Chapter 5 - Basic Concepts in Human Molecular Geneti - 2009 - Molecular Patholog
No ratings yet
Chapter 5 - Basic Concepts in Human Molecular Geneti - 2009 - Molecular Patholog
19 pages
Gene Prediction
No ratings yet
Gene Prediction
24 pages
A Novel Construction of Genome Space with Biological Geometry
No ratings yet
A Novel Construction of Genome Space with Biological Geometry
14 pages
Procascamc00016 0140
No ratings yet
Procascamc00016 0140
5 pages
MATH3353 Notes
No ratings yet
MATH3353 Notes
100 pages
Proyecto Genoma Humano
No ratings yet
Proyecto Genoma Humano
50 pages
Computational_Characterization_of_Transc
No ratings yet
Computational_Characterization_of_Transc
6 pages
DFMFullCoverageKS5-HypothesisTesting_HOMEWORK
No ratings yet
DFMFullCoverageKS5-HypothesisTesting_HOMEWORK
7 pages
Using Dit-Fft Algorithm For Identification of Protein Coding Region in Eukaryotic Gene
No ratings yet
Using Dit-Fft Algorithm For Identification of Protein Coding Region in Eukaryotic Gene
10 pages
Full Fundamentals of Thermodynamics 1st Edition John H S Lee K Ramamurthi Ebook All Chapters
100% (5)
Full Fundamentals of Thermodynamics 1st Edition John H S Lee K Ramamurthi Ebook All Chapters
49 pages
CL662 PW 02 Gene Finding
No ratings yet
CL662 PW 02 Gene Finding
39 pages
Gene Pridiction and Orf
No ratings yet
Gene Pridiction and Orf
34 pages
Year 8 2023-2024 Curriculum Mapping
No ratings yet
Year 8 2023-2024 Curriculum Mapping
25 pages
09.05.23_Sequencing Technology and Development_Canvas
No ratings yet
09.05.23_Sequencing Technology and Development_Canvas
31 pages
Computational Approaches
No ratings yet
Computational Approaches
12 pages
RTD in PFR Tube
No ratings yet
RTD in PFR Tube
13 pages
Matematica Filogenomica
No ratings yet
Matematica Filogenomica
41 pages
Lva1 App6891 PDF
No ratings yet
Lva1 App6891 PDF
33 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
66 pages
Rosales
No ratings yet
Rosales
27 pages
Gene Prediction
No ratings yet
Gene Prediction
15 pages
Contents:-: Using Neural Networks 7
No ratings yet
Contents:-: Using Neural Networks 7
21 pages
Influence Line Diagram (ILD) 1. Sign Convention:: Figure: 1 (A)
No ratings yet
Influence Line Diagram (ILD) 1. Sign Convention:: Figure: 1 (A)
4 pages
Genome Annotation
No ratings yet
Genome Annotation
24 pages
BBT3 - CASD - BIOCOMP - 2ndassignment' With You
No ratings yet
BBT3 - CASD - BIOCOMP - 2ndassignment' With You
7 pages
Genomic Medicine: Basic Molecular Biology
No ratings yet
Genomic Medicine: Basic Molecular Biology
23 pages
Gene Finding
No ratings yet
Gene Finding
5 pages
Signal Processing in Sequence Analysis: Advances in Eukaryotic Gene Prediction
No ratings yet
Signal Processing in Sequence Analysis: Advances in Eukaryotic Gene Prediction
12 pages
An Overview of Gene Identification
No ratings yet
An Overview of Gene Identification
9 pages
LOPA
No ratings yet
LOPA
9 pages
Boundary Exon Prediction Using External Information
No ratings yet
Boundary Exon Prediction Using External Information
5 pages
SHA512 Ftfubbj
No ratings yet
SHA512 Ftfubbj
11 pages
Uncertainty of Bogue Formulation
No ratings yet
Uncertainty of Bogue Formulation
26 pages
4 - Graham S y Healey P Relational Concepts of Space and Place Issues For Planning Theory and Practice
No ratings yet
4 - Graham S y Healey P Relational Concepts of Space and Place Issues For Planning Theory and Practice
25 pages
CS395T Computational Statistics With Application To Bioinformatics
No ratings yet
CS395T Computational Statistics With Application To Bioinformatics
28 pages
Exon - Intron
No ratings yet
Exon - Intron
4 pages
DrahmedsoilMechanicsnoteschapter5 PDF
No ratings yet
DrahmedsoilMechanicsnoteschapter5 PDF
61 pages
Gene Identification - I: Shivani Chandra Birla Institute of Scientific Research
No ratings yet
Gene Identification - I: Shivani Chandra Birla Institute of Scientific Research
35 pages
Module 1 - Geometric Relations
100% (1)
Module 1 - Geometric Relations
26 pages
MT5118
No ratings yet
MT5118
6 pages
Thompson R. J. and Visser A.T. 1999. Designing and Managing Unpaved Opencast Mine Haul Roads For Optimum Performance
No ratings yet
Thompson R. J. and Visser A.T. 1999. Designing and Managing Unpaved Opencast Mine Haul Roads For Optimum Performance
17 pages
Introduction To Bioinformatics: Tolga Can
No ratings yet
Introduction To Bioinformatics: Tolga Can
21 pages
Gate - in 2003
No ratings yet
Gate - in 2003
24 pages
Ghosh and Mallik
No ratings yet
Ghosh and Mallik
68 pages
A Technical Paper ON Genomic Digital Signal Processing: Jyothishmathi Institute of Technology and Science Karimnagar
No ratings yet
A Technical Paper ON Genomic Digital Signal Processing: Jyothishmathi Institute of Technology and Science Karimnagar
12 pages
Hamming Code
No ratings yet
Hamming Code
12 pages
Gene Prediction
No ratings yet
Gene Prediction
5 pages
Name: - Geometry CP Boston Collegiate Charter School Final Exam Ms - Touhey Multiple Choice (30 Points)
No ratings yet
Name: - Geometry CP Boston Collegiate Charter School Final Exam Ms - Touhey Multiple Choice (30 Points)
39 pages
(Solved) Let V Be A Finite-Dimensional Vector Space, and Let W 1 and W 2 Be..
No ratings yet
(Solved) Let V Be A Finite-Dimensional Vector Space, and Let W 1 and W 2 Be..
3 pages
Modeling and Data Analysis in The Credit Card Industry: Bankruptcy, Fraud, and Collections
No ratings yet
Modeling and Data Analysis in The Credit Card Industry: Bankruptcy, Fraud, and Collections
6 pages
Mat507 Advanced Mathematics For Biomedical Engineering TH 1.00 Ac18
No ratings yet
Mat507 Advanced Mathematics For Biomedical Engineering TH 1.00 Ac18
2 pages
Mathematics Resource Package: Quarter Ii
No ratings yet
Mathematics Resource Package: Quarter Ii
7 pages
Gene Prediction
25% (4)
Gene Prediction
36 pages
Gene Prediction
No ratings yet
Gene Prediction
50 pages
Group # 13
No ratings yet
Group # 13
49 pages
2ETM (Extracted Timing Models) - More Detail - VLSI Concepts
No ratings yet
2ETM (Extracted Timing Models) - More Detail - VLSI Concepts
2 pages
(Chapter - 1) (Real Numbers) (Exemplar Problems) : Answer 5
No ratings yet
(Chapter - 1) (Real Numbers) (Exemplar Problems) : Answer 5
2 pages
Compgenestruiden
No ratings yet
Compgenestruiden
19 pages
Difference of Two Square
100% (1)
Difference of Two Square
16 pages
Gene Finding
No ratings yet
Gene Finding
31 pages
Manual PDF
100% (1)
Manual PDF
53 pages
Experiment No: 1 Aim
No ratings yet
Experiment No: 1 Aim
13 pages
ISMC 2016 Primary 5 Solutions
No ratings yet
ISMC 2016 Primary 5 Solutions
6 pages
I. Objectives: Unit Test (Quadrilaterals)
No ratings yet
I. Objectives: Unit Test (Quadrilaterals)
2 pages
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
No ratings yet
Bioinformatics Tools: Stuart M. Brown, PH.D Dept of Cell Biology NYU School of Medicine
50 pages
Unit 6 - Bioinformatics
No ratings yet
Unit 6 - Bioinformatics
41 pages
Ir Pulse Generator Pseudocode
No ratings yet
Ir Pulse Generator Pseudocode
3 pages
Syllabus of Engineering Service Examination
No ratings yet
Syllabus of Engineering Service Examination
5 pages
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
From Everand
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
Björn Olsson
No ratings yet

Exons and Introns of Eukaryotic Genes

Uploaded by

Exons and Introns of Eukaryotic Genes

Uploaded by

International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB16)

A New Statistical Distanceuseful for Analyzing

regions is inspected in [7]. A segmentation method for exon

G ENE prediction or finding focuses on the process of

978-1-4673-9745-2 ©2016 IEEE

joint PMF fk(x,y) = Pk(X=x,Y=y) and the product of marginal

sequences, containing complex inherent statistical patterns.

978-1-4673-9745-2 ©2016 IEEE

HGNC1810 2.9140-3.6899 50.8746- 50.0425

HGNC11311 2.1556- 2.9477 50.7741-50.8235

HGNC11882 3.1751-3.7464 50.7756-50.8662

HGNC14024 1.5059- 3.0062 50.7396 -50.9839

HGNC38732 2.4194-3.1699 50.8256-50.9436

978-1-4673-9745-2 ©2016 IEEE

[1] R.D. Sleator, An overview of the current status of eukaryote gene

Fig 4:Patterns of statistical distance (y-axis) against nucleotide distance (x-

978-1-4673-9745-2 ©2016 IEEE

You might also like