0% found this document useful (0 votes)

156 views4 pages

Position Weight Matrix

The position weight matrix (PWM) is a commonly used representation of motifs in biological sequences. PWMs are derived from aligned sequences thought to share a function and contain a position for each sequence symbol at each pattern position. PWMs are more sensitive and precise than consensus sequences for distinguishing true binding sites. Information content, calculated from the PWM, indicates how different the motif is from random sequences. PWMs are used in computational tools to discover motifs and scan sequences for motif matches.

Uploaded by

Anonymous E4Rbo2s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

156 views4 pages

Position Weight Matrix

Uploaded by

Anonymous E4Rbo2s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Position weight matrix

This article is about Bioinformatics. For the disease

in horses known by the acronym PSSM, see Equine
polysaccharide storage myopathy.
A position weight matrix (PWM), also known as a

weights which could distinguish true binding sites from

other non-functional sites with similar sequences. Training the perceptron on both sets of sites resulted in a matrix and a threshold to distinguish between the two sets.[1]
Using the matrix to scan new sequences not included in
the training set showed that this method was both more
sensitive and precise than the best consensus sequence.[2]
The advantages of PWMs over consensus sequences have
made PWMs a popular method for representing patterns
in biological sequences and an essential component in
modern algorithms for motif discovery.[3][4]

PWMs are often represented graphically as sequence logos.

position-specic weight matrix (PSWM) or positionspecic scoring matrix (PSSM), is a commonly used 2 From Sequences to PWM
representation of motifs (patterns) in biological sequences.
A PWM has one row for each symbol of the alphabet:
PWMs are often derived from a set of aligned sequences 4 rows for nucleotides in DNA sequences or 20 rows
that are thought to be functionally related and have be- for amino acids in protein sequences. It also has one
come an important part of many software tools for com- column for each position in the pattern. In the rst
step in constructing a PWM, a basic position frequency
putational motif discovery.
matrix (PFM) is created by counting the occurrences
of each nucleotide at each position. From the PFM, a
position probability matrix (PPM) can now be created
1 Background
by dividing that former nucleotide count at each position
by the number of sequences, thereby normalising the
values. Formally, given a set X of N aligned sequences
of length l, the elements of the PPM M are calculated:

Mk,j =

N
1
I(Xi,j = k),
N i=1

where i (1,...,N), j (1,...,l), k is the set of symbols

in the alphabet and I(a=k) is an indicator function where
I(a=k) is 1 if a=k and 0 otherwise.
For example, given the following DNA sequences:
PWMs were introduced by American geneticist Gary Stormo.

The position weight matrix was introduced by American geneticist Gary Stormo and colleagues in 1982[1] as
an alternative to consensus sequences. Consensus se- the corresponding PFM is:
quences had previously been used to represent patterns
in biological sequences, but had diculties in the prediction of new occurrences of these patterns.[2] The rst

gorithm was suggested by Polish American mathemati- M = G1 1 7 10 0

cian Andrzej Ehrenfeucht in order to create a matrix of
T 4 1 1 0 10
1

6
2
1
1

7
1
1
1

2
1
5
2

1
2
.
1
6

4 INFORMATION CONTENT OF A PWM

The entries in the matrix make clear the advantage of adding pseudocounts, especially when using small
datasets

to construct M. The background model need not

A 0.3 0.6 0.1 0.0 0.0 0.6 0.7 0.2 have
0.1equal values for each symbol: for example, when
C 0.2 0.2 0.1 0.0 0.0 0.2 0.1 0.1 studying
0.2
. organisms with a high GC-content, the values
M=

G0.1 0.1 0.7 1.0 0.0 0.1 0.1 0.5 for0.1

C and G may be increased with a corresponding deT 0.4 0.1 0.1 0.0 1.0 0.1 0.1 0.2 crease
0.6 for the A and T values.

and therefore the resulting PPM is:

[5]

Both PPMs and PWMs assume statistical independence

between positions in the pattern, as the probabilities for
each position are calculated independently of other positions. From the denition above, it follows that the sum
of values for a particular position (that is, summing over
all symbols) is 1. Each column can therefore be regarded
as an independent multinomial distribution. This makes
it easy to calculate the probability of a sequence given
a PPM, by multiplying the relevant probabilities at each
position. For example, the probability of the sequence
S = GAGGTAAAC given the above PPM M can be
calculated:

When the PWM elements are calculated using log likelihoods, the score of a sequence can be calculated by
adding (rather than multiplying) the relevant values at
each position in the PWM. The sequence score gives an
indication of how dierent the sequence is from a random sequence. The score is 0 if the sequence has the
same probability of being a functional site and of being
a random site. The score is greater than 0 if it is more
likely to be a functional site than a random site, and less
than 0 if it is more likely to be a random site than a functional site.[5] The sequence score can also be interpreted
in a physical framework as the binding energy for that
sequence.

4 Information content of a PWM

p(S|M ) = 0.10.60.71.01.00.60.70.20.2 = 0.0007056.

Pseudocounts (or Laplace estimators) are often applied
when calculating PPMs if based on a small dataset, in order to avoid matrix entries having a value of 0.[6] This is
equivalent to multiplying each column of the PPM by a
Dirichlet distribution and allows the probability to be calculated for new sequences (that is, sequences which were
not part of the original dataset). In the example above,
without pseudocounts, any sequence which did not have
a G in the 4th position or a T in the 5th position would
have a probability of 0, regardless of the other positions.

The information content (IC) of a PWM is sometimes of

interest, as it says something about how dierent a given
PWM is from a uniform distribution.
The self-information of observing a particular symbol at
a particular position of the motif is:

log(pi,j )
The expected (average) self-information of a particular
element in the PWM is then:

Creating the PWM

pi,j log(pi,j )
Most often the elements in PWMs are calculated as log
likelihoods. That is, the elements of the PWM are transFinally, the IC of the PWM is then the sum of the exformed using a background model b so that:
pected self-information of every element:
Mk,j = log2 (Mk,j /bk ).
describes how an element in the PWM (left), Mk,j , can
be calculated. The simplest background model assumes
that each letter appears equally frequently in the dataset.
That is, the value of bk = 1/|k| for all symbols in the
alphabet (0.25 for nucleotides and 0.05 for amino acids).
Applying this transformation to the PPM M from above
(with no pseudocounts added) gives:

A 0.26
1.26
C
0.32
0.32
M=
G1.32 1.32
T 0.68 1.32

1.32
1.32
1.49
1.32

1.26
0.32
1.0 1.32
1.0 1.32

i,j

pi,j log(pi,j )

Often, it is more useful to calculate the information content with the background letter frequencies of the sequences you are studying rather than assuming equal
probabilities of each letter (e.g., the GC-content of DNA
of thermophilic bacteria range from 65.3 to 70.8,[7] thus
a motif of ATAT would contain much more information
than a motif of CCGG). The equation for information

content thus becomes

1.49 0.32 1.32
1.32 1.32 0.32
.
1.32
1.0
1.32

1.32
log(pi,j1.26
/pb )
0.32
i,j pi,j

3
where pb is the background frequency for that letter. This [9] Kel AE, et al. (2003). MATCHTM: a tool for
searching transcription factor binding sites in DNA secorresponds to the KullbackLeibler divergence or relaquences. Nucleic Acids Research. 31 (13): 3576
tive entropy. However, it has been shown that when using
3579. doi:10.1093/nar/gkg585. PMC 169193 . PMID
PSSM to search genomic sequences (see below) this uni12824369.
form correction can lead to overestimation of the importance of the dierent bases in a motif, due to the uneven
[10] Wrzodek, Clemens; Schrder, Adrian; Drger, Andreas;
distribution of n-mers in real genomes, leading to a sigWanke, Dierk; Berendzen, Kenneth W.; Kronfeld, Marnicantly larger number of false positives.[8]
cel; Harter, Klaus; Zell, Andreas (9 October 2009).

Using PWMs

ModuleMaster: A new tool to decipher transcriptional

regulatory networks. Biosystems. Ireland: Elsevier.
99 (1): 7981. doi:10.1016/j.biosystems.2009.09.005.
ISSN 0303-2647. PMID 19819296.

There are various algorithms to scan for hits of PWMs [11] Beckstette, M.; et al. (2006). Fast index based algorithms and software for matching position specic
in sequences. One example is the MATCH algorithm[9]
[10]
scoring matrices. BMC Bioinformatics. 7: 389.
which has been implemented in the ModuleMaster.
doi:10.1186/1471-2105-7-389.
PMC 1635428 . PMID
More sophisticated algorithms for fast database search16930469.
ing with nucleotide as well as amino acid PWMs/PSSMs
are implemented in the possumsearch software and are
described by Beckstette, et al. (2006).[11]

7 External links

References

[1] Stormo, Gary D.; Schneider, Thomas D.; Gold, Larry;

Ehrenfeucht, Andrzej (1982). Use of the 'Perceptron'
algorithm to distinguish translational initiation sites in
E. coli". Nucleic Acids Research. 10 (9): 29973011.
doi:10.1093/nar/10.9.2997.
[2] Stormo, G. D. (1 January 2000). DNA binding sites:
representation and discovery. Bioinformatics. 16
(1): 1623. doi:10.1093/bioinformatics/16.1.16. PMID
10812473.
[3] Sinha, S. (27 July 2006). On counting position weight
matrix matches in a sequence, with application to discriminative motif nding. Bioinformatics. 22 (14): e454
e463. doi:10.1093/bioinformatics/btl227.
[4] Xia, Xuhua (2012). Position Weight Matrix, Gibbs Sampler, and the Associated Signicance Tests in Motif Characterization and Prediction. Scientica. 2012: 115.
doi:10.6064/2012/917540.
[5] Guigo, Roderic. An Introduction to Position Specic
Scoring Matrices. http://bioinformatica.upf.edu. Retrieved 12 November 2013. External link in |work= (help)
[6] Nishida, K.; Frith, M. C.; Nakai, K. (23 December
2008). Pseudocounts for transcription factor binding
sites. Nucleic Acids Research. 37 (3): 939944.
doi:10.1093/nar/gkn1019.
[7] Aleksandrushkina NI, Egorova LA (1978). Nucleotide
makeup of the DNA of thermophilic bacteria of the
genus Thermus. Mikrobiologiia. 47 (2): 2502. PMID
661633.
[8] Erill I, O'Neill MC (2009). A reexamination of information theory-based methods for DNA-binding site identication. BMC Bioinformatics. 10: 57. doi:10.1186/14712105-10-57. PMC 2680408 . PMID 19210776.

3PFDB a database of Best Representative PSSM

Proles (BRPs) of Protein Families generated using
a novel data mining approach.
UGENE PSS matrices design, integrated interface to JASPAR, Uniprobe and SITECON
databases.

8 TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

Text and image sources, contributors, and licenses

8.1

Text

Position weight matrix Source: https://en.wikipedia.org/wiki/Position_weight_matrix?oldid=736312107 Contributors: Michael Hardy,

BenFrantzDale, Ketil, Chowbok, Thorwald, Kbradnam, Rjwilmsi, SmackBot, Verne Equinox, RlyehRising, Colonies Chris, Sandve, Hsiaut,
CmdrObot, Montanabw, Opabinia regalis, IradBG, Addbot, DOI bot, Non-dropframe, Troels.M, , Gnomehacker, Citation bot 1,
DrilBot, Tom.Reding, Trappist the monk, RoadTrain, Amkilpatrick, RjwilmsiBot, Romeokienzler, Helpful Pixie Bot, Hsp90, Unknomics,
Fakharian.m, Monkbot, Harlba, Sbandya1 and Anonymous: 23

8.2

Images

File:Free-to-read_lock_75.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/80/Free-to-read_lock_75.svg License: CC0

Contributors:
Adapted
from
<a
href='//en.wikipedia.org/wiki/File:Open_Access_logo_PLoS_white_green.svg'
class='image'
title='Open_Access_logo_PLoS_white_green.svg'><img
alt='Open_Access_logo_PLoS_white_green.svg'
src='//upload.wikimedia.
org/wikipedia/commons/thumb/9/90/Open_Access_logo_PLoS_white_green.svg/9px-Open_Access_logo_PLoS_white_green.svg.png'
width='9' height='14' srcset='//upload.wikimedia.org/wikipedia/commons/thumb/9/90/Open_Access_logo_PLoS_white_green.svg/
14px-Open_Access_logo_PLoS_white_green.svg.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/9/90/Open_Access_
logo_PLoS_white_green.svg/18px-Open_Access_logo_PLoS_white_green.svg.png 2x' data-le-width='640' data-le-height='1000'
/></a>
Original artist:
This version:Trappist_the_monk (talk) (Uploads)
File:ISMBECCB13-039.jpg Source: https://upload.wikimedia.org/wikipedia/commons/f/fd/ISMBECCB13-039.jpg License: CC BY
2.0 Contributors: Flickr: ISMBECCB13-039 Original artist: ismb
File:LexA_gram_positive_bacteria_sequence_logo.png Source: https://upload.wikimedia.org/wikipedia/commons/8/85/LexA_gram_
positive_bacteria_sequence_logo.png License: CC BY-SA 3.0 Contributors: Source: Created using the Weblogo software (http://weblogo.
berkeley.edu/), which is distributed on a MIT Open Source License (http://weblogo.berkeley.edu/LICENSE) Original artist: Gnomehacker

8.3

Content license

Creative Commons Attribution-Share Alike 3.0

2021 Midterm Assessment EESC252 Geology For Engineers
No ratings yet
2021 Midterm Assessment EESC252 Geology For Engineers
11 pages
SafeQ6 License Guide en 1-02-00
No ratings yet
SafeQ6 License Guide en 1-02-00
14 pages
Nandi-Markweta Languages PDF
No ratings yet
Nandi-Markweta Languages PDF
6 pages
Design of Organic Synthesis 1
100% (1)
Design of Organic Synthesis 1
170 pages
Chapter 1 Soils Investigation
No ratings yet
Chapter 1 Soils Investigation
45 pages
SPE 131582 Condensate Banking Phenomenon Evaluation in Heterogeneous Low Permeability Reservoirs
No ratings yet
SPE 131582 Condensate Banking Phenomenon Evaluation in Heterogeneous Low Permeability Reservoirs
18 pages
Sensors Scribd
No ratings yet
Sensors Scribd
11 pages
Sensors Notes
100% (2)
Sensors Notes
4 pages
Communication Operations
No ratings yet
Communication Operations
70 pages
PGP
No ratings yet
PGP
38 pages
Machine Learning Base IoT Botnet Detection Systems
No ratings yet
Machine Learning Base IoT Botnet Detection Systems
10 pages
Ans Lab Record
100% (3)
Ans Lab Record
26 pages
Security Protocols For Wireless Sensor Network
No ratings yet
Security Protocols For Wireless Sensor Network
39 pages
6.0 Introduction To Real-Time Operating Systems (Rtos)
No ratings yet
6.0 Introduction To Real-Time Operating Systems (Rtos)
35 pages
Security in Wireless Sensor Networks
No ratings yet
Security in Wireless Sensor Networks
6 pages
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
Iot PPT New 1
No ratings yet
Iot PPT New 1
18 pages
Circuit Maker
No ratings yet
Circuit Maker
8 pages
and 80486
0% (1)
and 80486
28 pages
Wireless Sensor Networks: Security, Attacks and Challenges
No ratings yet
Wireless Sensor Networks: Security, Attacks and Challenges
13 pages
Simple, Real-Time Obstacle Avoidance Algorithm For Mobile Robots
No ratings yet
Simple, Real-Time Obstacle Avoidance Algorithm For Mobile Robots
6 pages
Systolic Arrays & Their Applications
No ratings yet
Systolic Arrays & Their Applications
35 pages
Beyond Binary Classification
No ratings yet
Beyond Binary Classification
34 pages
Enhanced Super-Resolution Using GAN
No ratings yet
Enhanced Super-Resolution Using GAN
6 pages
II. Combinational Logic Network (CLN) : 2.1 Definition and Classification
No ratings yet
II. Combinational Logic Network (CLN) : 2.1 Definition and Classification
12 pages
Intrusion Detection System Using GSM Modem (Minor Project Batch 09-13)
100% (1)
Intrusion Detection System Using GSM Modem (Minor Project Batch 09-13)
41 pages
Django PPT
No ratings yet
Django PPT
13 pages
Graphics Processing Unit (GPU) : Guided By: Presented BY
No ratings yet
Graphics Processing Unit (GPU) : Guided By: Presented BY
22 pages
Unit 5: Depth Buffer (Z-Buffer) Method
No ratings yet
Unit 5: Depth Buffer (Z-Buffer) Method
5 pages
Cse-IV-unix and Shell Programming (10cs44) - Notes
No ratings yet
Cse-IV-unix and Shell Programming (10cs44) - Notes
161 pages
Robot Manual
No ratings yet
Robot Manual
70 pages
Enterprise Information Architecture Component Model - Chapter 5
100% (1)
Enterprise Information Architecture Component Model - Chapter 5
27 pages
Office Automation
No ratings yet
Office Automation
14 pages
Dependency Graph and Bernstein Conditions
No ratings yet
Dependency Graph and Bernstein Conditions
39 pages
MC - Ii Unit
No ratings yet
MC - Ii Unit
11 pages
Numpy, Pandas and Matplotlib
No ratings yet
Numpy, Pandas and Matplotlib
60 pages
Web Development Using PHP
No ratings yet
Web Development Using PHP
65 pages
Eigenface For Face Recognition
No ratings yet
Eigenface For Face Recognition
19 pages
UART
No ratings yet
UART
3 pages
Embedded Prathap
No ratings yet
Embedded Prathap
58 pages
Group E Deep Learning Final
No ratings yet
Group E Deep Learning Final
31 pages
Chapter One ISR
No ratings yet
Chapter One ISR
25 pages
Data Structures: 2-3 Trees, B Trees, TRIE Trees
No ratings yet
Data Structures: 2-3 Trees, B Trees, TRIE Trees
41 pages
SPINS: Security Protocols For Sensor Networks
No ratings yet
SPINS: Security Protocols For Sensor Networks
29 pages
Microprocessor Based System Design
No ratings yet
Microprocessor Based System Design
44 pages
WFQ (Weighted Fair Queuing)
No ratings yet
WFQ (Weighted Fair Queuing)
4 pages
Oomd (U1&u2)
100% (1)
Oomd (U1&u2)
83 pages
Super-Resolution of Document Images Using Transfer Deep Learning of An ESRGAN Model
No ratings yet
Super-Resolution of Document Images Using Transfer Deep Learning of An ESRGAN Model
6 pages
Random Forest
No ratings yet
Random Forest
16 pages
CCN UNIT-I Introduction Complete Notes
No ratings yet
CCN UNIT-I Introduction Complete Notes
47 pages
Robust Face Recognition Under Difficult Lighting Conditions
No ratings yet
Robust Face Recognition Under Difficult Lighting Conditions
4 pages
Raspberry Pi Remote Access
No ratings yet
Raspberry Pi Remote Access
1 page
Theoretical and Practical Analysis On CNN, MTCNN and Caps-Net Base Face Recognition and Detection PDF
No ratings yet
Theoretical and Practical Analysis On CNN, MTCNN and Caps-Net Base Face Recognition and Detection PDF
35 pages
Artificial Intelligence in Social Networking
No ratings yet
Artificial Intelligence in Social Networking
49 pages
Parallel and Distributed Algorithms
No ratings yet
Parallel and Distributed Algorithms
65 pages
C++ Notes
No ratings yet
C++ Notes
99 pages
Cloud, Microservices and Applications Notes(5 Units)
No ratings yet
Cloud, Microservices and Applications Notes(5 Units)
71 pages
Ai Unit 1 Notes
No ratings yet
Ai Unit 1 Notes
19 pages
Final Merged
No ratings yet
Final Merged
946 pages
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
From Everand
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
Björn Olsson
No ratings yet
Pattern Recognition 1
No ratings yet
Pattern Recognition 1
5 pages
Metamotifs - A Generative Model For Building Families of Nucleotide Position Weight Matrices
No ratings yet
Metamotifs - A Generative Model For Building Families of Nucleotide Position Weight Matrices
16 pages
Unit_5 (1)
No ratings yet
Unit_5 (1)
110 pages
Further Research On Application of Probability Weighted Moments in Estimating Parameters of The Pearson Type Three Distribution
No ratings yet
Further Research On Application of Probability Weighted Moments in Estimating Parameters of The Pearson Type Three Distribution
19 pages
Unit 2 - Machine Learning - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 2 - Machine Learning - WWW - Rgpvnotes.in PDF
10 pages
Jerky: Jerky Is Lean Trimmed Meat That Has Been Cut Into Strips
No ratings yet
Jerky: Jerky Is Lean Trimmed Meat That Has Been Cut Into Strips
6 pages
Mountain Range
No ratings yet
Mountain Range
5 pages
Sierra Morena: Central Plateau and Providing The Watershed Between The Valleys of The
No ratings yet
Sierra Morena: Central Plateau and Providing The Watershed Between The Valleys of The
7 pages
Tariq Ibn Ziyad
No ratings yet
Tariq Ibn Ziyad
5 pages
Jebel Musa (Morocco)
No ratings yet
Jebel Musa (Morocco)
3 pages
Adel Sedra: Adel S. Sedra Is An Egyptian Canadian Electrical Engineer and
No ratings yet
Adel Sedra: Adel S. Sedra Is An Egyptian Canadian Electrical Engineer and
4 pages
Konakovsky District (Russian: Конак
No ratings yet
Konakovsky District (Russian: Конак
7 pages
Bantoid Languages PDF
No ratings yet
Bantoid Languages PDF
2 pages
Crumple Zone
No ratings yet
Crumple Zone
7 pages
Kipsigis Language
100% (1)
Kipsigis Language
3 pages
Demonstrative PDF
No ratings yet
Demonstrative PDF
8 pages
Class (Computer Programming)
No ratings yet
Class (Computer Programming)
12 pages
Eispack: Documentation
No ratings yet
Eispack: Documentation
1 page
Cephalopod Limb
No ratings yet
Cephalopod Limb
9 pages
XI ENGLISH Final
No ratings yet
XI ENGLISH Final
4 pages
MARTABE GOLD DEPOSITES Final PDF
100% (1)
MARTABE GOLD DEPOSITES Final PDF
19 pages
Endogenic Processes WS
No ratings yet
Endogenic Processes WS
2 pages
Slope Stabliity - Bishop Method
0% (1)
Slope Stabliity - Bishop Method
1 page
1S MID7b The Disciplines of Geography 2
No ratings yet
1S MID7b The Disciplines of Geography 2
27 pages
Plataspidae Related Cantharodes Review
No ratings yet
Plataspidae Related Cantharodes Review
33 pages
Chart of GI Secretions
0% (1)
Chart of GI Secretions
2 pages
Nicole Reiff Resume 2019
No ratings yet
Nicole Reiff Resume 2019
2 pages
Freeman Missed Opportunity
No ratings yet
Freeman Missed Opportunity
1 page
565 - 2017 Final Ans Key Online
No ratings yet
565 - 2017 Final Ans Key Online
11 pages
Physics Chapter 3 Practice Test
No ratings yet
Physics Chapter 3 Practice Test
2 pages
Gen Bio 2 Set 2 q1 Exam
No ratings yet
Gen Bio 2 Set 2 q1 Exam
3 pages
Physical Geodesy
100% (1)
Physical Geodesy
377 pages
Role of Nodes in KP
100% (3)
Role of Nodes in KP
16 pages
Fensome Et Al., 1996 - DinoEvolution
No ratings yet
Fensome Et Al., 1996 - DinoEvolution
6 pages
Being A Nationalistic Leader: Lesson 2
No ratings yet
Being A Nationalistic Leader: Lesson 2
16 pages
Petroleum and Natural Gas Engineering ( 0 English) Program Curriculum
No ratings yet
Petroleum and Natural Gas Engineering ( 0 English) Program Curriculum
2 pages
Conversion of Geodetic Coordinates To "Earth-Centred" Cartesian Coordinates
No ratings yet
Conversion of Geodetic Coordinates To "Earth-Centred" Cartesian Coordinates
4 pages
Porosity Powerpoint
No ratings yet
Porosity Powerpoint
34 pages
International Journal Conservation Science
No ratings yet
International Journal Conservation Science
11 pages
Quarry Mining Notes
No ratings yet
Quarry Mining Notes
5 pages
Ambiente Marino Proundo-Abanicos Submarinos
100% (1)
Ambiente Marino Proundo-Abanicos Submarinos
15 pages
Yael Yardeni Astrology Seminar 20090603 English PDF
No ratings yet
Yael Yardeni Astrology Seminar 20090603 English PDF
10 pages
Space Dynamics - Problem Sheet Level 2
0% (1)
Space Dynamics - Problem Sheet Level 2
2 pages
Shear-Wave Velocity Estimation in Porous Rocks PDF
No ratings yet
Shear-Wave Velocity Estimation in Porous Rocks PDF
15 pages
Lab Act #5
No ratings yet
Lab Act #5
2 pages

Position Weight Matrix

Uploaded by

Position Weight Matrix

Uploaded by

Position weight matrix

This article is about Bioinformatics. For the disease

weights which could distinguish true binding sites from

PWMs are often represented graphically as sequence logos.

where i (1,...,N), j (1,...,l), k is the set of symbols

use of PWMs was in the discovery of RNA sites that

gorithm was suggested by Polish American mathemati- M = G1 1 7 10 0

4 INFORMATION CONTENT OF A PWM

to construct M. The background model need not

G0.1 0.1 0.7 1.0 0.0 0.1 0.1 0.5 for0.1

and therefore the resulting PPM is:

Both PPMs and PWMs assume statistical independence

4 Information content of a PWM

p(S|M ) = 0.10.60.71.01.00.60.70.20.2 = 0.0007056.

The information content (IC) of a PWM is sometimes of

Creating the PWM

content thus becomes

ModuleMaster: A new tool to decipher transcriptional

[1] Stormo, Gary D.; Schneider, Thomas D.; Gold, Larry;

3PFDB a database of Best Representative PSSM

8 TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

Text and image sources, contributors, and licenses

Position weight matrix Source: https://en.wikipedia.org/wiki/Position_weight_matrix?oldid=736312107 Contributors: Michael Hardy,

File:Free-to-read_lock_75.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/80/Free-to-read_lock_75.svg License: CC0

Creative Commons Attribution-Share Alike 3.0

You might also like