0% found this document useful (0 votes)
5 views

Bioinformatics (3)

The document discusses the structure and organization of the genome, highlighting the role of chromatin, histones, and non-histone proteins in DNA packaging and gene regulation. It details the human genome's composition, including protein-coding genes, non-coding RNA, and genetic variations, as well as the significance of CpG islands in gene expression. Additionally, it covers the concepts of gene evolution, regulatory elements like promoters and enhancers, and the field of epigenetics, emphasizing its implications in health and disease.

Uploaded by

theofix301
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Bioinformatics (3)

The document discusses the structure and organization of the genome, highlighting the role of chromatin, histones, and non-histone proteins in DNA packaging and gene regulation. It details the human genome's composition, including protein-coding genes, non-coding RNA, and genetic variations, as well as the significance of CpG islands in gene expression. Additionally, it covers the concepts of gene evolution, regulatory elements like promoters and enhancers, and the field of epigenetics, emphasizing its implications in health and disease.

Uploaded by

theofix301
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Genome structure and organization

• Genomic DNA in the nucleus exists in


combination with histone proteins.
• DNA – protein complex ➔ chromatin (in units of
nucleosome), which is the basic unit of chromatin
structure.
• The nucleosome core particle is composed of a
histone octamer and the DNA that wraps around
the octamer.
• Histone octamer consists of two copies of each
of the four core histone proteins: H2A, H2B, H3,
Ref: Thomas Splettstoesser (www.scistyle.com), CC BY-SA 4.0
and H4. Histone proteins are positively charged <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons

and interact with the negatively charged


phosphate groups of DNA.
Choudhuri, Supratim. “Bioinformatics for beginners: genes, genomes,
molecular evolution, databases, and analytical tools.” Elsevier, 2014.
Genome structure and organization

• Chromatin is composed of DNA and proteins, mainly histones and non-histones.

• One group of non-histones is the high mobility group (HMG) proteins, which have

low molecular weight and high mobility in the nucleus.

• HMG proteins can bind to DNA and alter its shape and compactness, making it

more accessible to other factors.

• HMG proteins can also modulate the interaction between transcription factors and

coregulators, which are proteins that enhance or repress transcription.


Human genome
• Consists of around 3.1 billion base pairs, (in one copy of the human genome, which
includes 22 human autosomes and sex chromosomes (X or Y)).
https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000001405.40/
• The genomes of two humans are about 99.9% identical at the nucleotide level, but
there are also variations that account for individual differences and traits.
• There are ~20.000 protein-coding genes, in the entire genomic DNA, which
occupy only about ~ 1-2 % of the genome sequence.
• ~ two-thirds of the protein-coding genes are orthologs across mammals,
meaning that they share a common ancestor and have similar functions.
• ~ 3 – 3.5 % of the genome sequence consists of regulatory sequences, such
as promoters, enhancers, and silencers, that control gene expression by interacting
with transcription factors and other proteins.
Human genome
• The human genome also contains a high amount of non-coding RNA genes, such as
microRNAs, long non-coding RNAs, and ribosomal RNAs, that have various roles in gene
regulation, RNA processing, translation, and cellular functions.
• ~ 50 % of the human genome consists of repeat sequences (Simple repeats (e.g.
(A)n, (CA)n, (CGG)n), tandem repeat blocks (e.g. centromeric repeats, telomeric repeats,
ribosomal gene clusters), segmental duplications (copied from one region and integrated
into another region)), which are derived from transposable elements or DNA replication
errors.
• Some examples of genetic variations are SNPs (single nucleotide
polymorphisms), which are single base changes in the DNA sequence (65% of all SNPs are
transition mutations (interchange of one ring pyrimidine C-T)), and copy number
variations (structural variations that cover more than 1kb of DNA sequence), which are
insertions or deletions of large segments of DNA.
CpG islands

• CpG sequence (CG islands) (cytosine phosphate guanine clusters) are


short regions of DNA (about 200-2000 bp) that have a high frequency of
CpG dinucleotides, which are normally rare in the genome due to
methylation and deamination.
• The human genome contains about 0.8% CpG islands, which are mostly
located near the promoters of genes.
• Cytosine in the CpG island is often methylated, and methylcytosine
(meC) spontaneously tends to deaminate to thymine, hence converting
CpG to TpG. This process results in a loss of CpG sites and a mutation in
the DNA sequence.
CpG islands

• CpG islands are known to regulate gene expressions (transcriptional silencing or


activation) by affecting the binding of transcription factors and the accessibility of
chromatin.
• Methylation of CpG islands ➔ transcriptional silencing, as it prevents the binding of
transcription factors and recruits methyl-binding proteins that induce chromatin
condensation.
• Absence of methylation of CpG islands ➔ active transcription, as it allows the
binding of transcription factors and recruits histone-modifying enzymes that induce
chromatin relaxation.
Genome evolution

• Genes can die and be born through various mechanisms that alter the
genome structure and function.
• Gene birth; duplication of an existing gene is followed by a mutation ➔
results in a new function or expression pattern for the duplicated gene.
• Gene death; inactivating mutations and loss of function (e.g.
Pseudogenization; functional gene ➔ becomes non-functional due to
mutations that disrupt its coding sequence or regulatory elements)
Promotor

• A promoter is a DNA sequence located in the 5’-flanking region upstream


of the transcription start site of a gene.
• A promoter contains various transcription regulatory sequence elements
that bind to specific transcription factors and RNA polymerase.
Promotor

• A promoter initiates and regulates the transcription of a gene by


determining its frequency, timing, and tissue specificity.
• A core (or basal) promoter (~ 35bp long) contains the minimal elements
required for transcription initiation, such as TATA box, initiator element,
and downstream promoter element.
• A proximal promoter (upstream of the core promotor, 250bp long),
contains additional elements that modulate transcription activity.
• A distal promoter (upstream of the proximal promotor ), contains
enhancer or silencer elements that can further activate or repress
transcription by interacting with distant or nearby regulatory proteins.
Enhancers / Silencers
• Enhancers and silencers are DNA sequences located farther away from the
promoter (could be thousands of bp upstream or downstream of the gene) or
within introns or exons.
• Enhancers bind specific transcription activators and increase the rate of
transcription by enhancing the assembly or stability of the transcription initiation
complex.
• Silencers bind specific transcription repressors and decrease the rate of
transcription by interfering with the assembly or stability of the transcription
initiation complex.
Regulation

Thomas Shafee, CC BY 4.0 <https://creativecommons.org/licenses/by/4.0>, via Wikimedia Commons


This diagram illustrates how different types of regulatory elements (promoters, enhancers, silencers) and
factors (RNA polymerase II, general transcription factors, specific transcription factors) work together to
regulate gene expression at the transcriptional level.
Epigenetics
• Epigenetics is the study of changes in gene function that can not be explained by changes in
DNA sequence, but rather by chemical modifications or interactions that affect how genes
are expressed.
• DNA methylation: the addition or removal of a methyl group to a cytosine base, which
can affect gene expression by altering the accessibility of DNA to transcription factors
and enzymes.
• Histone modification: the addition or removal of chemical groups, such as acetyl,
methyl, or phosphate, to the amino acid tails of histone proteins, which can affect gene
expression by altering the structure and dynamics of chromatin.
• Chromatin conformation change: the folding and unfolding of chromatin into different
levels of compaction and organization, which can affect gene expression by exposing or
hiding certain regions of DNA to transcription machinery.
• Regulation by non-coding RNAs: the production and function of various types of RNA
molecules that do not code for proteins, but have roles in gene regulation, such as
microRNAs, long non-coding RNAs, and ribosomal RNAs.
Epigenetics
• Epigenetic modifications are essential for normal development and differentiation
of cells and tissues, as well as for adaptation and response to environmental
changes.
• However, epigenetic modifications can also be involved in various diseases and
disorders, such as cancer, diabetes, obesity, neurological disorders, and
autoimmune diseases.
• Epigenetic modifications are potentially reversible and modifiable by
pharmacological or nutritional interventions. Therefore, epigenetics offers new
opportunities for diagnosis, prevention, and treatment of diseases.

You might also like