0% found this document useful (0 votes)
6 views

Chapter 3

RNA

Uploaded by

zaeemahsan97
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Chapter 3

RNA

Uploaded by

zaeemahsan97
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Chapter 3

Genes and Genetic Code

Abstract
A genetic code consists of a set of three nucleotides and there are sixty four genetic
codes. Hence, one amino acid is for more than one genetic code and this feature is
called degeneracy of genetic code. Genes may be dispersed or also present in
clusters/groups on a chromosome. These gene clusters may be operons and multigene
families. Operons are groups or clusters of more than one open reading frame that
are present together and their expression is controlled by same regulatory sequences.
So, the mRNA that is transcribed from an operon is polycistronic mRNA transcript
with multiple open reading frames. However multigene families are same or identical
and are not regulated in a synchronized way. Hence in this chapter gene, genetic code
and biosynthetic pathways of 21st and 22nd amino acids will be explained
unequivocally.
Keywords: Selenocysteine, Pyrrolysine, Amino acid biosynthesis, Genetic code,
Gene

3.1 Introduction
To maintain life, synthesis of biological products is necessary. The information and
instructions to synthesize these biological products are inherited through genes.
Recombination, mutation, transformation and function are the genetic attributes of a
gene. Genes consist of promoter sequences, segments of DNA (that are coded for
functional products viz. rRNA, tRNA and amino acids or proteins) and terminators
along with other regulatory sequences. Promoters and terminators are the regulatory
sequences of genes which determine their expressions. Genes are separated from
each other by non-coding intergenic DNA sequences. Genes may be dispersed or
also present in clusters/groups on chromosomes, however these gene clusters may be
either operons or multigene families. A gene can be similar to other genes in a
genome and this degree of similarity may vary from gene to gene. Similar genes are
categorized as gene family. Within a gene family genes may be paralogous (different
but related genes) and orthologous (same genes) thus on this basis multigene
families are categorized as simple multigene families and complex multigene
families.

23
24 Chapter 3

3.2 Operons
Operons are groups or clusters of genes which are present together and their
expression is controlled by same regulatory sequences. In an operon, a set of genes
is linked by a single promoter and terminator (Fig 3.1), therefore, expression of these
genes is regulated in a much synchronized way. Such genes are coded for
functionally related proteins working for same purpose. However, genes of an operon
are not identical and they are coded for different proteins, but these proteins/enzymes
take part in same biological pathway (Watson et al. 2004; Nelson and Cox 2012).

Fig.3.1. An operon (Redrawn after alterations from Watson et al. 2004)

The mRNA that is transcribed from an operon is polycistronic mRNA transcript


having multiple open reading frames (ORFs). This polycistronic transcript has little
untranslated regions which separate each open reading frame to other open reading
frame of the same transcript. Operons are commonly found in prokaryotes and lower
eukaryotes. Presence of operons in some higher eukaryotes has also been reported
(Brown 1998; Petter 2002).

3.3 Multigene families


Like operon, multigene families are also clusters of genes but with the exception
that these genes are similar or identical and are not regulated in a synchronized way.
The genes of multigene family perform same biological tasks. As described earlier
these multigene families are of two types; simple multigene family and complex
multigene family. In simple multigene family, all genes of clusters are same for
example, genes of rRNA. They all are transcribed as a single transcriptional unit and
multiple copies of this transcriptional unit are present as clusters on same
chromosome or on separate chromosomes. It may be due to a surge in requirement
of the said gene product which results in the formation of multiple copies by gene
duplication during evolutionary stages. Whereas the genes of complex multigene
families are not identical rather are very similar. These genes are encoded for similar
polypeptides which possess similar functions. Globin gene family is the example of
complex multigene families (Walsh and Wolfgang 2001; Lodish et al. 2012).
Genes and Genetic Code 25

3.4 Regulatory elements


The gene expression is regulated and controlled by various regulatory sequences and
genes. Regulatory elements are the molecular switches of the genes encoded for any
functional product. Promoters are regulatory sequences acting as the site of
recognition for RNA polymerases and are involved in the regulation of gene
expression too. Promoter region is responsible for initiation of transcription. Various
genes have more than one promoter sites within a gene sequence and RNAs of
different types or lengths are transcribed by such kinds of genes. Like promoters,
terminators are the regulatory sequences present at the end of genes and are held
responsible for termination of transcription. Operators, enhancers and silencers are
the regulatory sequences which control the degree of gene expression. Regulatory
genes are coded for those proteins which are involved in the regulation of gene
expression. These regulatory proteins are called trans-acting proteins. These are
transcribed from trans-acting sequences (regulatory gene sequences). Gene
expression is controlled via activation and repression by these regulatory proteins on
binding with cis-acting sequences such as operators (Brown 1998; Petter 2002;
Nelson and Cox 2012).

3.5 Structural genes


Structural genes are DNA sequences coded for particular proteins, enzymes and other
functional products. In prokaryotes, coding portions are continuous and not
interrupted by non-coding introns thus mostly structural genes after transcription into
RNA have no intronic portions. Whereas in eukaryotes, genes are discontinuous
hence coding portions are interrupted by introns and RNAs are functional after the
excision of these introns. The coding portions of genes are commonly referred to as
exons. The term exon is usually used for coding regions or sequences for
polypeptides but it is worth to mention that 5́ UTRs (5́ untranslated region) and 3́
UTRs (3́ untranslated region) are constitute of exons (wholly and partially) (Brown
1998; Petter 2002; Nelson and Cox 2012). Likewise some transcripts have exons but
they do not code for any protein. So, it is not appropriate to designate exon being the
coding portion of gene as the specific portion of exon encoding for a protein is called
exom.

3.6 Genetic code


A genetic code is a triplet of nucleotides designated for amino acid. There are sixty
four (64) genetic codes or codons while there are twenty two (22) amino acids
reported (Table 3.1). Hence it is obvious that each amino acid is coded by more than
one codon. This characteristic of genetic code is called degeneracy of genetic code.
Genetic codes of a gene determine the sequence of amino acids in a polypeptide
chain. More than one genetic code for same amino acid is synonymous with variation
at third nucleotide or base, constituting third base degeneracy and this position is
called wobble position. Most often, out of sixty four genetic codes, 61 are for twenty
amino acids while remaining are stop codons/non-sense codons (UAA, UAG and
26 Chapter 3

UGA) (Rothwell 1988; Petter 2002). However, few exceptions are there as
selenocysteine and pyrrolysine (21st and 22nd amino acids respectively) are coded by
stop codons. These are also proteinogenic amino acids because both are incorporated
and condensed in polypeptide chain during translation process. It would be
interesting to know that the codon UGA that is referred to as umber/opal codon being
stop codon is also a code for selenocysteine amino acid. It shows that there are some
phenomena or features that trigger translational machinery to go on and incorporate
selenocysteine in polypeptide chain of selenoprotein and not to stop translation. It is
believed that a stem loop structure at 3́UTRs is important for UGA codon to read as
code for selenocysteine (SeC). Similarly, twenty second amino acid pyrrolysine is
also coded by another stop codon that is UAG (amber) (Mueller 2009; Yuan et al.
2010; Quitterer et al. 2012).

Table 3.1 Genetic codes


U C A G
UUU Phe UCU Ser UAU Tyr UGU Cys U
UUC Phe UCC Ser UAC Tyr UGC Cys C
U
UUA Leu UCA Ser UAA Stop UGA Stop/SeC A
UUG Leu UCG Ser UAG Stop/Pyr UGG Trp G
CUU Leu CCU Pro CAU His CGU Arg U
CUC Leu CCC Pro CAC His CGC Arg C
C
CUA Leu CCA Pro CAA Gln CGA Arg A
CUG Leu CCG Pro CAG Gln CGG Arg G
AUU Ile ACU Thr AAU Asn AGU Ser U
AUC Ile ACC Thr AAC Asn AGC Ser C
A
AUA Ile ACA Thr AAA Lys AGA Arg A
AUG Met ACG Thr AAG Lys AGG Arg G
GUU Val GCU Ala GAU Asp GGU Gly U
GUC Val GCC Ala GAC Asp GGC Gly C
G
GUA Val GCA Ala GAA Glu GGA Gly A
GUG Val GCG Ala GAG Glu GGG Gly G
Source: Watson et al. (2004)
Phe: Phenylalanine Cys: Cystein
Ser: Serine Thr: Threonine
Tyr: Tyrosine Ile: Isoleucine
Leu: Leucine Met: Methionine
Pro: Proline Gly: Glycine
His: Histidine Lys: Lysine
Gln: Glutamine Asn: Asparagine
Val: Valine Asp: Aspartic acid
Ala: Alanine Glu: Glutamic acid
Trp: Tryptophan SeC: Selenocystein
Arg: Arginine Pyr: Pyrrolysine
Genes and Genetic Code 27

3.7 Amino acids and their biosynthetic pathways


Functional diversity among proteins is due to three important facets of their
structures that are amino acid sequences of polypeptide chains, their post
translational modifications and their folding patterns. Of these three, only amino acid
sequence of a polypeptide chain is depicted by genetic contexts. It was said
commonly that there are twenty (20) kinds of proteinogenic amino acids (amino
acids of polypeptide chains that are condensed during translation by cellular
machinery), but now two more amino acids; selenocysteine and pyrrolysine have
been reported (Ibba and Söll 2002; Aeby et al. 2009). The genetic contexts are
capable to allow for the recording and incorporation of these 22 proteinogenic amino
acids from natural set of protein building blocks during translation process.
Biosynthetic pathways of these amino acids are different and categorized into six
groups on the basis of their main metabolic precursors. The major precursors for
the said amino acids are; Pyruvate, Oxaloacetate, α -Ketoglutarate, Phosphoribosyl
pyrophosphate, Phosphoenolpyruvate (PEP) and 3-phosphoglycerate. Among
aromatic amino acids; tryptophan and phenylalanine are synthesized from
Phosphoenolpyruvate (PEP) + Erythrose-4-phosphate while tyrosine is
biosynthesized from phenylalanine (Fig 3.2) whereas branched amino acids (valine,
alanine and leucine) are aroused from Pyruvate (Fig 3.3).

Fig. 3.2 Biosynthesis of aromatic amino acids (Conceived from Miles 2003)
28 Chapter 3

Fig. 3.3 Biosynthesis of branched amino acids (Conceived from Miles 2003)

Glutamic acid is originated from α –Ketoglutarate which in turn produces


glutamine, arginine and proline (Fig 3.4), whereas, Phosphoribosyl pyrophosphate
is the only precursor of histidine (Fig 3.5).

Fig. 3.4 Biosynthesis of Glutamic acid/glutamate,glutamine, arginine and proline


amino acids (Conceived from Miles 2003)
Genes and Genetic Code 29

Fig. 3.5 Biosynthesis of histidine amino acids (Conceived from Miles 2003)

Metabolic precursor, oxaloacetate is the source of aspartic acid or aspartate from


which lysine, asparagine, threonine and methionine are aroused (Fig 3.6). From
lysine, pyrrolysine is derived and isoleucine is biosynthesized from threonine. The
metabolic precursor 3-phosphoglycerate is involved in the biosynthesis of serine
which serves as an antecedent of cystein, selenocysteine and glycine (Brown 1998;
Miles 2003) (Fig 3.7).

Fig. 3.6 Biosynthesis of aspartic acid/aspartate, lysine, asparagine, threonine,


methionine, Pyrrolysine and isoleucine amino acids (Conceived from Miles 2003)
30 Chapter 3

Major precursors of all ammo acids


Ammo acids that are precursors of other amino acids
Other amino acids

Fig. 3.7 Biosynthesis of cysteine, selenocysteine and glycine amino acids


(Conceived from Miles 2003)

3.7.1 Selenocysteine: a 21stamino acid


Selenocysteine is a cysteine analogue and is present in selenoproteins. In
selenocysteine, sulfur present in cystein is replaced by selenium (Fig 3.8).

Fig. 3.8 Structures of Cysteine and Selenocysteine


Genes and Genetic Code 31

Sel operon in prokaryotes (Fig 3.9) is pre-requisite for the insertion of selenocysteine
(SeC) into growing polypeptide chain. Sel operon consists of four genes (Sel A, Sel
B, Sel C and Sel D). Products of these four genes are essential for the synthesis of
tRNA and its charging with selenocysteine. SelC gene of Sel operon codes for tRNA
specific for selenocysteine. Sel B encodes for a translation factor that is
fundamentally similar to EF-Tu and is responsible for the entry of tRNAsec
(selenocisteinyl-tRNA) into aminoacyl site (A site) of ribosome during polypeptide
chain synthesis (Yuan et al. 2006; Aeby et al. 2009).

Fig. 3.9 Sel operon (Conceived from Yuan et al. 2010)

Attachment of selenocysteine to its tRNASeC is not directly as in case of other amino


acids to their canonical tRNAs, whereas it is synthesized via tRNA dependent
conversion of serine. First tRNASeC is charged with serine by seryl-tRNA synthetase.
The product of Sel A converts serine into dehydroalanine (more activated compound
than serine) and the product of Sel D converts selenium into selenophosphate (High
energy compound activated more than selenium). These both compounds combine
to form selenocisteinyl tRNASeC. The conversion of ser-tRNASeC into sec-tRNASeC is
mediated by a specialized enzyme, Selenocysteine synthase (SeC synthase) that is
the product of Sel A. This reaction takes place in the presence of selenophosphate, a
donor of selenium (Yuan et al. 2006; Kossinova 2011).
Similarly, eukaryotes also have stem loop structure that triggers UGA to read
selenocysteine and prevents translation termination when UGA codon comes. This
cis-acting stem loop structure is called SeC insertion element (SECIS). In E.coli,
SECIS is located immediately downstream to UGA codon in mRNA of selenoprotein
whereas in eukaryotes this element is present in 3́UTR (untranslated region) of
selenoprotein’s mRNA. In eukaryotes, there are two important factors for
selenocysteine incorporation; one is SECIS binding protein known as SBP2 and
second is translation factor that facilitates tRNAsec binding to ribosome, and is called
eEFSeC (Aeby et al. 2009; Mueller 2009).
32 Chapter 3

3.7.2 Biosynthesis of Selenocysteine


Selenoproteins have selenocysteines in their catalytic centre. Contrary to others,
biosynthesis of selenocysteine (SeC) amino acid takes place on its cognate tRNA.
Thus, a tRNA dependent pathway is required for the synthesis of selenocysteine and
recording machinery for its incorporation into growing polypeptide chain. The
biological machineries involved in the synthesis of selenocysteine and its
incorporation into selenoproteins are diverse in all living domains. In E.coli, for the
biosynthesis of selenocysteine, at first, tRNASeC interacts with serine in the presence
of seryl-tRNA synthetase, and seryl-tRNASeC (ser-tRNASeC) is formed. After the
charging or aminoacylation of tRNASeC with serine, the selenocysteine synthase (SeC
synthase) interacts with seryl-tRNASeC and the removal of hydroxyl group from the
seryl moiety of ser-tRNASeC takes place (Aeby et al. 2009; Mueller 2009). As a result
of this, an intermediate, dehydroalanyl-tRNASeC is formed which accepts
selenophosphate (donor of selenium; Se) to form selenocisteinyl-tRNASeC (Fig 3.10).
In case of bacteria, transformation of Ser-tRNASeC into SeC-tRNASeC occurs in
pyridoxal-5́-phosphate dependent reaction that is catalyzed by SeC synthase (selA
product). Whereas in eukaryotes there is an extra step, in which Ser-tRNASeC is
phosphorylated by O-phosphoseryl-tRNA kinase (PSTK), and O-phosphoseryl-
tRNASeC (Sep- tRNASeC) is formed in the presence of ATP and magnesium (Mg2+).
The Sep- tRNASeC is subsequently transformed into SeC-tRNASeC with the help of an
enzyme Sep-tRNA:SeC-tRNA Synthase (SepSecS). Like SeC synthase, SepSecS
enzyme is also pyridoxal-5́-phosphate dependent. For this reaction, selenophosphate
(Se donor) is formed by selenophosphate synthase (SpS2). There is an important
distinction in the chemistry of both bacterial as well as eukaryotic synthases i-e. on
tRNASeC, substrate for eukaryotic synthase is phosphoserine (Sep) that on hydrolysis
releases phosphate group while in case of E.coli, serine instead of phosphoserine is
the substrate for bacterial synthase enzyme that upon hydrolysis eliminates water
group (Yuan et al. 2006; Yuan et al. 2010)
Delivery of SeC- tRNASeC to ribosome is mediated by a specialized translation factor
that instigates a shift of UGA from stop codon to SeC sense codon. A characteristic
stem loop structure in mRNA is required for this translation factor to actively decode
UGA into selenocysteine. The canonical codon and a unique aminoacyl tRNA
synthetase:tRNA as a valuable combination has a room for variety of unusual amino
acids in both prokaryotic and eukaryotic expression system (Yuan et al. 2010;
Kossinova 2011).

3.7.3 Pyrrolysine: a 22nd amino acid


Pyrrolysine, a 22nd amino acid is an integral component of pathways involved in the
methane formation (methanogenesis). This amino acid is still mysterious because its
biosynthetic pathway is not yet clear. Different biosynthetic pathways of pyrrolysine
have been proposed. Recently, a biosynthetic pathway of pyrrolysine has been
proposed in which two molecules of lysines are involved. In contrast to
selenocysteine, pyrrolysine biosynthesis is tRNA-independent involving
biosynthetic machinery that is encoded by Pyl operon (PylTSBCD). Amber (UAG)
suppressor tRNA is encoded by PylT whereas pyrrolysyl-tRNA-synthetase
Genes and Genetic Code 33

(PylRS) is the product of PylS. In bacteria PylS gene is split, and it encodes C-
terminal of PylS whereas a gene PylSn encodes N-terminal domain of PylS.
Pyrrolysine is aminoacylated directly to its cognate tRNAPyl in a reaction that is
catalyzed by pyrrolysyl-tRNA-synthetase (PylRS) and is incorporated into protein
without the aid of complex biochemical machineries (Ibba and Söll 2002; Fekner and
Chan 2011). Likewise, selenocysteine, incorporation of pyrrolysine into growing
polypeptide chain also requires pyrrolysine insertion sequence (PLYIS) that forms
hairpin loop like structure in its respective mRNAs.

Fig. 3.10 Biosynthesis of selenocysteine (Redrawn after modification from Yuan et


al. 2010)
34 Chapter 3

Fig. 3.11 Proposed scheme of pyrrolysine biosynthetic pathway (Conceived from


Gaston et al. 2011)
Genes and Genetic Code 35

References
Aeby, E., P. Sotiria, P. Mascha, M. Janine, L. Allyson, U. Elisabetta, S. Dieter and
S. Andre´ (2009). The canonical pathway for selenocysteine insertion is
dispensable in Trypanosomes. Proc Natl Acad Sci. 106:5088–5092.
Brown, T.A. (1998). Genetics: A Molecular Approach. 3rd Edition. Chapman and
Hall, London, UK.
Fekner, T. and M.K. Chan (2011). The pyrrolysine translational machinery as a
genetic-code expansion tool. Curr Opin Chem Biol. 15:387–391.
Gaston,A.M., L. Zhang, K.B. Green-Churchand J.A. Krzycki (2011). Proposed
pathway of pyrrolysine biosynthesis from two molecules of lysine by the
products of pylB, pylC and pylD. Nature 471:647-650.
Ibba, M and D. Söll (2002). Genetic code: introducing pyrrolysine. Curr Biol. 12:
464-466.
Kossinova, O. (2011). Insights into the Selenocysteine Incorporation Mechanism in
Mammals. PhD Dissertation, The Universty of Strasbourg, Strasbourg, France.
Lodish, H.F., A. Berk, C. Kaiser, M. Krieger, A. Bretscher, H. Ploegh, A. Amon and
M. Scott (2012). Molecular Cell Biology. 7th Edition. W.H. Freeman and
Company, N.Y., USA.
Marsha, A.G., Z. Liwen, B.G. Kari and A.K. Joseph (2011). The complete
biosynthesis of the genetically encoded amino acid pyrrolysine from lysine.
Nature 471:647–650.
Miles, B. (2003). Biosynthesis of amino acids. http://www.tamu.edu/
faculty/bmiles/lectures/biosynaa.pdf. Accessed on 22 April 2016.
Mueller, E.G. (2009). Se-ing into selenocysteine biosynthesis. Nat Chem Biol.
5:611-612.
Nelson, D.L. and M.M. Cox (2012). Lehninger Principles of Biochemistry. 6th
Edition. W.H. Freeman and Company, N.Y., USA.
Petter, P. (2002). Historical development of the concept of the gene. J Med Philos.
27:257-86.
Quitterer, F., A. List, P. Beck, A. Bacher and M. Groll (2012). Biosynthesis of the
22nd genetically encoded amino acid pyrrolysine: structure and reaction
mechanism of PylC at 1.5Å resolution. J Mol Biol. 424:270-82.
Rothwell, N.V. (1988). Understanding Genetics. 4th Edition. Oxford University
Press, N.Y., USA.
Walsh, J.B. and S. Wolfgang (2001). Multigene Families: Evolution. In:
Encyclopedia of Life Sciences. John Wiley and Sons, New Jersey, USA.
Watson, J.D., T.A. Baker, S.P. Bell, A. Gann, M. Levine and R. Losick (2004). Mol.
Biology of the Gene. 5th Edition. Pearson Education, CA., USA.
Yuan, J., P. Sotiria, C.S. Juan, S. Dan, O.D. Patrick, J.H. Michael, M.C. Alexander,
B.W. William and S. Dieter (2006). RNA-dependent conversion of
phosphoserine forms selenocysteine in eukaryotes and archaea. Proc Natl Acad
Sci. 103:18923–18927.
36 Chapter 3

Yuan, J., O’D. Patrick, A. Alex, G. Sarath, S. Lynn, P. Sotiria, S. Miljan and S. Dieter
(2010). Distinct genetic code expansion strategies for selenocysteine and
pyrrolysine are reflected in different aminoacyl-tRNA formation systems. FEBS
Lett. 584:342–349.

You might also like