0% found this document useful (0 votes)
7 views6 pages

Pr1 Biological databases practical

The document outlines a practical class for MAM 5108: Microbial Bioinformatics, focusing on the use of biological databases to retrieve information about delta-endotoxin genes from Bacillus thuringiensis, which are used in transgenic crops for insect resistance. It includes detailed instructions for searching the NCBI GenBank and Gene databases, as well as the UniProt KnowledgeBase, to gather specific data about nucleotide sequences, gene records, and protein information. The practical aims to enhance students' skills in bioinformatics by engaging them in hands-on database searches and data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views6 pages

Pr1 Biological databases practical

The document outlines a practical class for MAM 5108: Microbial Bioinformatics, focusing on the use of biological databases to retrieve information about delta-endotoxin genes from Bacillus thuringiensis, which are used in transgenic crops for insect resistance. It includes detailed instructions for searching the NCBI GenBank and Gene databases, as well as the UniProt KnowledgeBase, to gather specific data about nucleotide sequences, gene records, and protein information. The practical aims to enhance students' skills in bioinformatics by engaging them in hands-on database searches and data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

MAM 5108: Microbial Bioinformatics

1 Dr. Pasan Fernando

Practical 1: Biological Databases


In this practical class, you will learn how to search popular biological databases to retrieve
information. Specifically, for today, you will work with delta-endotoxin genes of Bacillus
thuringiensis (Bt) bacterium. Bt is widely used as a natural biological control agent in agriculture
because it can produce certain toxins, which can act as natural insecticides. Because of this
ability, delta-endotoxin genes of Bt are expressed in transgenic crop plants to acquire natural
insect resistance. These Bt transgenic crops are now cultivated on over 32 million hectares
around the world. For instance, the cry1Ab gene, which you will be using in this practical class,
is widely used for producing transgenic crops in corn, rice, brinjal, tomato, potato, cotton, and
sugarcane to obtain resistance against several harmful insects.

1. Using the National Center for Biotechnology Information (NCBI) GenBank to find
information on nucleotide sequences (15 marks)

Go to the NCBI GenBank by searching google or using the following link:


https://www.ncbi.nlm.nih.gov/genbank/

Then, search for “5.3 class delta-endotoxin gene” using the search bar. This will result in
multiple hits. Filter the results by selecting Bacillus thuringiensis as the organism in the
“Top organisms” list on the right side. Finally, select the best hit for your search query
based on the name (Usually, this is the top hit, but this may change in some cases).
Answer the following questions based on the best hit record.

a. What are the GenBank accession and the version for the record? (2 marks)

GenBank: M37263.1

b. What are the length and the type of the nucleotide sequence? (2 marks)

3550 bp DNA linear

c. Give the title of the main reference for this record. (1 mark)
Sequence of a lepidopteran toxin gene of Bacillus thuringiensis
subsp kurstaki NRD-12

d. What is the taxon identifier (ID) for Bacillus thuringiensis? (1 mark)

1428

1
MAM 5108: Microbial Bioinformatics
2 Dr. Pasan Fernando

e. What are the sequence coordinates for the coding sequence? (1 mark)

f. What is the NCBI protein ID for encoded protein from the nucleotide sequence?
(1 mark)

AAA22420.1

g. Access the corresponding protein record by clicking on the protein ID. What is
the length of the amino acid sequence according to the protein record? (1 mark)
1155

h. Give the names and the amino acid sequence coordinates of three distinct
regions of the protein. (6 marks)

48..251
/region_name="Endotoxin_N"
259..460
/region_name="Endotoxin_M"

463..606
/region_name="delta_endotoxin_C"

2. Using the NCBI Gene database to find information about genes (15 marks)

Go to the NCBI Gene database by performing a google search or clicking on the


following link: https://www.ncbi.nlm.nih.gov/gene

Now, search for “endotoxin” in the search bar. This will result in multiple hits. Filter the
results by selecting Bacillus thuringiensis as the organism in the “Top organisms” list on
the right side. Then, click on the “BTHUR0008_RS29305” gene and access the gene
record.

2
MAM 5108: Microbial Bioinformatics
3 Dr. Pasan Fernando

a. What are the Gene ID and gene symbol for this record? (2 marks)
Gene ID: 67470296

BTHUR0008_RS29305

b. What is the gene type of this particular gene? (1 mark)

protein coding

c. What is the bacterial strain of the gene record? (1 mark)

Bacillus thuringiensis serovar berliner ATCC 10792 (strain: ATCC 10792, serovar:
berliner, culture-collection: ATCC:10792, type-material: type strain of Bacillus
thuringiensis)

d. Give the gene symbols of two adjacent genes to this particular gene in its
genome. (2 marks)
BTHUR0008_RS3434
BTHUR0008_RS29310

e. Give the titles of two publications related to this gene. (2 marks)

 Hetero-oligomerization of Bacillus thuringiensis Cry1A proteins enhance


binding to the ABCC2 transporter of Spodoptera exigua
 Bacillus thuringiensis and its pesticidal crystal proteins

f. Access the RefSeq genomic sequence for the gene. What is the GenBank
accession number of this RefSeq genomic sequence (with the sequence region)?
What is the original sequence record corresponding to this genomic sequence?
(4 marks)

NCBI Reference Sequence: NZ_CM000753.1


NZ_CM000753

g. List two reasons for preferring the NCBI Gene database over the NCBI GenBank
when retrieving gene sequences (2 marks)

h. What is the UniProtKB/Swiss-Prot ID for the gene? (1 mark)


P0A373

3
MAM 5108: Microbial Bioinformatics
4 Dr. Pasan Fernando

3. Using the Universal Protein Resource (UniProt) KnowledgeBase (KB) to find


information about proteins (20 marks)

Go to the UniProtKB by performing a google search or clicking on the following link:


https://www.uniprot.org/

Now, use the UniProtKB ID you found in question 2(h) to find the corresponding protein
record and answer the following questions based on the record.

a. What are the protein and gene names as given in the record? (2 marks)

Pesticidal crystal protein Cry1Ba


cry1Ba

b. What is the function of this particular protein? (1 mark)


Promotes colloidosmotic lysis by binding to the midgut epithelial cells of
insects.

c. Give one molecular function Gene Ontology (GO) term and one biological
process GO term associated with this protein. (2 marks)

signaling receptor binding


killing by symbiont of host cells

d. Give two alternative names for this protein. (2 marks)

140 kDa crystal protein


Crystaline entomocidal protoxin

e. Is a 3-dimensional structure available for this protein? If it is, what is the source
of the structure? (2 marks)
AlphaFoldDB

f. What is the protein family that contains this particular protein? (1 mark)

4
MAM 5108: Microbial Bioinformatics
5 Dr. Pasan Fernando

delta endotoxin family

g. During which development stage of the organism is this protein produced? (1


mark)
The crystal protein is produced during sporulation and is accumulated both as
an inclusion and as part of the spore coat.

h. Give sequence coordinates of a region where polar amino acid residues are
overrepresented in this protein sequence. (1 mark)

423-439

i. Give the accession numbers of the European Nucleotide Archive (ENA) and the
Protein Information Resource (PIR) databases for this record. (2 marks)
ENA :CP004134
PIR : S00873

j. Give the UniProt IDs and the organism names for two similar proteins which
have 100% identity with this protein sequence. (2 marks)
M1QWV7_BACTU
Bacillus thuringiensis serovar thuringiensis str. IS5056

A3RLZ7_BACTU
Bacillus thuringiensis

k. Access the Pfam entry for this particular protein using the cross-references
section. According to the Pfam record, list four distinct Pfam domains found in
this protein. (4 marks)

Cry1Ac_D5
Endotoxin_C
Endotoxin_C2
Endotoxin_M

5
MAM 5108: Microbial Bioinformatics
6 Dr. Pasan Fernando

You might also like