0% found this document useful (0 votes)
80 views

Bioinformatic Practice

This document contains instructions for a series of exercises exploring various genomes and genes using the Ensembl genome browser. The exercises involve finding information about specific genes and genomes, such as the assembly, size, location, transcripts and associated phenotypes. The student is asked to provide short answers to questions about exploring genomes of the giant panda, zebrafish, mosquitoes, a bacteria species, and the human MYH9 and PAH genes. Additional exercises explore human population genetics and phenotypes, different human genome assemblies, and a region of the Coprinopsis cinerea genome.

Uploaded by

yungiang157
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Bioinformatic Practice

This document contains instructions for a series of exercises exploring various genomes and genes using the Ensembl genome browser. The exercises involve finding information about specific genes and genomes, such as the assembly, size, location, transcripts and associated phenotypes. The student is asked to provide short answers to questions about exploring genomes of the giant panda, zebrafish, mosquitoes, a bacteria species, and the human MYH9 and PAH genes. Additional exercises explore human population genetics and phenotypes, different human genome assemblies, and a region of the Coprinopsis cinerea genome.

Uploaded by

yungiang157
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Name: …………………..

Class: …………………..

Student-ID: …………….

Introduction to Ensembl

https://www.youtube.com/watch?time_continue=1442&v=lA2xq3YkWko

Exercise 1 – Panda

(a) Go to the species homepage for Giant Panda. What is the name of
the genome assembly for Panda?

Put you answer here

(b) Click on More information and statistics. How long is the Panda
genome (in bp)? How many coding genes have been annotated?

Put you answer here

Exercise 2 – Zebrafish

What previous assemblies are available for zebrafish?

Exercise 3 – Mosquitoes

(a) Go to Ensembl Metazoa. How many species of the genus Anopheles


are represented in Ensembl Metazoa?

(b) When was the current Anopheles gambiae genome assembly last
revised?

Exercise 4 – Bacteria
Go to Ensembl Bacteria and find the species Belliella baltica. How
many coding and non-coding genes does it have?

Exercise 5 – Exploring the human MYH9 gene

(a) Find the human MYH9 (myosin, heavy chain 9, non-muscle) gene,
and go to the Gene

 On which chromosome and which strand of the genome is this gene


located?
 How many transcripts (splice variants) are there and how many are
protein coding?
 What is the longest transcript that codes for protein, and how long is
the protein it encodes?

(b) Click on Phenotype at the left side of the page. Are there any
diseases associated with this gene, according to O-MIM (Online
Mendelian Inheritance in Man)?

(c) In the transcript table, click on the transcript ID for MYH9-201, and
go to the Transcript tab.

 How many exons does it have?


 Are any of the exons completely or partially untranslated?

Exercise 6 – Finding a gene associated with a phenotype

Phenylketonuria is a genetic disorder caused by an inability to


metabolise phenylalanine in any body tissue. This results in an
accumulation of phenylalanine causing seizures and mental
retardation.

(a) Search for phenylketonuria from the Ensembl homepage and


narrow down your search to only genes. What gene is associated with
this disorder?

(b) How many protein coding transcripts does this gene have? View all
of these in the transcript comparison view.
(c) What is the OMIM gene identifier for this gene?

Exercises 7 - Exploring the Human genome

(a)What is the current human genome assembly? What are the


previous assemblies?
(b)What is the size of a human genome? How many coding genen
(Primary assembly)?
(c)Search for the gene BRCA2? What is the position of the gene?
How many protein coding transcripts does this gene have?
(d)The same question as question (c), but using the GRCh37 human
genome assembly version.

Exercise 8 — Human population genetics and phenotype data

The SNP rs1738074 in the 5’ UTR of the human TAGAP gene has been
identified as a genetic risk factor for a few diseases.

(a) In which transcripts is this SNP found?

(b) What is the least frequent genotype for this SNP in the Yoruba (YRI)
population from the 1000 Genomes phase 3?

(c) What is the ancestral allele? Is it conserved in the 90 eutherian


mammals EPO-Extended?

(d) With which diseases is this SNP associated? Are there any known
risk (or associated alleles?

Exercise 9 – Exploring a Coprinopsis cinerea okayama region

(a) Go to the region 7:1400000-1425000 in Coprinopsis cinerea


okayama in Ensembl fungi.
(b) How many complete genes are found in this region? How many on
the forward and how many on the reverse strand?

(c) Zoom in on the largest gene EFI27358. How many exons does this
gene have?

(d) Export the cDNA sequence of the transcript variant EFI27358.

You might also like