0% found this document useful (0 votes)
156 views

List of Biological Databases

bio data

Uploaded by

ArifSheriff
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views

List of Biological Databases

bio data

Uploaded by

ArifSheriff
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

List of biological databases

From Wikipedia, the free encyclopedia


Jump to navigationJump to search
Biological databases are stores of biological information. [1] The journal Nucleic Acids
Research regularly publishes special issues on biological databases and has a list of such
databases. The 2018 issue has a list of about 180 such databases and updates to previously
described databases.[2]

Contents

 1Meta databases
 2Model organism databases
 3Nucleic acid databases
o 3.1DNA databases
o 3.2Gene expression databases (mostly microarray data)
o 3.3Phenotype databases
o 3.4RNA databases
 4Amino acid / protein databases
o 4.1Protein sequence databases
o 4.2Protein structure databases
o 4.3Protein model databases
o 4.4Protein-protein and other molecular interactions
o 4.5Protein expression databases
 5Signal transduction pathway databases
 6Metabolic pathway and protein function databases
 7Additional databases
o 7.1Exosomal databases
o 7.2Mathematical model databases
o 7.3Taxonomic databases
o 7.4Radiologic databases
o 7.5Antimicrobial resistance databases
 8Wiki-style databases
 9Specialized databases
 10References
 11External links

Meta databases[edit]
Meta databases are databases of databases that collect data about data to generate new data. They
are capable of merging information from different sources and making it available in a new and more
convenient form, or with an emphasis on a particular disease or organism.

 ConsensusPathDB: a molecular functional interaction database,


integrating information from 12 other
 Entrez (National Center for Biotechnology Information)
 Neuroscience Information Framework (University of California, San
Diego): integrates hundreds of neuroscience relevant resources;
many are listed below

Model organism databases[edit]


Model organism databases provide in-depth biological data for intensively studied.

 PomBase: the knowledgebase for the fission


yeast Schizosaccharomyces pombe[3]

Nucleic acid databases[edit]


DNA databases[edit]
Primary databases
International Nucleotide Sequence Database (INSD) consists of the following databases.

 DNA Data Bank of Japan (National Institute of Genetics)


 EMBL (European Bioinformatics Institute)
 GenBank (National Center for Biotechnology Information)
DDBJ (Japan), GenBank (USA) and European Nucleotide Archive (Europe) are repositories for
nucleotide sequence data from all organisms. All three accept nucleotide sequence submissions,
and then exchange new and updated data on a daily basis to achieve optimal synchronisation
between them. These three databases are primary databases, as they house original sequence
data. They collaborate with Sequence Read Archive (SRA), which archives raw reads from high-
throughput sequencing instruments.
Secondary databases

 23andMe's database
 HapMap
 OMIM (Online Mendelian Inheritance in Man): inherited diseases
 RefSeq
 1000 Genomes Project: launched in January 2008. The genomes of
more than a thousand anonymous participants from a number of
different ethnic groups were analyzed and made publicly available.
Gene expression databases (mostly microarray data)[edit]
Main article: Microarray databases
Genome databases
These databases collect genome sequences, annotate and analyze them, and provide public
access. Some add curation of experimental literature to improve computed annotations. These
databases may hold many species genomes, or a single model organism genome.

 ArrayExpress:[4] archive of functional genomics data; stores data


from high-throughput functional genomics experiments from EMBL
 Bioinformatic Harvester
 Ensembl: provides automatic annotation databases for human,
mouse, other vertebrate and eukaryote genomes
 Ensembl Genomes: provides genome-scale data for bacteria,
protists, fungi, plants and invertebrate metazoa, through a unified
set of interactive and programmatic interfaces (using the Ensembl
software platform)
 FlyBase: genome of the model organism Drosophila melanogaster
 Gene Disease Database
 Gene Expression Omnibus (GEO[5]): a public functional genomics
data repository from the U.S. National Cancer Institute (NCI), which
supports array- and sequence-based data. Tools for querying and
downloading gene expression profiles are provided.
 Human Protein Atlas (HPA[6]): a public database with expression
profiles of human protein coding genes both on mRNA and protein
level in tissues, cells, subcellular compartments, and cancer tumors.
 Legume Information System (LIS): genomic database for the
legume family[7]
 Personal Genome Project: human genomes of 100,000 volunteers
from around the world
 RGD (Rat Genome Database): genomic and phenotype data for
Rattus norvegicus
 Saccharomyces Genome Database:[8] genome of the yeast model
organism
 SNPedia
 SoyBase Database[9] (SoyBase): USDA soybean genetics and
genomic database (Soybean)
 UCSC Malaria Genome Browser: genome of malaria causing
species (Plasmodium falciparum and others)
 Wormbase: genome of the model organism Caenorhabditis
elegans and WormBase ParaSite for parasitic species
 Xenbase: genome of the model organism Xenopus
tropicalis and Xenopus laevis
 Zebrafish Information Network: genome of this fish model organism
Phenotype databases[edit]

 PHI-base: pathogen-host interaction database. It links gene


information to phenotypic information from microbial pathogens on
their hosts. Information is manually curated from peer reviewed
literature.
 RGD Rat Genome Database: genomic and phenotype data
for Rattus norvegicus
 PomBase database: manually curated phenotypic data for the
yeast Schizosaccharomyces pombe
RNA databases[edit]

 miRBase: the microRNA database
 Rfam: a database of RNA families

Amino acid / protein databases[edit]


Protein sequence databases[edit]
 Database of Interacting Proteins (Univ. of California)
 DisProt: database of experimental evidences of disorder in proteins
(Indiana University School of Medicine, Temple
University, University of Padua)
 InterPro: classifies proteins into families and predicts the presence
of domains and sites
 MobiDB: database of intrinsic protein disorder annotation
(University of Padua)
 neXtProt: a human protein-centric knowledge resource
 Pfam: protein families database of alignments and HMMs (Sanger
Institute)
 PRINTS: a compendium of protein fingerprints from (Manchester
University)
 PROSITE: database of protein families and domains
 Protein Information Resource (Georgetown University Medical
Center [GUMC])
 SUPERFAMILY: library of HMMs representing superfamilies and
database of (superfamily and family) annotations for all completely
sequenced organisms
 Swiss-Prot: protein knowledgebase (Swiss Institute of
Bioinformatics)
 NCBI: protein sequence and knowledgebase (National Center for
Biotechnology Information)
Protein structure databases[edit]

 Protein Data Bank (PDB), comprising:


o Protein DataBank in Europe (PDBe)[10]
o ProteinDatabank in Japan (PDBj)[11]
o Research Collaboratory for Structural Bioinformatics (RCSB)[12]
 Structural Classification of Proteins (SCOP)
For more protein structure databases, see also Protein structure database.
Protein model databases[edit]

 ModBase: database of comparative protein structure models


(Sali Lab, UCSF)
 Similarity Matrix of Proteins (SIMAP): database of protein
similarities computed using FASTA
 Swiss-model: server and repository for protein structure models
 AAindex: database of amino acid indices, amino acid mutation
matrices, and pair-wise contact potentials
Protein-protein and other molecular interactions[edit]

 BioGRID: general repository for interaction datasets (Samuel


Lunenfeld Research Institute)
 RNA-binding protein database
Protein expression databases[edit]
 Human Protein Atlas: aims at mapping all the human proteins in
cells, tissues and organs

Signal transduction pathway databases[edit]


 NCI-Nature Pathway Interaction Database
 Netpath: curated resource of signal transduction pathways in
humans
 Reactome: navigable map of human biological pathways, ranging
from metabolic processes to hormonal signalling (Ontario Institute
for Cancer Research, European Bioinformatics Institute, NYU
Langone Medical Center, Cold Spring Harbor Laboratory)
 WikiPathways

Metabolic pathway and protein function databases[edit]


 BioCyc Database Collection: includes EcoCyc and MetaCyc
 BRENDA: the comprehensive enzyme information system, including
FRENDA, AMENDA, DRENDA, and KENDA
 HMDB: contains detailed information about small molecule
metabolites found in the human body
 KEGG PATHWAY Database (Univ. of Kyoto)
 MANET database (University of Illinois)
 Reactome: navigable map of human biological pathways, ranging
from metabolic processes to hormonal signalling (Ontario Institute
for Cancer Research, European Bioinformatics Institute, NYU
Langone Medical Center, Cold Spring Harbor Laboratory)
 SABIO-RK: database for biochemical reactions and their kinetic
properties
 WikiPathways

Additional databases[edit]
Exosomal databases[edit]

 ExoCarta
Mathematical model databases[edit]

 Biomodels Database: published mathematical models describing


biological processes
Taxonomic databases[edit]
Main article: List of biodiversity databases

 BacDive: bacterial metadatabase that provides strain-linked


information about bacterial and archaeal biodiversity, including
taxonomy information
 EzTaxon-e: database for the identification of prokaryotes based on
16S ribosomal RNA gene sequences
Radiologic databases[edit]

 The Cancer Imaging Archive (TCIA)


 Neuroimaging Informatics Tools and Resources Clearinghouse
Antimicrobial resistance databases[edit]

 AMRFinderPlus
 Antimicrobial Drug Database (AMDD)
 ARDB
 ARGminer
 BacMet
 Beta-Lactamase Database (BLAD)
 Beta-Lactamase Database (BLDB)
 CBMAR
 The Comprehensive Antibiotic Resistance Database
 EARS-Net
 FARME
 INTEGRALL
 LacED
 MEGARes
 MUBII-TB-DB
 Mustard Database
 MvirDB
 PathoPhenoDB
 PATRIC database
 RAC: Repository of Antibiotic resistance Cassettes
 ResFinder
 TBDReaMDB
 u-CARE
 VFDB

Wiki-style databases[edit]
 Gene Wiki
 WikiProfessional

Specialized databases[edit]
 Barcode of Life Data Systems: database of DNA barcodes
 The Cancer Genome Atlas (TCGA): provides data from hundreds of
cancer samples obtained using high-throughput techniques such as
gene expression profiling, copy number variation profiling, SNP
genotyping, genome-wide DNA methylation profiling, microRNA
profiling, and exon sequencing of at least 1,200 genes
 Cellosaurus: a knowledge resource on cell lines
 CTD (Comparative Toxicogenomics Database): describes chemical-
gene-disease interactions
 DiProDB: a database to collect and analyse thermodynamic,
structural and other dinucleotide properties
 Dryad: repository of data underlying scientific publications in the
basic and applied biosciences
 Edinburgh Mouse Atlas
 EPD Eukaryotic Promoter Database
 FINDbase (the Frequency of INherited Disorders database)
 GigaDB: repository of large scale datasets underlying scientific
publications in the biological and biomedical research
 HGNC (HUGO Gene Nomenclature Committee): a resource for
approved human gene nomenclature
 International Human Epigenome Consortium :[13] integrates
epigenomic reference data from well-known national endeavors
such as the Canadian CEEHRC,[14] European Blueprint,[15] European
Genome-phenome Archive (EGA[16]), US ENCODE and
NIH Roadmap, German DEEP,[17] Japanese CREST,[18] Korean
KNIH, Singapore's GIS and China's EpiHK[19]
 MethBase: database of DNA methylation data visualized on
the UCSC Genome Browser
 Minimotif Miner: database of short contiguous functional peptide
motifs
 Oncogenomic databases: a compilation of databases that serve for
cancer research
 PubMed: references and abstracts on life sciences and biomedical
topics
 RIKEN integrated database of mammals
 TDR Targets: a chemogenomics database focused on drug
discovery in tropical diseases
 TRANSFAC: a database about eukaryotic transcription factors, their
genomic binding sites and DNA-binding profiles
 JASPAR: a database of manually curated, non-redundant
transcription factor binding profiles.
 MetOSite: a database about methionine sulfoxidation sites and its
functional roles in proteins[20]
 Healthcare Cost and Utilization Project (HCUP) is the largest
collection of hospital care data in the United States. It includes
hundreds of millions of inpatient, outpatient, and emergency
records.

References[edit]
1. ^ Wren JD, Bateman A (October 2008). "Databases, data tombs and
dust in the wind". Bioinformatics.  24  (19): 2127–
8. doi:10.1093/bioinformatics/btn464.  PMID  18819940.
2. ^ "Volume 46 Issue D1 | Nucleic Acids Research | Oxford
Academic". academic.oup.com. Retrieved  2018-09-04.
3. ^ Lock, A; Rutherford, K; Harris, MA; Hayles, J; Oliver, SG; Bähler, J;
Wood, V (13 October 2018). "PomBase 2018: user-driven
reimplementation of the fission yeast database provides rapid and
intuitive access to diverse, interconnected information". Nucleic Acids
Research.  47  (D1): D821–
D827. doi:10.1093/nar/gky961.  PMC 6324063. PMID 30321395.
4. ^ ArrayExpress
5. ^ GEO
6. ^ "The Human Protein Atlas". www.proteinatlas.org. Retrieved 2019-
05-27.
7. ^ Dash S, Campbell JD, Cannon EK, Cleary AM, Huang W, Kalberer
SR, Karingula V, Rice AG, Singh J, Umale PE, Weeks NT, Wilkey AP,
Farmer AD, Cannon SB (January 2016).  "Legume information system
(LegumeInfo.org): a key component of a set of federated data
resources for the legume family".  Nucleic Acids Research.  44  (D1):
D1181-8.  doi:10.1093/nar/gkv1159. PMC  4702835.  PMID  26546515.
8. ^ "Saccharomyces Genome Database |
SGD".  www.yeastgenome.org. Retrieved 2018-09-04.
9. ^ Grant, David; Nelson, Rex T.; Cannon, Steven B.; Shoemaker,
Randy C. (2010). "SoyBase, the USDA-ARS soybean genetics and
genomics database".  Nucleic Acids Research. 38 (Suppl 1) (Database
issue): D843–
D846. doi:10.1093/nar/gkp798. PMC  2808871.  PMID  20008513.
10. ^ Mir S, Alhroub Y, Anyango S, Armstrong DR, Berrisford JM, Clark
AR, Conroy MJ, Dana JM, Deshpande M, Gupta D, Gutmanas A,
Haslam P, Mak L, Mukhopadhyay A, Nadzirin N, Paysan-Lafosse T,
Sehnal D, Sen S, Smart OS, Varadi M, Kleywegt GJ, Velankar S
(January 2018). "PDBe: towards reusable data delivery infrastructure
at protein data bank in Europe". Nucleic Acids Research. 46 (D1):
D486–D492. doi:10.1093/nar/gkx1070.  PMC 5753225. PMID 291261
60.
11. ^ Kinjo AR, Bekker GJ, Suzuki H, Tsuchiya Y, Kawabata T, Ikegawa
Y, Nakamura H (January 2017). "Protein Data Bank Japan (PDBj):
updated user interfaces, resource description framework, analysis
tools for large structures". Nucleic Acids Research. 45(D1): D282–
D288. doi:10.1093/nar/gkw962.  PMC 5210648. PMID 27789697.
12. ^ Rose PW, Prlić A, Altunkaya A, Bi C, Bradley AR, Christie CH, et al.
(January 2017). "The RCSB protein data bank: integrative view of
protein, gene and 3D structural information". Nucleic Acids
Research.  45  (D1): D271–
D281. doi:10.1093/nar/gkw1000.  PMC 5210513. PMID 27794042.
13. ^ (IHEC) data portal
14. ^ CEEHRC
15. ^ Blueprint
16. ^ EGA
17. ^ DEEP
18. ^ CREST
19. ^ "Sharing epigenomes globally".  Nature Methods.  15  (3): 151.
2018.  doi:10.1038/nmeth.4630.  ISSN  1548-7105.
20. ^ Valverde, Héctor; Cantón, Francisco R.; Aledo, Juan Carlos (2019).
"MetOSite: an integrated resource for the study of methionine residues
sulfoxidation". Bioinformatics.  35(22): 4849–
4850.  doi:10.1093/bioinformatics/btz462.  PMID  31197322.
External links[edit]
 Nucleic Acid Research Molecular Biology Database Collection  –
over 1,600 databases

hide

Bioinformatics

Sequence databases: GenBank, European Nucleotide Archive and DNA Data Bank of Japan

Secondary databases: UniProt, database of protein sequences grouping together Swiss-Prot, TrEMBL and Protein Information Resource

Other databases: Protein Data Bank, Ensembl and InterPro

Specialised genomic databases: BOLD, Saccharomyces Genome Database, FlyBase, VectorBase, WormBase, PHI-base, Arabidopsis Information Resource and Zebrafis

BLAST

Bowtie

Clustal

EMBOSS

HMMER

MUSCLE

SAMtools

TopHat

Server: ExPASy

Ontology: Gene Ontology

Rosalind (education platform)

Broad Institute

Computational Biology Department (CBD)

Microsoft Research - University of Trento Centre for Computational and Systems Biology  (COSBI)

Database Center for Life Science (DBCLS)

DNA Data Bank of Japan (DDBJ)

European Bioinformatics Institute (EMBL-EBI)

European Molecular Biology Laboratory (EMBL)

Flatiron Institute

J. Craig Venter Institute (JCVI)

Max Planck Institute of Molecular Cell Biology and Genetics  (MPI-CBG)


US National Center for Biotechnology Information (NCBI)

Japanese Institute of Genetics

Netherlands Bioinformatics Centre (NBIC)

Philippine Genome Center (PGC)

Scripps Research

Swiss Institute of Bioinformatics (SIB)

Wellcome Sanger Institute

Whitehead Institute

African Society for Bioinformatics and Computational Biology  (ASBCB)

Australia Bioinformatics Resource (EMBL-AR)

European Molecular Biology network (EMBnet)

International Nucleotide Sequence Database Collaboration  (INSDC)

International Society for Biocuration (ISB)

International Society for Computational Biology (ISCB) 


Student Council (ISCB-SC)

Institute of Genomics and Integrative Biology (CSIR-IGIB)

Japanese Society for Bioinformatics (JSBi)

Basel Computational Biology Conference ([BC2])

European Conference on Computational Biology (ECCB)

Intelligent Systems for Molecular Biology (ISMB)

International Conference on Bioinformatics (InCoB)

ISCB Africa ASBCB Conference on Bioinformatics

Pacific Symposium on Biocomputing (PSB)

Research in Computational Molecular Biology (RECOMB)

CRAM format

FASTA format

FASTQ format

NeXML format

Nexus format

Pileup format

SAM format

Stockholm format

Computational biology
List of biological databases

Molecular phylogenetics

Sequencing

Sequence database

Sequence alignment

  Category

  Commons
Categories: 
 Biological databases
Navigation menu
 Not logged in
 Talk
 Contributions
 Create account
 Log in
 Article
 Talk
 Read
 Edit
 View history
Search
Search Go

 Main page
 Contents
 Featured content
 Current events
 Random article
 Donate to Wikipedia
 Wikipedia store
Interaction
 Help
 About Wikipedia
 Community portal
 Recent changes
 Contact page
Tools
 What links here
 Related changes
 Upload file
 Special pages
 Permanent link
 Page information
 Wikidata item
 Cite this page
Print/export
 Download as PDF
 Printable version
Languages
Add links
 This page was last edited on 17 February 2020, at 07:39 (UTC).
 Text is available under the Creative Commons Attribution-ShareAlike License; additional
terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy.
Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit
organization.
 Privacy policy

 About Wikipedia

 Disclaimers

 Contact Wikipedia

 Developers

 Statistics

 Cookie statement

 Mobile view

You might also like