Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Jan 1;30(1):38-41.
doi: 10.1093/nar/30.1.38.

The Ensembl genome database project

Affiliations

The Ensembl genome database project

T Hubbard et al. Nucleic Acids Res. .

Abstract

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Screenshot of Ensembl contigview, showing the region of human chromosome 11 around genome sequence accession AP000869. The region is shown at three resolutions and navigation (re-centre on click) is possible by clicking in any of the three panels. In the top ‘Chromosome’ panel a red box shows the region being viewed in q23.3 with respect to the cytogenetic banding pattern of the entire chromosome. In the middle ‘Overview’ panel a second red box similarly shows the region being viewed in detail below. The middle panel shows the location of markers and genes, by default >1 Mb. Genes are coloured brown and labelled with either HUGO identifiers or SPTREMBL IDs if they are known. Novel ‘Ensembl genes’ (see text for definition) are labelled as such and shown in black. Annotated genes from EMBL/GenBank sequence files, where present, are shown in green. The lower ‘Detailed View’ panel shows genomic sequence features in detail, by default >100 Kb. Gene colour scheme is as for the ‘Overview’ panel, with the addition of sequence contig based genscan ab initio predictions shown in cyan. Matches to SPTREMBL entries are shown in yellow, with boxes linking a series of matches to the same entry. Matches to the WGS mouse genome are shown in purple. The region being viewed can be zoomed and re-centred with the mouse or specified precisely in chromosomal coordinates. The pull-down menu shown is one of several and allows the user to select the features being displayed. The second pull down allows the addition of annotation from third party DAS sources. Floating menus (not shown) appear as the mouse is moved over any feature, allowing access to pages with additional information.

Similar articles

  • Ensembl 2002: accommodating comparative genomics.
    Clamp M, Andrews D, Barker D, Bevan P, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Hubbard T, Kasprzyk A, Keefe D, Lehvaslaiho H, Iyer V, Melsopp C, Mongin E, Pettett R, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Birney E. Clamp M, et al. Nucleic Acids Res. 2003 Jan 1;31(1):38-42. doi: 10.1093/nar/gkg083. Nucleic Acids Res. 2003. PMID: 12519943 Free PMC article.
  • Ensembl 2004.
    Birney E, Andrews D, Bevan P, Caccamo M, Cameron G, Chen Y, Clarke L, Coates G, Cox T, Cuff J, Curwen V, Cutts T, Down T, Durbin R, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz H, Iyer V, Kahari A, Jekosch K, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark C, Clamp M, Hubbard T. Birney E, et al. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D468-70. doi: 10.1093/nar/gkh038. Nucleic Acids Res. 2004. PMID: 14681459 Free PMC article.
  • Using the Ensembl genome server to browse genomic sequence data.
    Fernández-Suárez XM, Schuster MK. Fernández-Suárez XM, et al. Curr Protoc Bioinformatics. 2007 Jan;Chapter 1:Unit 1.15. doi: 10.1002/0471250953.bi0115s16. Curr Protoc Bioinformatics. 2007. PMID: 18428779
  • An overview of Ensembl.
    Birney E, Andrews TD, Bevan P, Caccamo M, Chen Y, Clarke L, Coates G, Cuff J, Curwen V, Cutts T, Down T, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz HR, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark KC, Cameron G, Durbin R, Cox A, Hubbard T, Clamp M. Birney E, et al. Genome Res. 2004 May;14(5):925-8. doi: 10.1101/gr.1860604. Epub 2004 Apr 12. Genome Res. 2004. PMID: 15078858 Free PMC article. Review.
  • Genome information resources - developments at Ensembl.
    Hammond MP, Birney E. Hammond MP, et al. Trends Genet. 2004 Jun;20(6):268-72. doi: 10.1016/j.tig.2004.04.002. Trends Genet. 2004. PMID: 15145580 Review.

Cited by

References

    1. Apweiler R., Attwood,T.K., Bairoch,A., Bateman,A., Birney,E., Biswas,M., Bucher,P., Cerutti,L., Corpet,F., Croning,M.D. et al. (2001) The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res., 29, 37–40. - PMC - PubMed
    1. Antonarakis S.E. and McKusick,V.A. (2000) OMIM passes the 1,000-disease-gene mark. Nature Genet., 25, 11. - PubMed
    1. Velculescu V.E., Zhang,L., Vogelstein,B. and Kinzler,K.W. (1995) Serial analysis of gene expression. Science, 270, 484–487. - PubMed
    1. Wheeler D.L., Church,D.M., Lash,A.E., Leipe,D.D., Madden,T.L., Pontius,J.U., Schuler,G.D., Schriml,L.M., Tatusova,T.A., Wagner,L. and Rapp,B.A. (2001) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 29, 11–16. Updated article in this issue: Nucleic Acids Res. (2002), 30, 13–16. - PMC - PubMed
    1. Enright A.J., Iliopoulos,I., Kyrpides,N.C. and Ouzounis,C.A. (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature, 402, 86–90. - PubMed

Publication types