Human Genome Databases

Welcome to the GenomeWeb
Human Genome Databases

Search for:

The following are a collection of human genome databases.

Genome Central
OMIM (Online Mendelian Inheritance in Man)
GDB (Genome DataBase)
GeneCards - integrated biomedical genetic information
The Unified Database for Human Genome Mapping
OMIM gene map
EnsEMBL - annotated human sequences
The HuGeMap database
GenAtlas
Genome Navigator: Whitehead/MIT STS-based Map of the Human Genome
The Genome Channel
Whitehead/MIT STS-Based Map of the Human Genome
Transcript Map of the Human Genome
Human Telomere Information
The Genetic Location Database (LDB)
The dysmorphic human and mouse homology database
BodyMap - Anatomical Expression Database of Human Genes
CEPH-Genethon integrated map
CEPH Genotype database
Cooperative Human Linkage Center (CHLC)
GeneMap '98 - The International RH Mapping Consortium Map
Radiation Hybrid Mapping data (RHdb)
dbEST Expressed Sequence Tag Database
UniGene - Unique Human Gene Sequence Collection
dbSTS Sequence Tagged Site Database
Whitehead Institute/MIT Genome Center
V BASE: A Directory of Human Immunoglobulin V Genes
Human CpG Island database
Human population genetics database (Genography)
Anthony Nolan Research Institute (ANRI)
GDB Nomenclature Committee
Atlas of Genetics and Cytogenetics in Oncology and Haematology
Human Genome - The Third Millennium
Human BAC Fingerprint Database

Detailed information on the above options

Genome Central
In the context of the completion of the"Working Draft" of the Human Genome Sequence by the public Human Genome Project, here are some URLs that are good starting points for working with the data.

The vast majority of the human genome is now publicly available. About 25% of the genome is in finished form, while the great majority of the remainder is in draft form. A publication on the working draft will be submitted later this year.

As with other organisms, the full primary source data is available in each of three public databases: Genbank, EMBL and DDBJ. The information is being searched and analyzed by tens of thousands of scientists in academia and industry.

Still, it is a daunting and time-consuming task for users to directly analyze the primary source data. Many users would like access to ancillary information and tools that provide an ongoing picture of the genome that is both comprehensive and comprehensible. Such information includes the overlaps between clones; the correct genomic location of each clone; an integrated genomic sequence that merges the individual clones; and annotation of gene content.

In fact, such resources have been developed and are freely available - but they are not widely known.

For ease of access, we have created a master web site called"Human Genome Central" containing a brief listing of links to some of the most useful public resources; further links to additional sites can be found within them. The web sites will be regularly updated with new information.

OMIM (Online Mendelian Inheritance in Man)
This database is a catalog of human genes and genetic disorders authored and edited by Dr. Victor A. McKusick and colleagues at NCBI, Bethesda, Maryland.

GDB (Genome DataBase)
GDB holds data on Human gene loci, polymorphisms, mutations, probes, genetic maps, GenBank, citations and contacts.

GeneCards - integrated biomedical genetic information
Although it will take some years until the human genome is totally sequenced, and still a much longer time to learn about the functions of the products of those genes, the complex organization and the vast amount of biomedical information already accessible often cause certain problems that are somehow connected to the phenomenon of"information overflow" and the often very time-consuming process of information retrieval or mining. Thus, many scientists feel that new approaches to organize scientific information are urgently needed.

GeneCards is a database that intends to address some of these problems by integrating biomedical information taken from several sources (GDB, MGD, OMIM, SWISS-PROT, HGMD, Doctor's Guide to the Internet), and by presenting them in a way facilitating a quick.

The Unified Database for Human Genome Mapping
The Unified Database (UDB) integrates information on the human genome, with emphasis on mapping information. Mapped DNA segments, classified by categories (such as genes, EST clusters and STSs mapped by various methods) are presented on a Megabase-scale integrated map, with further information and links to relevant databases.

UDB includes data from numerous resources, including the Genome Database, Whitehead Institute/MIT Center for Genome Research, Genethon, GeneMap'99 and others. Integrated map locations were calculated from separate method-specific chromosome maps (e.g. genetic linkage, radiation hybrid, and content-contig maps) by a simple scaling algorithm.

OMIM gene map
The OMIM gene map presents the cytogenetic map location of disease genes and other expressed genes described in OMIM.

You enter a position, say '17q11' and you get all the omim records in that regionn, even things that map to '17cen-q12'

EnsEMBL - annotated human sequences
Ensembl is a joint project between EMBL-EBI and the Sanger Centre to develop a software system for automating analysis of genomic data. It is being applied to the publically released human genome data stream.

The HuGeMap database
HuGeMap is a database that contains:

the genetic maps from Genethon,
the genetic maps from Cooperative Human Linkage Center,
the physical maps from CEPH/Ginithon
the physical maps from the Whitehead Institute-MIT

HuGeMap is interconnected to the radiation hybrid gene map database RHdb, maintained at EBI. This interconnection is based on CORBA servers that have been implemented at Infobiogen and EBI, and that share the same IDL (see the Object Management Group for an introduction to CORBA).

GenAtlas
Compiles the information relevant to the mapping efforts of the Human Genome Project.

GENATLAS/GEN is a repertory of three types of objects : genes, diseases, and markers.

Genome Navigator: Whitehead/MIT STS-based Map of the Human Genome
Genome Navigator is an attempt to provide a visual interactive gateway to major databases containing physical and genetic mapping information about the human genome.

Genomic maps of these organisms are displayed using DerBrowser, a Java applet, designed as a universal tool to display and navigate various types of maps. Among other features, it allows a user to query external databases about any map object.

The Genome Channel
This system is a prototype graphical browser for querying the annotated reference genome.

The Java interface relies on a number of underlying resources, analysis tools and data-retrieval agents to provide an up-to-date view of genomic sequences as well as computational and experimental annotation. Designed to be simple enough for a layperson, the channel also offers sophisticated capabilities for hypothesis testing.

Whitehead/MIT STS-Based Map of the Human Genome
This contains YAC screening data for several thousand STSs. For each STS, information is held on the following types of raw data (where available):

Chromosomal assignments
YAC library screening results
Radiation hybrid panel screening results

In addition, information is available on the following preliminary preliminary analyses:

Doubly-linked YAC contigs
Singly-linked YAC contigs
Radiation hybrid maps
Integrations between genetic, radiation hybrid, and YAC contig maps.

Transcript Map of the Human Genome
A small portion of each cDNA sequence is all that is needed to develop unique gene markers, known as sequence tagged sites or STSs, which can be detected in chromosomal DNA by assays based on the polymerase chain reaction (PCR). To construct a transcript map, cDNA sequences from a master catalog of human genes were distributed to mapping laboratories in North America, Europe, and Japan. These cDNAs were converted to STSs and their physical locations on chromosomes determined on one of two radiation hybrid (RH) panels or a yeast artificial chromosome (YAC) library containing human genomic DNA. This mapping data was integrated relative to the human genetic map and then cross-referenced to cytogenetic band maps of the chromosomes. (Further details are available in the accompanying article in the 25 October issue of SCIENCE).

The histograms reflect the distributions and densities of genes along the chromosomes. Because the individual genes (>16,000) are too numerous to represent, images have been chosen to illustrate the myriad aspects of human biology, pathology, and relationships with other organisms that can be revealed by analysis of genes and their protein products.

Human Telomere Information
This is a section of GenLink's Teldb giving literature citations and other information on human telomeric regions.

The Genetic Location Database (LDB)
Ldb is an analytical database for constructing fully integrated genetic and physical maps. The ldb program generates an integrated map (known as the summary map) from partial maps of physical, genetic, regional, somatic hybrid, mouse homology and cytogenetic data.

The summary maps and the data used to build up such maps are available from this site. The files for each chromosome are stored in the same directory which include the summary map, partial maps, lod files and the parameter files. As this server is experimental many of the chromosome directories are incomplete with the most complete map sets being chromosomes 1,9,21 and X.

The dysmorphic human and mouse homology database
This consists of three separate databases of human and mouse malformation syndromes together with a database of mouse/human syntenic regions. The mouse and human malformation databases are linked together through the chromosome synteny database. The purpose of the system is to allow retrieval of syndromes according to detailed phenotypic descriptions and to be able to carry out homology searches for the purpose of gene mapping. Thus the database can be used to search for human or mouse malformation syndromes in different ways:-

By specifying specific malformations or clinical features, or chromosome locations.
By Homology.
By asking for human syndromes located at a chromosome region syntenic with a specific mouse chromosome region (and vice versa from human to mouse).

BodyMap - Anatomical Expression Database of Human Genes
BodyMap is a data bank of expression information of human genes, novel or known, in various tissues or cell types. It is created by random sequencing of clones in 3'-directed cDNA libraries. Since these clones were not amplified, redundancy of the same sequence reflects the quantitative aspect of gene expression in various tissues.

You can enter your sequence and it will be matched using fasta to the cDNA sequences in this database.

CEPH-Genethon integrated map
This page allows you to search the CEPH-Genethon mapping data used to build the first generation physical map of the human genome. It gives information on the CEPH YAC library and the QUICKMAP database with the infoclone program to get information about a STS or a YAC.

CEPH Genotype database
The Centre d'Etude du Polymorphisme Humain (CEPH) maintains a database of genotypes for all genetic markers that have been tested in the reference families for linkage mapping of the human chromosomes.

browse data by chromosome, probe, D-number, Gene name and Heterozygote frequency
output data for several markers on the same chromosome in CEPH, LINKAGE and CRIMAP formats.
CEPH Collaborators can now submit data for new markers directly through Internet by using a browser supporting JAVA applets.

All genotypes contributed to the CEPH database are also available by anonymous FTP server. Genotypes, markers description, pairwise lodscores may be downloaded from the FTP server. In addition, the server contains databases for published CEPH consortium maps and also breakpoint maps.

Cooperative Human Linkage Center (CHLC)
The goal of the Cooperative Human Linkage Center is to develop statistically rigorous, high heterozygosity genetic maps of the human genome that are greatly enriched for the presence of easy-to-use PCR-formatted microsatellite markers.

Genetic maps showing the positions of genetic markers
- Integrated Maps showing the position of genetic markers constructed using genotype data from the CEPH reference panel.
- CHLC Marker Maps showing the positions of CHLC generated markers in various reference maps.
Search by name for information on markers.
- Search by name for information on markers, including map location primer and pcr conditions, and sequence templates.
- Likely locations of current CHLC markers, in Version 2.0 Skeletal Maps
- Tables of CHLC markers characterized by linkage analysis.
- List of full information on markers generated by CHLC
  - In Current Linkage Map
  - Candidate Linkage Markers
  - Somatic Cell Hybrid Assigned
- Prior versions of the CHLC generated markers.
- Marshfield CA-repeat Markers
  - Table of initial typing data
  - Table of sequence data
  - Table of PCR Primers
CHLC publications.
- Science Maps and Data
- Copies of the CHLC Newsletter
CHLC project information.

GeneMap '98 - The International RH Mapping Consortium Map
This is the latest Radiation Hybrid Consortium Human map.

Radiation Hybrid Mapping data (RHdb)
Radiation hybrid maps are an indispensable alternative to genetic maps as they can include non-polymorphic markers and are also powerful enough to order unresolved genetic clusters of polymorphic STSs. An international collaborative project has been started which will produce a large number of these hybrids for the human genome. This in turn will allow the generation of a very precise STS map that will be indispensable in the study of multifactorial diseases.

RHdb, the radiation hybrid database is an archive of raw data with links to other related databases. The main data is stored in a relational database. Submissions to this database are made using a standard format. Various export formats will be supported, as well as different ways of accessing the data.

dbEST Expressed Sequence Tag Database
The dbEST Database holds many Human ESTs.

UniGene - Unique Human Gene Sequence Collection
This holds clusters of human EST sequences that represent the transcription products of distinct genes.

These sequences are being used for transcript mapping in collaboration with several genome mapping centers. Some of the clusters have already been localized to chromosomes, but more detailed mapping map information is not available at this time.

dbSTS Sequence Tagged Site Database
The dbSTS Database holds many Human STSs.

Whitehead Institute/MIT Genome Center

Human YAC screening data for sequence tagged sites (STSs) screened on the CEPH mega-YAC library with over 1100 contigs assembled using double linkage between STSs.
For each STS, they report addresses for the YACs found to contain the STS.
Human, Rat and Mouse marker map data files.

V BASE: A Directory of Human Immunoglobulin V Genes
A directory of human immunoglobulin germline variable region sequences compiled from over a thousand published sequences (including those in the current releases of the Genbank and EMBL data libraries). There are seven directories: D, JH, JK, JL, VH, VK and VL. Each directory consists of a folder or file containing the germline sequences and a file containing the corresponding reference list.

Human CpG Island database
Look at the Human CpG Island database. This is a flat file containing a description of genes and their associated CpG islands.

Human population genetics database (Genography)
A database of human genetic and cultural diversity, to act as a comprehensive community repository supporting work in human population genetics and quantitative anthropology.

The currently available version of the database contains 100,000 gene frequencies from almost 2,000 populations, collected from the literature on classical polymorphisms (essentially protein data) published up to 1986. These data were used for calculations on which the book History and Geography of Human Genes, by Cavalli-Sforza, Menozzi, and Piazza, is based.

Future versions of the database will include an update of the classical polymorphism data, a collection of published and unpublished DNA data by individual and by population (including RFLPs, microsatellites, and SNPs), and the future CEPH database, to be collected in collaboration with the Human Genome Diversity Project. In addition, information about geography, regional ecology, linguistics, mythology, musicology, and physical anthropology will be included. Finally, various analysis and visualization tools will be provided.

Anthony Nolan Research Institute (ANRI)
The WHO Nomenclature and HLA Sequence alignments are available from this site, together with monthly updates.

GDB Nomenclature Committee

Activities of the Nomenclature Committee
Nomenclature mailing list
Committee members
Guidelines for choosing a gene symbol
Submit a proposed gene symbol
Browse approved gene symbols
Gene Family Nomenclature
Checking for existing symbols

Atlas of Genetics and Cytogenetics in Oncology and Haematology
The Atlas of Genetics and Cytogenetics in Oncology and Haematology is a cooperative process of reviewing and updating on somatic genetics, clinical entities in cancer, and on cancer-prone diseases; it is made for and by: cytogeneticists, molecular biologists, and geneticists in general, clinicians in oncology and in haematology, and pathologists.

Human Genome - The Third Millennium
This site aims to help researchers find their way in Web-accessible databases containing Human Genome information, and to find answers to problems related to human genomic clones, contigs, sequences and maps.

It contains:

Links to relevant databases and tools.
Search strategies to aid users with little or no experience.
Practical examples with tips for searching and experimental follow-up.

Human BAC Fingerprint Database
The Human fingerprint database consists of fingerprints generated at the GSC from various Human libraries. Approximately 268K fingerprints were generated from the RPCI-11 BAC library and are designated with an"N" in the on-line search and in the FPC database. We are currently fingerprinting other BAC libraries for the Human fingerprint database. These are BACs from the Caltech library B,C and D1 are designated with an"M" and RPCI-13 BACs are designated with an"F". All other fingerprints that are available from the database were generated in collaboration with other labs.

Any Comments, Questions? Support@hgmp.mrc.ac.uk

Welcome to the GenomeWeb Human Genome Databases

Detailed information on the above options

Welcome to the GenomeWeb
Human Genome Databases