CDRH: ADatabase of Complex Disease-related Haplotypes in Human
14
be displayed in a new page (Figure 1c). The detailed
information consists of three sections: disease,
literature, and haplotype. The disease section focuses
on a brief summary of the pathogenesis and clinical
characteristics of colorectal cancers. If users desire
more comprehensive knowledge of the disease and its
effects, they can enter the web site of Patient UK or
Wikipedia by an included hyperlink. The literature
section lists all documents concerning susceptible (or
protective) haplotypes for colorectal cancers,
including PubMed ID, publication date, title, and the
abstract. This information provides a preliminary
insight into progress in the detection and treatment of
colorectal cancers based on haplotype analysis. The
haplotype section presents all colorectal cancer related
haplotypes,
haplotype frequencies,
related
chromosome number, and gene symbol, SNPs (or
microsatellites) that comprise a haplotype, the risk
status of haplotypes, the p-value of statistical tests,
and study populations (Figure 1f). For more detailed
information about genes or haplotypes, users are able
to click on relevant links and a new page will appear,
as shown in Figure 1e and Figure 1g. An image
showing the haplotype location on chromosome bands
is displayed on the left, which gives users visual
indication of the haplotype location. In addition to
disease related haplotypes, we provide all the other
haplotypes defined by the same SNPs (or
microsatellites) in the same study populations and
their frequencies to users (Figure 1h). Users can also
query CDRH by using combinations of disease names
and chromosome numbers (Figure 1b). The results are
the same as searching only by disease name.
Figure 1c shows the row called ‘risk status’ of the
query results. It has four different values: ‘risk’ and
‘protection’ stand for haplotypes that increase or
decrease, respectively, the disease risk as described in
the literature; ‘statistical inference risk’ and ‘statistical
inference protection’ stand for haplotypes that increase
or decrease the disease risk, respectively, which were
only present in the results table of an association test.
Similar to the search by disease name, users can
search the database by gene name (currently supports
Entrez Gene ID and Gene Symbol). This is effective
in helping users directly identify haplotypes related to
a gene of interest. Users can also search the database
by chromosome number. Complex disease-related
haplotype-centered information is shown in the order
of the online publication date of the articles. Users can
track developments in the design and analysis of
haplotype studies for complex human diseases on this
chromosome. In addition, users can retrieve
information by SNP ID (rs#). If the query SNP has
been identified as being part of a haplotype in our
database, the search result will be returned in a new
page. The basic SNP information and the concise
description of relevant references will help users
better understand genetic susceptibility to complex
diseases. Users can view the details of interesting
items by clicking on hypertext links. Our database also
preserves the search history records for each query model,
which allows users to recall previous search results.
The query results obtained in different ways can be
directly downloaded as an Excel file by the download
link at the top of view page (Figure 1d). Furthermore,
all data for complex disease-related haplotypes, as
well as the corresponding analysis software, are freely
available on the download page.
1.4 Submit page
We encourage users to submit information concerning
complex disease-related haplotypes that are not
documented. Data can be directly submitted to CDRH
via the Submit Web page. Required submission
information includes disease name, population,
chromosome number, gene symbol, haplotype,
PubMed ID, and the correspondence details of the
submitters. All submissions will receive a systematic
quality assurance review.
The submitted records, and other essential information,
will be added in the CDRH as soon as possible if the
submissions pass the above checks. The data
contained in CDRH is updated regularly by manual
extraction of relevant information from publications
retrieved from the literature databases of PubMed. The
collection of new and improved items will be displayed in
the top of the browse page after each update.
2 Discusson
Understanding the relationship between genetic
variation and heritable risk for complex human
Computational
Molecular Biology