Computational Molecular Biology
15
diseases is a formidable challenge for modern human
genetics. This is also an important step towards the
discovery of genes that influence complex human
diseases. To provide a central resource for molecular
biologists and geneticists who study complex
disease-related haplotypes, we have collected a
considerable amount of information, which was
scattered in existing studies, and have developed a
database of complex disease-related haplotypes,
CDRH. It not only offers an easy-to-use interface to
query the valuable reference information concerning
haplotypes and diseases extracted from the literature,
but also integrates vast quantities of complementary
biological annotations from external database. The
CDRH database clearly reflects the relationships
between haplotypes and complex diseases. Thus, it
facilitates the gathering of more comprehensive
information on complex disease-related haplotypes,
and at the same time, saves researchers the trouble of
searching multiple databases and large quantities of
literatures.
Currently, 1 125 haplotypes are documented in the
CDRH database, referring to 22 autosomes, the
chromosome X, the chromosome Y, and the
mitochondrion. Figure 2a represents a histogram of
the number of complex disease-related haplotypes on
each chromosome. Figure 2b represents a histogram of
the number of complex disease-related genes on each
chromosome. As is evident from Figure 2, the
overwhelming majority of haplotypes (431 haplotypes)
and genes (39 genes) are located on chromosome 6. In
particular, these haplotypes and genes are mainly
concentrated in the 6p21.3 (74.36%) region. Some
previous studies indicated that this region is associated
with many complex immune diseases, such as type 1
diabetes (Noble et al., 1996; Hermann et al., 2003),
rheumatoid arthritis (Newton et al., 2004), rheumatic
heart disease (Hernandez-Pacheco et al., 2003), and
systemic lupus erythematosus (Vargas-Alarcon et al.,
2001). These results imply that certain complex
diseases share some common biomarkers and might
have underlying functional interaction among
predisposing genes. In the future, more studies will
give us a deeper comprehension of the 6p21.3 region.
Figure 2a also indicates that there are no complex
disease-related haplotypes located on chromosome 21.
This phenomenon is attributable to there being no
exact haplotype information for chromosome 21 in the
literatures.
Figure 2 The chromosomal distribution of complex
disease-related haplotypes and genes in the CDRH database
Note: MT; mitochondrion. (a) Histogram of the number of
complex disease-related haplotypes on each chromosome. (b)
Histogram of the number of complex disease-related genes on
each chromosome
To date, the CDRH database has records of 114
complex diseases. Table 2 shows the statistical
information of the top six complex diseases, in order
of the number of haplotypes. These diseases involve at
least two populations, and more than one chromosome
and gene, which implies that these diseases are more
common compared with the others and may be
caused by multiple genes. Multiple sclerosis
(Rosati, 2001) and rheumatoid arthritis (Harris,
1990) each have at least two studies in the
literature in our database, which might imply that
researchers should pay more attention to these
diseases.
Computational
Molecular Biology