Computational Molecular Biology 2014, Vol. 4, No. 10, 1-17
http://cmb.biopublisher.ca
4
Then, co- and single-localized peaks overlapping
with gene annotation were identified. The
summarized box plot was shown in Figure 1 (A). We
noted that exon region was not overrepresented in
co-localized peaks, compared to single-localized
(control) peaks. With respect to intron region, a
significant higher percentage than control group
could be observed in me1me2 group, suggesting
me1me2 was probably a housekeeping mark for gene
body and involved in the transcriptional elongation.
For any combinations involving me1, the percentage
of overlapping introns was higher than others. In
addition, 5′ UTR was depleted in co-localized signals,
compared with controls. For 3′ UTR element and
TES (0k, 10k), all marks followed similar
distributions. Compared with single-localized peaks,
me1me2 and me2me3 co-localized peaks in regions
of 1k upstream of TSSs were found significantly
prominent, while me1me2 and me1me3 co-localized
peaks were significantly prominent within (-10k, -1k)
upstream of TSSs. We found that the me1 and me2
related co-localized groups distributed more than
single-localized group within (1k, 10k) downstream
of TES, consistent with the previous observation that
me1 and me2 signals tended to distribute towards 3′
regions of genes
. The
re-summarized landscape of inter- and intra-gene
distributions for Figure 1 (A) was shown in Figure 1
(B). We found that me1me2 located more in
intragenic regions and was most overlapped with
intron element. Previous studies suggested that the
first intron may harbor functional elements to control
gene expression
, which
highlighted the potentially regulatory role of
me1me2. We noted that co-localized peaks were
overrepresented in intron. To unbiasly measure the
enrichment of intron in co-localized and
single-localized peaks, fold for intron/exon was
calculated. The fold for me1me2, me1me2me3,
me1me3 was 13.74 ±0.51, 9.53 ±0.57, 6.09 ±0.21
(All
p
< 1.8E-4), respectively, which were
significantly larger than 5.33 ± 0.05 for
single-localized peaks. For me2me3, the fold = 5.19 ±
0.15,
p
=0.0539. From the result, me3-related marks
were considered independent of intron localization.
2.2 Co-localized peaks except me1me2 are more
phylogenetically conserved than single-localized peaks
It was interesting to explore whether co-localized
peaks were more conserved than single-localized
peaks. To characterize the conservation of the
identified single-localized and co-localized peaks,
two phastCons (pC) cutoffs were chosen. For each
peak, the average conservation status was averaged
for genomic positions with pC score larger than pC
cutoff, and finally the peak’s conservation was
represented by the average pC score. Above all,
only peaks overlapping with annotated TPRs were
taken into consideration. The high cutoff 0.6
focuses on more conserved peaks, while cutoff of
0.2 just means little conservation. From Table 1,
percentages of conserved peaks for different
co-localization passed by pC cutoff were shown. To
assess the overlap among peaks, Overlap rate (OR)
was introduced. OR=1 indicates the given two
peaks are completely overlapped. Suppose OR and
pC cutoffs be 1.0 and 0.6, respectively. As the cutoff
becomes looser, the percentage for all localization
types was decreasing. The trend for percentage was
straightforward, the varying range with respect to
pC cutoff under the same OR cutoff was within
[0.05, 0.08], indicating the conservation level was
robust. Except me1me2, other types of co-localized
peaks were more conserved than single-localized
peaks, which was in accord with the fact that Me1
and me2 were not as stable as me3 groups and me3
marks were generally stable. Notably, H3K4me
triplet which is the case of three marks co-localizing
the same positions was most conserved. As the OR
cutoff became stringent, the percentage for
H3K4me triplet type became smaller. Even under
the most stringent conserved cutoff. When the pC
cutoff was chosen as 1.0, the conclusion still held
(details see Supplementary Table 2 and
Supplementary Table 3). H3K4me3 and H3K27me3
co-localization was previously reported to be even
more conserved than either K4 or K27
single-localization in human embryonic stem cells
, which was consistent with our results.