8 - CMB-2014v4n9页

Computational Molecular Biology 2014, Vol. 4, No. 10, 1-17

http://cmb.biopublisher.ca

4

Then, co- and single-localized peaks overlapping

with gene annotation were identified. The

summarized box plot was shown in Figure 1 (A). We

noted that exon region was not overrepresented in

co-localized peaks, compared to single-localized

(control) peaks. With respect to intron region, a

significant higher percentage than control group

could be observed in me1me2 group, suggesting

me1me2 was probably a housekeeping mark for gene

body and involved in the transcriptional elongation.

For any combinations involving me1, the percentage

of overlapping introns was higher than others. In

addition, 5′ UTR was depleted in co-localized signals,

compared with controls. For 3′ UTR element and

TES (0k, 10k), all marks followed similar

distributions. Compared with single-localized peaks,

me1me2 and me2me3 co-localized peaks in regions

of 1k upstream of TSSs were found significantly

prominent, while me1me2 and me1me3 co-localized

peaks were significantly prominent within (-10k, -1k)

upstream of TSSs. We found that the me1 and me2

related co-localized groups distributed more than

single-localized group within (1k, 10k) downstream

of TES, consistent with the previous observation that

me1 and me2 signals tended to distribute towards 3′

regions of genes

(Zhang et al., 2009)

. The

re-summarized landscape of inter- and intra-gene

distributions for Figure 1 (A) was shown in Figure 1

(B). We found that me1me2 located more in

intragenic regions and was most overlapped with

intron element. Previous studies suggested that the

first intron may harbor functional elements to control

gene expression

(Bradnam and Korf, 2008)

, which

highlighted the potentially regulatory role of

me1me2. We noted that co-localized peaks were

overrepresented in intron. To unbiasly measure the

enrichment of intron in co-localized and

single-localized peaks, fold for intron/exon was

calculated. The fold for me1me2, me1me2me3,

me1me3 was 13.74 ±0.51, 9.53 ±0.57, 6.09 ±0.21

(All

p

< 1.8E-4), respectively, which were

significantly larger than 5.33 ± 0.05 for

single-localized peaks. For me2me3, the fold = 5.19 ±

0.15,

p

=0.0539. From the result, me3-related marks

were considered independent of intron localization.

2.2 Co-localized peaks except me1me2 are more

phylogenetically conserved than single-localized peaks

It was interesting to explore whether co-localized

peaks were more conserved than single-localized

peaks. To characterize the conservation of the

identified single-localized and co-localized peaks,

two phastCons (pC) cutoffs were chosen. For each

peak, the average conservation status was averaged

for genomic positions with pC score larger than pC

cutoff, and finally the peak’s conservation was

represented by the average pC score. Above all,

only peaks overlapping with annotated TPRs were

taken into consideration. The high cutoff 0.6

focuses on more conserved peaks, while cutoff of

0.2 just means little conservation. From Table 1,

percentages of conserved peaks for different

co-localization passed by pC cutoff were shown. To

assess the overlap among peaks, Overlap rate (OR)

was introduced. OR=1 indicates the given two

peaks are completely overlapped. Suppose OR and

pC cutoffs be 1.0 and 0.6, respectively. As the cutoff

becomes looser, the percentage for all localization

types was decreasing. The trend for percentage was

straightforward, the varying range with respect to

pC cutoff under the same OR cutoff was within

[0.05, 0.08], indicating the conservation level was

robust. Except me1me2, other types of co-localized

peaks were more conserved than single-localized

peaks, which was in accord with the fact that Me1

and me2 were not as stable as me3 groups and me3

marks were generally stable. Notably, H3K4me

triplet which is the case of three marks co-localizing

the same positions was most conserved. As the OR

cutoff became stringent, the percentage for

H3K4me triplet type became smaller. Even under

the most stringent conserved cutoff. When the pC

cutoff was chosen as 1.0, the conclusion still held

(details see Supplementary Table 2 and

Supplementary Table 3). H3K4me3 and H3K27me3

co-localization was previously reported to be even

more conserved than either K4 or K27

single-localization in human embryonic stem cells

(Zhao

et al., 2007)

, which was consistent with our results.