CGE-2018v6n5 - page 6

Cancer Genetics and Epigenetics 2018, Vol.6, No.5, 33-39
35
On the other hand, these above-described techniques for DNA methylation detection based on bisulfite conversion
are the methylation status of a detected cell population. Although cells in an organism share a single genome,
there is a difference in the apparent genome between cells and cells. Recently, the development of single-cell
sequencing technology has made it possible to observe dynamic changes in DNA methylation from a single cell
level. For example, in 2013, Guo et al. realized the single-cell level of RRBS technology (Down et al., 2008). In
2014, Smallwood et al. developed the scBS-seq technique (Stelzer et al., 2015) to detect more than half of the
methylation groups in a single cell.
Other cytosine modification sequencing and single-cell methylation sequencing data are still very rare, and
WGBS data is increasing due to the reduction in sequencing costs and the maturity of the technology. Although
the sample size of RRBS is also sufficient, the consistency of methylation sites detected by RRBS between
different samples is poor and is not recommended for analysis of multiple methylation groups. Therefore, in this
review, we focus on the strategies and challenges of WGBS data analysis.
2 From Reads to Methylation Levels: Comparison of Reads
The level of methylation from raw reads to single bases generated by the sequencing platform requires three steps:
(i) Quality control of the original reads; (ii) Align clean reads to the reference genome; (iii) Calculate the
methylation level of each cytosine based on the covered reads. For a cytosine, the methylation level is the ratio
between the methylated reads and the total number of reads overlying it. The specific analysis procedures for
comparison and factors affecting the accuracy of methylation level calculations are reviewed by Felix Krueger et
al. It is recommended to use the variable-step wiggle format or its binary form bigwig to store genome-wide
methylation level data (similar to the Roadmap program and the ENCODE program) to facilitate subsequent use
of web genome browsers (such as UCSC) or local genome browsers (eg. IGV) for the visualization of
methylation.
Due to the maintenance of DNA methylation mediated by DNMT1, the methylation status of most CpG sites in
the genome is symmetric on the positive and negative strands, so the reads of positive and negative strands are
usually combined to increase the depth of reads. For diploid mammals, when a CpG site is methylated at one
position and unmethylated at another, the methylation level is expected to be 0.5.
Limited to bisulfite conversion cannot distinguish between 5hmC and 5mC, 5fC/5caC and 5hmC, the methylation
level is affected by the three cytosine modifications related to demethylation. Compared with 5mC, 5hmC and
5fC/5caC account for a very small proportion, 5hmC is about 5% in mouse embryonic stem cells, 5fC/5caC is
about 2% (Figure 2B), and the peaks of the methylation level distribution of 1,5hmC and 5fC/5caC were below
0.5 (Figure 2A), while the methylation level of 5mC was close to 1, indicating that cells of the same cell type are
not demethylated at the same time. Cytosine in some cells is at 5mC, 5hmC, while in another cell is at 5fC, 5caC,
5mC (Figure 2C). Furthermore, 5hmC was not recognized by DNMT1, resulting in its inability to be maintained
during DNA replication. It was observed that the chain asymmetry of 5hmC and 5fC in mouse embryonic stem
cells, indicating early observed chain differences in CpG sites, which may be due to the dynamic process of DNA
demethylation at the time of detection.
The level of methylation in bulk WGBS is also affected by intercellular differences. The cellular heterogeneity of
DNA methylation comes mainly from two aspects: (i) epigenetic group differences between different cell types.
For example, when detecting blood tissue, blood contains multiple types of cells, and DNA methylation of
different types of cells is cell type specific at certain locations. (ii) DNA methylation process and DNA
demethylation process. WGBS is static.
Before the subsequent analysis, the CpG sites with lower read coverage need to be removed, because more sample
reads can more accurately quantify the overall methylation status of the cell population, and are less affected by
sequencing errors and alignment errors. Previous studies have shown that except for except for brain tissue and
embryonic stem cells, non-CpG methylation (CHH and CHG methylation, H is the other three bases other than C)
1,2,3,4,5 7,8,9,10,11,12
Powered by FlippingBook