CGE-2018v6n5 - page 8

Cancer Genetics and Epigenetics 2018, Vol.6, No.5, 33-39
37
Table 1 The list of methylation pattern regions
Regions
Methylation Level
Representation Genomic Location
Identification Strategy
Length
Year
CGI
normally unmethylated
promoter
sequence feature
~300bp
1986
CpG shores
normally tissue-specific
adjacent to CGI
sequence feature
~2kb
2009
PMD
≤70%
span gene
sliding window
~153kb
2009
HMR
low
promoter
HMM
~1.8kb
2011
UMR
<13%
promoter
HMM
~2.1kb
2011
LMR
13.9%-50%
enhancer
HMM
~700bp
2011
FMR
>50%
gene body, intergenic
HMM
-
2011
DMV
≤15%
span gene
sliding window
≥5kb
2013
Canyon
≤10%
span gene
HMM
≥3.5kb
2014
When the cell population detected by WGBS contains more than one cell type, such as blood containing multiple
cell types, the cell type-specific methylation of the CpG site will cause the methylation level to not exhibit the
traditional ultra-high/ultra-low-level Basis. Moreover, the methylation level depends on the proportion of each
cell type. At the same time, the presence of other cytosine modifications such as 5hmC makes the methylation
status of the region difficult to assess. Cellular heterogeneity and demethylated cytosine modification primarily
affect the region in which the methylation level is intermediate. Another level of methylation level in the middle is
the allelic-specific methylation region.
The regulation of methylation modification on DNA sequence-protein interaction is affected by the number of
CpG sites and the type of cytosine modification thereon. Therefore, it is recommended to the identified
methylation pattern regions can be further classified according to CpG density, length, and the like.
4 Multi-sample Methylation Spectrum
The datasets detected by WGBS usually have fewer samples and large sites, so the methods commonly used to
identify differential methylation sites in microarray methylation data cannot be directly applied to the
identification of differential methylation in WGBS.
For methylation group comparisons of paired samples, such as normal samples and cancer samples, conventional
methods for identifying differentially methylated regions are by segmenting the genome into small windows.
Although it was able to recognize differential methylation at a low genome-wide extent, it is expected that the
identification of differential methylation sites as a differential methylation region is feasible.
It is worth noting that the identification of differentially methylated regions requires paired samples or two sets of
samples and does not apply to comparisons between multiple sample methylation groups. When comparing
methylation groups of multiple samples, it is recommended to use a predefined methylation region as a baseline.
The boundaries of genomic features such as promoters and exons depend on the transcription initiation site and
the splice site and are therefore not suitable as reference regions for methylation regions. Relatively, CpG islands
based on sequence feature recognition are a more suitable genomic feature, while the proportion of CpG sites
contained in CpG islands is too small, and most CpG islands are unmethylated in most tissues and cell types.
Therefore, the methylation group cannot be fully reflected.
5 Conclusion
The development of next-generation sequencing technology has greatly promoted the research of DNA
methylation. However, the analysis and processing methods of various high-throughput methylation data are still
not perfect. The bisulfite conversion-based next-generation sequencing technology can detect cytosine methylated
cytosine, and obtain the location and methylation level of whole genome methylation through combining with
bioinformatics methods. According to an algorithm, the CpG sites with similar characteristics are connected to
form a methylation pattern region having a regulatory effect, such as a CpG island. The methylation group of
multiple samples can identify differentially methylated regions, thereby identifying genomic elements with
regulatory functions. It is worth noting that the rational use of DNA methylation data is still a challenge due to the
1,2,3,4,5,6,7 9,10,11,12
Powered by FlippingBook