Cancer Genetics and Epigenetics 2015, Vol.3, No.13, 1-6
2
IGFBP-2 mRNA is already expressed in preimplantation
embryo (Prelle et al., 2001), and expression continues
at high levels in many tissues during embryonic and
fetal development (Schuller et al., 1993; van Kleffens
et al., 1998). In the postnatal period, IGFBP-2 is the
second most abundant IGFBP in the circulation and is
present in various other biological fluids and tissues of
many vertebrate species (Blum et al., 1993; Hwa et al.,
1999).
Clinical researches discover that patients with cancer
of high expression of IGFBP2 often have a shorter
survival time (Busund et al., 2005; Lin et al., 2009;
Fukushima and Kataoka, 2007; Hsieh et al., 2010).
This study sought for the epigenetic regulation
elements of IGFBP2, and then used it to the survival
analysis. We obtained the epigenetic regulation elements
that could separate patients with the long survival time
and patients with short survival time. These epigenetic
regulation elements can be used for the clinical study
of drug development for the treatment of cancer.
1 Method
1.1 Datasets
We obtained colon cancer data from TCGA
(
/) including 450 k
DNA methylation data, gene expression data and
clinical information data. The gene expression data
includes 273 cancer samples and 41 normal samples
from UNC IlluminaHiSeq RNASeqV2 level3 data.
The 450k DNA methylation data includes 339 cancer
samples and 38 normal samples from JUH-USC
Human Methylation 450 level3 data. We obtained
samples with both DNA methylation data and gene
expression data, including 266 cancer samples and 19
normal samples.
1.2 Differentially methylation analysis
IGFBP2 gene expression data was abstracted, and
divided all the cancer samples into two groups: high
expression group and low expression group. SAM
(
SignificanceAnalysis of Microarrays
)
is a statistics tools
for searching remarkable genes in microarray data sets. It
was used to distinguish differentially methylated site in
both high and low gene expression group of IGFBP2.
1.3 The screening of methylation sites and survival
analysis
The relationship between IGFBP2 gene expression
and differentially methylated sites was obtained from
the Pearson correlation coefficient. Through 1000
permutation Pearson correlation coefficient, the DNA
methylation sites that highly correlated with IGFBP2
gene expression was obtained. Then the chromosome
coordinate of highly correlated DNA methylation sites
was obtained to get the DNA methylation sites in cis
regulation region of IGFBP2.
The COX regression analysis was used to get the
DNA methylation sites that correlated with survival
time. Then we used these DNA methylation sites to
plot survival curve.
2 Results
2.1 Data processing and analysis
To filter the epigenetic regulation elements, we obtained
colon adenocarcinoma data from TCGA (The Cancer
Genome Atlas) with 450k DNA methylation data,
RNA-seq data and the clinical information. All the
data were level 3 that had been preprocessed by the
TCGA. There were 314 samples in RNA-seq data
include 273 cancer samples and 41 normal samples,
whereas 301 cancer samples and 38 normal samples
included in DNA methylation data. We filtered 285
samples with corresponding RNA-seq and DNA
methylation data.
We drew the IGFBP2 expression value from RNA-seq
and grouped the cancer samples into high expression
group and low expression group through the mean
value of IGFBP2 expression. There were 100 cancer
samples in the high expression group and 166 cancer
samples in the low expression group, besides 19
normal samples.
2.2 Differentially methylation sites
We filtered differentially methylated sites between the
high expression group or low expression group and
the normal sample group by SAM (Table 1). In the
high expression group, we filtered 78942 high
differentially methylated sites and 103567 low
differentially methylated sites. In the low expression
group, we filtered 18504 high differentially methylated
sites and 73443 low differentially methylated sites.
2.3 Permutation
To get the highly correlative DNA methylation sites,
Pearson's correlation had been calculated between the
differentially methylated sites and the IGFBP2
expression. Permutation also was performed to get the