CGE -2016v4n2 - page 4

Cancer Genetics and Epigenetics 2016, Vol.4, No.2, 1-9
1
Research Article Open Access
Identification of Differentially Expressed Genes and Prognostic Biomarkers of
Breast Cancer Based on RNA-Seq and KEGG Pathway Network
S.M. Zhang
1
, Y. Gu
1
, S.Y. Wu
2
, Y. Kang
3
, S. Liu
3
, D. Zhang
3,
1. College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
2. Software College, East China University of Technology, Nanchang, 330013, China
3. The 2nd Affiliated Hospital, Harbin Medical University, Harbin, 150081, China
Corresponding author email
:
Cancer Genetics and Epigenetics 2016, Vol.4, No.2 doi
:
Received: 25 Jul., 2016
Accepted: 26 Jul., 2016
Published: 18 Oct., 2016
Copyright
© 2016 This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article
:
Zhang S.M., Gu Y., Wu S.Y., Kang Y., Liu S. and Zhang D., 2016, Identification of Differentially Expressed Genes and Prognostic Biomarkers of Breast Cancer
Based on RNA-Seq and KEGG Pathway Network, Cancer Genetics and Epigenetics, 4(2): 1-9 (doi
:
)
Abstract
The incidence of breast cancer is a complex biological process and multiple genes involved in the regulation. The gene
expression differences of tumor cells between different patients’ determine the different treatment and prognosis. Therefore
investigate the characteristics changes of breast cancer from a genetic level include identification of differentially expressed genes
and prognostic markers will facilitate the development of appropriate and effective treatment.
This subject obtained RNA-Seq Level
3 gene expression data from TCGA database, SAM algorithm was used to find differentially expressed genes. Next, the DAVID
bioinformatics tool was employed to analyze the function of these genes, and obtained the significantly enriched pathways of these
genes. Then gene interaction information was extracted from the pathways, KEGG pathway network was built by integrating these
information, and the network topology were analyzed. The hub nodes extracted from the network were as candidate genes. Then the
genes which have a significant impact on the survival were identified by using Cox proportional hazards regression model. And these
genes were introduced into a multivariate analysis, the sample risk scores were calculated, according to which samples were divided
into a high risk group and a low risk group. The survival difference between these two groups was analyzed using Kaplan Meier
method, and logrank test was used to assess the statistical significant. By analyzing the gene expression dataset of TCGA database, a
total of 5880 differentially expressed genes were found. Eight significant pathways were obtained by enrichment analysis. Then we
used the interaction information of genes extracted from the pathways to build a KEGG pathway network, and 32 candidate genes
were obtained from the network. Three significant genes (AARS, ADK, and ADORA2A) which have significant impact on the
prognosis of breast cancer were identified by Cox proportional hazards. These three genes can be used as new prognostic biomarkers
in breast cancer, provide guidance for the treatment of breast cancer. Wherein AARS has been proven associated with breast cancer
risk. By multivariate analysis, this subject divided breast cancer into a high risk group and a low risk group, and there exits
significant difference between them.
Keywords
Breast Cancer; Differentially Expressed; KEGG Pathway Network; Gene; Prognosis
Background
Breast cancer is a heterogeneous disease originated in the breast cancer tissue. It is the most common cancer in
women, with up to 25 percent. The main risk factors for breast cancer are female gender and age
. Other potential risk factors include genetic factors, infertility or lack of breastfeeding
, high levels of the hormone
, diet and
obesity. Recent studies have shown that exposure to environmental pollution is also a risk factor for breast cancer
.
Minor role played by genetic susceptibility in most cases, however, in general, genetic factors are thought to be
the main reason for 5-10% of patients
. In less than 5% of patients, genetic factor plays a
significant role by causing hereditary breast cancer - ovarian cancer syndrome. This includes patients who carry
the BRCA1 and BRCA2 gene mutations. These mutations accounted for 90% of genetic factors and patients
affected have a 60-80% risk of breast cancer
. Other notable mutations also include P53, PTEN,
ATM, PALB2 and CHEK2. Mutations that cause breast cancer have been confirmed by experiments association
1,2,3 5,6,7,8,9,10,11,12,13,...14
Powered by FlippingBook