Genomics and Applied Biology 2017, Vol.8, No.5, 30-48
38
Table 9 The main clusters and top terms in the reference co-citation network (1985-2009)
Cluster
Size
Silhouette
Top terms (log-likelihood ratio, p-level)
Average year of publication
0
30
0.793
YAC contig map
2002
1
17
0.936
Yeast artificial chromosome(YAC)
1988
2
14
0.748
Thermophile
1995
3
10
1
Cofactor biosynthesis
1985
4
9
1
Severe combined immunodeficiency
1988
5
8
0.899
Gene expression pattern
1996
Table 10 The main clusters and top terms in the reference co-citation network (2010-2014)
Cluster
Size
Silhouette
Top terms (log-likelihood ratio, p-level)
Average year of publication
0
16
0.658
Single molecule
2006
1
15
0.851
Quantitative Trait Locus (QTL) analysis
2010
2
13
0.732
Teleost
2006
3
11
0.912
Copy number variation
2007
4
6
0.961
Genome duplication
2007
Table 11 The main clusters and top terms in the reference co-citation network (2015-2016)
Cluster
Size
Silhouette
Top terms (log-likelihood ratio, p-level)
Average year of publication
0
24
0.790
Field gel electrophoresis
2010
1
17
0.937
Adaptation
2011
2
16
0.761
Alcolapia grahami
2011
3
16
0.829
Next generation sequencing
2010
4
13
0.927
Nonapoptotic cell death
2012
5
8
0.972
Crangon crangon
2011
6
6
0.918
Zinc finger nuclease
2013
7
6
0.910
Venous thrombosis
2012
8
5
1
Draft genome
2009
Table 12 The top 10 literature with highest centrality
Rank Centrality
Author
Year
Source
Volume
Page
Cluster
1
0.38
Yang ZH
2007
Molecular Biology and Evolution V24
P1586
2b
、
0c
2
0.37
Dunham I
2012
Nature
V489
P57
3b
、
6c
3
0.36
Cong L
2013
Science
V339
P819
6c
4
0.33
Li H
2009
Bioinformatics
V25
P2078
1b
、
3c
5
0.29
Wang T
2014
Science
V343
P80
6c
6
0.28
Altshuler D
2010
Nature
V467
P1061
3b
7
0.27
Barretina J
2012
Nature
V483
P603
4c
8
0.26
Langmead B
2009
Genome Biology
V10
R25.1
1b
、
2c
9
0.24
Zerbino DR
2008
Genome research
V18
P821
0b
、
0c
10
0.2
Overbeek R
2014
Nucleic acids research
V42
D206
0c
10
0.2
Altschul SF
1990
Journal of Molecular Biology
V215
P403
2a
Note: a refers to the cluster in the reference co-citation network (1985-2009); b refers to the cluster in the reference co-citation
network (2010-2014); c refers to the cluster in the reference co-citation network (2015-2016)
The literature “Yang ZH-2007” ranks the first with centrality of 0.38. The title is “PAML 4: Phylogenetic analysis
by maximum likelihood”. PAML is a software package for phylogenetic analysis on DNA or protein sequences
using maximum likelihood methods. It was developed by Ziheng Yang and provided for academic use for free. He
is the author of the paper, and also he is a famous ethnic Chinese scientist, academician of the Royal Academy of
Sciences and professor in statistical genetics of University of London. The use of multi-core PAML parallel
algorithm has obvious acceleration effect on data set analysis of DNA and protein sequences. The literature
“Dunham I-2012” ranks the second with centrality of 0.37. The title is “An integrated encyclopedia of DNA