Genomics and Applied Biology 2017, Vol.8, No.5, 30-48
39
elements in the human genome”. The integrated encyclopedia of DNA elements can systematically map the areas
of transcription, transcription factor association, chromosome structure, and histone modification. These data
enable the biochemical function of 80% genes to be assigned. Many of the candidate regulatory elements are
found to be associated with other regulatory elements and expression genes, providing new insights into the
regulation mechanism of genes. Some newly discovered gene elements and sequence variations associated with
human diseases show statistical correlation with sequence variation, which helps to explain variation. The
literature “Cong L-2013” ranks the third with the centrality of 0.36. The title is “Multiplex Genome Engineering
Using CRISPR/Cas Systems”. Gene editing technology CRISPR/Cas9 is listed by SCIENCE as one of the ten
major advances in science and technology in the year of 2013. In this paper, two kinds of II CRISPR (regularly
spaced short palindrome repeats) / Cas (CRISPR related protein) system were designed and proved that Cas9
nuclease can accurately split endogenous genomes directly induced by short RNA in human and mouse cells.
Cas9 can also be converted into an incisional enzyme that promotes homologous directional repair with minimal
mutagenic activity. Multiple boot sequences can be encoded into a single CRISPR array, enabling simultaneous
editing of several parts of the mammalian genome. It has been proved that RNA guided nuclease technology is
convenient, programmable and widely applicable. Besides, the most recently published high-centrality literature
in 2014 is also about CRISPR/Cas9 system entitled “Genetic Screens in Human Cells Using the CRISPR-Cas9
System”. In the CRISPR/Cas9 system, the enzyme Cas9 cuts at the DNA target site. The target of DNA is
determined in the following way: the RNA molecule called CRISPR RNA (crRNA) uses some of its sequences to
bind to another RNA molecule called tracrRNA by base pairing. The chimeric RNA (tracrRNA/crRNA) is formed
and then pairing with the target DNA site with another portion of the crRNA sequence. In this way, this chimeric
RNA can guide Cas9 to the target site and cut it. In practical application, tracrRNA and crRNA can be used as the
two guiding RNA (gRNA) or can be fused together to form a one-way guide RNA (single guide RNA, sgRNA),
and it’s used to guide the enzyme Cas9 binding to the target DNA sequence and cutting. CAS9 together with
sgRNA is called the Cas9-sgRNA system. Therefore, in order to construct nuclear gene modified animal models
by using CRISPR/Cas9, Cas9/sgRNA software package was complied to assist in the rapid design and screening
of highly active and specific sgRNA, constructing sgRNA expression library and large-scale construction of
genetically modified animal models. Another high-centrality literature published in 2014 was “The SEED and the
Rapid Annotation of microbial genomes using Subsystems Technology (RAST)”. Genome annotation belongs to
the category of functional genomics. RAST is a fast annotation tool using Subsystem technology. It is a genome
annotation tool for complete or nearly complete bacteria and archaea. The accuracy, consistency and completeness
of RAST are based on two databases: the Subsystem Library of artificial rectification and the FIGfams Library of
protein, which can be used to predict ORF, Rrna, Trna and corresponding functional genes, and these information
can be used to build metabolic network.
By integrating the co-cited literature of three periods, the top 10 most cited references are listed in Table 13. The
highly cited literature is generally an important document with a fundamental role. The top is “Lander ES-2001”
with the 851 times of citation. The title is “Initial sequencing and analysis of the human genome”. This paper
reports the results of an international collaboration dedicated to providing free drafts of human genome
sequencing, and makes a preliminary analysis of the sequencing data. The second-place literature is “Altschul
SF-1997” with 797 times of citation. The title is “Gapped BLAST and PSI-BLAST
:
a new generation of protein
database search programs”. Gapped BLAST is Gapped Basic Local Alignment Search Tool while PSI-BLAST is
Position Specific Iterated BLAST. BLAST is a search tool widely used to retrieve protein or DNA sequences
similar to the current research sequences in protein or DNA databases. The improved BLAST allows the insertion
of the vacancy, that is, the vacancy BLAST, and its running speed is up to three times that of the original.
PSI-BLAST introduces the location specific scoring matrix into BLAST, and retrieves protein or DNA database
by using this matrix. It finds out the best retrieval results through many iterations. The single iteration of
PSI-BLAST is similar to that of vacancy BLAST, but its sensitivity to bio-correlation sequences with weak
similarity is stronger than that of vacancy BLAST. It has thus been used to explore and discover the new and
interesting BRCT protein super family members. The third-place literature is “Li H-2009” with 659times of