I
n silico Proteomic Functional Re-annotation
Escherichia coli
K-12 using Dynamic Biological Data Fusion Strategy
42
proteome sequences of
E. coli
K-12 organism.
Here, in silico functional proteomic re-annotation of
the
E. coli
K-12 was carried out using a
well-organized and proficient annotation procedure
that involves dynamic fusion of biological data from
various databases (Figure 2).The complete genome of
E. coli
K-12 was downloaded from the EcoCyc
database and an initial analysis on the total number of
protein sequences with clear functions and the
sequences without clear functions were carried out.
Then, all the sequences were submitted to the AIM
BLAST for BLAST analyses, FGT for ProDom,
ScanProsite and COG analyses and stand alone Pfam
for family based analyses. The results of individual
sequences from all the five tools were stored in a local
database for further analyses. When the results of all
the sequences were available, a statistical analysis was
undertaken to determine the confidence level of
predicted functions.
Authors’ contributions
GRK did Overall data management,software
development and writing paper. TKS carried out
software development and writing paper. CPR did
data analysis and interpretation. KPK involved in
software development and data analysis. The authors
have read and approved the manuscript.
Acknowledgements
Authors would like to thank Aravindan Ganesan,
Kalyanamoorthy Subha and R Sathish Kumar for their
constructive comments which helped in preparation of
this manuscript.
References
Altschul S.F., Gish W., Miller W., Myers E.W., and Lipman
D.J., 1990, Basic local alignment search tool. J. Mol. Biol.,
215: 403-410
http://dx.doi.org/10.1016/S0022-2836(05)80360-2
Ashok S., Venil S., Nupoor C., and Kumar G.R., 2014,
Prediction and classification of ABC transporters in
Geobacter sulfurreducens
PCA using computational
approaches. Current Bioinformatics, 9(2): 166-172
http://dx.doi.org/10.2174/1574893608999140109113236
Aravindhan G., Kumar G.R., Kumar R.S., and Subha K., 2009,
AJAX Interface: A Breakthrough in Bioinformatics Web
Applications. Proteomics Insights, 2: 1-7
Aravindhan G., Kumar R.S., Subha K., Subazini T.K., Dey A.,
Kant K., and Kumar G.R., 2009, AIM-BLAST-AJAX
Interfaced Multisequence Blast
.
Proteomics Insights,
2:
9-13
Blattner F.R., Plunkett G., Bloch C.A., Perna N.T., Burland V.,
and Riley M. et al., 1997, The complete genome sequence
of
Escherichia coli
K-12. Science, 277: 1453-1474
http://dx.doi.org/10.1126/science.277.5331.1453
Bock J.R., and Gough D.A., 2004, In silico biological function
attribution: a different perspective. Drug Discov Today
Biosilico, 2: 30-37
http://dx.doi.org/10.1016/S1741-8364(04)02381-9
Camus J.C., Pryor M.J., Médigue C., and Cole S.T., 2002,
Re-annotation of the genome sequence of
Mycobacterium
tuberculosis
H37Rv. Microbiology, 148: 2967-2973
Dandekar T., Huynen M., Regula J.T., Ueberle B.,
Zimmermann C.U.,
and Andrade M.A.,
2000,
Re-annotating the
Mycoplasma pneumoniae
genome
sequence: adding value, function and reading frames. Nucl.
Acids Res., 28: 3278-3288
http://dx.doi.org/10.1093/nar/28.17.3278
Elmore M.T., Potok T.E., and Sheldon F.T., 2003, Dynamic
Data Fusion Using An Ontology-Based Software Agent
System. Proceedings of the 7th World Multiconference on
Systemics
,
Cybernetics and Informatics, 1-6
Finn R.D., Tate J., Mistry J., Coggill P.C., Sammut S.J., Hotz
H.R, Ceric G., Forslund K., Eddy S.R., Sonnhammer E.L,
and Bateman A., 2008,The Pfam protein families database.
Nucl. Acids Res., 36: 281-288
http://dx.doi.org/10.1093/nar/gkm960
Gabriel M.H., and Kristen L., 2008, Choosing BLAST options
for better detection of orthologs as reciprocal best hits.
Bioinformatics, 24: 319-324
http://dx.doi.org/10.1093/bioinformatics/btm585
Galperin M.Y., and Koonin E.V., 2010, From complete genome
sequence to ‘complete’ understanding? Trends Biotechnol.,
28: 398-406
http://dx.doi.org/10.1016/j.tibtech.2010.05.006
Gundogdu O., Bentley S.D., Holden M.T., Parkhill J., Dorrell
N., and Wren B.W., 2007, Re-annotation and re-analysis
of the
Campylobacter jejuni
NCTC11168 genome
sequence. BMC Genomics, 8: 162-170
http://dx.doi.org/10.1186/1471-2164-8-162
Hulo N., Sigrist C.J., Le Saux V., Langendijk-Genevaux P.S.,
Bordoli L., Gattiker A., De Castro E., Bucher P., and
Bairoch A., 2004, Recent improvements to the PROSITE
database. Nucl. Acids Res., 32: 134-137
http://dx.doi.org/10.1093/nar/gkh044
Karp P.D., Keseler I.M., Shearer A., Latendresse M.,
Computational
Molecular Biology