Computational Molecular Biology
2
The term secretome was first introduced by Tjalsma et
al. (2000) to denote the complete set of proteins in
Bacillus subtilis
processed by the secretory pathway,
which included protein secreted to the extracellular
space and also proteins involved in the pathway.
However, recently it was more often limited, as in this
work, to represent only the secreted, extracellular
portion - including cell wall proteins - of the proteome
(e.g., Greenbaum et al., 2001; Hathout, 2007; Bouws
et al., 2008; Agrawal et al., 2010; Lum and Min,
2011b). A plant secretome consists of primarily cell
wall proteins, proteins involved in cell wall
metabolism, and extracellular enzymes and signal
molecules involved in defense of pathogens (Isaacson
and Rose, 2006; Kamoun, 2009; Lum and Min, 2011a).
Secreted enzymes, particularly hydrolases such as
α-amylase and α-glucosidases, have been well studied
using germinating barley seeds as a model system.
These hydrolases were synthesized in the aleurone
layer and secreted into the endosperm to break down
starch and other storage reserves (Ranki and Sopanen,
1984; Jones and Robinson, 1989; Finnie et al., 2011
for review). Recently, advances in proteomic analytic
techniques along with the complete sequencing of
Arabidopsis thaliana
and
Oryza sativa
genomes
resulted in many secreted proteins, including the cell
wall proteome, being identified (Boudart et al., 2007;
Agrawal et al., 2010; Lum and Min, 2011a). These
identified secreted proteins mainly consist of cell wall
proteins in Arabidopsis (see Jamet et al., 2008 for
review) and some enzymes such as GLP1 involved in
pathogen defense (Oh et al., 2005). Using a leaf or
seed cell suspension culture, secreted proteins were
identified with 2D-gel electrophoresis coupled with
liquid chromatography mass spectrometry analysis in
rice, Medicago and sorghum (Jung et al., 2008;
Kusumawati et al., 2008; Cho et al., 2009; Ngara and
Ndimba, 2011). A large number of secreted proteins
were also identified from root exudates using
aseptically grown seedlings of rice and Arabidopsis
(Shinano et al., 2011; De-la-Pena et al., 2010).
Experimental systems, analytical techniques, and
related bioinformatics tools used for plant secretome
study were recently comprehensively reviewed
(Agrawal et al., 2010; Meinken and Min, 2012;
Alexandersson et al., 2013; Kraus et al., 2013; Caccia
et al., 2013).
Classical eukaryotic secreted proteins contain a
secretory signal peptide at the N-terminus that directs
proteins to the rough ER for completing protein
synthesis and then transports them to the Golgi
complex for protein targeting (von Heijne, 1990). The
signal peptide, typically 15~30 amino acids long, is
often cleaved off during translocation across the
endomembrane systems. Classical secreted proteins
can be computationally predicted relatively accurately
(Min, 2010). Recently we analyzed all manually
curated and annotated secreted plant proteins in the
UniProtKB/Swiss-Prot dataset and found 87% of them
could be predicted to have a signal peptide by all three
predictors used (Lum and Min, 2011a). The accuracy
of secretome prediction could be further improved by
using a new version of SignalP (SignalP 4.0)
combined with other tools including TMHMM for
identifying transmembrane proteins and PS-Scan for
identifying ER luminal proteins (Min, 2010; Melhem
et al., 2013).
With improvements in sequencing technology and the
reduced cost of sequencing, the genomes of more and
more plant species are being completely sequenced.
Currently there are 32 land plants with complete or
draft genome sequences available and 73 land plant
species with genome sequencing in progress
(http://www.ncbi.nlm.nih.gov/genomes/static/gpstat.ht
ml). There are also assembled expressed sequence tag
(EST) data in plants available for identifying potential
genes encoding secreted proteins in more than 200
species (PlantGDB, http://www.plantgdb.org/prj/ESTCluster/)
(Duvick et al., 2008). As a result of genome
sequencing, the number of protein sequences available
is increasing rapidly.
In addition to the classical secreted proteins, a large
number of leadless, non-classical, secreted proteins
(LSP), i.e. not having a secretory signal peptide, have
been identified in plants (Jung et al., 2008; Agrawal et
al., 2010; Ding et al., 2012 for review). These proteins
have not been curated in the UniProtKB. Therefore
there is a need to have a central knowledgebase
providing plant protein subcellular locations for the
Computational
Molecular Biology