PlantSecKB: the Plant Secretome and Subcellular Proteome KnowledgeBase
7
UniProt/Swiss-Prot dataset (curated and reviewed)
and 1 355 593 from UniProt-TrEMBL (unreviewed)
with an additional 26 685 proteins predicted from the
newly sequenced genome of sacred lotus (Ming et al.,
2013; Lum et al., 2013). The main categories of
subcellular proteomes for species having more than 7
000 entries are summarized in Table 1. Curated
secreted proteins, ER proteins and lysosome proteins
are not listed in Table 1. There were only 7 lysosome
proteins in
A. thaliana
identified and no lysosome
proteins were predicted in other species. There are a
total of 2 774 curated secreted proteins, which are
mainly obtained from
A. thaliana
and
O. sativa
subsp.
japonica with 1 247 and 559 entries, respectively. It
should be noted that the number of total protein
entries in a species is the number collected in the
UniProtKB, which can be greater than a complete
or reference genome, as there are some
redundancies or duplicates in some protein entries.
For example,
O. sativa
subspecies japonica has 99
984 entries in PlantSecKB and only 63 544 entries
in its complete proteome set, and
A. thaliana
has
53 847 entries in PlantSecKB and only 31 908
entries in the complete proteome set in UniProtKB
(http://www.uniprot.org/taxonomy/complete-proteomes).
An overall trend observed is that plants with relatively
small proteome sizes have a relatively small number
and a relatively lower proportion of secreted proteins,
such as in single-celled green algae. For example,
Osterococcus
species has less than 100 secreted
proteins predicted (1.2%), and moss (
Physcomitrella
patens
) has 781 secreted proteins predicted (2.9%)
(Table 2). On average the secretome accounts for
about 4.0%~7.5% of the proteome in monocot and
dicot plants based on our prediction estimations. The
secretome percentages reported in this study are
slightly lower than we reported previously. This is due
to the fact that our previous study used SignalP 3.0,
whereas this study used SignalP 4.0 which has a
higher specificity (Lum et al., 2013; Petersen et al.,
2011).
The average predicted proteome sizes and
distributions of subcellular proteomes are summarized
in Table 3 using 9 species or subspecies in each
category of green algae, monocot and dicot plants
listed in Table 2.
Lotus japonicus
, a dicot, was the
only species not used for this analysis due to
incompleteness of its proteome. The average predicted
proteome size is much smaller in green algae, thus
each subcellular proteome consists of a smaller
number of proteins (Table 3). Comparing monocots
and dicots, the distribution percentages of secreted
proteins, chloroplast membrane proteins, vacuolar
proteins, and plasma membrane proteins were not
significantly different. However, monocots had a
significantly higher proportion of proteins predicted as
mitochondria (both membrane and non-membrane)
and chloroplast membrane,
and dicots had
significantly more proteins predicted as cytosol and
nucleus (Table 3). Whether these observed differences
in subcellular proteome distributions between
monocots and dicots are caused by computational
tools or are real with biological or evolutionary
significances needs further investigation.
Table 3 Comparison of subcellular proteome distribution in green algae, monocot and dicot plants
Mitochondiral
Chloroplast
Plasma
Proteo
me
Secreto
me (%)
Membr
ane (%)
Non-mem
brane (%)
Membrane
(%)
Non-mem
brane (%)
Cytosol
(%)
Vacuole
(%)
Membra
ne (%)
Nucleus
(%)
Green algae 10371 284 (2.7) 286 (2.8) 1975 (19.0) 201 (1.9)
1284 (12.4) 1933(18.8) 83 (0.8) 341 (3.3) 1567(14.5)
Monocot
43653 2667(6.1) 834 (1.9) 7140 (16.4) 702 (1.6)
6304 (14.4) 6822(15.6) 381 (0.9) 1699(3.9) 7947(18.2)
Dicot
45715 2645(5.8) 562 (1.2) 5098 (11.2) 712 (1.6)
5122 (11.2) 8600(18.8) 459 (1.0) 2180(4.8) 10342(22.6)
T-test
ns
ns
***
***
ns
***
***
ns
ns
***
Note: T-test was used to compare the subcellular proteome (%) distribution in monocots and dicots. ns: not significant; ***: highly
significant (t < 0.001)
Computational
Molecular Biology