CMB-2017v7n1 - page 10

Computational Molecular Biology 2017, Vol.7, No.1, 1-11
7
Table 4 Classification of maize gene products based on Gene Ontology molecular functions
Total
AS
%
Molecular function
GO:0005488
4 541
3 570
78.6
binding
GO:0000166
2 103
1 769
84.1
nucleotide binding
GO:0016740
2 048
1 619
79.1
transferase activity
GO:0016787
1 782
1 443
81.0
hydrolase activity
GO:0003824
1 767
1 364
77.2
catalytic activity
GO:0003677
1 345
1 005
74.7
DNA binding
GO:0003674
1 320
1 006
76.2
molecular_function
GO:0003723
761
625
82.1
RNA binding
GO:0005215
748
581
77.7
transporter activity
GO:0016301
714
588
82.4
kinase activity
GO:0005515
648
532
82.1
protein binding
GO:0003700
627
423
67.5
transcription factor activity, sequence-specific DNA binding
GO:0005198
328
245
74.7
structural molecule activity
GO:0003676
212
176
83.0
nucleic acid binding
GO:0030234
178
133
74.7
enzyme regulator activity
GO:0004518
168
133
79.2
nuclease activity
GO:0008289
157
132
84.1
lipid binding
GO:0030246
134
106
79.1
carbohydrate binding
GO:0004871
129
105
81.4
signal transducer activity
GO:0008135
83
69
83.1
translation factor activity, RNA binding
GO:0004872
77
60
77.9
receptor activity
GO:0003682
64
44
68.8
chromatin binding
GO:0003774
47
39
83.0
motor activity
GO:0005102
42
34
81.0
receptor binding
GO:0019825
4
2
50.0
oxygen binding
GO:0045182
4
3
75.0
translation regulator activity
Total
20 031
15 806
78.9
2.4 Impact of AS on gene product function
The PUTs were annotated for putative protein coding region by performing a BLASTX search against
UniProt/Swiss-Prot database and the ORFs were identified using OrfPredictor webserver (Min et al., 2005a), and
the completeness of ORFs were examined using TargetIdentifier (Min et al., 2005b). The protein families of the
ORFs of were predicted using rpsBLAST searching Pfam database. Isoforms generated by AS can be either
functional or non-functional. Non-functional AS isoforms often have a premature stop codon due to non-three
nucleotide insertions or deletions within the ORF region. These isoforms often are degraded through the process
of “regulated unproductive splicing and translation” (RUST) or nonsense mediated mRNA decay (NMD)
surveillance machinery (Morello and Breviario, 2008). It was estimated that ~43% Arabidopsis AS events and
~36% rice events produce NMD candidates (Wang and Brendel, 2006). In our dataset of 192 624 AS isoform pairs,
there were 12 146 (6.3%) pairs with one isoform harboring a complete ORF and the other not having an ORF.
Lacking an ORF in a transcript could be either due to incompleteness in the PUT sequence or due to loss of a start
codon or a premature stop codon. There were also 54 388 (28.2%) pairs having complete ORFs in both isoforms.
Thus we further compared if their protein domains were changed or not within the set having complete ORFs.
Within a total of 54 388 AS isoform pairs having complete ORFs, 10 9941 (20.2%) pairs had no Pfam hit, 32 768
(60.2%) pairs had identical Pfam hits, the remaining 10 626 (19.6%) either had one isoform having a Pfam hit and
1,2,3,4,5,6,7,8,9 11,12,13,14,15,16
Powered by FlippingBook