9 - CMB-2014v4n7页

基本HTML版本

Computational Molecular Biology 2014, Vol. 4, No. 9, 1-6
http://cmb.biopublisher.ca
5
Many genes were annotated with different pathways
in the KEGG database (http://www.genome.jp/kegg/
pathway.html). Further comparative result is shown in
Table 7. Many transcripts include various pathways
like metabolic pathways, plant-pathogen interaction
pathways, fatty acid metabolism pathway and fatty
acid biosynthesis.
Table 7 KEGG Result
Species
Genes
KEGG Pathway
Arachis hypogaea
L.
568
109
Cicer arietinum
L.
786
78
Phaseolus vulgaris
L.
629
89
Trigonella foenum-graecum
L.
192
87
Vicia sativa
L.
500
122
2.5 SSR mining
Microsatellite markers (SSR markers) are some of the
most successful molecular markers in the construction
of a peanut genetic map and in diversity analysis
(Zhang et al). For identification of SSRs, all transcripts
were searched with perl script MISA. SSR mining
result is described in Table 8 which shows detailed
information of each species’ SSR result. The
mono-nucleotide SSRs represented the largest fraction
of SSRs identified followed by tri-nucleotide and
di-nucleotide SSRs. Although only a small fraction of
tetra-, penta- and hexa-nucleotide SSRs were identified
in transcripts, the number is quite significant in most of
species.
Table 8 Statistics of SSRs identified in transcripts
Species
SSR Mining:
Arachis
hypogaea L.
Cicer
arietinum L.
Phaseolus
vulgaris L.
Trigonella
foenum-graecum L.
Vicia sativa
L.
Total number of sequences examined:
10824
34678
6999
7256
22748
Total size of examined sequences (bp):
4605095
27932177
2110290
3226271
11444673
Total number of identified SSRs:
742
5228
1405
3107
1150
Number of SSR containing sequences:
649
4391
1304
2191
1055
Number of sequences containing more than one SSR: 74
681
86
747
92
Number of SSRs present in compound formation:
48
337
64
747
48
Distribution to different repeat type classes:
Mono-nucleotide
265
2019
1218
2589
362
Di-nucleotide
164
1271
87
235
243
Tri-nucleotide
299
1818
90
243
529
Tetra-nucleotide
10
78
7
28
10
Penta-nucleotide
2
17
2
10
3
Hexa-nucleotide
2
25
1
2
3
2.6 Plant Transcription Factor
Further, transcription factor encoding transcripts were
identified by sequence comparison to known
transcription factor gene families. Result in Table 9
shows that transcription factor genes distributed with
families were identified and which is described in
Table 9 and Figure 2 which is Plant Transcription
Factor Result of
Trigonella foenum-graecum L.
.The
overall distribution of transcription factor encoding
transcripts among the various known protein families
is very similar with that of other legumes as predicted
earlier (Libault et al., 2009).
3 Conclusion
This study is focus on five different legume species
from NCBI database for de novo sequence assembly
and analysis by RNA-seq using next-generation
Illumina and 454 sequencing. The transcriptome
sequencing enables various functional genomics
studies for an organism. Although several high
throughput technologies have been developed for
Table 9 Plant Transcription Factor Result
Species
At least different families
Arachis hypogaea
L.
70
Cicer arietinum
L.
97
Phaseolus vulgaris
L.
43
Trigonella foenum-graecum
L.
45
Vicia sativa
L.
82