Computational Molecular Biology 2014, Vol. 4, No. 9, 1-6
            
            
              http://cmb.biopublisher.ca
            
            
              5
            
            
              Many genes were annotated with different pathways
            
            
              in the KEGG database (http://www.genome.jp/kegg/
            
            
              pathway.html). Further comparative result is shown in
            
            
              Table 7. Many transcripts include various pathways
            
            
              like metabolic pathways, plant-pathogen interaction
            
            
              pathways, fatty acid metabolism pathway and fatty
            
            
              acid biosynthesis.
            
            
              Table 7 KEGG Result
            
            
              Species
            
            
              Genes
            
            
              KEGG Pathway
            
            
              Arachis hypogaea
            
            
              L.
            
            
              568
            
            
              109
            
            
              Cicer arietinum
            
            
              L.
            
            
              786
            
            
              78
            
            
              Phaseolus vulgaris
            
            
              L.
            
            
              629
            
            
              89
            
            
              Trigonella foenum-graecum
            
            
              L.
            
            
              192
            
            
              87
            
            
              Vicia sativa
            
            
              L.
            
            
              500
            
            
              122
            
            
              2.5 SSR mining
            
            
              Microsatellite markers (SSR markers) are some of the
            
            
              most successful molecular markers in the construction
            
            
              of a peanut genetic map and in diversity analysis
            
            
              (Zhang et al). For identification of SSRs, all transcripts
            
            
              were searched with perl script MISA. SSR mining
            
            
              result is described in Table 8 which shows detailed
            
            
              information of each species’ SSR result. The
            
            
              mono-nucleotide SSRs represented the largest fraction
            
            
              of SSRs identified followed by tri-nucleotide and
            
            
              di-nucleotide SSRs. Although only a small fraction of
            
            
              tetra-, penta- and hexa-nucleotide SSRs were identified
            
            
              in transcripts, the number is quite significant in most of
            
            
              species.
            
            
              Table 8 Statistics of SSRs identified in transcripts
            
            
              Species
            
            
              SSR Mining:
            
            
              Arachis
            
            
              hypogaea L.
            
            
              Cicer
            
            
              arietinum L.
            
            
              Phaseolus
            
            
              vulgaris L.
            
            
              Trigonella
            
            
              foenum-graecum L.
            
            
              Vicia sativa
            
            
              L.
            
            
              Total number of sequences examined:
            
            
              10824
            
            
              34678
            
            
              6999
            
            
              7256
            
            
              22748
            
            
              Total size of examined sequences (bp):
            
            
              4605095
            
            
              27932177
            
            
              2110290
            
            
              3226271
            
            
              11444673
            
            
              Total number of identified SSRs:
            
            
              742
            
            
              5228
            
            
              1405
            
            
              3107
            
            
              1150
            
            
              Number of SSR containing sequences:
            
            
              649
            
            
              4391
            
            
              1304
            
            
              2191
            
            
              1055
            
            
              Number of sequences containing more than one SSR: 74
            
            
              681
            
            
              86
            
            
              747
            
            
              92
            
            
              Number of SSRs present in compound formation:
            
            
              48
            
            
              337
            
            
              64
            
            
              747
            
            
              48
            
            
              Distribution to different repeat type classes:
            
            
              Mono-nucleotide
            
            
              265
            
            
              2019
            
            
              1218
            
            
              2589
            
            
              362
            
            
              Di-nucleotide
            
            
              164
            
            
              1271
            
            
              87
            
            
              235
            
            
              243
            
            
              Tri-nucleotide
            
            
              299
            
            
              1818
            
            
              90
            
            
              243
            
            
              529
            
            
              Tetra-nucleotide
            
            
              10
            
            
              78
            
            
              7
            
            
              28
            
            
              10
            
            
              Penta-nucleotide
            
            
              2
            
            
              17
            
            
              2
            
            
              10
            
            
              3
            
            
              Hexa-nucleotide
            
            
              2
            
            
              25
            
            
              1
            
            
              2
            
            
              3
            
            
              2.6 Plant Transcription Factor
            
            
              Further, transcription factor encoding transcripts were
            
            
              identified by sequence comparison to known
            
            
              transcription factor gene families. Result in Table 9
            
            
              shows that transcription factor genes distributed with
            
            
              families were identified and which is described in
            
            
              Table 9 and Figure 2 which is Plant Transcription
            
            
              Factor Result of
            
            
              Trigonella foenum-graecum L.
            
            
              .The
            
            
              overall distribution of transcription factor encoding
            
            
              transcripts among the various known protein families
            
            
              is very similar with that of other legumes as predicted
            
            
              earlier (Libault et al., 2009).
            
            
              3 Conclusion
            
            
              This study is focus on five different legume species
            
            
              from NCBI database for de novo sequence assembly
            
            
              and analysis by RNA-seq using next-generation
            
            
              Illumina and 454 sequencing. The transcriptome
            
            
              sequencing enables various functional genomics
            
            
              studies for an organism. Although several high
            
            
              throughput technologies have been developed for
            
            
              Table 9 Plant Transcription Factor Result
            
            
              Species
            
            
              At least different families
            
            
              Arachis hypogaea
            
            
              L.
            
            
              70
            
            
              Cicer arietinum
            
            
              L.
            
            
              97
            
            
              Phaseolus vulgaris
            
            
              L.
            
            
              43
            
            
              Trigonella foenum-graecum
            
            
              L.
            
            
              45
            
            
              Vicia sativa
            
            
              L.
            
            
              82