4 - GAB-2014v5n1页

基本HTML版本

Genomics and Applied Biology 2014, Vol. 5, No. 5, 1-6
http://gab.biopublisher.ca
1
Research Report Open Access
De Novo RNA Seq Assembly and Annotation of
Phaseolus vulgaris
L.
(SRR1283084)
Sagar S. Patel
1
, Dipti B. Shah
1
, Hetalkumar J. Panchal
2
1. G. H. Patel Post Graduate Department of Computer Science and Technology, Sardar Patel University, Vallabh Vidyanagar, Gujarat-388120, India
2. Gujarat Agricultural Biotechnology Institute, Navsari Agricultural University, Surat, Gujarat- 395007, India
Corresponding author email
Genomics and Applied Biology, 2014, Vol.5, No.5 doi: 10.5376/gab.2014.05.0005
Received: 21 Aug., 2014
Accepted: 24 Sep., 2014
Published: 22 Oct., 2014
© 2014 Patel et al., This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly cited.
Preferred citation for this article:
Patel et al., 2014, De Novo RNA Seq Assembly and Annotation of
Phaseolus vulgaris
L. (SRR1283084), Genomics and Applied Biology, Vol.5, No.5, 1-6
(doi:
Abstract
Phaseolus vulgaris
L. which is also known as Common bean; is produced in the tropics on small-scale farms where
unfavorable factors limit the yield potential. Recently, next-generation sequencing technology, termed RNA-seq, has provided a
powerful approach for analyzing the Transcriptome. This study is focus on RNA-seq of
Phaseolus vulgaris
L. of
SRR1283084
from
NCBI database for de novo Transcriptome analysis. A total of 20.4 million single reads were generated with N50 of 293 bp. Sequence
assembly contained total 6999 contigs which is further search with known proteins, a total of 1679 genes were identified. Among
these, only 629 unigenes were annotated with 3724 gene ontology (GO) functional categories and sequences mapped to 89 pathways
by searching against the Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG). These data will be useful for gene
discovery and functional studies and the large number of transcripts reported in the current study will serve as a valuable genetic
resource of the
Phaseolus vulgaris
L..
Keywords
Transcriptome; Bioinformatics;
Phaseolus vulgaris
L..
Introduction
Next generation sequencing methods for high
throughput RNA sequencing (transcriptome) is
becoming increasingly utilized as the technology of
choice to detect and quantify known and novel
transcripts in plants. This Transcriptome analysis
method is fast and simple because it does not require
cloning of the cDNAs. Direct sequencing of these
cDNAs can generate short reads at an extraordinary
depth. After sequencing, the resulting reads can be
assembled into a genome-scale transcription profile. It
is a more comprehensive and efficient way to measure
Transcriptome composition, obtain RNA expression
patterns, and discovers new exons and genes
(Mortazavi et al., 2008; Wang et al.,2009); sequencing
data of Transcriptome was assembled using various
assembly tools, functional annotation of genes and
pathway analysis carried with various Bioinformatics
tools. The large number of transcripts reported in the
current study will serve as a valuable genetic resource
for
Phaseolus vulgaris
L..
High-throughput short-read sequencing is one of the
latest sequencing technologies to be released to the
genomics community. For example, on average a
single run on the Illumina Genome Analyser can result
in over 30 to 40 million single-end (~35 nt) sequences.
However, the resulting output can easily overwhelm
genomic analysis systems designed for the length of
traditional Sanger sequencing, or even the smaller
volumes of data resulting from 454 (Roche)
sequencing technology. Typically, the initial use of
short-read sequencing was confined to matching data
from genomes that were nearly identical to the reference
genome. Transcriptome analysis on a global gene
expression level is an ideal application of short-read
sequencing. Traditionally such analysis involved
complementary DNA (cDNA) library construction,
Sanger sequencing of ESTs, and microarray analysis.
Next generation sequencing has become a feasible
method for increasing sequencing depth and coverage
while reducing time and cost compared to the
traditional Sanger method (L J Collins et al.).