Tree Genetics and Molecular Breeding 2012, Vol.2, No.1, 1
-
7
http://tgmb.sophiapublisher.com
6
in different species.
A large number of microsatellite analyses on
bioinformatics have been reported. Representatively,
Dieringer and Schlotterer (2003) found two significantly
different patterns in the course of microsatellite
variation through bioinformatics analysis on a large
number of microsatellites in the nine species. However,
microsatellite analysis of bioinformatics mainly focused
on fungi, human beings and mode of plant (Brinkmann
et al., 1998; Lothe, 1997; Toth et al., 2000). For forest
tree species, most of microsatellite analysis were yet
limited to be experimental analysis on a small number
of microsatellite loci (Wyman et al., 2003), whereas a
large number of microsatellite bioinformatics analysis
were also limited to a single species, such as whole
genome sequenced poplar (Tuskan et al., 2004; Li et
al., 2009). So far, it is yet to report comparative study
of the characteristics of microsatellite more than one
species, which might be due to lack of the genomic
sequence resources. Instead of whole genome sequencing,
EST sequencing of forest tree species was much more
popular (Sterky et al., 2004; Allona et al., 1998; Keller
et al., 2009). A plenty of EST sequences of pine,
poplar and eucalyptus in public databases would
facilitate this research.
Overall, the dominant microsatellites in gene regions
displayed the similar trends in the length of repeat unit
and the frequency of repeat unit losing and gaining in
this study, but the abundance and the number of
microsatellites with high-frequency variation had a
significant differentiation occurring between pine and
poplar or eucalyptus compared. Pine is a kind of
conifer species, while eucalyptus and poplar are kinds
of the broadleaf species. Our findings did reveal
whether there is the common difference between the
genomes of needle and broadleaf species, it does no
doubt need to be analyzed in more tree species. However,
the released public databases of EST sequences of
other needle and broadleaf species are still very
limited and insufficient to carry out the relevant
analysis of bioinformatics. In recent years, with the
rapid development of next generation high-throughput
sequencing technology, transcriptome sequencing has
being conducted on more and more tree species, of
which will be able to draw a clear answer to this
question.
3 Materials and methods
3.1 The sequence resources and microsatellite sequence
finding
EST sequences of Pine, poplar and eucalyptus were
downloaded from NCBI database by the website of
http://www.ncbi.nlm.nih.gov/dbEST/index.html. There
were differences in the number of EST sequences
sequenced in these species, in order to ensure the
results with comparability to each species, 30,000
sequences were randomly selected to search microsatellite
sequences by using the program of Sputnik developed
by C. Abajian of University of Washington), The
finding process followed with the default threshold,
and the minimum Score was set to nine, the ranges of
all microsatellites with repeating units from 2 to 5 bp
in length were covered.
3.2 Analysis on microsatellite length variation and
nucleotide composition of dominant repeating units
We adopted the EXCEL tool to classify the nucleotide
compositions of repeat units of microsatellites, in
order to find the highest proportion of the repeat unit
with the same base composition in different types of
microsatellites and figure out the dominant base
composition in different types of microsatellite units
through the corresponding repeat unit. Of microsatellite
length variation was carried out by the mapping
functions of the EXCEL. Pie charts were drawn based
on microsatellites different length repeat unit of which
each sector corresponding to the different microsatellite
in length (the microsatellite with the 1% frequency or
less was merged, each sector corresponding to the
length of the microsatellite and the proportion of the
microsatellite were marked with tie line in the
corresponding sector), the number of sectors stood for
the variation of microsatellite length, the more the
sectors had, the faster the rate of repeat unit losing and
gaining in the corresponding type of microsatellites,
thus the corresponding type of microsatellites generally
has higher polymorphism.
Author’s contributions
MMY, DXG and SXL are the persons who carried out this
study. TMY conceived the project and designed the analysis
procedures as well as wrote and revised manuscript. All authors
had read and agreed the final text.