Computational Molecular Biology
2
Figure 1 Structure of
matK
gene
(http://www.faculty.biol.vt.edu/hilu/Hilu_Lab_Website/Picture
s/Additional%20Photos/matK2.JPG)
1.2 NCBI (The National Center for Biotechnology
Information)
The National Center for Biotechnology Information
(NCBI) is part of the United States National Library
of Medicine (NLM), a branch of the National
Institutes of Health. The NCBI houses a series of
databases relevant to biotechnology and biomedicine.
Major databases include GenBank for DNA sequences,
Protein, Genome, EST etc. All these databases are
available online through the Entrez search engine
(http://www.ncbi.nlm.nih.gov).
1.3 DNA (Deoxyribonucleic acid)/Nucleotide
The Deoxyribonucleic acid (DNA) is a molecule that
encodes the genetic instructions used in the
development and functioning of all known living
organisms and many viruses (http://en.wikipedia.org).
Genetic information is encoded as a sequence of
nucleotides (guanine, adenine, thymine, and cytosine)
recorded using the letters G, A, T, and C. Most DNA
molecules are double-stranded helices, consisting of
two long polymers of simple units called nucleotides,
molecules with backbones made of alternating sugars
(deoxyribose) and phosphate groups (related to
phosphoric acid), with the nucleobases (G, A, T, C) attached
to the sugars (http://www.ncbi.nlm.nih.gov/nuccore/).
1.4 Protein
Proteins are large biological molecules consisting of
one or more chains of amino acids. Proteins perform a
vast array of functions within living organisms,
including catalyzing metabolic reactions, replicating
DNA, responding to stimuli, and transporting
molecules from one location to another
(http://en.wikipedia.org). Proteins differ from one
another primarily in their sequence of amino acids,
which is dictated by the nucleotide sequence of their
genes, and which usually results in folding of the
protein into a specific three-dimensional structure that
determines its activity (http://en.wikipedia.org),
(http://www.ncbi.nlm.nih.gov/protein/).
2 Materials and Methods
In this paper we have considered around 266 species
which are found in Gujarat state of India (Sagar Patel
et al., 2013). Further we searched each species in
NCBI database and finally found around 149 species’
information like DNA, Protein and other useful
information of Leguminosae family (Sagar Patel et al.,
2014). Further we have only considered
matK
gene
sequences of DNA and Protein sequences.
Evolutionary analysis done in MEGA software by
Maximum Likelihood method (Bootstrap method)
(Tamura et al., 2011) as shown in Figure 2.
Figure 2 Flow chart of method
3 Results
3.1 Result of DNA
matK
gene sequences
As shown in above Figure 3 which is result of DNA
matK
Sequences by Maximum Likelihood method
(bootstrap method), starting from top we observed that
species are placed in subfamily wise; like first,
Fabaceae (Papilionaceae), Mimosaceae followed by
Caesalpiniaceae but First and last species is from
Fabaceae (Papilionaceae) subfamily, so species of
Mimosaceae and Caesalpiniaceae subfamilies are
included within Fabaceae (Papilionaceae). Starting
from top species of Fabaceae (Papilionaceae)
subfamily are present in which species of genus
Medicago, Crotoraria, Sesbania, Vigna, Tephrosia,
Butea
and
Trigonella
genus are related as per
morphological characters or botanical classifications
except
Medicago lupulina, Vigna radiata, Vigna