GC2 Biology Dictates Gene Expressivity in
Camellia sinensis
24
Na =Za-1
The value of Na ranges from 1 to the number of
synonymous codons ka (the codon degeneracy). With
equal codon usage, homozygosity is minimal and the
value of Na is the number of synonymous codons. The
overall number of effective codons for a gene (Nc) is a
sum of average homozygosities Za for different
redundancy classes k (in set K of all redundancy
classes):
Where for each redundancy class:
When the codon usage pattern is more uniform than
expected, it is possible to obtain Nc > 61, in which
case it is readjusted to 61. If an amino acid is not
observed, or is very rare, then the value is replaced by
the average homozygosity of the amino acids in the
same redundancy class. If Ile amino acid is missing
(the only member in the redundancy class with three
synonymous codons), then the corresponding Z is
estimated from the average homozygosity of the other
redundancy classes.
For example, in the case of isoleucine:
When there is a large discrepancy among the amino
acids for a gene, the sum of Nc for all individual
amino acids can be used instead of taking the sum of
the averages of each redundancy class:
GC3s is the frequency of (G+C) and A3s, T3s, G3s,
and C3s are the distributions of A, T, G and C at the
synonymous third positions of codons (Gupta and
Ghosh, 2001). GC skew and AT Skew are defined as
the ratio of (G - C) to (G+C) and (A - T) to (A + T)
respectively along the DNA sequences (Wright, 1990).
3.3 Analysis
All the above-mentioned parameters were calculated
by using a PERL program developed by us. After
which we have measured the correlations between all
the above mentioned parameters with the gene
expressivity for
Camellia sinensis
.
Authors’ contributions
S.C conceived the idea, prepared the software for
analysis. P.P. analyzed the data set and prepared the
manuscript, figures and tables. All authors read and
approved the final manuscript.
Acknowledgment
We are thankful to Assam University, Silchar, Assam,
India for providing the necessary facilities in carrying
out this research work. We sincerely acknowledge the
help rendered by Dr. A. Sen, Director and other staff
members, Computer Centre, Assam University,
Silchar for their kind support in providing internet
access for this research work.
References
Bains W., 1987, Codon distribution in vertebrate genes may be
used to predict gene length, J Mol. Biol., 197(3): 379-388
http://dx.doi.org/10.1016/0022-2836(87)90551-1
Bernardi G., 1993, The vertebrate genome: isochores and
evolution, Mol. Biol. Evol., 10: 186-204
D'Onofrio G., Ghosh T.C., and Bernardi G., 2002, The base
composition of the genes is correlated with the secondary
structures of the encoded proteins, Gene, 300(1-2):
179-187
http://dx.doi.org/10.1016/S0378-1119(02)01045-4
Eyre-Walker A., 1996, Synonymous codon bias is related to
gene length in
Escherichia coli:
selection for translational
accuracy? Mol Biol Evol
.
, 13(6): 864-872
http://dx.doi.org/10.1093/oxfordjournals.molbev.a025646
Gerton J.L., DeRisi J., Shroff, R., Lichten M., Brown P.O., and
Petes T.D., 2000, Global mapping of meiotic
recombination hotspots and coldspots in the yeast
Saccharomyces cerevisiae
, Proc. Natl. acad. Sci. USA,
97(21), 11383-11390
http://dx.doi.org/10.1073/pnas.97.21.11383
Gouy M., and Gautier C., 1982, Codon usage in bacteria:
correlation with gene expressivity, Nucleic Acids Res., 10:
7055-7074
http://dx.doi.org/10.1093/nar/10.22.7055
Gupta S.K., and Ghosh T.C., 2001, Gene expressivity is the
main factor in dictating the codon usage variation among
the genes in
Pseudomonas aeruginosa
, Gene, 273: 63-70
Kk
ka k
Nn
Nc
k
Ka
a
k
a
N
n
N
1
)
5
3
5
2
3
1
3
2
1 2 (
3
1
1
6
1
4
1
3
3
k
k
k
k
Z
Z
Z
Z
Aa
a
N Nc
Computational
Molecular Biology