GC2 Biology Dictates Gene Expressivity in
            
            
              
                Camellia sinensis
              
            
            
              24
            
            
              Na =Za-1
            
            
              The value of Na ranges from 1 to the number of
            
            
              synonymous codons ka (the codon degeneracy). With
            
            
              equal codon usage, homozygosity is minimal and the
            
            
              value of Na is the number of synonymous codons. The
            
            
              overall number of effective codons for a gene (Nc) is a
            
            
              sum of average homozygosities Za for different
            
            
              redundancy classes k (in set K of all redundancy
            
            
              classes):
            
            
              Where for each redundancy class:
            
            
              When the codon usage pattern is more uniform than
            
            
              expected, it is possible to obtain Nc > 61, in which
            
            
              case it is readjusted to 61. If an amino acid is not
            
            
              observed, or is very rare, then the value is replaced by
            
            
              the average homozygosity of the amino acids in the
            
            
              same redundancy class. If Ile amino acid is missing
            
            
              (the only member in the redundancy class with three
            
            
              synonymous codons), then the corresponding Z is
            
            
              estimated from the average homozygosity of the other
            
            
              redundancy classes.
            
            
              For example, in the case of isoleucine:
            
            
              When there is a large discrepancy among the amino
            
            
              acids for a gene, the sum of Nc for all individual
            
            
              amino acids can be used instead of taking the sum of
            
            
              the averages of each redundancy class:
            
            
              GC3s is the frequency of (G+C) and A3s, T3s, G3s,
            
            
              and C3s are the distributions of A, T, G and C at the
            
            
              synonymous third positions of codons (Gupta and
            
            
              Ghosh, 2001). GC skew and AT Skew are defined as
            
            
              the ratio of (G - C) to (G+C) and (A - T) to (A + T)
            
            
              respectively along the DNA sequences (Wright, 1990).
            
            
              
                3.3 Analysis
              
            
            
              All the above-mentioned parameters were calculated
            
            
              by using a PERL program developed by us. After
            
            
              which we have measured the correlations between all
            
            
              the above mentioned parameters with the gene
            
            
              expressivity for
            
            
              
                Camellia sinensis
              
            
            
              .
            
            
              
                Authors’ contributions
              
            
            
              S.C conceived the idea, prepared the software for
            
            
              analysis. P.P. analyzed the data set and prepared the
            
            
              manuscript, figures and tables. All authors read and
            
            
              approved the final manuscript.
            
            
              
                Acknowledgment
              
            
            
              We are thankful to Assam University, Silchar, Assam,
            
            
              India for providing the necessary facilities in carrying
            
            
              out this research work. We sincerely acknowledge the
            
            
              help rendered by Dr. A. Sen, Director and other staff
            
            
              members, Computer Centre, Assam University,
            
            
              Silchar for their kind support in providing internet
            
            
              access for this research work.
            
            
              
                References
              
            
            
              Bains W., 1987, Codon distribution in vertebrate genes may be
            
            
              used to predict gene length, J Mol. Biol., 197(3): 379-388
            
            
              http://dx.doi.org/10.1016/0022-2836(87)90551-1
            
            
              Bernardi G., 1993, The vertebrate genome: isochores and
            
            
              evolution, Mol. Biol. Evol., 10: 186-204
            
            
              D'Onofrio G., Ghosh T.C., and Bernardi G., 2002, The base
            
            
              composition of the genes is correlated with the secondary
            
            
              structures of the encoded proteins, Gene, 300(1-2):
            
            
              179-187
            
            
              http://dx.doi.org/10.1016/S0378-1119(02)01045-4
            
            
              Eyre-Walker A., 1996, Synonymous codon bias is related to
            
            
              gene length in
            
            
              
                Escherichia coli:
              
            
            
              selection for translational
            
            
              accuracy? Mol Biol Evol
            
            
              
                .
              
            
            
              , 13(6): 864-872
            
            
              http://dx.doi.org/10.1093/oxfordjournals.molbev.a025646
            
            
              Gerton J.L., DeRisi J., Shroff, R., Lichten M., Brown P.O., and
            
            
              Petes T.D., 2000, Global mapping of meiotic
            
            
              recombination hotspots and coldspots in the yeast
            
            
              
                Saccharomyces cerevisiae
              
            
            
              , Proc. Natl. acad. Sci. USA,
            
            
              97(21), 11383-11390
            
            
              http://dx.doi.org/10.1073/pnas.97.21.11383
            
            
              Gouy M., and Gautier C., 1982, Codon usage in bacteria:
            
            
              correlation with gene expressivity, Nucleic Acids Res., 10:
            
            
              7055-7074
            
            
              http://dx.doi.org/10.1093/nar/10.22.7055
            
            
              Gupta S.K., and Ghosh T.C., 2001, Gene expressivity is the
            
            
              main factor in dictating the codon usage variation among
            
            
              the genes in
            
            
              
                Pseudomonas aeruginosa
              
            
            
              , Gene, 273: 63-70
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
                Kk
              
            
            
              
                ka k
              
            
            
              
                Nn
              
            
            
              
                Nc
              
            
            
              
            
            
              
            
            
              
            
            
              
                k
              
            
            
              
                Ka
              
            
            
              
                a
              
            
            
              
                k
              
            
            
              
                a
              
            
            
              
                N
              
            
            
              
                n
              
            
            
              
                N
              
            
            
              1
            
            
              )
            
            
              5
            
            
              3
            
            
              5
            
            
              2
            
            
              3
            
            
              1
            
            
              3
            
            
              2
            
            
              1 2 (
            
            
              3
            
            
              1
            
            
              1
            
            
              6
            
            
              1
            
            
              4
            
            
              1
            
            
              3
            
            
              3
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
               
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
               
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
            
            
              
                k
              
            
            
              
                k
              
            
            
              
                k
              
            
            
              
                k
              
            
            
              
                Z
              
            
            
              
                Z
              
            
            
              
                Z
              
            
            
              
                Z
              
            
            
              
            
            
              
            
            
              
            
            
              
                Aa
              
            
            
              
                a
              
            
            
              
                N Nc
              
            
            
              Computational
            
            
              Molecular Biology