计算分子生物学
(
网络版
), 2012
年
,
第
1
卷
,
第
2
篇
,
第
7
-1
5
页
Jisuan Fenzi Shengwuxue (Online), 2012, Vol.1, No.2, 7
-
15
http://cmb.5th.sophiapublisher.com
13
两个物种基因组寡聚核苷酸转移概率矩阵的
欧几里徳距离计算公式:
N
j i
j i
j i
y x
D
1 ,
2
,
,
)
(
(3)
其中,
N
为转移概率矩阵的总行数或总列数;
x
i, j
、
y
i, j
分别代表两个物种的转移概率矩阵中第
i
行第
j
列的元素值。
图
5
酿脓链球菌种内的
5
个菌株的系统发育树
注
: A:
基于
16S rRNA
的核苷酸序列构建的系统发育树
; B:
基于基因组三核苷酸转移概率矩阵的联合直方图散度构建
的系统发育树
Figure 5 The phylogeny tree of 5 strains of
Streptococcus pyogenes.
Note: A: The phylogeny tree based on 16S rRNA gene
constructed by neighbor-joint method; B: The phylogeny tree
based on joint histogram divergence of genomic trinucleotide
transition probability matrix
另外,基于单基因的系统发育分析方法存在着
局限性且分辨力有限,它只对种以上的分类单元具
有较高的分辨力,而对种以下亲缘关系十分接近的
物种则难以区分
(Bohlin et al., 2008)
。对于用单基因
序列比对无法区分的物种,我们用联合直方图散度
进行聚类分析,由此评估联合直方图散度在系统发
育分析方面的优势。这里,我们先用
CLUSTAL X
对
11
个物种的
16S rRNA
基因进行多重序列比对,
接着用邻接法构建单基因系统发育树;同时,通过
构建这些物种的联合直方图散度矩阵,并利用
PHYLIP
软件进行聚类,推断它们的系统发育关系;
最后,通过比较上述两种手段构建的系统发育树,
对新方法进行评估。
作者贡献
严翠婷是研究的主要执行人,包括数据采集、分析和初稿写
作;黄庆生编写程序,参与部分数据分析和讨论;章芬参与
部分数据分析;方颖是项目负责人,指导整个实验设计、数
据分析、论文写作和修改。
参考文献
Amoutzias G.D., Robertson D.L., Oliver S.G., and Bornberg-
Bauer E., 2004, Convergent evolution of gene networks by
single-gene duplications in higher eukaryotes, EMBO
Reports, 5(3): 274-279
Bohlin J., Skjerve E., and Ussery D.W., 2008a, Reliability and
applications of statistical methods based on oligonucleotide
frequencies in bacterial and archaeal genomes, BMC
Genomics, 9(1): 104
Bohlin J., Skjerve E., and Ussery W., 2008, Investigations of
oligonucleotide usage variance within and between prokaryotes,
PLoS Computational Biology, 4(4): 1-9
Gogarten J.P., and Townsend J.P., 2005, Horizontal gene transfer,
genome innovation and evolution, Nature Reviews Microb-
iology, 3(9): 679-687
Mei Y.S., Yang S.X., and Mo B., 2007, Automatic image
registration algorithm based on a novel similarity measu-
rement, Yiqi Yibiao Xuebao (Chinese Journal of Scientific
Instrument), 28(4): 336-339 (
梅跃松
,
杨树兴
,
莫波
, 2007,
一种基于新的相似性测度的自动图像配准算法
,
仪器
仪表学报
, 28(4): 336-339)
Pass G., and Zabih R., 1999, Comparing images using joint
histograms, Multimedia Systems, 7(3): 234-240
Phillips G.J., Arnold J., and Robert I., 1987, Mono-through hexanu-
cleotide composition of the
Escherichia coli
genome: AMarkov
chain analysis, Nucleic Acids Research, 15(6): 2611-2626
Qi J., Luo H., and Hao B.L., 2004, CVTree: A phylogenetic
tree reconstruction tool based on whole genomes, Nucleic
Acids Research, 32: W45-W47
Sun J.D., Xu Z., and Hao B.L., 2010, Whole-genome based
archaea phylogeny and taxonomy: A composition vector
approach, Chinese Science Bulletin, 55(22): 2323-2328
Takahashi M., Kryukov K., and Naruya S., 2009, Estimation of
bacterial species phylogeny through oligonucleotide frequency
distances, Genomics, 93(6): 525-533
Tyagi A., Bag S.K., Shukla V., Roy S., and Tuli R., 2010,
Oligonucleotide frequencies of barcoding loci can discriminate
sp ecies across kingdoms, Plos One, 5(8): 1-9
Wu M., and Eisen J.A., 2008, A simple, fast, and accurate method
of phylogenomic inference, Genome Biology, 9: R151
Xiong Y.Y., Wang J.P., Lan Y.J., Wen M., and Zhang S.H., 2008,
Evolutionary information of the diversity of oligonucleotide
frequency of genomes, Zhongshan Daxue Xuebao (Zirankexue
Ban) (Acta Scientiarum Naturalium Universitatis Sunyatsen),
(2): 84-88 (
熊远妍
,
王军鹏
,
蓝一杰
,
文明
,
张尚宏
,
2008,
基因组寡聚核苷酸频率组分差异的进化信息
,
中
山大学学报
:
自然科学版
, (2): 84-88