引用本文: | 刘红梅,刘国庆.基于k-mer组分信息的系统发生树构建方法[J].生物信息学,2013,11(2):100-104. |
| LIU Hong-mei,LIU Guo-qing.A method for constructing phylogenetic tree based on k-mer information[J].Chinese Journal of Bioinformatics,2013,11(2):100-104. |
|
摘要: |
随着越来越多基因组的测序完成,基于全基因组的非比对的系统发生分析已成为研究热点。不同的生物物种或个体基因组之间的核酸组分不完全相同。遗传语言-DNA序列的信息很大程度上反映在其k-mer频数中。基于基因组序列k-mer频数的系统发生树则从新的角度为我们提供物种之间的亲缘关系。本文定义基于k-mer频数的信息参数,并用它表征基因组序列,计算不同基因组之间信息参数的距离,用邻接法对84个病毒构建了系统发生树,发现构建的系统发生树很大程度上与已有的系统发生树相吻合。 |
关键词: 系统发生树 k-mer频数 距离矩阵 |
DOI:10.3969/j.issn.1672-5565.2013-02.20130204 |
分类号: |
基金项目:国家自然科学基金(61102162)、内蒙古自治区高等学校科学研究项目(NJ10098)和内蒙古科技大学创新基金(2009NC005)资助。 |
|
A method for constructing phylogenetic tree based on k-mer information |
LIU Hong-mei1,LIU Guo-qing2,3
|
(1. Department of respiration, People’s Hospital,Wuhai 016000, China; ;2.School of Mathematics, Physics and Biological Engineering, Inner Mongolia University of Science and Technology,Baotou 014010, China; ;3.The Institute of Bioengineering and Technology, Inner Mongolia University of Science and Technology, Baotou 014010, China)
|
Abstract: |
With the success in the sequencing of complete genomes, the phylogenetics analysis by alignment-free methods based on complete genomes has been a hot topic. The nucleotide composition is different across species or populations. The information in the genetic language-DNA can be reflected largely in its k-mer frequencies. The phylogenetic tree based on k-mer frequencies would provide us the evolutionary relation among organisms from a novel perspective. In this study, the genomes of 84large viruses are characterized by an information parameter, which is defined based on k-mer frequencies in the sequences; then the distances among the virus genomes are calculated and a phylogenetic tree is constructed for the viruses by using neighbor-joining method. The obtained phylogenetic tree is largely in agreement with the others’ tree. |
Key words: Phylogenetic Tree k-mer Frequency Distance Matrix |