一种快速非比对的蛋白质序列相似性与进化分析方法

艾亮; 冯杰

期刊检索

关键词检索

新闻公告MORE

主管单位 工业和信息化部 主办单位 哈尔滨工业大学主编任南琪 国际刊号ISSN 1672-5565 国内刊号CN 23-1513/Q

期刊网站二维码

微信公众号二维码

引用本文:	艾亮,冯杰.一种快速非比对的蛋白质序列相似性与进化分析方法[J].生物信息学,2023,21(3):179-186.
	AI Liang,FENG Jie.A fast alignment-free method for protein sequence similarity and evolution analysis[J].Chinese Journal of Bioinformatics,2023,21(3):179-186.

【打印本页】【HTML】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】【关闭】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 711次下载 691次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
一种快速非比对的蛋白质序列相似性与进化分析方法
艾亮,冯杰
(中央民族大学理学院,北京 100081)[HJ1.4mm]

摘要:

本文提出了一种新的快速非比对的蛋白质序列相似性与进化分析方法。在刻画蛋白质序列特征时,首先将氨基酸的10种理化性质通过主成分分析浓缩为6个主成分,并且将每条蛋白质序列里的氨基酸数目作为权重对主成分得分值进行加权平均,然后再融合氨基酸的位置信息构成一个26维的蛋白质序列特征向量,最后利用欧式距离度量蛋白质序列间的相似性及进化关系。通过对3个蛋白质序列数据集的测试表明,本文提出的方法能将每条蛋白质序列准确聚类,并且简便快捷,说明了该方法的有效性。

关键词: 蛋白质序列主成分分析相似性系统进化树

DOI：10.12113/202209010

分类号:Q516

文献标识码:A

基金项目:

A fast alignment-free method for protein sequence similarity and evolution analysis

AI Liang,FENG Jie

(School of Science, Minzu University of China, Beijing 100081, China)

Abstract:

In this paper, we propose a new fast alignment-free method for protein sequence similarity and evolution analysis. First, 10 groups of physicochemical properties of amino acids are reduced to 6 principal components using principal component analysis, and the number of amino acids in each protein sequence is used as weights to the scores of the principal components. Then, the amino acid position information is fused to form a 26-dimension feature vector for each protein sequence. Finally, the Euclidean distance is used to measure the similarity and evolutionary distance between protein sequences. The test on three datasets shows that our method can cluster each protein sequence accurately, which illustrates the validity of our method.

Key words: Protein sequences Principal component analysis Similarity Phylogenetic trees

期刊检索

关键词检索

新闻公告MORE

友情链接LINKS