引用本文: | 顾倜,蔡磊鑫,王帅,吕强.多目标遗传算法的含假结RNA二级结构预测[J].生物信息学,2017,15(3):142-148. |
| GU Ti,CAI Leixin,WANG Shuai,LÜ Qiang.Prediction of RNA secondary structure including pseudoknots basedon multi-objective genetic algorithm[J].Chinese Journal of Bioinformatics,2017,15(3):142-148. |
|
摘要: |
假结是RNA中一种重要的结构,由于建模的困难导致它更难被预测。通过碱基之间的配对概率来预测含假结RNA二级结构的ProbKnot算法具有很高的精度,但该算法仅用了配对概率作为预测依据,导致阴性配对大量出现,因此精度中的特异性较低。实验结合ProbKnot算法中碱基配对概率模型,通过使用多目标遗传算法,从而提高预测含假结RNA二级结构的特异性,以此促进总体精度的提高。实验过程中,首先计算出每个碱基成为单链的概率,作为新增的预测依据,然后使用遗传算法对RNA二级结构进行交叉、变异和迭代,最后得到Pareto最优解,进一步得出最高的最大期望精度。实验结果表明,在使用的RNA案例中,采用该方法比现有方法精度平均提高约4%。 |
关键词: RNA二级结构 假结 多目标优化 遗传算法 最大期望精度 |
DOI:10.3969/j.issn.1672-5565.201701006 |
分类号:TP391 |
文献标识码:A |
基金项目:国家自然科学基金(61170125). |
|
Prediction of RNA secondary structure including pseudoknots basedon multi-objective genetic algorithm |
GU Ti1,CAI Leixin1,WANG Shuai1,LÜ Qiang2
|
(1. School of Computer Science & Technology, Soochow University, Suzhou 215006, China; 2. Provincial Key Laboratory for Computer Information Processing Technology of Jiangsu(Soochow University), Suzhou 215006, China)
|
Abstract: |
Pseudoknot is a kind of important structure in RNA, which is more difficult to be predicted due to the difficulty in training the prediction model. ProbKnot algorithm has a high accuracy in predicting the secondary structure of pseudoknotted RNA based on base-pair probabilities. However, this algorithm only makes use of the base-pair probabilities as a predictive feature, which leads to a large number of negative pairs and lower specificity. The overall accuracy can be improved which follows the improvement of specificity by combining the probabilistic model of base pairing in ProbKnot algorithm and multi-objective genetic algorithm. In the experiments, we will first calculate the single-stranded probability as a new predictive feature, and then use genetic algorithm to cross, mutate and iterate the secondary structure of RNA. Finally, we can get the Pareto optimal solutions and the highest maximum expected accuracy. The experiment results showed that in the RNA cases, applying this method could achieve an average increase of 4% in the accuracy compared with the current existing methods. |
Key words: RNA secondary structure Pseudoknot Multi-objective optimization Genetic algorithm Maximum expected accuracy |