引用本文: | 李大舟,路莹,高巍,陈思思.基于TransformerMGI的microRNA靶向基因预测[J].生物信息学,2024,22(3):225-238. |
| LI Dazhou,LU Ying,GAO Wei,CHEN Sisi.MicroRNA-targeted gene prediction based on TransformerMGI[J].Chinese Journal of Bioinformatics,2024,22(3):225-238. |
|
摘要: |
生物小分子microRNA可以对基因表达进行正向或负向调控,研究microRNA与基因之间的关系对于机体稳态的维持和疾病治疗都有着重要意义。利用深度学习方法对microRNA和基因靶向关系进行预测,提出了TransformerMGI模型。在特征工程阶段,针对生物序列潜在信息难以准确地提取这一问题,TransformerMGI模型分别采用了基于图卷积神经网络的GP-GCN方法和DNA2Vec模型对microRNA和基因数据的潜在信息进行提取,得到了二者的表征嵌入矩阵,在模型方面,TransformerMGI模型引入了幂归一化来改进经典的深度学习模型。利用microRNA和基因数据经过特征提取后得到两个表征矩阵,这两个矩阵分别被放入TransformerMGI模型中,通过TransformerMGI模型内部的Attention机制对二者自身和相互的特征信息进行了聚合和关联运算,最终预测出microRNA调控基因的概率。采用ROC曲线下面积和准确召回率曲线作为模型性能评价指标,将TransformerMGI与其他现有模型进行了比较评估。实验结果表明,TransformerMGI模型的AUC和AUPRC评分均可达0.91以上,优于现有的其他模型。TransformerMGI模型能在不考虑生物学原理和基因组背景的前提下,仅依赖microRNA和基因的碱基序列信息,实现microRNA靶向基因的预测,从而为后续的microRNA靶向基因预测研究提供了可借鉴的深度学习方法。 |
关键词: microRNA 靶向基因预测 深度学习 图卷积网络 多头注意力机制 |
DOI:10.12113/202303008 |
分类号:Q819 |
文献标识码:A |
基金项目: |
|
MicroRNA-targeted gene prediction based on TransformerMGI |
LI Dazhou,LU Ying,GAO Wei,CHEN Sisi
|
(School of Computer Science and Technology, Shenyang University of Chemical Technology, Shenyang 110142, China)
|
Abstract: |
Biological small molecules such as microRNAs play an important role in biological processes and can positively or negatively regulate gene expression. Studying the relationship between microRNAs and genes is of great significance for the maintenance of homeostasis and the treatment of diseases. In this paper, the deep learning method is used to predict the relationship between microRNA and gene targeting, and the TransformerMGI model is proposed. In the feature engineering stage, aiming at the problem that it is difficult to extract the potential information of biological sequences accurately, the GP-GCN method based on graph convolutional neural network and DNA2Vec model are used to extract the potential information of microRNA and gene data, respectively, and the representation embedding matrix of microRNA and gene data is obtained. In terms of models, the TransformerMGI model introduces power normalization to improve the classical deep learning models. After feature extraction of microRNA and gene data in this paper, two representation matrices are obtained, which are respectively put into the TransformerMGI model. Through the internal Attention mechanism of the TransformerMGI model, the self-and mutual feature information of microRNA and gene are aggregated and correlated. Finally, the probability of microRNA regulating genes is predicted. In this paper, the area under the ROC curve and the exact recall curve are used as the performance evaluation indicators to evaluate the proposed model compared with other existing models. The experimental results show that the AUC and AUPRC scores of the TransformerMGI model proposed in this paper can reach more than 0.91, which is better than other existing models. The TransformerMGI model can only rely on the base sequence information of microRNA and gene without considering the biological principles and genomic background, and realize the prediction of microRNA target genes, which provides a useful deep learning method for the subsequent prediction of microRNA target genes. |
Key words: microRNA Target gene prediction Deep learning Graph convolutional networks Multi-head attention |