摘要: |
首先介绍序列比对的分子生物学基础,即核酸序列基本单元核苷酸和蛋白质序列基本单元氨基酸。文中以精心设计的图表列出四种核苷酸和二十种氨基酸的名称、性质和分类。第2节简述序列比对基础,包括相似性和同源性基本概念、整体比对和局部比对、点阵图方法、动态规划和启发式算法、计分矩阵和空位罚分,以及常用软件和分析平台。第3节介绍核酸序列比对中常用计分矩阵DNAfull,蛋白质序列比对中常用计分矩阵BLOSUM62和PAM250。第4-8节则以血红蛋白、多肽毒素、植物转录因子、癌胚抗原和唾液酸酶为例,介绍双序列比对的具体应用。通过这些实例,说明如何选择分析平台和比对程序、如何设置计分矩阵和空位罚分,如何分析比对结果及其生物学意义。文末进行简要总结。 |
关键词: 双序列比对 相似性和同源性 整体比对和局部比对 点阵图 计分矩阵 空位罚分 血红蛋白 多肽毒素 植物转录因子 癌胚抗原 唾液酸酶 |
DOI:10.12113/202202002 |
分类号:Q51 |
文献标识码:A |
基金项目: |
|
Basics of pairwise sequence alignment and some application examples |
LUO Jingchu
|
(College of Life Sciences and Center for Bioinformatics, Peking University, Beijing 100871, China)
|
Abstract: |
This paper first preseats a brief introduction to molecular biology, focusing on nucleotides and amino acids, the basic units of nucleic acid and protein sequence. With well-designed figures and tables, the name, classification, and property of four nucleotides and 20 amino acids are displayed. General concepts related to sequence alignment are described in the second section, including similarity and homology, global and local alignment, the dot plot method, dynamic and heuristic programming, scoring matrix and gap penalty, and alignment tools and platforms. The third part introduces the scoring matrices, which play a critical role in sequence alignment. Characteristics of DNAfull for DNA sequence, as well as BLOSUM62 and PAM250 for protein sequence are described in detail. By taking hemoglobin, peptide toxin, plant transcription factor, carcinoembryonic antigen, and cytosolic sialidase as examples, sections 4-8 illustrate how to choose alignment method and platform, how to select scoring matrix, how to change gap penalty, how to analyze alignment results and their biological significance. Finally, a simple summary is made. |
Key words: Pairwise sequence alignment Similarity and homology Global and local alignment Dot plot Scoring matrix Gap penalty Hemoglobin Peptide toxin Plant transcription factor Carcinoembryonic antigen Cytosolic sialidase |