摘要: |
基于局部序列相似性比对的数据库搜索系统BLAST是生物信息学领域常用工具之一。本文首先介绍数据库相似性搜索的基本概念,包括计分矩阵、空位罚分,以及灵敏度和特异度等;以血红蛋白alpha和beta亚基为例,说明BLAST搜索基本策略,包括分割种子串、确定近邻串、搜索高分对、延伸高分对、计算期望值等。讨论种子序列字长、计分矩阵、空位罚分等对搜索结果的影响。介绍blastp,blastx,blastn和tblastn四个BLAST通用程序,以及SmartBlast,Primer-Blast和Global Align等专用程序。文末简述BLAST主要用途,列举几个国际国内BLAST网站,介绍FASTA,BLAT,HMMER等其它数据库搜索程序。 |
关键词: 序列相似性 数据库搜索 BLAST搜索策略 计分矩阵 空位罚分 BLAST通用程序 BLAST专用程序 |
DOI:10.12113/202411002 |
分类号:TP392 |
文献标识码:A |
基金项目: |
|
A brief introduction to the sequence database search system BLAST |
LUO Jingchu
|
(College of Life Sciences and Center for Bioinformatics, Peking University, Beijing 100871, China)
|
Abstract: |
The Basic Local Alignment Search Tool (BLAST) is the sequence database search system based on local sequence alignment. It is one of the most commonly-used sequence analysis tools in bioinformatics. After giving the general concept of sequence database search, we start with the description of the main strategy of BLAST search: ① divide the seed sequence; ② find the neighborhood sequence; ③ search for high scoring pair; ④ extend the high scoring pair; ⑤ calculate the expected value E. The effect of the major parameters such as word size of the seed, the scoring matrix and the gap-penalty are discussed. In addition to the routine programs blastp, blastn, blastx and tblastn, special programs such as SmartBlast, Primer-Blast and Global Align, are also briefly described. Finally, we list the main usage of BLAST, several international and domestic BLAST web sites, and other database search tools such as FASTA, BLAT, HMMMER. |
Key words: Sequence similarity Database search BLAST search strategy Scoring matrix Gap penalty BLAST routine programs BLAST special programs |