引用本文: | 祝让飞,刘洪波,苏建忠,王芳,崔颖,张岩.基于ChIP-seq的差异组蛋白修饰区域的筛选[J].生物信息学,2014,12(02):151-156. |
| ZHU Rangfei,LIU Hongbo,SU Jianzhong,WANG Fang,CUI Ying,ZHANG Yan.Identification of regions differentially modified by histone modification based on the ChIP-seq data[J].Chinese Journal of Bioinformatics,2014,12(02):151-156. |
|
摘要: |
组蛋白修饰是在基因组水平上起到重要调控作用的表观遗传修饰,随着ChIP-Seq的广泛使用,高通量数据的积累,为从全基因组研究组蛋白修饰模式奠定了基础。但目前缺乏在多样本间筛选疾病相关的调控区域的方法,因而本文开发了一种多细胞系的差异筛选算法来识别差异组蛋白修饰区域。本文通过窗口移动法来估计组蛋白修饰水平,并根据信息熵理论定量各个细胞系之间的差异。基于随机背景来确定差异显著性阈值。利用此算法来筛选人类全基因组9个细胞系间H3K4me3差异的区域,结果显示这些区域显著富集在基因启动子上和其他重要的染色质状态上,且与先前人们发现的活性启动子染色质状态显著重叠。通过文献挖掘进一步证实了与白血病相关的基因组标记。这些结果表明基于熵的策略可有效地挖掘多细胞系间以及与疾病相关的差异组蛋白修饰。 |
关键词: 组蛋白修饰差异 高通量数据处理 表观遗传修饰 |
DOI:10.3969/j.issn.1672-5565.2014-02.20140212 |
分类号:Q81 |
基金项目: |
|
Identification of regions differentially modified by histone modification based on the ChIP-seq data |
ZHU Rangfei, LIU Hongbo, SU Jianzhong, WANG Fang, CUI Ying, ZHANG Yan
|
(School of biological information and technology, Harbin medical university, Harbin 150081,China)
|
Abstract: |
Histone modification plays an important regulating role in genome. With the wide use of ChIP-Seq, high-throughput data has been accumulating from whole genome researches related with epigenetic regulation of genes. However, the lack of effective methods for processing and analyzing of these data hinders the screening of disease-related regulatory regions. In this study, we developed a novel algorithm for identification of the regions differentially modified by histone modification across multiple cell lines. By this method, we estimated the level of histone modification in each cell line, and quantified the histone modification difference among multiple cell lines according to the information entropy theory. The statistically significant threshold based on the random background can be used to identify the regions differentially modified by histone modification across multiple cell lines. We applied the algorithm to genome-wide screening of the regions differentially modified by H3K4me3 across nine cell lines. We found a significant enrichment of these regions on gene promoters and other important chromatin states. It is also revealed that these regions significantly overlapped with chromatin status related with the active promoter. Further literature mining confirmed the specific high H3K4me3 of gene RHCE in K265 cell line. These results show that our proposed strategy based entropy is effective in identification of histone modification difference among multiple cell lines and mining epigenetic abnormalities in diseases. |
Key words: Histone modification, High-throughput data processing, epigenetic modification. |