疾病相关离子通道靶点识别的信息融合方法研究.doc
《疾病相关离子通道靶点识别的信息融合方法研究.doc》由会员分享,可在线阅读,更多相关《疾病相关离子通道靶点识别的信息融合方法研究.doc(106页珍藏版)》请在沃文网上搜索。
1、中文摘要中文摘要随着人类基因组计划的实施和不断深入,核酸、蛋白质序列及其表达谱等生 物信息数据爆炸性地增长,这不仅为生物学药物学的研究提供了丰富的资源,加 速了功能基因组学的发展,也对数据挖掘和知识发现技术也提出了新的挑战3与 此同时,生命科学的研究E呈现“分子化、系统化、全面化”的趋势发展,人们 对疾病的认识也逐渐深入,从人、器官、组织、细胞、以至于达到H前的分子水 平,于是,后基因组时代疾病相关靶点的筛选需要从DNA序列,表达谱,蛋白 质各个层面进行研究,以期髙效地找到最优的疾病相关靶点。 离子通道是质膜上一类特殊蛋白,具有重要生物学及药理学意义。目前有关 离子通道的数据信息分布分散,尚未
2、有系统全面的离子通道数据集成平台,这对 离子通道的研究带來了很大的不便,为此,本课题遵循目前形势的需要,将序列、 表达等不同层面数据信息融合,并集成本课题组自主研发软件和现有经典数据分 析软件及功能数据库资源构建离子通道数据集成分析平台,以期为离子通道科研 者提供更为便捷的分析平台。由于离子通道疾病为多基因型疾病,传统的单基因疾病iK别分析方法已经不 能够深入可靠的挖掘出疾病相关基因,为此,本课题运用创新的数据挖掘技术对 离子通道基因表达谱数据进行系统化分析,并通过后期生物学功能注释数据库对 ft杂疾病的致病机理进行深入的分析研究。首先,根据离子通道的特点,从己有 的全基因表达谱数据筛选离T通
3、道基因以及其它跨膜蛋白基因的表达谱数据信 息,这既可以充分利用己有的生物信息学数据资源,同时也对典型的离子通道疾 病的研究提供了一个更新颖的数据提取分析视角,进而减少了昂贵的离子通道芯 片的制作成本。然后,针对心脏病等典型的离子通道病,在组织样本类别的引导 下,利用集成决策的方法识别与疾病相关的基因,对不同的交叉证实过程产生的 结果进行交叉分析研究,并采用多种其它分类学方法对结果进行证实,结果表明: 该方法识别出的具有统计学意义的与疾病相关的离子通道基因,与已有的生物学 知识是相符,证明了方法的有效性。本研究采用一种针对离子通道与跨膜蛋白基 因挖掘的模式识别方法,即决策森林方法,同时充分考虑综
4、合频率与深度两个重 IIAbstract要因素在致病基因选择过程的影响,而构建新的指标,同时采用四种经典分类方 法进行结果证实,并融合多种生物信息学数据库,对实验结果进行深入合理的生 物学解释。这些可靠性强的致病基因的发现,对于药物靶点的发现以及新药的研 制都具有十分重要的意义。第四,我们还提出了基于耦合双向聚类的离子通道数 据分析技术CCTWC,从样本与特征两个方向对离子通道表达谱数据进行聚类分 析,并将得到的离子通道基因簇,运用生物学通路网络构建软件PathwayStudio 构建基因之N的互作关系,并分别分析了传统离+通道分型在基于疾病遗传机理 层面所划分的离子通道亚型中的分布情况,以及
5、离子通道基因簇中离子通道的互 作关系,进而揭示基于表达相似所划分的离子通道亚型与疾病亚型之间的关联关 系。总之,本研究针对目前离子通道数据分散分布的特点,从不同层次构建了离 子通道数据集成分析平台,并从离子通道基因表达谱数据出发,利用集成决策的 方法挖掘疾病相关基因,并通过CCTWC方法分析研究了离子通道亚型与疾病亚 型之间的内在关联,这些工作为复杂的离子通道疾病发病机理的研究提供了一个 全新的视角。 关键词离子通道,集成平台,决策森林,耦合双向聚类AbstractAlong with the deep implement of the Human Genome Project, biolog
6、ical information about nucleic acid and protein sequence and expression profile has risen exploded. This has provided plenty information for biological and pharmacological study, and accelerate the development speed of functional genome, but also, it puts forward new challenge for data mining and kn
7、owledge discovering technology. Meanwhile, the study of life sciences has presented a current development of molecularisra? systematism and complentisra. Understand for disease has gone deep gradually, from body, to organism, to tissue, to cell, presently, even to molecular level. So, in post-genome
8、 era, selection of disease related target needs studies on various lays,accordingly,from DNA sequence, to expression profile,to proteins, looking forward to find out the best disease related target.Ion channel are special kind of membrane proteins which own important physiological functions. Researc
9、h on ion channel has critical biological and pharmacological significance. Now, data about ion channel are scattered different databases, and there has no one systematic and full-scale ion channel data ensemble flat, this has brought great inconvenient for ion channel study works. So, to follow the
10、phasic need, our study will construct the ion channel data ensemble analysis flat from sequence data, expression profile dataset of ion channel genes and proteins, also integrate some our software and other classical data analysis software, expect to provide a more convenient analysis flat for ion c
11、hannel scientific researchers-Often, channelopathy are more complex polygene diseases, and traditional discriminance analysis methods of single-gene disorder are already unable to dig out disease related genes thoroughly and reliably. So, to make a thorough study for the pathopoiesis mechanism of co
12、mplex channlopathy, we apply novel data miningillAbstracttechnology to make a systematic analysis of in channel expression profile, and also their sequence dataset and other biological annotation information from the functional databases. First of all, based the character of ion channel as well as t
13、he increasing of dataset of high throughput and whole genome gene expression profile, we try to select the ion channel gene expression profile from the existing gene expression dataset. This work can make a full use of existing biological data, also provides a novel visual angle of data extraction f
14、or channelopathy research. Further more, it reduces the dearly facture cost of ion channel microarry. In the next place, aim at the representative channelopathy such as cardiomypathy and under the lead of tissue samples classes, vve apply the ensemble decision approach to mine the disease related io
15、n channel genes. Also, we make a cross analysis to the various results getting from different cross validation process, and use other different classification approach to validate it. The results show that, this ensemble decision approach can mine out the disease related ion channel genes which poss
16、es the statistical significance and the results match with the existing biological knowledge. This can prove the validity of ensemble decision approach. In our research, we raise a new pattern recognition method for the ion channel data analysis, decision forest. In our study, we also considered the
17、 impact of two important factors, frequency and deepness of the features in the tree, and construct a new index for settling this problem. Meanwhile, we apply four classical classification methods to invalidate the results, and fiisc various bioinformatics database to explore a reasonable biological
18、 explanation for the experiment results. The discovery of disease related genes are provided with an important significance for discovery of target and the development of new drug. Fourthly, we also provide the ion channel data analysis technology CCTWC based on the coupled two-way clustering which
19、make a clustering analysis on ion channel#Abstractexpression profile from two directions, samples and features- This work wall get the different ion channel clusters according the gene expression level. That is to say, genes in one cluster always own the similar expression level Then we use the path
20、way construction software PathwayStudio to get the interaction between everf ion channel cluster, and analysis the distribution of tradition ion channel classes in the subtypes of ion channels found by disease heredity mechanism, also the interaction relationships between ion channels of different i
21、on channel gene clusters. Further more, discover the relationship between ion channel subtypes which found by similar expression style and the disease subtypes. On the whole, to aim directly at the feature of the ion channel data distributed dispersedly, this study construct the ion channel data ens
22、emble analysis flat from different data layers. And to set out from the ion channel gene expression profile, we use the ensemble decision approach to mine the disease related ion channel genes. Also we use the coupled two-way clustering approach to analyze the inner relationship between the subtypes
23、 of ion channel and the subtypes of diseases. These works provide a new view angle for the pathogenetic study of the complex channclopathyKeywords : Ion Channel; Eensemble Flat; Decision Forest; Coupled Two-Way Clustering第1章引言l.i课题研究的目的和意义离子通道是质膜上的一种特殊的蛋白质,分布于细胞膜以及线粒体、内质网 等细胞器的膜上。它们允许适当大小和电荷的离子以被动转运
24、的方式通过质膜, 产生膜电流,是神经、肌肉、腺体等许多组织细胞膜上产生兴奋、传输信号的基 本元件,参与动作电位传播、神经递质释放、肌肉收缩、激素分泌、细胞周期、 离子分布等多种生物学过程,对维持机体的下常生理功能至关蜇要。研究表明n,当编码离子通道亚单位的基因发生突变或者机体内出现针对 通道的内源性物质时,通道的功能将会被或多或少的增强或减弱,最终导致机体 生理功能紊乱,出现某些先天性或后天获得性的离子通道病疾(Channelopathy)。 目前发现的离子通道病己超过30种,主要包括心脏系统疾病如心律失常等,中 枢神经系统疾病如癫痫发作、共济失调、偏头疼等,动脉血管疾病如高血压等, 肺系统疾
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
10 积分
下载 | 加入VIP,下载更划算! |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 疾病 相关 离子 通道 识别 信息 融合 方法 研究