Selections of data preprocessing met

時間：2023-04-26 10:13:56 自然科學論文我要投稿

相關推薦

Selections of data preprocessing methods and similarity metrics for gene cluster analysis

Clustering is one of the major exploratory techniques for gene expression data analysis. Only with suitable similarity metrics and when datasets are properly preprocessed, can results of high quality be obtained in cluster analysis. In this study, gene expression datasets with external evaluation criteria were preprocessed as normalization by line, normalization by column or logarithm transformation by base-2, and were subsequently clustered by hierarchical clustering, k-means clustering and self-organizing maps (SOMs) with Pearson correlation coefficient or Euclidean distance as similarity metric. Finally, the quality of clusters was evaluated by adjusted Rand index. The results illustrate that k-means clustering and SOMs have distinct advantages over hierarchical clustering in gene clustering, and SOMs are a bit better than k-means when randomly initialized. It also shows that hierarchical clustering prefers Pearson correlation coefficient as similarity metric and dataset normalized by line. Meanwhile, k-means clustering and SOMs can produce better clusters with Euclidean distance and logarithm transformed datasets. These results will afford valuable reference to the implementation of gene expression cluster analysis.

作者： YANG Chunmei WAN Baikun GAO Xiaofeng 作者單位： YANG Chunmei,WAN Baikun(Department of Biomedical Engineering and Scientific Instrumentations, Tianjin University, Tianjin 300072, China)

GAO Xiaofeng(Motorola (China) Electronics Ltd., Tianjin 300457, China)

刊名：自然科學進展（英文版） SCI 英文刊名： PROGRESS IN NATURAL SCIENCE 年，卷(期)： 2006 16(6) 分類號： N1 關鍵詞： gene expression cluster analysis data preprocessing similarity metrics Rand index

【Selections of data preprocessing met】相關文章：

I met you online05-04

知識管理系統(tǒng)Data Solution研發(fā)日記之三文檔解決方案04-28

亚洲免费人人妻人人,cao78在线视频,福建一级毛片,91精品视频免费观看,高清另类图片操逼,日本特黄特色大片免费看,超碰欧美人人澡曰曰澡夜夜泛

Selections of data preprocessing met