Reconstructing Disease Phenome-genome Association by Bi-Random Walk

Date of Submission: 
August 16, 2013
Report Number: 
Report PDF: 
Promising results were recently reported in utilizing network information in phenotype-similarity network and gene-interaction network with graph-based learning to derive new disease phenotype-gene associations. However, a more fundamental understanding of how the network information is relevant to disease phenotype-gene associations is lacking. In this paper, we analyze the circular bigraphs (CBGs) in OMIM phenotype-gene association networks, and introduce a bi-random walk (BiRW) algorithm to capture the CBG patterns in the networks for unveiling the associations between the complete collection of disease phenotypes (phenome) and genes. BiRW performs separate random walk simultaneously on gene interaction network and phenotype similarity network to explore gene paths and phenotype paths in CBGs of different sizes. In the analysis of OMIM associations, we discovered that 81% of the associations are covered by CBG patterns of path-length up to 3 with variability by 21 disease classes, and there is a clear correlation between the CBG coverage and the predictability of the phenotype-gene associations. Some prominent examples are cancers, nutritional diseases, dermatological diseases, bone diseases, cardiovascular diseases and respiratory diseases. Experiments on recovering known associations in cross-validation and predicting new associations in a test set validated that BiRW effectively improved prediction performance over existing methods by ranking more known associations in the top 100 out of more than 12,000 candidate genes. The investigation of the global disease phenome-genome association map also revealed interesting new predictions and phenotype-gene modules by disease classes. Availability: