Data Mining Based Predictive Models for Overall Health Indices

Date of Submission: 
April 14, 2010
Report Number: 
Report PDF: 
In this study, we infer health care indices of individuals using their pharmacy medical and prescription claims. Specifically, we focus on the widely used Charlson Index. We use data mining techniques to formulate the problem of classifying Charlson Index (CI) and build predictive models to predict individual health index score. First, we present comparative analyses of several classification algorithms. Second, our study shows that certain ensemble algorithms lead to higher prediction accuracy in comparison to base algorithms. Third, we introduce cost-sensitive learning to the classification algorithms and show that the inclusion of cost-sensitive learning leads to improved prediction accuracy. The built predictive models can be used to allocate health care resources to individuals. It is expected to help reduce the cost of health care resource allocation and provisioning and thereby allow countries and communities lacking the ability to afford high health care cost to provide health indices (coverage), provide individuals with health index which takes into consideration their overall health and thereby improve quality of individual health assessment (quality), and improve reliability of decision making by focusing on a set of objective criteria for all individuals (reliability).