HiDRA: Statistical Multi-dimensional Resource Discovery for Large-scale Systems

Date of Submission: 
December 5, 2008
Report Number: 
Report PDF: 
Resource discovery enables applications deployed in heterogeneous large-scale distributed systems to find resources that meet their execution requirements. In particular, most applications need resource requirements to be satisfied simultaneously for multiple resources (such as CPU, memory and network bandwidth). Due to the inherent dynamism in many large-scale systems caused by factors such as load variations, network congestion, and churn, providing statistical guarantees on such resource requirements is important to avoid application failures and overheads. However, existing resource discovery techniques either provide statistical guarantees only for individual resources, or take a static or memoryless approach to meeting resource requirements along multiple dimensions. In this paper, we present HiDRA, a resource discovery technique providing statistical guarantees for resource requirements spanning multiple dimensions simultaneously. Our technique takes advantage of the multivariate normal distribution for the probabilistic modeling of resource capacity over multiple dimensions. Through analysis of PlanetLab traces, we show that HiDRA performs nearly as well as a fully-informed algorithm, showing better precision and having recall within 3% of such an algorithm. We have also deployed HiDRA on a 307-machine PlanetLab testbed, and our live experiments on this testbed demonstrate that HiDRA is a feasible, low-overhead approach to statistical resource discovery in a distributed system.