HiDRA: Statistical Multi-dimensional Resource Discovery for Large-scale Systems

Resource discovery enables applications deployed in heterogeneous large-scale distributed systems to find resources that meet QoS requirements. In particular, most applications need resource requirements to be satisfied simultaneously for multiple resources (such as CPU, memory and network bandwidth). Due to dynamism in many large-scale systems, providing statistical guarantees on such requirements is important to avoid application failures and overheads. However, existing techniques either provide guarantees only for individual resources, or take a static or memoryless approach along multiple dimensions. We present HiDRA, a scalable resource discovery technique providing statistical guarantees for resource requirements spanning multiple dimensions simultaneously. Through trace analysis and a 307-node PlanetLab implementation, we show that HiDRA, while using over 1,400 times less data, performs nearly as well as a fully-informed algorithm, showing better precision and having recall within 3%. We demonstrate that HiDRA is a feasible, low-overhead approach to statistical resource discovery in a distributed system.