Effectiveness of Dynamic Resource Allocation for Handling Internet Flash Crowds

Internet data centers host multiple Web applications on shared hardware resources. These data centers are typically provisioned to meet the expected peak demands of the hosted applications based on normal time-of-day effects. Such an over-provisioning approach is not robust to flash crowd scenarios, where the load increase of some hosted applications is much higher than their expected peak loads. In such scenarios, data centers can utilize their resources better by employing dynamic resource allocation. In this paper, we present a prototype data center implementation that we use to study the effectiveness of dynamic resource allocation for handling flash crowds with different characteristics. This prototype implements a multi-tiered server architecture along with mechanisms for monitoring, load detection, load balancing and dynamic allocation. Our experiments with this prototype show that a carefully designed dynamic allocation scheme can be effective for handling flash crowds. We show that in order to handle very sharp growth in loads, a dynamic allocation scheme must be either extremely responsive or employ low overhead mechanisms such as using hot spare servers. On the other hand, gradually increasing flash crowds can be handled equally well with larger overheads and slower reaction times. We also show that even in the presence of large allocation overhead, it is possible to achieve the same application performance by either allocating multiple servers simultaneously or allocating a few servers often. Using our results, we conclude that even without large-scale over-provisioning, it is possible to effectively handle flash crowd conditions using a dynamic allocation scheme that responds quickly to workload changes, and that can mask large allocation overheads either by deploying a few ready servers or by allocating multiple servers simultaneously.