Scalable Transaction Management with Serializable Snapshot Isolation on HBase

Date of Submission: 
February 16, 2012
Report Number: 
12-004
Report PDF: 
Abstract: 
Key-value based data storage systems such as HBase and Bigtable provide high scalability and availability compared to traditional relational databases. However, unlike relational databases, the existing key-value stores provide only limited transactional functionality, such as single-row transactions. In this paper, we address the problem of building scalable transaction management mechanisms for multi-row ACID transactions on data storage systems such as HBase. To support scalability and availability, the transaction management functions are decoupled from the data storage system. Furthermore, our design does not depend on any central transaction management layer, instead these functions and the related recovery actions are performed in a decentralized and cooperative manner by the application level processes executing the transactions. Our protocol uses snapshot-isolation model as it provides more concurrency. Since the basic snapshot isolation model does not guarantee serializability, our protocol uses a technique based on identifying dependency cycles amongst transactions to avoid serialization anomalies. The protocol for supporting serializability is also performed in a decentralized manner. We demonstrate the scalability and robustness of our approach as well as the correctness of the protocol in ensuring serializable executions of transactions on HBase. We also present and evaluate an alternative approach based on a hybrid model where certain functions, such as conflict detection, are performed by a dedicated service.