Storing Dynamic Graphs: Speed vs. Storage Trade-offs
Date of Submission:
July 30, 2014
With the ever increasing size and availability of web and social graph comes the need for compact and efficient data structures in which to store them. The problem of compact representations for sparse static graphs is a well studied problem in the domain of web and online social graphs, namely the WebGraph Framework and the Layered Label Propagation ordering for extending WebGraph to online social networks, among others. While these techniques do a satisfactory job in the context of static sparse graphs, there is little literature on how these techniques can be extended to dynamic sparse graphs. In this paper, we present algorithms and experimental analysis for five data structures for representing dynamic sparse graphs: Linked-List (LL), Batch Compressed Sparse Row (BCSR), Dynamic Adjacency Array (DAA), Dynamic Intervalized Adjacency Array (DIAA), and Dynamic Compressed Adjacency Array (DCAA). The goal of the presented data structures is two fold. First, the data structures must be compact, as the size of the graphs being operated on continues to grow to less manageable sizes. Second, the cost of operating on the data structures must be within a small factor of the cost of operating on the static graph, else these data structures will not be useful. Of these five algorithms, LL, BCSR, and DAA are baseline approaches, DCAA is semi-compact, but suited for fast operation, and DIAA is focused on compactness and is a dynamic extension of the WebGraph Framework. Our results show that for well intervalized graphs, like web graphs, DCAA is superior to all other data structures in terms of memory and access time. Furthermore, we show that in terms of memory, the compact data structure DIAA outperforms all other data structures at the cost of a modest increase in update and access time.