A Parallel Hill-Climbing Refinement Algorithm for Graph Partitioning
Graph partitioning is an important step in distributing workloads on parallel compute systems, sparse matrix re-ordering, and VLSI circuit design. Producing high quality graph partitionings while effectively utilizing available CPU power is becoming increasingly challenging due to the rising number of cores per processor. This not only increases the amount of parallelism required of the partitioner, but also the degree partitionings it is to generate. In this work we present a new shared-memory parallel k-way method of refining an existing partitioning that can break out of local minima. Our method matches the quality of the current high-quality serial refinement methods, and achieves speedups of 5.7-16.7x using 24 threads, while exhibiting only 0.52% higher edgecuts than when run serially. This is 6.3x faster than other parallel refinement methods.