Exploring Speculative Parallelism in SPEC2006
Date of Submission:
November 4, 2008
Computer industry has adopted multi-threaded and multi-core architectures as the clock rate increase stalled in early 2000's. It was hoped that the continuous improvement of single-program performance could be achieved through these architectures. However, traditional parallelizing compilers often fail to effectively parallelize general-purpose applications which typically have complex control flow and excessive pointer usage. Recently hardware techniques like Transactional Memory (TM) and Thread-Level Speculation (TLS) have been proposed to simplify the task of parallelization by using speculative threads. Potential of speculative parallelism in general-purpose applications like SPEC CPU 2000 have been well studied and have shown to be moderately successful. Preliminary work that examined the potential parallelism in SPEC2006 deployed parallel threads with a restrictive TLS execution model and limited compiler support, and thus showed only limited performance potential. In this paper, we first analyze the cross-iteration dependence behavior of SPEC 2006 benchmarks and show that more parallelism potential is available in SPEC 2006 benchmarks, comparing against SPEC2000. Further, we use a state-of-the-art profile-driven TLS compiler to identify loops that can be speculatively parallelized. Overall, we found an average speedup of 60% on four cores over what could be achieved by a traditional parallelizing compiler such as Intel.s ICC compiler on such benchmarks. We also found that an additional 11% improvement could be obtained on selected benchmarks using 8 cores when we extend TLS on multiple loop levels as opposed to restricting TLS only on a single loop level.