Synchronization Transformations for Parallel Computing

Pedro Diniz and Martin Rinard


Department of Computer Science,
University of California at Santa Barbara
Santa Barbara, CA 93106-5110

Abstract

As parallel computing becomes part of the mainstream computing environment, compilers will need to apply the synchronization optimizations required to deliver efficient parallel software. This paper describes a new framework for synchronization optimizations and a new set of transformations for programs that implement critical sections using mutual exclusion locks. These transformations allow the compiler to move constructs that acquire and release locks both within and between procedures and to eliminate acquire and release constructs. The paper also presents a new synchronization algorithm, lock elimination, for reducing synchronization overhead. This optimization locates computations that repeatedly acquire and release the same lock, then uses the transformations to obtain equivalent computations that acquire and release the lock only once. Experimental results from a parallelizing compiler for object-based programs illustrate the practical utility of this optimization. For three benchmark programs the optimization dramatically reduces the number of times the computations acquire and release locks, which significantly reduces the amount of time processors spend acquiring and releasing locks. For one of the three benchmarks, the optimization always significantly improves the overall performance. Depending on the number of processors executing the computation, the optimized version runs between 2.11 and 1.83 times faster than the unoptimized version. For one of the other benchmarks, the optimized version runs between 1.13 and 0.96 times faster than the unoptimized version, with a mean 1.08 times faster. For the final benchmark, the optimization reduces the overall performance.
Postscript Document