Synchronization Transformations for Parallel Computing
Pedro Diniz and Martin Rinard
Department of Computer Science,
University of California at Santa Barbara
Santa Barbara, CA 93106-5110
Abstract
As parallel computing becomes part of the mainstream computing environment,
compilers will need to apply the synchronization optimizations required to
deliver efficient parallel software. This paper describes a new framework
for synchronization optimizations and a new set of transformations for
programs that implement critical sections using mutual exclusion locks.
These transformations allow the compiler to move constructs that acquire and
release locks both within and between procedures and to eliminate acquire
and release constructs.
The paper also presents a new synchronization algorithm, lock elimination,
for reducing synchronization overhead. This optimization locates computations
that repeatedly acquire and release the same lock, then uses the
transformations to obtain equivalent computations that acquire and release
the lock only once. Experimental results from a parallelizing compiler for
object-based programs illustrate the practical utility of this optimization.
For three benchmark programs the optimization dramatically reduces the number
of times the computations acquire and release locks, which significantly
reduces the amount of time processors spend acquiring and releasing locks.
For one of the three benchmarks, the optimization always significantly
improves the overall performance. Depending on the number of processors
executing the computation, the optimized version runs between 2.11 and 1.83
times faster than the unoptimized version. For one of the other benchmarks,
the optimized version runs between 1.13 and 0.96 times faster than the
unoptimized version, with a mean 1.08 times faster. For the final benchmark,
the optimization reduces the overall performance.