A Compiler Approach to Fast Design Space Exploration in FPGA-based Systems
Byoungro So
Mary Hall
Pedro Diniz
University of Southern California
Information Sciences Institute
bso@isi.edu
mhall@isi.edu
pedro@isi.edu
Abstract
This paper describes an automated approach to hardware design
space exploration, through a collaboration between parallelizing
compiler technology and high-level synthesis tools. The current
practice of mapping computations to custom hardware implementations
requires programmers to assume the role of hardware designers.
In tuning the performance of their hardware implementation,
designers manually apply loop transformations. For example,
loop unrolling is used to expose instruction-level parallelism
at the expense of more hardware resources for concurrent operator
evaluation. Because unrolling also increases the amount of data
a computation requires, too much unrolling can lead to a memory
bound implementation where resources are idle.
To negotiate the inherent hardware implementation space-time
trade-offs designers must engage in an iterative refinement
cycle, at each step manually applying and evaluating the
impact of their transformations. This design refinement is not
only error-prone and tedious but also prohibitively expensive
given the potential large search spaces and the current long
synthesis times.
We developed a compiler algorithm that enables designers
to effectively explore the large design spaces resulting
from the application of a wide variety of program transformations.
Our approach uses synthesis estimation techniques to quantitatively
evaluate the application of several loop transformations and derive
an optimized and feasible hardware implementation for the computation
in a loop nest. The result is a fast but sound approach to hardware
design space exploration. We have implemented this design space
exploration algorithm in the context of a compilation and synthesis
system. Our preliminary experimental results reveal that the current
design exploration compiler algorithm is very effective in determining
which and by how much should each of the enabled loop transformations
be applied.
Copyright Notice
Compressed PS Document,PDF Document