University of Southern California

Publications

Selected publications

1. VLSI Group – Jeffrey Draper et al

Journal Papers

Michael Bajura, et al, Models and Algorithmic Limits for an ECC-Based Approach to Hardening Sub-100nm SRAMs, IEEE Transactions on Nuclear Science , Vol 54, Number 4, August 2007, pp. 935 - 945

Sumit Mediratta, Jeffrey Draper, Achieving On-chip Fault-tolerance Utilizing BIST Resources, WSEAS Transactions on Circuits and Systems, Issue 12, Volume 5, December 2006, pp. 1726 - 1733

Jeffrey Draper, et al, A Prototype Processing-in-Memory (PIM) Chip for the Data-Intensive Architecture (DIVA) System, Journal of VLSI Signal Processing , Vol 40, Number 1, May 2005, pp. 73 - 84

Joong-Seok Moon, et al, Voltage-Pulse Driven Harmonic Resonant Rail Drivers for Low-Power Applications, IEEE Transactions on Very Large Scale Integration (VLSI) Systems , Vol 11, Number 5, October 2003, pp. 762 - 777

Herming Chiueh, Jeffrey Draper, John Choma, Jr., A Dynamic Thermal Management Circuit for System-On-Chip Designs, Analog Integrated Circuits and Signal Processing , Vol 36, Issue 1-2, July - August 2003, pp. 175 - 181

Herming Chiueh, Jeffrey Draper, John Choma, Jr., A Programmable Thermal Management Interface Circuit for PowerPC Systems, Microelectronics Journal , Vol 32/10-11, 2001, pp. 875 - 881

Louis Luh, John Choma, Jr., and Jeffrey Draper, A High-Speed Fully Differential Current Switch, IEEE Transactions on Circuits and Systems, April 2000, pp. 358 - 63

Louis Luh, John Choma, Jr., and Jeffrey Draper, A Continuous-Time Common-Mode Feedback Circuit (CMFB) for High-Impedance Current-Mode Applications, IEEE Transactions on Circuits and Systems, April 2000, pp. 363 - 369

Jeffrey T. Draper and Joydeep Ghosh, A Comprehensive Analytical Model for Wormhole Routing in Multicomputer Systems, Journal of Parallel and Distributed Computing, November 1994, pp. 202 - 214

Jeffrey T. Draper and Joydeep Ghosh, The M-Cache: A Message-Handling Mechanism for Multicomputer Systems, Parallel Computing, September 1994, pp. 1269 - 1288

Joydeep Ghosh, Kelvin Goveas, and Jeffrey T. Draper, Performance Evaluation of a Parallel I/O Subsystem for a Hypercube Multicomputer, Journal of Parallel and Distributed Computing, January 1993, pp. 190 - 206

________________________________________

Conference Papers/Presentations

Taek-Jun Kwon, Jeff Sondeen, Jeffrey Draper, Floating-Point Division and Square Root using a Taylor-Series Expansion Algorithm, Proceedings of the 50th IEEE International Midwest Symposium on Circuits and Systems, August 2007, pp. 305 - 308

Young Hoon Kang, Jeffrey Draper, Design Trade-offs for Load/Store Buffers in Embdedded Processing Environments, Proceedings of the 50th IEEE International Midwest Symposium on Circuits and Systems, August 2007, pp. 1461 - 1464

Rashed Zafar Bhatti, Monty Denneau, Jeff Draper, Data Strobe Timing of DDR2 using a Statistical Random Sampling Technique, Proceedings of the 50th IEEE International Midwest Symposium on Circuits and Systems, August 2007, pp. 1114 - 1117

Rashed Zafar Bhatti, Keith Chugg, Jeff Draper, Standard Cell based Pseudo-Random Clock Generator for Statistical Random Sampling of Digital Signals, Proceedings of the 50th IEEE International Midwest Symposium on Circuits and Systems, August 2007, pp. 1110 - 1113

Sumit Mediratta, Jeffrey Draper, Performance Evaluation of Probe-Send Fault-tolerant Network-on-Chip Router, Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, July 2007, pp. 69 - 75

Riaz Naseer, et al, Critical Charge Characterization for Soft Error Rate Modeling in 90nm SRAM, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2007, pp. 1879 - 1882

Sumit Mediratta, Jeffrey Draper, Characterization of a Fault-tolerant NoC Router, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2007, pp. 381 - 384

Riaz Naseer, et al, Critical Charge and SET Pulse Widths for Combinational Logic in Commercial 90nm CMOS Technology Proceedings of the 2007 ACM Great Lakes Symposium on VLSI, March 2007, pp. 227 - 230

Sumit Mediratta, Jeffrey Draper, Effective Realization of On-chip Fault-tolerance Utilizing BIST Resources, Proceedings of the 5th WSEAS Int. Conf. on Circuits, Systems, Electronics, Control & Signal Processing, November 2006

Riaz Naseer, Rashed Zafar Bhatti, Jeff Draper, Analysis of Soft Error Mitigation Techniques for Register Files in IBM Cu-08 90nm Technology, Proceedings of the 49th IEEE International Midwest Symposium on Circuits and Systems, August 2006

Sumit Mediratta, Jeffrey Draper, On-chip Fault-tolerance Utilizing BIST Resources, Proceedings of the 49th IEEE International Midwest Symposium on Circuits and Systems, August 2006

Rashed Zafar Bhatti, Craig Steele, Jeff Draper, PBuf: An On-Chip Packet Transfer Engine for MONARCH, Proceedings of the 49th IEEE International Midwest Symposium on Circuits and Systems, August 2006

Riaz Naseer, Jeff Draper, DF-DICE: A Scalable Solution for Soft Error Tolerant Circuit Design, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2006, pp. 3890 - 3893

Rashed Bhatti, Monty Denneau, Jeff Draper, Phase Measurement and Adjustment of Digital Signals Using Random Sampling Technique, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2006, pp. 3886 - 3889

Tim Barrett, et al, A Double-Data Rate (DDR) Processing-in-Memory (PIM) Device with WideWord Floating-Point Capability, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2006, pp. 1933 - 1936

Rashed Zafar Bhatti, Monty Denneau, Jeff Draper, 2 Gbps SerDes Design Based on IBM Cu-11 (130nm) Standard Cell Technology, Proceedings of the 2006 ACM Great Lakes Symposium on VLSI, May 2006, pp. 198 - 203

Sumit Mediratta, Jeffrey Draper, Performance Analysis of User-Level PIM Communication in the Data-Intensive Architecture (DIVA) System, Proceedings of the 12th International Conference on High Performance Computing (HiPC 2005), December 2005, pp. 407 - 419

Rashed Zafar Bhatti, Monty Denneau, Jeff Draper, Duty Cycle Measurement and Correction Using a Random Sampling Technique, Proceedings of the 48th IEEE International Midwest Symposium on Circuits and Systems, August 2005

Riaz Naseer, Jeff Draper, The DF-DICE Storage Element for Immunity to Soft Errors, Proceedings of the 48th IEEE International Midwest Symposium on Circuits and Systems, August 2005

Taek-Jun Kwon, Jeff Sondeen, Jeff Draper, Design Trade-Offs in Floating-Point Unit Implementation for Embedded and Processing-In-Memory Systems, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2005, pp. 3331-3334

Sumit Mediratta, Craig Steele, Jeff Sondeen, Jeffrey Draper, An Area-Efficient and Protected Network Interface for Processing-In-Memory Systems, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2005, pp. 2951-2954

Sumit Mediratta, Craig Steele, Ravinder Singh, Jeff Sondeen, Jeffrey Draper, A 0.18um CMOS Implementation of an Area Efficient Precise Exception Handling Unit for Processing-In-Memory Systems, Proceedings of the 47th IEEE International Midwest Symposium on Circuits and Systems, July 2004, Vol. III, pp. 455-458

Taek-Jun Kwon, Joong-Seok Moon, Jeff Sondeen, Jeff Draper, A 0.18um Implementation of a Floating-Point Unit for a Processing-In-Memory System, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2004, Vol. II, pp. 453-456

Sumit Mediratta, Jeff Sondeen, Jeffrey Draper, An Area-Efficient Router for the Data-Intensive Architecture (DIVA) System, Proceedings of the 17th International Conference on VLSI Design , January 2004

Joong-Seok Moon, Taek-Jun Kwon, Jeff Sondeen, Jeff Draper, An Area-Efficient Standard-Cell Floating-Point Unit Design for a Processing-In-Memory System, Proceedings of the 29th European Solid-State Circuit Conference , September 2003

Jeffrey Draper, Jeff Sondeen, Chang Woo Kang, Implementation of a 256-bit WideWord Processor for the Data-Intensive Architecture (DIVA) Processing-In-Memory (PIM) Chip, Proceedings of the 28th European Solid-State Circuit Conference , September 2002

Herming Chiueh, Jeffrey Draper, Sumit Mediratta, Jeff Sondeen, The Address Translation Unit of the Data-Intensive Architecture (DIVA) System, Proceedings of the 28th European Solid-State Circuit Conference , September 2002

Jeffrey Draper, Jeff Sondeen, Sumit Mediratta, Ihn Kim, Implementation of a 32-bit RISC Processor for the Data-Intensive Architecture Processing-In-Memory Chip, Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, July 2002

Jeff Draper, et al, The Architecture of the DIVA Processing-In-Memory Chip, Proceedings of the International Conference on Supercomputing, June 2002

Joong-Seok Moon, William C. Athas, Peter A. Beerel, Jeffrey T. Draper, Low-Power Sequential Access Memory Design, Proceedings of the IEEE Custom Integrated Circuits Conference, May 2002

Herming Chiueh, Jeffrey Draper, John Choma, Jr., A Dynamic Thermal Management Circuit for System-on-Chip Designs, Proceedings of the International IEEE Conference on Electronics, Circuits, and Systems, September 2001

Herming Chiueh, Jeffrey Draper, John Choma, Jr., A Thermal Management System and Prototyping for System-on-Chip Designs, Proceedings of the 2001 Southwest Symposium on Mixed-Signal Design, February 2001

Herming Chiueh, Jeffrey Draper, John Choma, Jr., A Programmable Thermal Management Interface Circuit for PowerPC Systems, Proceedings of the 6th International Workshop on Thermal Investigation of ICs and Systems, September 2000

Louis Luh, John Choma, Jr., Jeffrey Draper, A 400MHz 5th-Order CMOS Continuous-Time Switched-Current Sigma-Delta Modulator, Proceedings of the European Solid-State Circuits Conference, September 2000

Herming Chiueh, Jeffrey Draper, and John Choma, Jr., Implementation of a Temperature Monitoring Interface Circuit for PowerPC Systems Proceedings of the IEEE Midwest Symposium on Circuits and Systems, August 2000

Chang Woo Kang and Jeffrey Draper, A Fast, Simple Router for the Data-Intensive Architecture (DIVA) System, Proceedings of the IEEE Midwest Symposium on Circuits and Systems, August 2000

Louis Luh, Jeffrey Draper, and John Choma, Jr., Performance Optimization for High-Order Continuous-Time Sigma-Delta Modulators with Extra Loop Delay, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2000

Louis Luh, Jeffrey Draper, and John Choma, Jr., A Zener-Diode-Activated ESD Protection Circuit for Sub-Micron CMOS Processes, Proceedings of the IEEE International Symposium on Circuits and Systems, May 2000

Herming Chiueh, Jeffrey Draper, and John Choma, Jr., A Novel Fully Integrated Fan Controller for Advanced Computer Systems Proceedings of the Southwest Symposium on Mixed-Signal Design, February 2000

Mary Hall, et al, Mapping Irregular Applications to DIVA, a PIM-based Data-Intensive Architecture, Proceedings of Supercomputing, November 1999

Louis Luh, John Choma, Jr., Jeffrey Draper, Herming Chiueh, A High-Speed CMOS On-Chip Temperature Sensor, Proceedings of the European Solid-State Circuits Conference, September 1999, pp 290-3

Louis Luh, John Choma, Jr., Jeffrey Draper, Herming Chiueh, A High-Speed Digital Comb Filter for Sigma-Delta Analog-to-Digital Conversion, Proceedings of the IEEE Midwest Symposium on Circuits and Systems, August 1999

Louis Luh, John Choma, Jr., and Jeffrey Draper, Circuit Design Challenges for High-Speed CMOS Continuous-Time Switched-Current Sigma-Delta Modulators, Proceedings of the IEEE Midwest Symposium on Circuits and Systems, August 1999

Louis Luh, Jeffrey Draper, and John Choma, Jr., A Self-Sensing Tristate Pad Driver for Control Signals of Multiple Bus Controllers, Proceedings of the IEEE International Symposium on Circuits and Systems, May 1999

Louis Luh, John Choma, and Jeffrey Draper, Area-Efficient Area Pad Design for High Pin-Count Chips, Proceedings of the Great Lakes Symposium on VLSI, March 1999, pp. 78-81

Herming Chiueh, Jeffrey Draper, Louis Luh, and John Choma, A Novel Model for On-Chip Heat Dissipation, Proceedings of the 1998 IEEE Asia-Pacific Conference on Circuits and Systems, November 1998, pp. 779-82

Louis Luh, John Choma, and Jeffrey Draper, A Continuous-Time Common-Mode Feedback Circuit (CMFB) for High-Impedance Current-Mode Application, Proceedings of the 5th IEEE International Conference on Electronics, Circuits, and Systems, September 1998, Vol. 3, pp. 347-50

Louis Luh, John Choma, and Jeffrey Draper, A High-Speed Fully Differential Current Switch, Proceedings of the 5th IEEE International Conference on Electronics, Circuits, and Systems, September 1998, Vol. 3, pp. 343-6

Herming Chiueh, Jeffrey Draper, Louis Luh, and John Choma, A Thermal Evaluation of Integrated Circuits: On-Chip Offset Temperature Measurement and Modeling, Proceedings of the Second International Workshop on Design of Mixed-Mode Integrated Circuits and Applications, July 1998, pp. 109-13

Louis Luh, John Choma, and Jeff Draper, A 50MHz Continuous-Time Switched-Current Sigma Delta Modulator, Proceedings of the 1998 IEEE International Symposium on Circuits and Systems, June 1998, Vol. I, pp. 579-82

Craig S. Steele, Jeff Draper, and Jeff Koller, SafetyNet: Secure Communications for Embedded High-Performance Computing, Lecture Notes in Computer Science 1388 (IPPS/SPDP'98 Workshops Proceedings), April 1998, pp. 908-12

Jeff Draper, Jay Block, Jeff Koller, and Craig Steele, Thermal Management in Embedded Systems Using MEMS, Lecture Notes in Computer Science 1388 (IPPS/SPDP'98 Workshops Proceedings), April 1998, pp. 900-1

Louis Luh, John Choma, and Jeff Draper, A Continuous-Time Switched-Current Sigma Delta Modulator with Reduced Loop Delay, Proceedings of the Great Lakes Symposium on VLSI, February 1998, pp. 286-91

Craig S. Steele, Jeff Draper, Jeff Koller, and Claire LaCour, A Bus-Efficient Low-Latency Network Interface for the PDSS Multicomputer, Proceedings of the International Symposium on High Performance Distributed Computing, August 1997, pp. 213-22

Jeff Draper and Fabrizio Petrini, Routing in Bidirectional k-ary n-cubes with the Red Rover Algorithm, Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, June 1997, pp. 1184-93

J. Koller, J. Block, J. Draper, C. Lacour, and C. Steele, Lessons from Three Generations of Embeddable Supercomputers, Presented at the Second International Workshop on Embedded HPC Systems and Applications, Geneva, Switzerland, April 1997

Jeff Draper, The Red Rover Algorithm for Deadlock-Free Routing on Bidirectional Rings, Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, August 1996, pp. 345-54

Jeffrey T. Draper and Joydeep Ghosh, Multipath E-Cube Algorithms (MECA) for Adaptive Wormhole Routing and Broadcasting in k-ary n-cubes, Proceedings of the Sixth International Parallel Processing Symposium, March 1992, pp. 407-10

Jeffrey T. Draper, Joydeep Ghosh, and William C. Athas, The M-Cache: A Message-Retrieving Mechanism for Multicomputer Systems, Proceedings of the IEEE Symposium on Parallel and Distributed Processing, December 1991, pp. 258-65

 

2. Compiler group – Mary Hall et al

2008

Model-Guided Performance Tuning of Parameter Values: A Case Study with Molecular Dynamics Visualization
Yoonju Lee Nelson, Bhupesh Bansal, Mary Hall, Aiichiro Nakano, and Kristina Lerman
In Proc. of the Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS), held in conjuction with IPDPS '08, April, 2008

The Potential of Computation Reuse in High-Level Optimization of a Signal Recognition System
Melina Demertzi, Pedro C. Diniz, Mary W. Hall, Anna C. Gilbert and Yi Wang
In Proc. of the Workshop on Next Generation Software, held in conjuction with IPDPS '08, April, 2008

Designing and Parameterizing a Workflow for Optimization: A Case Study in Biomedical Imaging
Vijay Kumar, Mary Hall, Jihie Kim, Yolanda Gil, Tahsin Kurc, Ewa Deelman and Joel Saltz
In Proc. of the Workshop on Next Generation Software, held in conjuction with IPDPS '08, April, 2008

2007

Compiler-assisted performance tuning LINK
Chun Chen, Jacqueline Chame, Yoonju Lee Nelson, Pedro Diniz, Mary Hall, and Robert Lucas
In Proc. of the Journal of Physics: Conference Series Volume 78, 012024 (10pp), 2007.

A Combined Hardware/Software Optimization Framework for Signal Representation and Recognition
Melina Demertzi, Pedro C. Diniz, Mary W. Hall, Anna C. Gilbert and Yi Wang
In Proc. of the 2007 Data-Driven Dynamic Application Systems (DDDAS) Workshop held in conjunction with the Intl. Conference on Computatinal Science (ICCS'07), Beijing, China, May 2007.

Partial Data Reuse for Windowing Computations: Performance Modeling for FPGA Implementations
Joonseok Park and Pedro C. Diniz
In Proc. of the 2007 IEEE Intl. Workshop on Applied Reconfigurable Computing (ARC'07), Published in Lecture Notes in Computer Science (LNCS), Vol. 4419, pp. 97-109, March 2007.

Model-guided empirical optimization for multimedia extension architectures: A case study.
Chun Chen, Jaewook Shin, Shiva Kintali, Jacqueline Chame, and Mary Hall.
In Proceedings of the Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL'07), held in conjunction with IPDPS'07, March 2007

Intelligent Optimization of Parallel and Distributed Applications
Bhupesh Bansal, Umit Catalyurek, Jacqueline Chame, Chun Chen, Ewa Deelman, Yolanda Gil, Mary Hall, Vijay Kumar, Tahsin Kurc, Kristina Lerman, Aiichiro Nakano, Yoon-ju Lee Nelson, Joel Saltz, Ashish Sharma, Priya Vashishta
In Proc. of the Workshop on Next Generation Software, held in conjuction with IPDPS '07, March, 2007

2006

Memory Parallelism using Custom Array Mapping to Heterogeneous Storage Structures
Nastaran Baradaran and Pedro C. Diniz
To appear in the Intl. Conference on Field-Programmable Logic and its Applications (FPL'06), Aug 2006.

An Overview of the ECO Project
Jacqueline Chame, Chun Chen, Pedro Diniz, Mary Hall, Yoon-Ju Lee, and Robert Lucas
In Proc. of the Workshop on Next Generation Software, held in conjuction with IPDPS '06, April, 2006

2005

Compiler-Directed Design Space Exploration for Caching and Prefetching Data in High-level Synthesis
Nastaran Baradaran and Pedro C. Diniz
In Proc. of the IEEE Intl. Conference on Field-Programmable Technology (FPT'05), Dec. 2005.

Array Replication to Increase Parallelism in Applications Mapped to Configurable Architectures
Heidi Ziegler, Priyadarshini Malusare and Pedro C. Diniz
In Proc. of the 18th Workshop on Languages and Compilers for Parallel Computing (LCPC'05) Oct. 2005.

A Systematic Approach to Model-Guided Empirical Search for Memory Hiearchy Optimization
C. Chen, J. Chame, M. W. Hall, and K. Lerman
In Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC'05), Oct. 2005.

Automatic mapping of C to FPGAs with the DEFACTO compilation and synthesis systems
Pedro C. Diniz, Mary Hall, Joonseok Park, Byoungro So and Heidi Ziegler
In Elsevier Journal on Microprocessors and Microsystems, Volume 29, Issues 2-3, 1 April 2005, Pages 51-62

Empirical Optimization for a Sparse Linear Solver: A Case Study LINK
Yoon-Ju Lee, Pedro Diniz, Mary Hall, and Robert Lucas
In Proc. of the International Journal of Parallel Programming (IJPP), Vol 33, Num 2/3, pp. 165-181, 2005

Superword-Level Parallelism in the Presence of Control Flow
Jaewook Shin, Mary W. Hall, Jacqueline Chame
International Symposium on Code Generation and Optimization(CGO), Mar. 2005

Combining Models and Guided Empirical Search to Optimize for Multiple Levels of the Memory Hierarchy
Chun Chen, Jacqueline Chame, Mary Hall
International Symposium on Code Generation and Optimization(CGO), Mar. 2005

A Register Allocation Algorithm in the Presence of Scalar Replacement for Fine-Grain Configurable Architectures
Nastaran Baradaran and Pedro C. Diniz
Proc. of Design Automation and Test in Europe (DATE'05), Mar 2005.

Evaluating Heuristics in Automatically Mapping Multi-Loop Applications to FPGAs
Heidi Ziegler and Mary Hall
International Symposium on Field-Programmable Gate Arrays (FPGA 2005), Feb. 2005

Exploiting Data Reuse in Modern FPGAs: Opportunities and Challenges for Compilers
Nastaran Baradaran and Pedro C. Diniz
Invited Paper. In Proc. of International Workshop on Applied Reconfigurable Computing (ARC 05), Feb 2005.

2004

Evaluating Compiler Technology for Control-Flow Optimizations for Multimedia Extension
Jaewook Shin, Mary W. Hall, Jacqueline Chame
6th Workshop on Media and Streaming Processors (MSP6), Dec. 2004
Held in conjunction with the 37th International Symposium on Microarchitecture

Compiler Reuse Analysis for the Mapping of Data in FPGAs with RAM Blocks
Nastaran Baradaran, Pedro Diniz and Joonseok Park
Proc. of the Intl. Conference on Field-Programmable Technology (FPT'04), Dec. 2004.

Performance and Area Modeling of Complete FPGA Designs in the Presence of Loop Transformations
Joonseok Park, Pedro Diniz and K. R. Shesha Ragunhatan
IEEE Transactions on Computers (Special Issue on Field-Programmable Logic), Nov. 2004.

A Code Isolator: Isolating Code Fragments from Large Programs LINK
Yoon-Ju Lee and Mary Hall
Proceedings of the 17th Workshop on Languages and Compilers for Parallel Computing (LCPC'04) Sep. 2004

Extending the Applicability of Scalar Replacement for Multiple Induction Variables
Nastaran Baradaran, Pedro Diniz and Joonseok Park
Proceedings of the 17th Workshop on Languages and Compilers for Parallel Computing (LCPC'04) Sep. 2004

Data Reuse in Configurable Architectures with RAM Blocks: Extended Abstract
Nastaran Baradaran, Joonseok Park, and Pedro C. Diniz
In Proc. of the Intl. Conference on Field-Programmable Logic and its Applications (FPL'04), Aug 2004.

Increasing the applicability of scalar replacement
Byoungro So and Mary Hall
International Conference on Compiler Construction (CC), April 2004

A Case Study Using Empirical Optimization for a Large, Engineering Application
Pedro Diniz, Yoon-Ju Lee, Mary Hall, and Robert Lucas
In Proc. of the Workshop on Next Generation Software, held in conjuction with IPDPS '04, April, 2004

Custom Data Layout for Memory Parallelism
Byoungro So, Mary Hall, and Heidi Ziegler
To Appear in the Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04), Palo Also, CA, March 20-24, 2004

2003

Exploiting Superword-Level Locality in Multimedia Extension Architectures
Jaewook Shin, Jacqueline Chame, and Mary W. Hall
Journal of Instruction Level Parallelism(JILP), vol. 5(2003), pp. 1-28

Using Estimates from Behavioral Synthesis Tools in Compiler-Directed Design Space Exploration
Byoungro So, Pedro Diniz, and Mary Hall
In Proc. of the 40th Design Automation Conference (DAC)

Compiler-Generated Communication for Pipelined FPGA Applications
Heidi Ziegler, Mary Hall, and Pedro Diniz
In Proc. of the 40th Design Automation Conference (DAC)

Performance and Area Modeling of Complete FPGA Designs in the presence of Loop Transformations
K.R. Shayee and Joonseok Park and Pedro Diniz
In Proc. of the 13th International Conference on Field Programmable Logic and Applications (FPL 2003), published as Lecture Notes in Computer Science (LNCS 2778), 2003

ECO: an Empirical-based Compilation and Optimization System
Nastaran Baradaran, Jacqueline Chame, Chun Chen, Pedro Diniz, Mary Hall, Yoon-Ju Lee, Bing Liu, and Robert Lucas
In Proc. of the Workshop on Next Generation Software, held in conjuction with IPDPS '03, April, 2003

Data Search and Reorganization using FPGAs: Application to Spatial Pointer-based Data Structures
Pedro Diniz and Joonseok Park
In Proc. of the 2003 Symp. on FPGAs for Custom Computing Machines (FCCM'03) IEEE Computer Society Press, Los Alamitos, California, April 2003, pages 207-217.

Search Space Properties for Mapping Pipelined FPGA Applications
Heidi Ziegler, Mary Hall, and Byoungro So
Proceedings of the 16th annual workshop on Languages and Compilers for Parallel Computing (LCPC), 2003

Synthesis and Estimation of Memory Interfaces for FPGA-based Reconfigurable Computing Engines (short paper) Joonseok Park and Pedro Diniz
In the Proc. of the 2003 IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'03) IEEE Computer Society Press, Los Alamitos, California, April 2003, pages 297-299.

2002

A Compiler Algorithm for Exploiting Page-Mode Memory Access in Embedded-DRAM Devices SLIDE
Jaewook Shin, Jacqueline Chame, and Mary W. Hall
4th Workshop on Media and Streaming Processors, November. 2002.

Coarse-Grain Pipelining for Multiple FPGA Architectures
Heidi Ziegler, Byoungro So, Mary Hall, and Pedro Diniz
In the Proceedings of the IEEE Symp. on FPGAs for Custom Computing Machines (FCCM'02), IEEE Computer Society Press, Los Alamitos, Calif., Oct. 2002, pp. 77-86.

Compiler-Controlled Caching in Superword Register Files for Multimedia Extension Architectures
J. Shin, J. Chame, and M. Hall
In Proceedings of the Parallel Architectures and Compilation Techniques Conference, September. 2002.

The Architecture of the DIVA Processing-In-Memory Chip
J. Draper, J. Chame, M. Hall, C. Steele, T. Barrett, J. LaCoss, J. Granacki, J. Shin, C. Chen, C. W. Kang, I. Kim, and G. Daglikoca
In Proceedings of the International Conference on Supercomputing, June, 2002.

A Compiler Approach to Fast Design Space Exploration in FPGA-based Systems
Byoungro So, Mary Hall and Pedro Diniz
In Proc. of the ACM Conference on Programming Language Design and Implementation (PLDI'2002), ACM Press, New York, June. 2002, pp. 00-00.

Data Reorganization Engines for the Next Generation of System-On-A-Chip FPGAs
Pedro Diniz and Joonseok Park
In Proceedings of the Tenth ACM International Symposium on FPGAs (FPGA'2002), ACM Press, New York, Feb. 2002, pp. 237-244.

Compiler Support for Custom Data Layout
Byoungro So, Heidi Ziegler, and Mary Hall
Proceedings of the 15th annual workshop on Languages and Compilers for Parallel Computing (LCPC), 2002

2001

Synthesis of Memory Access Controller for Streamed Data Applications for FPGA-based Computing Engines
Joonseok Park and Pedro Diniz
In Proceedings of the 14th Int. Symp. on System Synthesis (ISSS'2001), Oct. 2001, pp. 221-226.

Bridging the Gap between Compilation and Synthesis in the DEFACTO System
Pedro Diniz, Mary Hall, Joonseok Park, Byoungro So and Heidi Ziegler
In Proceedings of the 14th Workshop on Languages and Compilers for Parallel Computing (LCPC'2001), Published as Lecture Notes in Computer Science (LNCS),

Springer-Verlag, Berlin, 2001, pp.570-578.
An External Memory Interface for FPGA-based Computing Engines
Joonseok Park and Pedro Diniz
To appear in the Proceedings of the IEEE Symp. on FPGAs for Custom Computing Machines (FCCM'01), IEEE Computer Society Press, Los Alamitos, Calif., Oct.

2001

Matching and Searching Analysis for Parallel Hardware Implementation on FPGAs
Pablo Moisset, Pedro Diniz and Joonseok Park
In the Proceedings of the Nineth ACM Symposium on FPGAs (FPGA'2001), ACM Press, New York, Feb. 2001, pp. 125-133.

2000

Evaluating Automatic Parallelization in SUIF (detailed results are also available)
Sungdo Moon, Byoungro So, and Mary W. Hall
IEEE Transactions on Parallel Distributed Systems, Vol. 11, No. 1, pp. 36-49, IEEE Computer Society, January 2000

Automatic Synthesis of Data Storage and Contol Structures for FPGA-based Computing Machines
Pedro Diniz and Joonseok Park
In the Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'2000), IEEE Computer Society Press, Los Alamitos, Oct. 2000, pp. 91-100.

Compiler Transformations for Exploiting Bandwidth in PIM-Based Systems
Jacqueline Chame, Jaewook Shin, and Mary Hall
In The 27th Annual International Symposium on Computer Architecture, Workshop on Solving the Memory Wall Problem, June 11, 2000, Vancouver, British Columbia, Canada

1999

Combining Compile-Time and Run-Time Parallelization
Sungdo Moon, Byoungro So, and Mary W. Hall
Scientific Programming, Vol. 7, pp. 247-260, IOS Press, 1999

Mapping Irregular Applications to DIVA, A PIM-based Data-Intensive Architecture
Mary Hall, Peter Kogge, Jeff Koller, Pedro Diniz, Jacqueline Chame, Jeff Draper, Jeff LaCoss, John Granacki, Apoorv Srivastava, William Athas, Jay Brockman, Vincent Freeh, Joonseok Park, and Jaewook Shin
In Proceedings of the Supercomputing Conference (SC'99), November 1999.

Parallelization and Locality Analysis for Adaptive Computing Systems
Byoungro So, Heidi Ziegler, and Mary W. Hall
PACT'99 Workshop on Reconfigurable Computing (WoRC99), Newport Beach, California, October 16, 1999

Very High-Level Synthesis of Control and Datapath Structure for Reconfigurable Logic Devices
Pablo Moisset, Joonseok Park and Pedro Diniz
Proceedings of the Second International Workshop on Compiler and Architecture Support for Embedded Systems (CASES'99), Washington, D.C., October, 1999

A Tile Selection Algorithm for Data Locality and Cache Interference
Jacqueline Chame and Sungdo Moon
Proceedings of the 13th ACM International Conference on Supercomputing (ICS'99), pp. 492-499, Rhodes, Greece, June 1999

Evaluation of Predicated Array Data-Flow Analysis for Automatic Parallelization
Sungdo Moon and Mary W. Hall
Proceedings of the 7th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'99), pp. 84-95, Atlanta, Georgia, May 1999

DEFACTO: A Design Environment for Adaptive Computing Technology
Kiran Bondalapati, Pedro Diniz, Phillip Duncan, John Granacki, Mary Hall, Rajeev Jain, and Heidi Ziegler
Proceedings of the 6th Reconfigurable Architectures Workshop (RAW '99), San Juan, Puerto Rico, April 12, 1999

1998

Predicated Array Data-Flow Analysis for Run-Time Parallelization
Sungdo Moon, Mary W. Hall, and Brian R. Murphy
Proceedings of the 12th ACM International Conference on Supercomputing (ICS'98), pp. 204-211, Melbourne, Australia, July 1998

Measuring the Effectiveness of Automatic Parallelization in SUIF
Byoungro So, Sungdo Moon, and Mary W. Hall
Proceedings of the 12th ACM International Conference on Supercomputing (ICS'98), pp. 212-219, Melbourne, Australia, July 1998

A Case for Combining Compile-Time and Run-Time Parallelization
Sungdo Moon, Byoungro So, and Mary W. Hall
Proceedings of the Fourth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR'98), Pittsburgh, Pennsylvania, May 1998, Lecture Notes in Computer Science, Vol. 1511, pp. 91--106, Springer-Verlag


3. Forces Modeling group – Dan Davis et al

2007

Development
IITSEC 2007-2 (GPU in Sims)
Implementing a GPU-Enhanced Cluster for Large-Scale Simulations
WinterSim 2007
High-Performance Computing Enables Simulations to Transform Education

Aggregation/De-aggregation Problem
HPCMP User Group Conference 2007
A GPU-Enhanced Cluster for Accelerated FMS
IITSEC 2007-1 (Education)
Implementing New Educational Technology for 21st Century DoD Leadership

2006

Computing
IITSEC 2006 (Aggreg-DeAgr)
Application of Proven Parallel Programming Algorithmic Design to the

Analysis
2006 ITEA Journal of Test and Evaluation
Supercomputing’s Role in Data Problems and Its Contribution to Solutions
SIW 2006
Petascale Computing for Military Operations
IITSEC 2006 (Education)
Educational Extensions of Large-Scale Simulations Enabled by High Performance

2005

Analysis
2005 ITEA Journal of Test and Evaluation
Joint Experimentation on Scalable Parallel Processors (JESPP)
Journal of Defense Modeling and Simulation
Advanced Message Routing for Scalable Distributed Simulations
WinterSim 2005
Enabling 1,000,000-Entity Simulations on Distributed Linux Clusters

2004

IITSEC 2004
21st Century Simulation: Exploiting High Performance Computing and Data

2003

IITSEC 2003
Joint Experimentation on Scalable Parallel Processors