MPI performance analysis and optimization on Tile64/Maestro

Abstract

□ Stalls caused by data and instruction cache misses□ Message data is loaded into cache when the data is prepared for the send operation before calling MPI call□ As MPI call goes through several subroutines, part or all of the data is evacuated due to data conflict□ As the message size increases, more and more cycles are spent due to the cache miss

Date: July 19, 2009
Authors: Mikyung Kang, Eunhui Park, Minkyoung Cho, Jinwoo Suh, Dong-In Kang, Stephen P Crago
Journal: Proceedings of Workshop on Multi-core Processors for Space—Opportunities and Challenges Held in conjunction with SMC-IT
Pages: 19-23

Information Sciences Institute

Publications

MPI performance analysis and optimization on Tile64/Maestro

Abstract