John D. McCalpin
Advanced Systems Division
Silicon Graphics, Inc.
mccalpin@sgi.com
Revised to October 12, 1995
As the ratio of cpu speed to memory speed continues to increase in high performance computers, sustained memory bandwidth obtainable by user codes is an obvious candidate for a figure of merit in the performance evaluation of computer systems for high-performance computing. Sustainable memory bandwidth has a straightforward and intuitive interpretation, and is likely to be well correlated with application performance for vector-style codes with low computational density and limited cache re-use.
Despite this apparent simplicity, the architectural factors which determine sustainable memory bandwidth are many and complex, with a number of interesting subtleties. Vendors very seldom make enough hardware details available to make accurate estimates of sustainable memory bandwidth possible. Therefore, I present the results of a broad survey of memory bandwidth for a large variety current computers, including uniprocessors, vector processors, shared-memory systems, and distributed-memory systems.
The results are analyzed in terms of the sustainable data transfer rates for uncached unit-stride vector operations for each machine, and for each class. Some trends in the ratio of floating-point performance to memory bandwidth are also presented and discussed.