In comp.arch you write:
| ==========================================================
| Memory Transfer Rates in MB/s for Standard Fortran Kernels
| ==========================================================
| John D. McCalpin September 24, 1991
| mccalpin@perelandra.cms.udel.edu DELOCN::MCCALPIN
|
| The following table presents the results of a simple test of the
| memory bandwidth of a computer running some simple vector kernels
| coded in Fortran. In each case, only the memory transfer rate in
| Millions of Bytes per second (MB/s) is shown.
|
| Machine Copy Scale Sum SAXPY
| ------------------------------------------------------
| IBM RS/6000-950 190.5 163.3 184.6 187.5
| IBM RS/6000-530 145.5 88.9 109.1 120.0
| IBM RS/6000-320 58.2 55.7 60.0 60.0
| SGI 4D/240 35.6 22.1 34.9 25.6
| DEC 5000 25.9 25.6 23.1 21.8
| Sun 4/490 25.0 16.7 24.0 19.4
| Sun SS1+ 28.6 12.3 25.5 13.3
| SGI 4D/25 23.7 11.6 17.8 13.7
| Sun SS1 24.6 9.3 24.0 9.6
Here are the results for an Indigo running 4.0, and a 4D/35,
also running the released IRIX 4.0. I compiled it as f77 -O. I'm not quite sure
that I believe the accuracy of the timing, since it runs so fast.
Both machines were multiuser, but idle. Indigo should be somewhat
slower (slower clock, same memory system, smaller cache).
but Assignment doesn't seem to follow this model.
% uname -a
IRIX oceana 4.0 08281003 IP12
% hinv
1 33 MHZ IP12 Processor
FPU: MIPS R2010A/R3010 VLSI Floating Point Chip Revision: 4.0
CPU: MIPS R2000A/R3000 Processor Chip Revision: 3.0
On-board serial ports: 2
Data cache size: 32 Kbytes
Instruction cache size: 32 Kbytes
Main memory size: 24 Mbytes
Integral Ethernet: ec0, version 0
Disk drive: unit 1 on SCSI controller 0
Integral SCSI controller 0: Version WD33C93A
Iris Audio Processor, rev 2
Graphics board: LG1
% repeat 3 /tmp/X
Timing calibration ; t = 29.00000 clicks
Assignment: Rate = 55.17242 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 29.09091 MB/s MFLOPS = 1.818182
Summing: Rate = 66.66664 MB/s MFLOPS = 2.777777
SAXPYing: Rate = 47.05882 MB/s MFLOPS = 3.921569
Timing calibration ; t = 16.00000 clicks
Assignment: Rate = 100.0000 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 32.65306 MB/s MFLOPS = 2.040816
Summing: Rate = 74.99998 MB/s MFLOPS = 3.125000
SAXPYing: Rate = 42.85715 MB/s MFLOPS = 3.571429
Timing calibration ; t = 16.00000 clicks
Assignment: Rate = 100.0000 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 33.33333 MB/s MFLOPS = 2.083333
Summing: Rate = 57.14286 MB/s MFLOPS = 2.380953
SAXPYing: Rate = 47.05882 MB/s MFLOPS = 3.921569
Here are the results for a 4D/35:
% uname -a
IRIX mears 4.0 08281003 IP12
% hinv
1 36 MHZ IP12 Processor
FPU: MIPS R2010A/R3010 VLSI Floating Point Chip Revision: 4.0
CPU: MIPS R2000A/R3000 Processor Chip Revision: 3.0
On-board serial ports: 4
Data cache size: 64 Kbytes
Instruction cache size: 64 Kbytes
Main memory size: 16 Mbytes
Integral Ethernet: ec0, version 0
Disk drive: unit 2 on SCSI controller 0
Disk drive: unit 1 on SCSI controller 0
Integral SCSI controller 0: Version WD33C93A
Graphics board: GR1.2 Bit-plane, Z-buffer, Turbo options installed
% repeat 3 /tmp/X
repeat 3 /tmp/X
Assignment: Rate = 48.48484 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 30.76923 MB/s MFLOPS = 1.923077
Summing: Rate = 68.57143 MB/s MFLOPS = 2.857143
SAXPYing: Rate = 47.05882 MB/s MFLOPS = 3.921569
Timing calibration ; t = 21.00000 clicks
Assignment: Rate = 76.19046 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 31.37255 MB/s MFLOPS = 1.960784
Summing: Rate = 68.57145 MB/s MFLOPS = 2.857144
SAXPYing: Rate = 46.15383 MB/s MFLOPS = 3.846152
Timing calibration ; t = 16.00000 clicks
Assignment: Rate = 100.0000 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 33.33333 MB/s MFLOPS = 2.083333
Summing: Rate = 74.99998 MB/s MFLOPS = 3.125000
SAXPYing: Rate = 46.15385 MB/s MFLOPS = 3.846154
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:01 CDT