John,
We make a machine with multiple intel i860 nodes each with upto 32 Mbytes
of memory. These talk to each other via a reconfigurable network of tranputers.
2 transputers (8 links) have shared memory with each i860 and support a through
routing message passing system that does not impact the i860 memory bandwidth.
I think we probably have one of the fastests i860 memory systems there is.
I'm not sure that your test program is not also a measure of how smart the
compiler is at optimising the vector loops. The RS6000 has some smart stuff
in there and might well just wack in a single instruction for the copy. If the
sparc compiles this as a loop it not really a fair test of the memory system.
Anyway the results I get do correspond to the real world times on other
benchmarks. If the road is down hill and the wind is behind you each i860
runs at about the speed of the R6000/530. But looks at the super times we get
with the VAST vectoriser !
Meiko MK096 dual i860 board.
One i860 node (16 Mbytes of memory)
Portland Group Compiler
pgf77 (no optimisation)
Timing calibration ; t = 38.28125 clicks
Assignment: Rate = 41.79592 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 29.68116 MB/s MFLOPS = 1.855072
Summing: Rate = 47.62790 MB/s MFLOPS = 1.984496
SAXPYing: Rate = 28.71028 MB/s MFLOPS = 2.392524
pgf77 -O4
Timing calibration ; t = 11.32813 clicks
Assignment: Rate = 141.2414 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 51.84810 MB/s MFLOPS = 3.240506
Summing: Rate = 120.4706 MB/s MFLOPS = 5.019608
SAXPYing: Rate = 39.13375 MB/s MFLOPS = 3.261146
pgf77 -O4 -Mvect
Timing calibration ; t = 11.32813 clicks
Assignment: Rate = 141.2414 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 48.18823 MB/s MFLOPS = 3.011765
Summing: Rate = 149.8537 MB/s MFLOPS = 6.243903
SAXPYing: Rate = 37.69325 MB/s MFLOPS = 3.141104
Greenhills compiler plus Pacific Sierra VAST vectoriser
f77apx -vast -OLMA (vectorise, optimise, unroll loops, no arithmetic checks)
Timing calibration ; t = 5.024004 clicks
Assignment: Rate = 318.4711 MB/s MFLOPS = 0.0000000E+00
Scaling: Rate = 99.52228 MB/s MFLOPS = 6.220142
Summing: Rate = 238.8533 MB/s MFLOPS = 9.952222
SAXPYing: Rate = 103.8493 MB/s MFLOPS = 8.654112
wow this one makes it really screams !
Cheers Boris
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:02 CDT