From: Jonas August (jonas@cs.cmu.edu)
Date: Sun Jan 16 2005 - 14:20:58 UTC
Hi,
I just ran stream on a Dual 2.5GHz Power Mac G5 PPC 970FX with 512MB RAM
and 512kB L2 cache; processor speed was set to "Automatic", but I didn't
get improvement when I made it "Highest". The OS is Mac OS X 10.3.7
with gcc 3.3 (apple build 1640). For a single processor test I compiled
with:
gcc -fast -o stream_d second_wall.c stream_d.c
which is supposed to optimize for G5. The result is:
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 0
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 12201 microseconds.
(= 12201 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 1978.5036 0.0169 0.0162 0.0186
Scale: 1972.8907 0.0168 0.0162 0.0192
Add: 2208.1095 0.0222 0.0217 0.0229
Triad: 2215.4477 0.0222 0.0217 0.0229
I didn't do a dual proc. test because I don't have an openmp compiler.
Thanks.
-- Jonas August jonas*cs.cmu.edu (*=@) www.cs.cmu.edu/~jonas tel(412)268-1314 fax(412)268-6436 Robotics Institute, Carnegie Mellon University A409 Newell-Simon Hall, 5000 Forbes Avenue, Pittsburgh, PA 15213
This archive was generated by hypermail 2.1.4 : Tue Feb 15 2005 - 07:11:56 UTC