G4 Stream results (partially experimental)

From: Craig Armour (Craig.Armour@ausbit.com.au)
Date: Tue Sep 17 2002 - 22:50:04 CDT

Next message: Wei Lin and Norbert Juffa: "STREAM results for Pentium4 @ 2.8 GHz with PC1066 RDRAM"

Previous message: Mikhail Kuzminsky: "new STREAM results"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi,

I have some results for stream on the G4 450mhz Dual proc system (
standard apple ) uname of my box is as follows

Linux formaldehyde 2.4.19 #11 SMP Sun Sep 8 09:14:19 EST 2002 ppc unknown

stats:
2xG4@450mhz
128mb ram

The results I have are better than the ones currently listed, but not
necssarily mind blowing. I've also attached the source to the pthread
code used to obtain the mp information.

My code isn't the best but not sure for the crappy performance compared
to the other stats on your page. Perhaps the unified l2 cache doesn't
go well with the multi threading and you get a bit more cache thrashing
than prefered *shrug*. Also, gcc probably isn't the best compiler but
I'd be interested in comparing the code the other guys got from their
compilers to the code gcc produces

Cheers
Craig

text/x-csrc attachment: stream.c

craig@formaldehyde:~/src$ ./stream
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 400000, Offset = 0
Total memory required = 9.2 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 11773 microseconds.
   (= 11773 clock ticks)
   Increase the size of the arrays if this shows that
   you are not getting at least 20 clock ticks per test.
   -------------------------------------------------------------
   WARNING -- The above is only a rough guideline.
   For best results, please be sure you know the
   precision of your system timer.
   -------------------------------------------------------------
   Function Rate (MB/s) RMS time Min time Max time
   Copy: 298.4938 0.0217 0.0214 0.0226
   Scale: 297.9653 0.0218 0.0215 0.0219
   Add: 321.1885 0.0299 0.0299 0.0301
   Triad: 327.1542 0.0297 0.0293 0.0298

craig@formaldehyde:~/src$ /export/local/bin/gcc -O3 -funroll-loops -fprefetch-loop-arrays -mcpu=604 -lm -o stream second_wall.c stream_d.c

craig@formaldehyde:~/src$ /export/local/bin/gcc -O3 -funroll-loops -fprefetch-loop-arrays -mcpu=604 -pthread -lm -o stream_mp second_wall.c stream.c

craig@formaldehyde:~/src$ /usr/bin/time ./stream_mp
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 400000, Offset = 0
Total memory required = 9.2 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 2 microseconds.
Each test below will take on the order of 11355 microseconds.
   (= 5677 clock ticks)
   Increase the size of the arrays if this shows that
   you are not getting at least 20 clock ticks per test.
   -------------------------------------------------------------
   WARNING -- The above is only a rough guideline.
   For best results, please be sure you know the
   precision of your system timer.
   -------------------------------------------------------------
   Function Rate (MB/s) RMS time Min time Max time
   Copy: 338.8759 0.0194 0.0189 0.0212
   Scale: 325.3191 0.0201 0.0197 0.0212
   Add: 341.7224 0.0284 0.0281 0.0286
   Triad: 340.4137 0.0283 0.0282 0.0285

   1.89user 0.09system 0:01.05elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k
   0inputs+0outputs (179major+2438minor)pagefaults 0swaps

Next message: Wei Lin and Norbert Juffa: "STREAM results for Pentium4 @ 2.8 GHz with PC1066 RDRAM"
Previous message: Mikhail Kuzminsky: "new STREAM results"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.4 : Fri Nov 08 2002 - 13:37:15 CST