After reading Paul Hsieh's note (Dec 17, 1999), I decided to compile
stream_d.c with Intel's C/C++ 4.5. I use Intel's compiler for my codes, and
find v4.5 offers real improvements for memory-intensive apps, with
consistent performance for runs that last days. I used Paul Hsieh's
??-p55clock.c, since the clock() function in the win32*.c timer has very
poor resolution.
In short, there was an overall, significant improvement for everything but
triad, relative to the Lahey F90.
In case it wasn't obvious: I didn't tweak the source code in any way; it
was the default stream_d.c.
copy, scale, add, triad
Avg_1st_7: 533.08, 502.9, 641.2, 582.0
Avg_all10: 531.2, 501.4, 636.5, 577.9
System: Athlon 800 in Asus K7V motherboard, 768 MB PC133 SDRAM with 4:3
RAM:FSB and CAS..RAS 3-2-2.
WDC 7200 RPM 20 GB EIDE drive.
OS: win98; I did 10 runs, since I've noticed significant performance
variations. I killed all but the basic background processes; however, I
think the very act of wiring the redirecting results to file, and
concatenating the files, may be responsible for the little "dip" sjown in
the pdf attached.
Attached:
(1) stream.exe, win32 executable
(2) Athlon800_stream_c.txt, raw batch file output of ten runs
(3) Athlon800_stream_c.xls, Excel binary with results summary and plot
(4) plot_athlon800_stream_c.pdf, plot extracted from Excel file
Compilation switches were:
/G6 /ML /W2 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /FA /Fa"Release/"
/Fp".\Release\stream.pch" /YX /Fo".\Release/" /Fd".\Release/" /FD -Qrestrict
/c
...most of those are just directory junk; NOTE there were no restrict
keywords in the source, so that switch should have no effect.
This archive was generated by hypermail 2b29 : Sun Jun 11 2000 - 06:23:15 CDT