attached are Stream results for an rp8400 with:
1 cell (in a single partition)
4 cpus (fully populated with 4 cpus/cell)
8GB of memory (fully populated with 512MB DIMMs)
750MHz PA-8700 cpus
running HP-UX 11i, September patch bundle
compiled with fortran90 in 64 bit mode as follows:
f90 -o stream_d.mp +extend_source +autodbl4 +DA2.0W +noppu +DS2.0 +O3 -Wl,+pd,L -Wl,-aarchive +Oparallel stream_d.f second_wall.o
and source modified to adjust array size and put arrays in COMMON:
for 16 and 32 cpu runs:
63c63
< PARAMETER (n=2000000,offset=0,ndim=n+offset,ntimes=10)
--- > PARAMETER (n=53477800,offset=0,ndim=n+offset,ntimes=10) 88c88 < * COMMON a,b,c --- > COMMON a,b,c---------------------------------------------- Double precision appears to have 16 digits of accuracy Assuming 8 bytes per DOUBLE PRECISION word ---------------------------------------------- Array size = 53477800 Offset = 0 The total memory requirement is 1224 MB You are running each test 10 times -- The *best* time for each test is used *EXCLUDING* the first and last iterations ---------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds ---------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1556.2300 0.5500 0.5498 0.5502 Scale: 1541.2218 0.5555 0.5552 0.5559 Add: 1676.7441 0.7658 0.7655 0.7663 Triad: 1649.2237 0.7784 0.7782 0.7787 ---------------------------------------------------- Solution Validates! ----------------------------------------------------
This archive was generated by hypermail 2b29 : Wed Oct 31 2001 - 11:26:47 CST