Dear Dr. McCalpin,
We have measured STREAM benchmark on "Fujitsu SPARC M10-4S".
Please update the STREAM Web site.
System Name: Fujitsu SPARC M10-4S
CPU Name: SPARC64 X
CPU MHz: 3000
CPU(s) enabled: 256 cores, 16 chips, 16 cores/chip, 2 threads/core
Primary Cache: 64 KB I + 64 KB D on chip per core
Secondary Cache: 24 MB I+D on chip per chip
L3 Cache: None
Other Cache: None
Memory: 2 TB (128 x 16GB 2Rx4 PC3L-12800R-CL11, ECC,
running at 1600 MHz)
Operating System: Oracle Solaris 11.1
Compiler: C/C++: Version 12.3 of Oracle Solaris Studio,
1/13 Platform Specific Enhancement
Compilation Flags: -fast -m64 -xopenmp -xtarget=sparc64x
-fma=fused -xipo=2 -xpagesize=4M -xlinkopt
-xvector -xprefetch_level=3 -xprefetch=latx:8.0
-Qoption cg -Qlp-dl=1,-Qms_pipe-prefdl=1
-xtypemap=integer:64
STREAM Source Code: Fortran version (v5.6) with format changes
for large arrays.
OS Settings: (/etc/system parameters) lpg_alloc_prefer=1
Shell Environment: OMP_NUM_THREADS=512
SUNW_MP_PROCBIND="0-511"
SUNW_MP_THR_IDLE=spin
LD_PRELOAD=madv.so.1
LD_PRELOAD_64=madv.so.1
MADV=access_lwp
Run: ppgsz -o heap=256M,stack=256M,anon=256M <stream>
Outputs:
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
----------------------------------------------
STREAM Version $Revision: 5.6 $
----------------------------------------------
Array size = 2560000000
Offset = 1024
The total memory requirement is 58593 MB
You are running each test 10 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------
Number of Threads = 512
----------------------------------------------
Printing one line per active thread....
Printing one line per active thread....
(snip)
Printing one line per active thread....
Printing one line per active thread....
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 890337.8015 0.2480 0.0460 1.8610
Scale: 904156.0541 0.4489 0.0453 1.8919
Add: 1020663.0905 0.2840 0.0602 2.0707
Triad: 1023643.0281 0.2146 0.0600 1.4465
----------------------------------------------------
Solution Validates!
----------------------------------------------------
--
Akihiro SENOO
PA PROJECT
NEXT GENERATION TECHNICAL COMPUTING UNIT
FUJITSU Limited
senoo.akihiro@jp.fujitsu.com
Received on Tue Mar 26 19:59:08 2013
This archive was generated by hypermail 2.1.8 : Wed Mar 27 2013 - 10:00:40 CDT