STREAM results (Fujitsu SPARC M12-2S, 96 cores)
Dear Dr. McCalpin,
We have measured STREAM benchmark on Fujitsu SPARC M12-2S.
Please publish this score on the STREAM Web site on April 4, 2017 or later.
System Name: Fujitsu SPARC M12-2S
CPU Name: SPARC64 XII
CPU Characteristics: High Speed Mode enabled, up to 4.35 GHz
CPU MHz: 4250
CPU(s) enabled: 96 cores, 8 chips, 12 cores/chip, 8 threads/core
Primary Cache: 64 KB I + 64 KB D on chip per core
Secondary Cache: 512 KB I+D on chip per core
L3 Cache: 32 MB I+D on chip per chip
Other Cache: None
Memory: 4 TB (128 x 32 GB 2Rx4 PC4-2400T-R, ECC)
Operating System: Oracle Solaris 11.3 a next SRU
Compiler: Version 12.6 of Oracle Developer Studio
Compilation Flags: -fast -m64 -xopenmp -xtarget=sparc64xplus -xipo=2
$B!!!!!!!!!!!!!!!!!!!!!!(B -xpagesize=4M -xlinkopt -xvector -xprefetch_level=3
$B!!!!!!!!!!!!!!!!!!!!!!(B -xprefetch=latx:8.0
STREAM Source Code: Fortran version (v5.6) with format changes
for large arrays.
OS Settings: (/etc/system parameters)
autoup=86400$B!!(Bdoiflush=0$B!!(Bdopageflush=0
zfs:zfs_arc_max=1073741824
(change processor status) psradm -i 1-767
Shell Environment: OMP_NUM_THREADS=192
SUNW_MP_PROCBIND="1 5 9 13 17 21 25 29 33 37 41 45 49
53 57 61 65 69 73 77 81 85 89 93 97 101 105 109 113
117 121 125 129 133 137 141 145 149 153 157 161 165
169 173 177 181 185 189 193 197 201 205 209 213 217
221 225 229 233 237 241 245 249 253 257 261 265 269
273 277 281 285 289 293 297 301 305 309 313 317 321
325 329 333 337 341 345 349 353 357 361 365 369 373
377 381 385 389 393 397 401 405 409 413 417 421 425
429 433 437 441 445 449 453 457 461 465 469 473 477
481 485 489 493 497 501 505 509 513 517 521 525 529
533 537 541 545 549 553 557 561 565 569 573 577 581
585 589 593 597 601 605 609 613 617 621 625 629 633
637 641 645 649 653 657 661 665 669 673 677 681 685
689 693 697 701 705 709 713 717 721 725 729 733 737
741 745 749 753 757 761 765"
Run: <stream>
Outputs:
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
----------------------------------------------
STREAM Version $Revision: 5.6 $
----------------------------------------------
Array size = 16000000000
Offset = 48
The total memory requirement is 366210 MB
You are running each test 10 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------
Number of Threads = 192
----------------------------------------------
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
(snip)
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 675848.7271 0.3841 0.3788 0.3921
Scale: 673287.3440 0.3857 0.3802 0.3957
Add: 774468.5782 0.5052 0.4958 0.5152
Triad: 777986.2151 0.5050 0.4936 0.5135
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Received on Sat Apr 01 2017 - 16:15:34 CDT
This archive was generated by hypermail 2.3.0
: Mon Apr 03 2017 - 19:20:37 CDT