tuned STREAM on IBM System p5 550Q (8CPUs 1.65 GHz)

From: Ly Vu <lyvu@us.ibm.com>
Date: Sun Jul 23 2006 - 22:54:08 CST

These are tuned STREAM results on an IBM System p5 550Q
with eight 1.65GHz cpus. This is a POWER5+ SMP machine.
Large pages were used in all cases.

Function Rate (MB/s) RMS time Min time Max time
Copy: 16459.65 .07 .07 .08
Scale: 15891.33 .07 .07 .07
Add: 18843.11 .09 .09 .09
Triad: 19509.41 .08 .08 .08

Here is the full output file:
--------------------------------------------------
 Requesting Large Pages
 Setting up for 8 CPUs per module
 Number of segments per array = 2
 CPU binding list : 0 8
 Shared Segment Pointer = 504403158265495552
 Shared Segment Pointer = 504403158802366464
 Shared Segment Pointer = 504403159339237376
 Segment Size (B) = 268435456 (MB = 256 )
 Array Size (B) = 536870912 (MB = 512 )
 Array Size (DW) = 67108864
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 Num_threads = 16
 rebind: num_parthds is 16
 Starting Initialization
 Done With Initialization
 a(1) 1.00000000000000000
 b(M) 1.00000000000000000
 c(M) 1.00000000000000000
 Incremental Offset = 512
 Number of Threads = 16
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 66976768
 Offset = 0
 The total memory requirement is 1532 MB
 You are running each test 5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 67861 microseconds
    (= 67861 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 16459.65 .07 .07 .08
Scale: 15891.33 .07 .07 .07
Add: 18843.11 .09 .09 .09
Triad: 19509.41 .08 .08 .08
 Sum of a is = 101720966400000.000
 Sum of b is = 20344193280000.0000
 Sum of c is = 27125591040000.0000
______________________________________________
Ly Vu
IBM Corp. - Austin, Texas.
AIX/pSeries Performance
Phone : (512) 838-8228
Email : lyvu@us.ibm.com
Received on Mon Jul 24 18:32:59 2006

This archive was generated by hypermail 2.1.8 : Tue Jul 25 2006 - 11:10:50 CST