From: Frank Johnston (fjohn@us.ibm.com)
Date: Tue Nov 02 2004 - 09:58:07 CST
These are standard STREAM results on an IBM eServer p5 595
with sixty-four 1.9GHz cpus (36MB L3 cache). This is a POWER5 SMP machine.
Large pages were used in all cases.
Function Rate (MB/s) Avg time Min time Max time
Copy: 157559.59 .03 .03 .03
Scale: 152770.74 .03 .03 .03
Add: 168973.99 .04 .04 .04
Triad: 173564.19 .04 .04 .04
Here is the full output file:
--------------------------------------------------
Requesting Large Pages
Setting up for 8 CPUs per module
Number of segments per array = 8
CPU binding list : 0 8 16 24 32 40 48 56
Shared Segment Pointer = 504403158265495552
Shared Segment Pointer = 504403160412979200
Shared Segment Pointer = 504403162560462848
Segment Size (B) = 268435456 (MB = 256 )
Array Size (B) = 2147483648 (MB = 2048 )
Array Size (DW) = 268435456
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
Num_threads = 64
rebind: num_parthds is 64
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 2437303 cpu_id 0
bindprocessor successful: thread_self() 2437303 cpu_id 8
bindprocessor successful: thread_self() 2437303 cpu_id 16
bindprocessor successful: thread_self() 2437303 cpu_id 24
bindprocessor successful: thread_self() 2437303 cpu_id 32
bindprocessor successful: thread_self() 2437303 cpu_id 40
bindprocessor successful: thread_self() 2437303 cpu_id 48
bindprocessor successful: thread_self() 2437303 cpu_id 56
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 2437303 cpu_id 0
bindprocessor successful: thread_self() 2437303 cpu_id 8
bindprocessor successful: thread_self() 2437303 cpu_id 16
bindprocessor successful: thread_self() 2437303 cpu_id 24
bindprocessor successful: thread_self() 2437303 cpu_id 32
bindprocessor successful: thread_self() 2437303 cpu_id 40
bindprocessor successful: thread_self() 2437303 cpu_id 48
bindprocessor successful: thread_self() 2437303 cpu_id 56
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 2437303 cpu_id 0
bindprocessor successful: thread_self() 2437303 cpu_id 8
bindprocessor successful: thread_self() 2437303 cpu_id 16
bindprocessor successful: thread_self() 2437303 cpu_id 24
bindprocessor successful: thread_self() 2437303 cpu_id 32
bindprocessor successful: thread_self() 2437303 cpu_id 40
bindprocessor successful: thread_self() 2437303 cpu_id 48
bindprocessor successful: thread_self() 2437303 cpu_id 56
bindprocessor successful: thread_self() 2453683 cpu_id 1
bindprocessor successful: thread_self() 2552031 cpu_id 25
bindprocessor successful: thread_self() 2621441 cpu_id 42
bindprocessor successful: thread_self() 2625539 cpu_id 43
bindprocessor successful: thread_self() 2588913 cpu_id 34
bindprocessor successful: thread_self() 2474169 cpu_id 6
bindprocessor successful: thread_self() 2478267 cpu_id 7
bindprocessor successful: thread_self() 2543835 cpu_id 23
bindprocessor successful: thread_self() 2539737 cpu_id 22
bindprocessor successful: thread_self() 2494659 cpu_id 11
bindprocessor successful: thread_self() 2490561 cpu_id 10
bindprocessor successful: thread_self() 2617599 cpu_id 41
bindprocessor successful: thread_self() 2666519 cpu_id 53
bindprocessor successful: thread_self() 2605305 cpu_id 38
bindprocessor successful: thread_self() 2646029 cpu_id 48
bindprocessor successful: thread_self() 2650127 cpu_id 49
bindprocessor successful: thread_self() 2531541 cpu_id 20
bindprocessor successful: thread_self() 2564325 cpu_id 28
bindprocessor successful: thread_self() 2506953 cpu_id 14
bindprocessor successful: thread_self() 2629637 cpu_id 44
bindprocessor successful: thread_self() 2633735 cpu_id 45
bindprocessor successful: thread_self() 2482365 cpu_id 8
bindprocessor successful: thread_self() 2691107 cpu_id 59
bindprocessor successful: thread_self() 2597109 cpu_id 36
bindprocessor successful: thread_self() 2593011 cpu_id 35
bindprocessor successful: thread_self() 2511051 cpu_id 15
bindprocessor successful: thread_self() 2470071 cpu_id 5
bindprocessor successful: thread_self() 2461875 cpu_id 3
bindprocessor successful: thread_self() 2355365 cpu_id 2
bindprocessor successful: thread_self() 2601207 cpu_id 37
bindprocessor successful: thread_self() 2674715 cpu_id 55
bindprocessor successful: thread_self() 2670617 cpu_id 54
bindprocessor successful: thread_self() 2658323 cpu_id 51
bindprocessor successful: thread_self() 2654225 cpu_id 50
bindprocessor successful: thread_self() 2707499 cpu_id 63
bindprocessor successful: thread_self() 2703401 cpu_id 62
bindprocessor successful: thread_self() 2547933 cpu_id 24
bindprocessor successful: thread_self() 2560227 cpu_id 27
bindprocessor successful: thread_self() 2556129 cpu_id 26
bindprocessor successful: thread_self() 2486463 cpu_id 9
bindprocessor successful: thread_self() 2662421 cpu_id 52
bindprocessor successful: thread_self() 2613501 cpu_id 40
bindprocessor successful: thread_self() 2568423 cpu_id 29
bi Starting Initialization
Done With Initialization
a(1) 1.00000000000000000
b(M) 1.00000000000000000
c(M) 1.00000000000000000
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267914240
The total memory requirement is 6132 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157643.69 .03 .03 .03
Scale: 152756.33 .03 .03 .03
Add: 169093.94 .04 .04 .04
Triad: 173423.77 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267914240
The total memory requirement is 6132 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157490.41 .03 .03 .03
Scale: 152957.76 .03 .03 .03
Add: 168821.90 .04 .04 .04
Triad: 173614.68 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267914240
The total memory requirement is 6132 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157279.62 .03 .03 .03
Scale: 152946.05 .03 .03 .03
Add: 169304.12 .04 .04 .04
Triad: 173239.96 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267912192
The total memory requirement is 6132 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157863.95 .03 .03 .03
Scale: 153422.56 .03 .03 .03
Add: 169881.92 .04 .04 .04
Triad: 173065.21 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267912192
The total memory requirement is 6132 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157475.41 .03 .03 .03
Scale: 152927.97 .03 .03 .03
Add: 168642.21 .04 .04 .04
Triad: 173036.34 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267912192
The total memory requirement is 6132 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157559.59 .03 .03 .03
Scale: 152770.74 .03 .03 .03
Add: 168973.99 .04 .04 .04
Triad: 173564.19 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267910144
The total memory requirement is 6131 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 158240.66 .03 .03 .03
Scale: 153067.41 .03 .03 .03
Add: 168911.31 .04 .04 .04
Triad: 173457.93 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267910144
The total memory requirement is 6131 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157879.38 .03 .03 .03
Scale: 152660.61 .03 .03 .03
Add: 168630.37 .04 .04 .04
Triad: 172790.00 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267910144
The total memory requirement is 6131 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157017.62 .03 .03 .03
Scale: 152586.76 .03 .03 .03
Add: 168704.21 .04 .04 .04
Triad: 173050.56 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267908096
The total memory requirement is 6131 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157561.33 .03 .03 .03
Scale: 153213.64 .03 .03 .03
Add: 168684.98 .04 .04 .04
Triad: 173024.81 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267908096
The total memory requirement is 6131 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 157237.50 .03 .03 .03
Scale: 152722.99 .03 .03 .03
Add: 168850.80 .04 .04 .04
Triad: 173624.11 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 267908096
The total memory requirement is 6131 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 156589.74 .03 .03 .03
Scale: 152212.26 .03 .03 .03
Add: 168943.88 .04 .04 .04
Triad: 173346.22 .04 .04 .04
----------------------------------------------------
Solution Validates!
----------------------------------------------------
ndprocessor successful: thread_self() 2687009 cpu_id 58
bindprocessor successful: thread_self() 2527443 cpu_id 19
bindprocessor successful: thread_self() 2523345 cpu_id 18
bindprocessor successful: thread_self() 2535639 cpu_id 21
bindprocessor successful: thread_self() 2437303 cpu_id 0
bindprocessor successful: thread_self() 2609403 cpu_id 39
bindprocessor successful: thread_self() 2465973 cpu_id 4
bindprocessor successful: thread_self() 2498757 cpu_id 12
bindprocessor successful: thread_self() 2502855 cpu_id 13
bindprocessor successful: thread_self() 2584815 cpu_id 33
bindprocessor successful: thread_self() 2580717 cpu_id 32
bindprocessor successful: thread_self() 2695205 cpu_id 60
bindprocessor successful: thread_self() 2699303 cpu_id 61
bindprocessor successful: thread_self() 2515149 cpu_id 16
bindprocessor successful: thread_self() 2519247 cpu_id 17
bindprocessor successful: thread_self() 2682911 cpu_id 57
bindprocessor successful: thread_self() 2678813 cpu_id 56
bindprocessor successful: thread_self() 2572521 cpu_id 30
bindprocessor successful: thread_self() 2576619 cpu_id 31
bindprocessor successful: thread_self() 2641931 cpu_id 47
bindprocessor successful: thread_self() 2637833 cpu_id 46
This archive was generated by hypermail 2.1.4 : Wed Nov 03 2004 - 08:05:54 CST