From: Frank Johnston (fjohn@us.ibm.com)
Date: Mon Oct 04 2004 - 16:14:55 CDT
These are standard STREAM results on a IBM eServer p5 550 Express
with four 1500 MHz cpus (36MB L3 cache). This is a POWER5 SMP machine.
Large pages were used in all cases.
Function Rate (MB/s) Avg time Min time Max time
Copy: 6279.8475 .1709 .1709 .1710
Scale: 6187.7380 .1735 .1734 .1735
Add: 8225.0360 .1958 .1957 .1959
Triad: 8414.6062 .1915 .1913 .1916
Here is the full output file:
---------------------------------------------------
Requesting Large Pages
Setting up for 2 CPUs per module
Number of segments per array = 2
CPU binding list : 0 2
Shared Segment Pointer = 504403158265495552
Shared Segment Pointer = 504403158802366464
Shared Segment Pointer = 504403159339237376
Segment Size (B) = 268435456 (MB = 256 )
Array Size (B) = 536870912 (MB = 512 )
Array Size (DW) = 67108864
Num_threads = 4
Num_threads = 4
Num_threads = 4
Num_threads = 4
rebind: num_parthds is 4
Starting Initialization
Done With Initialization
a(1) 1.00000000000000000
b(M) 1.00000000000000000
c(M) 1.00000000000000000
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67079168
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6280.5902 .1709 .1709 .1710
Scale: 6188.5582 .1735 .1734 .1735
Add: 8231.0229 .1956 .1956 .1957
Triad: 8388.4727 .1920 .1919 .1921
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67079168
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6279.5652 .1710 .1709 .1711
Scale: 6182.1416 .1737 .1736 .1737
Add: 8242.3662 .1954 .1953 .1956
Triad: 8356.1780 .1928 .1927 .1930
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67079168
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6276.6233 .1710 .1710 .1711
Scale: 6184.8766 .1736 .1735 .1737
Add: 8249.3344 .1953 .1952 .1954
Triad: 8371.8011 .1923 .1923 .1924
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67077120
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6278.9180 .1709 .1709 .1709
Scale: 6184.4583 .1736 .1735 .1736
Add: 8230.9923 .1957 .1956 .1958
Triad: 8409.5182 .1918 .1914 .1923
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67077120
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6277.8759 .1710 .1710 .1711
Scale: 6183.7957 .1736 .1736 .1737
Add: 8253.0551 .1952 .1951 .1954
Triad: 8369.7399 .1924 .1923 .1924
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67077120
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6276.0991 .1710 .1710 .1711
Scale: 6183.1756 .1736 .1736 .1736
Add: 8251.3506 .1952 .1951 .1954
Triad: 8364.2967 .1925 .1925 .1926
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67075072
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6279.8475 .1709 .1709 .1710
Scale: 6187.7380 .1735 .1734 .1735
Add: 8225.0360 .1958 .1957 .1959
Triad: 8414.6062 .1915 .1913 .1916
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67075072
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6283.0469 .1711 .1708 .1712
Scale: 6183.4285 .1736 .1736 .1736
Add: 8242.7483 .1955 .1953 .1957
Triad: 8367.6589 .1927 .1924 .1928
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67075072
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6277.7105 .1711 .1710 .1712
Scale: 6183.2501 .1736 .1736 .1737
Add: 8249.2841 .1953 .1951 .1955
Triad: 8375.0384 .1925 .1922 .1926
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67073024
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6277.5626 .1710 .1710 .1710
Scale: 6185.5424 .1735 .1735 .1735
Add: 8225.8370 .1957 .1957 .1958
Triad: 8407.9575 .1920 .1915 .1923
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67073024
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6275.2784 .1711 .1710 .1711
Scale: 6184.2676 .1736 .1735 .1736
Add: 8245.6675 .1953 .1952 .1953
Triad: 8374.3152 .1923 .1922 .1924
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67073024
The total memory requirement is 1535 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 6276.7135 .1710 .1710 .1711
Scale: 6179.9713 .1737 .1737 .1737
Add: 8246.3926 .1953 .1952 .1954
Triad: 8367.2271 .1925 .1924 .1926
----------------------------------------------------
Solution Validates!
----------------------------------------------------
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 794813 cpu_id 0
bindprocessor successful: thread_self() 794813 cpu_id 2
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 794813 cpu_id 0
bindprocessor successful: thread_self() 794813 cpu_id 2
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 794813 cpu_id 0
bindprocessor successful: thread_self() 794813 cpu_id 2
bindprocessor successful: thread_self() 786593 cpu_id 2
bindprocessor successful: thread_self() 802987 cpu_id 3
bindprocessor successful: thread_self() 794813 cpu_id 0
bindprocessor successful: thread_self() 798889 cpu_id 1
This archive was generated by hypermail 2.1.4 : Tue Oct 05 2004 - 07:49:32 CDT