From: Frank Johnston (fjohn@us.ibm.com)
Date: Mon Oct 04 2004 - 16:11:08 CDT
These are standard STREAM results on a IBM eServer p5 520 Express
with two 1500 MHz cpus (36MB L3). This is a POWER5 SMP machine.
Large pages were used in all cases.
Function Rate (MB/s) Avg time Min time Max time
Copy: 3608.3480 .1489 .1487 .1491
Scale: 3617.6983 .1484 .1483 .1485
Add: 4807.8167 .1675 .1674 .1676
Triad: 4864.0613 .1657 .1655 .1659
Here is the full output file:
--------------------------------------------------
Requesting Large Pages
Setting up for 2 CPUs per module
Number of segments per array = 1
CPU binding list : 0
Shared Segment Pointer = 504403158265495552
Shared Segment Pointer = 504403158533931008
Shared Segment Pointer = 504403158802366464
Segment Size (B) = 268435456 (MB = 256 )
Array Size (B) = 268435456 (MB = 256 )
Array Size (DW) = 33554432
Num_threads = 2
Num_threads = 2
rebind: num_parthds is 2
Starting Initialization
Done With Initialization
a(1) 1.00000000000000000
b(M) 1.00000000000000000
c(M) 1.00000000000000000
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33541120
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3609.0958 .1488 .1487 .1490
Scale: 3625.9852 .1483 .1480 .1485
Add: 4813.5582 .1674 .1672 .1677
Triad: 4863.5088 .1659 .1655 .1661
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33541120
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3600.9318 .1491 .1490 .1492
Scale: 3615.4784 .1485 .1484 .1486
Add: 4798.5692 .1678 .1678 .1679
Triad: 4856.5761 .1659 .1658 .1661
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33541120
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3606.4994 .1489 .1488 .1490
Scale: 3612.8786 .1486 .1485 .1486
Add: 4801.1963 .1677 .1677 .1678
Triad: 4857.7709 .1658 .1657 .1659
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33539072
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3598.3978 .1493 .1491 .1496
Scale: 3625.0689 .1481 .1480 .1482
Add: 4804.7635 .1682 .1675 .1689
Triad: 4861.1812 .1662 .1656 .1669
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33539072
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3598.9789 .1492 .1491 .1494
Scale: 3611.8348 .1487 .1486 .1488
Add: 4805.5910 .1678 .1675 .1679
Triad: 4862.7636 .1658 .1655 .1660
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33539072
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3606.8398 .1490 .1488 .1493
Scale: 3614.2359 .1485 .1485 .1486
Add: 4811.2271 .1675 .1673 .1677
Triad: 4851.8200 .1661 .1659 .1662
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33537024
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3602.3131 .1490 .1490 .1491
Scale: 3615.3156 .1484 .1484 .1485
Add: 4809.9531 .1674 .1673 .1675
Triad: 4861.0803 .1658 .1656 .1660
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33537024
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3612.0721 .1489 .1486 .1491
Scale: 3615.6292 .1484 .1484 .1485
Add: 4804.3402 .1677 .1675 .1679
Triad: 4854.2584 .1660 .1658 .1661
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33537024
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3603.4147 .1490 .1489 .1490
Scale: 3615.3969 .1485 .1484 .1486
Add: 4812.2020 .1675 .1673 .1676
Triad: 4856.5209 .1659 .1657 .1660
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 512
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33534976
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3608.3480 .1489 .1487 .1491
Scale: 3617.6983 .1484 .1483 .1485
Add: 4807.8167 .1675 .1674 .1676
Triad: 4864.0613 .1657 .1655 .1659
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 1536
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33534976
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3606.0757 .1489 .1488 .1490
Scale: 3615.0658 .1484 .1484 .1485
Add: 4807.9058 .1676 .1674 .1677
Triad: 4854.1155 .1659 .1658 .1660
----------------------------------------------------
Solution Validates!
----------------------------------------------------
Incremental Offset = 2560
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 33534976
The total memory requirement is 767 MB
You are running each test 5 times
--
The *best* time for each test is used
*EXCLUDING* the first and last iterations
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
----------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 3563.6884 .1511 .1506 .1521
Scale: 3624.6145 .1496 .1480 .1508
Add: 4792.8063 .1698 .1679 .1711
Triad: 4844.0017 .1681 .1662 .1693
----------------------------------------------------
Solution Validates!
----------------------------------------------------
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 450673 cpu_id 0
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 450673 cpu_id 0
GETSHRSEG: requesting large pages
GETSHRSEG ENTRY: shmgetflag -2147481216
bindprocessor successful: thread_self() 450673 cpu_id 0
bindprocessor successful: thread_self() 450673 cpu_id 0
bindprocessor successful: thread_self() 508075 cpu_id 1
This archive was generated by hypermail 2.1.4 : Tue Oct 05 2004 - 07:49:22 CDT