standard STREAM on IBM eServer p5 520 Express (1500 MHz, 2 cpu)

From: Frank Johnston (fjohn@us.ibm.com)
Date: Mon Oct 04 2004 - 16:11:08 CDT

  • Next message: Frank Johnston: "tuned STREAM on IBM eServer p5 520 Express (1500 MHz, 2 cpu)"

    These are standard STREAM results on a IBM eServer p5 520 Express
    with two 1500 MHz cpus (36MB L3). This is a POWER5 SMP machine.
    Large pages were used in all cases.

    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3608.3480 .1489 .1487 .1491
    Scale: 3617.6983 .1484 .1483 .1485
    Add: 4807.8167 .1675 .1674 .1676
    Triad: 4864.0613 .1657 .1655 .1659

    Here is the full output file:
    --------------------------------------------------

     Requesting Large Pages
     Setting up for 2 CPUs per module
     Number of segments per array = 1
     CPU binding list : 0
     Shared Segment Pointer = 504403158265495552
     Shared Segment Pointer = 504403158533931008
     Shared Segment Pointer = 504403158802366464
     Segment Size (B) = 268435456 (MB = 256 )
     Array Size (B) = 268435456 (MB = 256 )
     Array Size (DW) = 33554432
     Num_threads = 2
     Num_threads = 2
     rebind: num_parthds is 2
     Starting Initialization
     Done With Initialization
     a(1) 1.00000000000000000
     b(M) 1.00000000000000000
     c(M) 1.00000000000000000
     Incremental Offset = 512
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33541120
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3609.0958 .1488 .1487 .1490
    Scale: 3625.9852 .1483 .1480 .1485
    Add: 4813.5582 .1674 .1672 .1677
    Triad: 4863.5088 .1659 .1655 .1661
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 1536
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33541120
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3600.9318 .1491 .1490 .1492
    Scale: 3615.4784 .1485 .1484 .1486
    Add: 4798.5692 .1678 .1678 .1679
    Triad: 4856.5761 .1659 .1658 .1661
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 2560
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33541120
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3606.4994 .1489 .1488 .1490
    Scale: 3612.8786 .1486 .1485 .1486
    Add: 4801.1963 .1677 .1677 .1678
    Triad: 4857.7709 .1658 .1657 .1659
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 512
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33539072
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3598.3978 .1493 .1491 .1496
    Scale: 3625.0689 .1481 .1480 .1482
    Add: 4804.7635 .1682 .1675 .1689
    Triad: 4861.1812 .1662 .1656 .1669
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 1536
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33539072
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3598.9789 .1492 .1491 .1494
    Scale: 3611.8348 .1487 .1486 .1488
    Add: 4805.5910 .1678 .1675 .1679
    Triad: 4862.7636 .1658 .1655 .1660
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 2560
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33539072
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3606.8398 .1490 .1488 .1493
    Scale: 3614.2359 .1485 .1485 .1486
    Add: 4811.2271 .1675 .1673 .1677
    Triad: 4851.8200 .1661 .1659 .1662
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 512
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33537024
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3602.3131 .1490 .1490 .1491
    Scale: 3615.3156 .1484 .1484 .1485
    Add: 4809.9531 .1674 .1673 .1675
    Triad: 4861.0803 .1658 .1656 .1660
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 1536
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33537024
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3612.0721 .1489 .1486 .1491
    Scale: 3615.6292 .1484 .1484 .1485
    Add: 4804.3402 .1677 .1675 .1679
    Triad: 4854.2584 .1660 .1658 .1661
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 2560
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33537024
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3603.4147 .1490 .1489 .1490
    Scale: 3615.3969 .1485 .1484 .1486
    Add: 4812.2020 .1675 .1673 .1676
    Triad: 4856.5209 .1659 .1657 .1660
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 512
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33534976
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3608.3480 .1489 .1487 .1491
    Scale: 3617.6983 .1484 .1483 .1485
    Add: 4807.8167 .1675 .1674 .1676
    Triad: 4864.0613 .1657 .1655 .1659
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 1536
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33534976
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3606.0757 .1489 .1488 .1490
    Scale: 3615.0658 .1484 .1484 .1485
    Add: 4807.9058 .1676 .1674 .1677
    Triad: 4854.1155 .1659 .1658 .1660
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
     Incremental Offset = 2560
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 33534976
     The total memory requirement is 767 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds
     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 3563.6884 .1511 .1506 .1521
    Scale: 3624.6145 .1496 .1480 .1508
    Add: 4792.8063 .1698 .1679 .1711
    Triad: 4844.0017 .1681 .1662 .1693
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------
    GETSHRSEG: requesting large pages
    GETSHRSEG ENTRY: shmgetflag -2147481216
    bindprocessor successful: thread_self() 450673 cpu_id 0
    GETSHRSEG: requesting large pages
    GETSHRSEG ENTRY: shmgetflag -2147481216
    bindprocessor successful: thread_self() 450673 cpu_id 0
    GETSHRSEG: requesting large pages
    GETSHRSEG ENTRY: shmgetflag -2147481216
    bindprocessor successful: thread_self() 450673 cpu_id 0
    bindprocessor successful: thread_self() 450673 cpu_id 0
    bindprocessor successful: thread_self() 508075 cpu_id 1



    This archive was generated by hypermail 2.1.4 : Tue Oct 05 2004 - 07:49:22 CDT