SX-5 STREAM

From: Philip Tannenbaum (philt@ieee.org)
Date: Mon Dec 13 1999 - 10:49:03 CST


Dear John,

Per your request at SC99, I am enclosing formal STREAM results
for the SX-5. The formal machine name is SX-5/16A

Per the naming below, the digit after the "/" is the number of processors
used in each run. So you have 1, 2, 4, 8, and 16 cpus represented. Remember
that the SX-5 is an SDRAM memory, so it is not directly comparable with the
SX-4 original with SSRAM, but is comparable to the SX-4A which also had
SDRAM but which was never reported.

BTW, I will put this into an HPCWire article in the next few days
unless you want to release it as the keeper of the stream ;)

Best Regards

Philip Tannenbaum
HNSX Supercomputers Inc.
---------------------------------------------
PA24-260C N9312P
----- Original Message -----
Subject: Re: STREAM

>
> Machine SX5#4 16A 128GB
> Date Dec 7 ,1999
>
> [1] SX-5/1A
> ----------------------------------------------
> Double precision appears to have 16 digits of accuracy
> Assuming 8 bytes per DOUBLE PRECISION word
> ----------------------------------------------
> Array size = 80000000
> Offset = 0
> The total memory requirement is 1831 MB
> You are running each test 10 times
> The *best* time for each test is used
> ----------------------------------------------------
> Your clock granularity/precision appears to be 44 microseconds
> The tests below will each take a time on the order
> of 30108 microseconds
> (= 684 clock ticks)
> Increase the size of the arrays if this shows that
> you are not getting at least 20 clock ticks per test.
> ----------------------------------------------------
> WARNING -- The above is only a rough guideline.
> For best results, please be sure you know the
> precision of your system timer.
> ----------------------------------------------------
> Function Rate (MB/s) RMS time Min time Max time
> Copy: 42544.6479 0.0301 0.0301 0.0302
> Scale: 42546.1651 0.0301 0.0301 0.0301
> Add: 47780.3278 0.0402 0.0402 0.0402
> Triad: 47779.0521 0.0402 0.0402 0.0402
> Sum of a is = 9.226406249984801D+19
> Sum of b is = 1.845281249988019D+19
> Sum of c is = 2.460375000016015D+19
>
>
>
> [2] SX-5/2A
> ----------------------------------------------
> Double precision appears to have 16 digits of accuracy
> Assuming 8 bytes per DOUBLE PRECISION word
> ----------------------------------------------
> Array size = 80000000
> Offset = 0
> The total memory requirement is 1831 MB
> You are running each test 10 times
> The *best* time for each test is used
> ----------------------------------------------------
> Your clock granularity/precision appears to be 44 microseconds
> The tests below will each take a time on the order
> of 15122 microseconds
> (= 344 clock ticks)
> Increase the size of the arrays if this shows that
> you are not getting at least 20 clock ticks per test.
> ----------------------------------------------------
> WARNING -- The above is only a rough guideline.
> For best results, please be sure you know the
> precision of your system timer.
> ----------------------------------------------------
> Function Rate (MB/s) RMS time Min time Max time
> Copy: 84852.6042 0.0151 0.0151 0.0151
> Scale: 84852.6042 0.0151 0.0151 0.0151
> Add: 95351.6151 0.0201 0.0201 0.0201
> Triad: 95327.9119 0.0201 0.0201 0.0202
> Sum of a is = 9.226406249985602D+19
> Sum of b is = 1.845281249991997D+19
> Sum of c is = 2.460375000000000D+19
>
>
>
> [4] SX-5/4A
> ----------------------------------------------
> Double precision appears to have 16 digits of accuracy
> Assuming 8 bytes per DOUBLE PRECISION word
> ----------------------------------------------
> Array size = 80000000
> Offset = 0
> The total memory requirement is 1831 MB
> You are running each test 10 times
> The *best* time for each test is used
> ----------------------------------------------------
> Your clock granularity/precision appears to be 44 microseconds
> The tests below will each take a time on the order
> of 7620 microseconds
> (= 173 clock ticks)
> Increase the size of the arrays if this shows that
> you are not getting at least 20 clock ticks per test.
> ----------------------------------------------------
> WARNING -- The above is only a rough guideline.
> For best results, please be sure you know the
> precision of your system timer.
> ----------------------------------------------------
> Function Rate (MB/s) RMS time Min time Max time
> Copy: 168485.5912 0.0076 0.0076 0.0076
> Scale: 168509.3886 0.0076 0.0076 0.0076
> Add: 189555.2133 0.0101 0.0101 0.0101
> Triad: 189517.2955 0.0101 0.0101 0.0101
> Sum of a is = 9.226406249987201D+19
> Sum of b is = 1.845281249999994D+19
> Sum of c is = 2.460375000000000D+19
>
>
>
> [8] SX-5/8A
> ----------------------------------------------
> Double precision appears to have 16 digits of accuracy
> Assuming 8 bytes per DOUBLE PRECISION word
> ----------------------------------------------
> Array size = 80000000
> Offset = 0
> The total memory requirement is 1831 MB
> You are running each test 10 times
> The *best* time for each test is used
> ----------------------------------------------------
> Your clock granularity/precision appears to be 44 microseconds
> The tests below will each take a time on the order
> of 3869 microseconds
> (= 88 clock ticks)
> Increase the size of the arrays if this shows that
> you are not getting at least 20 clock ticks per test.
> ----------------------------------------------------
> WARNING -- The above is only a rough guideline.
> For best results, please be sure you know the
> precision of your system timer.
> ----------------------------------------------------
> Function Rate (MB/s) RMS time Min time Max time
> Copy: 332551.3578 0.0039 0.0038 0.0039
> Scale: 332551.3578 0.0039 0.0038 0.0039
> Add: 371160.2378 0.0053 0.0052 0.0054
> Triad: 366690.0567 0.0053 0.0052 0.0054
> Sum of a is = 9.226406249990401D+19
> Sum of b is = 1.845281249999999D+19
> Sum of c is = 2.460375000000000D+19
>
>
>
> [16] SX-5/16A
> ----------------------------------------------
> Double precision appears to have 16 digits of accuracy
> Assuming 8 bytes per DOUBLE PRECISION word
> ----------------------------------------------
> Array size = 80000000
> Offset = 0
> The total memory requirement is 1831 MB
> You are running each test 10 times
> The *best* time for each test is used
> ----------------------------------------------------
> Your clock granularity/precision appears to be 45 microseconds
> The tests below will each take a time on the order
> of 2157 microseconds
> (= 48 clock ticks)
> Increase the size of the arrays if this shows that
> you are not getting at least 20 clock ticks per test.
> ----------------------------------------------------
> WARNING -- The above is only a rough guideline.
> For best results, please be sure you know the
> precision of your system timer.
> ----------------------------------------------------
> Function Rate (MB/s) RMS time Min time Max time
> Copy: 607491.8382 0.0021 0.0021 0.0022
> Scale: 590389.7421 0.0022 0.0022 0.0023
> Add: 607411.6518 0.0032 0.0032 0.0033
> Triad: 583069.4479 0.0034 0.0033 0.0034
> Sum of a is = 9.226406249996797D+19
> Sum of b is = 1.845281249999999D+19
> Sum of c is = 2.460374999999999D+19
>
>
>



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:08 CDT