From: h-takahara@bc.jp.nec.com
Date: Wed Oct 16 2002 - 03:51:19 CDT
Dear Dr. McCalpin,
Please find attached the STREAM runs we obtained on the NEC SX-6.
The machine used is the SX-6/8A, which consists of 8 processors.
The performance figures are reported for 1, 2, 4, and 8-processor
configurations of this system.
Each digit after the "/" on the attached table represents the number
of processors actually used for each run, e.g., SX-6/2A for
a 2-processor configuration.
We would be appreciated if you could update the STREAM Web site
with these data.
Thank you.
Best regards,
Hiroshi Takahara
Hiroshi Takahara
Senior Manager, Scientific Software Department
HPC Marketing Promotion Division, NEC Corporation
1-10, Nisshin-cho, Fuchu, Tokyo 183-8501, Japan
Tel/Fax: +81-42-333-6389 /6382
E-mail: h-takahara@bc.jp.nec.com
-------
1. Summary
[1cpu]
Function Rate (MB/s) RMS time Min time Max time
Copy: 31959.2652 0.0401 0.0401 0.0401
Scale: 31920.2167 0.0401 0.0401 0.0401
Add: 31983.0006 0.0600 0.0600 0.0600
Triad: 31982.9371 0.0600 0.0600 0.0600
[2cpus]
Function Rate (MB/s) RMS time Min time Max time
Copy: 63770.4794 0.0201 0.0201 0.0202
Scale: 63665.7411 0.0201 0.0201 0.0202
Add: 63908.6389 0.0301 0.0300 0.0301
Triad: 63908.3853 0.0301 0.0300 0.0301
[4cpus]
Function Rate (MB/s) RMS time Min time Max time
Copy: 126895.8381 0.0101 0.0101 0.0101
Scale: 126620.4981 0.0101 0.0101 0.0102
Add: 127643.0474 0.0150 0.0150 0.0150
Triad: 127633.9437 0.0150 0.0150 0.0150
[8cpus]
Function Rate (MB/s) RMS time Min time Max time
Copy: 202627.2054 0.0064 0.0063 0.0065
Scale: 192306.2280 0.0067 0.0067 0.0068
Add: 190231.3486 0.0102 0.0101 0.0104
Triad: 213024.2882 0.0092 0.0090 0.0093
2. Details
Machine: SX-6/8A 64GB
[SX-6/1A]
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 80000000
Offset = 0
The total memory requirement is 1831 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 28 microseconds
The tests below will each take a time on the order
of 40113 microseconds
(= 1433 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 31959.2652 0.0401 0.0401 0.0401
Scale: 31920.2167 0.0401 0.0401 0.0401
Add: 31983.0006 0.0600 0.0600 0.0600
Triad: 31982.9371 0.0600 0.0600 0.0600
Sum of a is = 9.226406249984801D+19
Sum of b is = 1.845281249988019D+19
Sum of c is = 2.460375000016015D+19
[SX-6/2A]
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 80000000
Offset = 0
The total memory requirement is 1831 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 28 microseconds
The tests below will each take a time on the order
of 20146 microseconds
(= 720 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 63770.4794 0.0201 0.0201 0.0202
Scale: 63665.7411 0.0201 0.0201 0.0202
Add: 63908.6389 0.0301 0.0300 0.0301
Triad: 63908.3853 0.0301 0.0300 0.0301
Sum of a is = 9.226406249985600D+19
Sum of b is = 1.845281249991997D+19
Sum of c is = 2.460375000000000D+19
[SX-6/4A]
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 80000000
Offset = 0
The total memory requirement is 1831 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 28 microseconds
The tests below will each take a time on the order
of 10132 microseconds
(= 362 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 126895.8381 0.0101 0.0101 0.0101
Scale: 126620.4981 0.0101 0.0101 0.0102
Add: 127643.0474 0.0150 0.0150 0.0150
Triad: 127633.9437 0.0150 0.0150 0.0150
Sum of a is = 9.226406249987199D+19
Sum of b is = 1.845281249999994D+19
Sum of c is = 2.460375000000000D+19
[SX-6/8A]
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 80000000
Offset = 0
The total memory requirement is 1831 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 29 microseconds
The tests below will each take a time on the order
of 6047 microseconds
(= 209 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 202627.2054 0.0064 0.0063 0.0065
Scale: 192306.2280 0.0067 0.0067 0.0068
Add: 190231.3486 0.0102 0.0101 0.0104
Triad: 213024.2882 0.0092 0.0090 0.0093
Sum of a is = 9.226406249990398D+19
Sum of b is = 1.845281250000000D+19
Sum of c is = 2.460375000000000D+19
-------
This archive was generated by hypermail 2.1.4 : Tue Oct 29 2002 - 15:11:18 CST