Dear John,
Here are some stream results for a Hitachi S3600. It may be
(almost) obsolete, but it is still faster than the average desktop... The
results are one run with array size of 4,000,000, then array size
20,000,000 offset 0 (twice) offset 8 and offset 10. The triad is the only
test which is markedly changed by the offset, and I have not
investigated other offsets.
If nothing else, this should double the number of Hitachi
machines for which you have data!
Michael
Compiled with "f77 -W0,'OPT(O(S)),HAP' stream_d.f t_second.f"
Array sizes increased as I don't trust the timing routine very far.
Run on an empty machine as an unprivileged user, except offset != 0 runs,
which were on a busy machine.
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 4000000
Offset = 0
The total memory requirement is 91 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 28 microseconds
The tests below will each take a time on the order
of 7461 microseconds
(= 266 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12407.9100 0.0052 0.0052 0.0052
Scale: 6500.1016 0.0100 0.0098 0.0102
Add: 11320.7547 0.0087 0.0085 0.0091
Triad: 7548.3567 0.0137 0.0127 0.0142
Sum of a is = 6075000000000.00000
Sum of b is = 1215000000000.00000
Sum of c is = 1620000000000.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 20000000
Offset = 0
The total memory requirement is 457 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 28 microseconds
The tests below will each take a time on the order
of 37276 microseconds
(= 1331 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12782.6156 0.0258 0.0250 0.0261
Scale: 6346.3102 0.0507 0.0504 0.0514
Add: 11202.1284 0.0433 0.0428 0.0435
Triad: 7031.2157 0.0697 0.0683 0.0706
Sum of a is = 30375000000000.0000
Sum of b is = 6075000000000.00000
Sum of c is = 8100000000000.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 20000000
Offset = 0
The total memory requirement is 457 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 27 microseconds
The tests below will each take a time on the order
of 37321 microseconds
(= 1382 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12700.4286 0.0258 0.0252 0.0262
Scale: 6346.0585 0.0507 0.0504 0.0513
Add: 11183.3368 0.0434 0.0429 0.0440
Triad: 6980.5997 0.0696 0.0688 0.0705
Sum of a is = 30375000000000.0000
Sum of b is = 6075000000000.00000
Sum of c is = 8100000000000.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 20000000
Offset = 8
The total memory requirement is 457 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 28 microseconds
The tests below will each take a time on the order
of 37263 microseconds
(= 1331 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12235.6900 0.0269 0.0262 0.0274
Scale: 6338.0142 0.0507 0.0505 0.0508
Add: 11437.5581 0.0422 0.0420 0.0424
Triad: 10168.6298 0.0504 0.0472 0.0599
Sum of a is = 30375000000000.0000
Sum of b is = 6075000000000.00000
Sum of c is = 8100000000000.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 20000000
Offset = 10
The total memory requirement is 457 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 28 microseconds
The tests below will each take a time on the order
of 37404 microseconds
(= 1336 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12062.7262 0.0273 0.0265 0.0288
Scale: 6351.4747 0.0509 0.0504 0.0512
Add: 11291.9921 0.0426 0.0425 0.0428
Triad: 10819.8273 0.0445 0.0444 0.0449
Sum of a is = 30375000000000.0000
Sum of b is = 6075000000000.00000
Sum of c is = 8100000000000.00000
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:07 CDT