Here's the log file from the run - please let me know if it
looks okay. The machine is bsw-1.cray.com - root passwd is
monday33 if you need to login there to see anything. I have
the stream stuff in /tmp/stream. I ran it twice - once in
single-user mode and once in multi but the numbers look comparable
to me. Here is the multi-cpu log:
Running on 1 cpus
3 trials with N=2e6
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 99483 microseconds
(= 99483 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 294.45 0.1087 0.1225 0.1198 0.1198 0.1208
Scale: 302.08 0.1059 0.1167 0.1156 0.1156 0.1167
Add: 317.36 0.1512 0.1515 0.1514 0.1514 0.1514
Triad: 313.72 0.1530 0.1540 0.1537 0.1537 0.1537
-----------------------------------------------------------------------------
All times are
0.1087 0.1059 0.1514 0.1537
0.1208 0.1164 0.1514 0.1540
0.1207 0.1166 0.1515 0.1537
0.1208 0.1167 0.1512 0.1539
0.1209 0.1167 0.1514 0.1538
0.1207 0.1167 0.1515 0.1537
0.1210 0.1166 0.1513 0.1536
0.1208 0.1167 0.1513 0.1540
0.1208 0.1167 0.1513 0.1530
0.1225 0.1167 0.1515 0.1539
-----------------------------------------------------------------------------
Sum of a is = 115330078125.0000
Sum of b is = 23066015625.00000
Sum of c is = 30754687500.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 106125 microseconds
(= 106125 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 298.12 0.1073 0.1205 0.1190 0.1190 0.1201
Scale: 301.36 0.1062 0.1168 0.1155 0.1156 0.1166
Add: 317.51 0.1512 0.1517 0.1513 0.1513 0.1512
Triad: 314.03 0.1529 0.1532 0.1530 0.1530 0.1531
-----------------------------------------------------------------------------
All times are
0.1073 0.1062 0.1512 0.1529
0.1205 0.1165 0.1517 0.1532
0.1202 0.1164 0.1513 0.1532
0.1203 0.1165 0.1515 0.1530
0.1200 0.1165 0.1512 0.1531
0.1202 0.1167 0.1512 0.1530
0.1202 0.1165 0.1515 0.1529
0.1204 0.1166 0.1512 0.1530
0.1202 0.1168 0.1513 0.1530
0.1203 0.1165 0.1513 0.1530
-----------------------------------------------------------------------------
Sum of a is = 115330078125.0000
Sum of b is = 23066015625.00000
Sum of c is = 30754687500.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 103388 microseconds
(= 103388 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 299.79 0.1067 0.1202 0.1186 0.1187 0.1200
Scale: 307.58 0.1040 0.1162 0.1148 0.1149 0.1160
Add: 316.79 0.1515 0.1518 0.1517 0.1517 0.1516
Triad: 315.79 0.1520 0.1525 0.1522 0.1522 0.1524
-----------------------------------------------------------------------------
All times are
0.1067 0.1040 0.1518 0.1521
0.1202 0.1159 0.1516 0.1521
0.1198 0.1161 0.1516 0.1525
0.1200 0.1160 0.1517 0.1520
0.1202 0.1159 0.1516 0.1524
0.1199 0.1162 0.1515 0.1524
0.1198 0.1160 0.1518 0.1521
0.1201 0.1159 0.1518 0.1523
0.1198 0.1162 0.1517 0.1523
0.1201 0.1161 0.1518 0.1521
-----------------------------------------------------------------------------
Sum of a is = 115330078125.0000
Sum of b is = 23066015625.00000
Sum of c is = 30754687500.00000
3 trials with N=5e6
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 288140 microseconds
(= 288140 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 279.58 0.3000 0.3379 0.3302 0.3306 0.3378
Scale: 288.93 0.2903 0.3223 0.3158 0.3160 0.3221
Add: 312.10 0.4032 0.4042 0.4034 0.4034 0.4032
Triad: 272.97 0.4610 0.4618 0.4612 0.4612 0.4618
-----------------------------------------------------------------------------
All times are
0.3000 0.2903 0.4042 0.4610
0.3377 0.3223 0.4033 0.4610
0.3378 0.3221 0.4032 0.4618
0.3379 0.3223 0.4032 0.4611
0.3379 0.3219 0.4033 0.4610
-----------------------------------------------------------------------------
Sum of a is = 303750.0000000000
Sum of b is = 60750.00000000000
Sum of c is = 81000.00000000000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 273045 microseconds
(= 273045 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 281.40 0.2981 0.3358 0.3282 0.3286 0.3357
Scale: 292.31 0.2870 0.3196 0.3130 0.3133 0.3196
Add: 314.48 0.4001 0.4009 0.4003 0.4003 0.4003
Triad: 275.14 0.4573 0.4576 0.4574 0.4574 0.4573
-----------------------------------------------------------------------------
All times are
0.2981 0.2870 0.4009 0.4574
0.3358 0.3196 0.4002 0.4576
0.3357 0.3196 0.4003 0.4573
0.3357 0.3196 0.4003 0.4574
0.3358 0.3195 0.4001 0.4574
-----------------------------------------------------------------------------
Sum of a is = 303750.0000000000
Sum of b is = 60750.00000000000
Sum of c is = 81000.00000000000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 283174 microseconds
(= 283174 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 278.61 0.3011 0.3386 0.3310 0.3313 0.3383
Scale: 287.68 0.2916 0.3231 0.3167 0.3170 0.3231
Add: 312.61 0.4025 0.4036 0.4028 0.4028 0.4027
Triad: 272.80 0.4613 0.4616 0.4614 0.4614 0.4615
-----------------------------------------------------------------------------
All times are
0.3011 0.2916 0.4036 0.4613
0.3386 0.3229 0.4027 0.4614
0.3383 0.3231 0.4027 0.4615
0.3386 0.3230 0.4025 0.4614
0.3383 0.3230 0.4025 0.4616
-----------------------------------------------------------------------------
Sum of a is = 303750.0000000000
Sum of b is = 60750.00000000000
Sum of c is = 81000.00000000000
Running on 2 cpus
3 trials with N=2e6
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 67243 microseconds
(= 67243 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 322.23 0.0993 0.1050 0.1043 0.1043 0.1050
Scale: 329.39 0.0971 0.1061 0.1045 0.1045 0.1043
Add: 356.41 0.1347 0.1355 0.1352 0.1352 0.1350
Triad: 355.75 0.1349 0.1360 0.1351 0.1351 0.1355
-----------------------------------------------------------------------------
All times are
0.0993 0.0971 0.1354 0.1351
0.1049 0.1056 0.1355 0.1349
0.1049 0.1057 0.1354 0.1350
0.1049 0.1054 0.1353 0.1349
0.1049 0.1025 0.1347 0.1360
0.1050 0.1061 0.1352 0.1349
0.1048 0.1057 0.1352 0.1350
0.1048 0.1055 0.1352 0.1350
0.1049 0.1054 0.1352 0.1349
0.1048 0.1057 0.1351 0.1350
-----------------------------------------------------------------------------
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 67626 microseconds
(= 67626 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 321.64 0.0995 0.1054 0.1047 0.1047 0.1052
Scale: 327.52 0.0977 0.1058 0.1048 0.1048 0.1056
Add: 354.48 0.1354 0.1359 0.1357 0.1357 0.1357
Triad: 357.62 0.1342 0.1348 0.1345 0.1345 0.1344
-----------------------------------------------------------------------------
All times are
0.0995 0.0977 0.1359 0.1347
0.1052 0.1055 0.1359 0.1344
0.1052 0.1056 0.1355 0.1346
0.1052 0.1055 0.1357 0.1343
0.1052 0.1057 0.1356 0.1347
0.1052 0.1055 0.1358 0.1342
0.1052 0.1057 0.1355 0.1348
0.1053 0.1054 0.1358 0.1342
0.1053 0.1058 0.1354 0.1346
0.1054 0.1055 0.1358 0.1347
-----------------------------------------------------------------------------
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 63427 microseconds
(= 63427 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 326.64 0.0980 0.1050 0.1042 0.1043 0.1049
Scale: 332.20 0.0963 0.1056 0.1046 0.1046 0.1055
Add: 354.67 0.1353 0.1357 0.1355 0.1355 0.1355
Triad: 356.94 0.1345 0.1348 0.1346 0.1346 0.1346
-----------------------------------------------------------------------------
All times are
0.0980 0.0963 0.1357 0.1348
0.1049 0.1055 0.1355 0.1346
0.1049 0.1055 0.1357 0.1346
0.1050 0.1056 0.1354 0.1346
0.1049 0.1054 0.1356 0.1345
0.1049 0.1056 0.1353 0.1346
0.1050 0.1054 0.1357 0.1345
0.1048 0.1056 0.1354 0.1345
0.1049 0.1055 0.1354 0.1346
0.1050 0.1055 0.1356 0.1345
-----------------------------------------------------------------------------
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
3 trials with N=5e6
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 179947 microseconds
(= 179947 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 331.26 0.2532 0.2752 0.2706 0.2708 0.2750
Scale: 332.00 0.2527 0.2776 0.2719 0.2721 0.2765
Add: 356.45 0.3530 0.3544 0.3538 0.3538 0.3544
Triad: 360.48 0.3491 0.3500 0.3496 0.3496 0.3491
-----------------------------------------------------------------------------
All times are
0.2532 0.2527 0.3536 0.3492
0.2748 0.2764 0.3542 0.3500
0.2750 0.2765 0.3544 0.3491
0.2752 0.2765 0.3530 0.3499
0.2749 0.2776 0.3538 0.3496
-----------------------------------------------------------------------------
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 173858 microseconds
(= 173858 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 330.73 0.2536 0.2746 0.2703 0.2704 0.2744
Scale: 334.70 0.2506 0.2762 0.2709 0.2711 0.2760
Add: 355.75 0.3537 0.3541 0.3539 0.3539 0.3537
Triad: 360.47 0.3491 0.3492 0.3492 0.3492 0.3491
-----------------------------------------------------------------------------
All times are
0.2536 0.2506 0.3538 0.3492
0.2744 0.2760 0.3540 0.3491
0.2744 0.2760 0.3537 0.3491
0.2746 0.2758 0.3541 0.3492
0.2744 0.2762 0.3540 0.3492
-----------------------------------------------------------------------------
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 176200 microseconds
(= 176200 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 332.77 0.2521 0.2747 0.2699 0.2700 0.2747
Scale: 333.43 0.2516 0.2765 0.2714 0.2716 0.2765
Add: 355.72 0.3537 0.3540 0.3539 0.3539 0.3537
Triad: 369.26 0.3408 0.3497 0.3477 0.3478 0.3497
-----------------------------------------------------------------------------
All times are
0.2521 0.2516 0.3540 0.3494
0.2741 0.2763 0.3540 0.3494
0.2747 0.2765 0.3537 0.3497
0.2743 0.2765 0.3539 0.3494
0.2742 0.2762 0.3539 0.3408
-----------------------------------------------------------------------------
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
Running on 4 cpus
3 trials with N=2e6
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 37289 microseconds
(= 37289 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 609.11 0.0525 0.0707 0.0563 0.0567 0.0527
Scale: 624.52 0.0512 0.0702 0.0569 0.0572 0.0537
Add: 676.15 0.0710 0.0916 0.0752 0.0757 0.0713
Triad: 703.82 0.0682 0.0892 0.0727 0.0731 0.0687
-----------------------------------------------------------------------------
All times are
0.0545 0.0512 0.0712 0.0683
0.0527 0.0557 0.0710 0.0688
0.0526 0.0539 0.0713 0.0686
0.0527 0.0538 0.0710 0.0682
0.0528 0.0537 0.0710 0.0686
0.0526 0.0538 0.0716 0.0687
0.0526 0.0539 0.0711 0.0686
0.0525 0.0538 0.0712 0.0685
0.0694 0.0688 0.0916 0.0892
0.0707 0.0702 0.0915 0.0892
-----------------------------------------------------------------------------
Sum of a is = 38443378596.67969
Sum of b is = 7688675719.335938
Sum of c is = 10251567625.78125
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 36693 microseconds
(= 36693 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 590.64 0.0542 0.0708 0.0564 0.0566 0.0544
Scale: 629.56 0.0508 0.0714 0.0575 0.0578 0.0554
Add: 654.40 0.0733 0.0940 0.0775 0.0778 0.0735
Triad: 675.10 0.0711 0.0940 0.0755 0.0759 0.0715
-----------------------------------------------------------------------------
All times are
0.0552 0.0508 0.0737 0.0712
0.0546 0.0555 0.0733 0.0717
0.0564 0.0553 0.0735 0.0712
0.0544 0.0553 0.0742 0.0711
0.0544 0.0555 0.0735 0.0717
0.0543 0.0554 0.0734 0.0712
0.0542 0.0554 0.0738 0.0711
0.0547 0.0560 0.0735 0.0716
0.0547 0.0647 0.0940 0.0940
0.0708 0.0714 0.0916 0.0898
-----------------------------------------------------------------------------
Sum of a is = 38443378596.67969
Sum of b is = 7688675719.335938
Sum of c is = 10251567625.78125
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 36638 microseconds
(= 36638 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 581.55 0.0550 0.0709 0.0583 0.0585 0.0551
Scale: 613.44 0.0522 0.0707 0.0590 0.0592 0.0592
Add: 648.45 0.0740 0.0928 0.0780 0.0783 0.0741
Triad: 678.33 0.0708 0.0917 0.0751 0.0755 0.0711
-----------------------------------------------------------------------------
All times are
0.0565 0.0522 0.0749 0.0710
0.0553 0.0574 0.0742 0.0710
0.0558 0.0566 0.0748 0.0715
0.0552 0.0562 0.0748 0.0709
0.0550 0.0603 0.0740 0.0710
0.0553 0.0582 0.0742 0.0713
0.0579 0.0562 0.0746 0.0708
0.0559 0.0565 0.0741 0.0724
0.0651 0.0658 0.0928 0.0917
0.0709 0.0707 0.0915 0.0892
-----------------------------------------------------------------------------
Sum of a is = 38443378596.67969
Sum of b is = 7688675719.335938
Sum of c is = 10251567625.78125
3 trials with N=5e6
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 90695 microseconds
(= 90695 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 605.41 0.1386 0.1856 0.1592 0.1606 0.1434
Scale: 641.87 0.1307 0.1853 0.1560 0.1575 0.1425
Add: 660.28 0.1906 0.2410 0.2205 0.2218 0.2410
Triad: 691.04 0.1821 0.2340 0.2128 0.2143 0.2340
-----------------------------------------------------------------------------
All times are
0.1386 0.1307 0.1907 0.1821
0.1428 0.1438 0.1906 0.1821
0.1434 0.1425 0.2410 0.2340
0.1856 0.1778 0.2409 0.2330
0.1856 0.1853 0.2393 0.2328
-----------------------------------------------------------------------------
Sum of a is = 101250.0193119049
Sum of b is = 20250.00386238098
Sum of c is = 27000.00514984131
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 89745 microseconds
(= 89745 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 632.41 0.1326 0.1867 0.1477 0.1491 0.1399
Scale: 662.86 0.1266 0.1860 0.1457 0.1471 0.1390
Add: 695.31 0.1810 0.2402 0.1931 0.1945 0.1812
Triad: 721.73 0.1743 0.2351 0.1980 0.2001 0.1743
-----------------------------------------------------------------------------
All times are
0.1326 0.1266 0.1810 0.1750
0.1399 0.1387 0.1813 0.1748
0.1399 0.1390 0.1812 0.1743
0.1396 0.1381 0.1818 0.2308
0.1867 0.1860 0.2402 0.2351
-----------------------------------------------------------------------------
Sum of a is = 101250.0193119049
Sum of b is = 20250.00386238098
Sum of c is = 27000.00514984131
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 99430 microseconds
(= 99430 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 609.53 0.1376 0.1850 0.1579 0.1592 0.1428
Scale: 612.42 0.1370 0.1862 0.1599 0.1612 0.1471
Add: 651.75 0.1931 0.2515 0.2143 0.2158 0.1931
Triad: 681.27 0.1847 0.2362 0.2055 0.2070 0.1862
-----------------------------------------------------------------------------
All times are
0.1376 0.1370 0.1936 0.1851
0.1437 0.1471 0.1931 0.1847
0.1428 0.1471 0.1931 0.1862
0.1804 0.1821 0.2515 0.2362
0.1850 0.1862 0.2401 0.2353
-----------------------------------------------------------------------------
Sum of a is = 101250.0193119049
Sum of b is = 20250.00386238098
Sum of c is = 27000.00514984131
Now running 2 threads -- 1 per node
3 trials with N=2e6
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 37724 microseconds
(= 37724 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 563.74 0.0568 0.0602 0.0598 0.0598 0.0602
Scale: 569.12 0.0562 0.0618 0.0610 0.0611 0.0615
Add: 591.83 0.0811 0.0833 0.0814 0.0814 0.0812
Triad: 588.59 0.0816 0.0819 0.0817 0.0817 0.0817
-----------------------------------------------------------------------------
All times are
0.0568 0.0562 0.0812 0.0818
0.0602 0.0615 0.0812 0.0817
0.0601 0.0618 0.0812 0.0819
0.0601 0.0616 0.0812 0.0817
0.0602 0.0616 0.0812 0.0817
0.0602 0.0615 0.0813 0.0817
0.0602 0.0615 0.0811 0.0819
0.0602 0.0615 0.0811 0.0816
0.0601 0.0615 0.0815 0.0816
0.0602 0.0615 0.0833 0.0817
-----------------------------------------------------------------------------
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 40005 microseconds
(= 40005 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 572.21 0.0559 0.0603 0.0598 0.0598 0.0602
Scale: 577.41 0.0554 0.0612 0.0605 0.0606 0.0611
Add: 592.53 0.0810 0.0814 0.0811 0.0811 0.0811
Triad: 598.38 0.0802 0.0805 0.0803 0.0803 0.0803
-----------------------------------------------------------------------------
All times are
0.0559 0.0554 0.0810 0.0804
0.0603 0.0611 0.0811 0.0803
0.0602 0.0611 0.0813 0.0803
0.0601 0.0612 0.0811 0.0805
0.0602 0.0612 0.0811 0.0803
0.0602 0.0611 0.0811 0.0803
0.0602 0.0611 0.0814 0.0804
0.0603 0.0611 0.0812 0.0802
0.0602 0.0611 0.0810 0.0804
0.0603 0.0610 0.0811 0.0803
-----------------------------------------------------------------------------
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 36931 microseconds
(= 36931 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 548.72 0.0583 0.0604 0.0600 0.0600 0.0603
Scale: 582.02 0.0550 0.0616 0.0608 0.0608 0.0614
Add: 591.78 0.0811 0.0815 0.0812 0.0812 0.0814
Triad: 596.88 0.0804 0.0806 0.0805 0.0805 0.0804
-----------------------------------------------------------------------------
All times are
0.0583 0.0550 0.0812 0.0805
0.0601 0.0615 0.0811 0.0805
0.0601 0.0616 0.0812 0.0805
0.0601 0.0614 0.0812 0.0805
0.0601 0.0614 0.0814 0.0804
0.0604 0.0613 0.0815 0.0805
0.0604 0.0613 0.0811 0.0805
0.0601 0.0615 0.0811 0.0805
0.0601 0.0614 0.0812 0.0806
0.0603 0.0614 0.0812 0.0804
-----------------------------------------------------------------------------
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
3 trials with N=5e6
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 132988 microseconds
(= 132988 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 589.98 0.1422 0.1556 0.1527 0.1528 0.1553
Scale: 578.73 0.1449 0.1595 0.1565 0.1566 0.1595
Add: 603.20 0.2086 0.2093 0.2091 0.2091 0.2093
Triad: 600.18 0.2097 0.2099 0.2098 0.2098 0.2099
-----------------------------------------------------------------------------
All times are
0.1422 0.1449 0.2086 0.2099
0.1552 0.1595 0.2093 0.2098
0.1553 0.1595 0.2093 0.2099
0.1552 0.1593 0.2093 0.2099
0.1556 0.1591 0.2092 0.2097
-----------------------------------------------------------------------------
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 129742 microseconds
(= 129742 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 584.43 0.1435 0.3380 0.2620 0.2776 0.3345
Scale: 592.58 0.1416 0.3224 0.2519 0.2654 0.3148
Add: 612.03 0.2056 0.4013 0.2851 0.3004 0.4009
Triad: 607.11 0.2073 0.4597 0.3101 0.3332 0.4597
-----------------------------------------------------------------------------
All times are
0.1435 0.1416 0.2056 0.2073
0.1561 0.1583 0.2057 0.2091
0.3345 0.3148 0.4009 0.4597
0.3380 0.3223 0.4013 0.4591
0.3377 0.3224 0.2121 0.2152
-----------------------------------------------------------------------------
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 134463 microseconds
(= 134463 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 594.46 0.1411 0.1557 0.1527 0.1528 0.1555
Scale: 588.73 0.1425 0.1596 0.1559 0.1561 0.1594
Add: 604.24 0.2082 0.2092 0.2085 0.2085 0.2084
Triad: 601.58 0.2092 0.2100 0.2095 0.2095 0.2095
-----------------------------------------------------------------------------
All times are
0.1411 0.1425 0.2084 0.2092
0.1557 0.1596 0.2092 0.2100
0.1555 0.1594 0.2084 0.2095
0.1556 0.1590 0.2084 0.2093
0.1554 0.1590 0.2082 0.2094
-----------------------------------------------------------------------------
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:07 CDT