Here's the log file from the run - please let me know if it
looks okay. The machine is - root passwd is
monday33 if you need to login there to see anything. I have
the stream stuff in /tmp/stream. I ran it twice - once in
single-user mode and once in multi but the numbers look comparable
to me. Here is the multi-cpu log:
Running on 1 cpus
3 trials with N=2e6
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 99483 microseconds
(= 99483 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 294.45 0.1087 0.1225 0.1198 0.1198 0.1208
Scale: 302.08 0.1059 0.1167 0.1156 0.1156 0.1167
Add: 317.36 0.1512 0.1515 0.1514 0.1514 0.1514
Triad: 313.72 0.1530 0.1540 0.1537 0.1537 0.1537
All times are
0.1087 0.1059 0.1514 0.1537
0.1208 0.1164 0.1514 0.1540
0.1207 0.1166 0.1515 0.1537
0.1208 0.1167 0.1512 0.1539
0.1209 0.1167 0.1514 0.1538
0.1207 0.1167 0.1515 0.1537
0.1210 0.1166 0.1513 0.1536
0.1208 0.1167 0.1513 0.1540
0.1208 0.1167 0.1513 0.1530
0.1225 0.1167 0.1515 0.1539
Sum of a is = 115330078125.0000
Sum of b is = 23066015625.00000
Sum of c is = 30754687500.00000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 106125 microseconds
(= 106125 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 298.12 0.1073 0.1205 0.1190 0.1190 0.1201
Scale: 301.36 0.1062 0.1168 0.1155 0.1156 0.1166
Add: 317.51 0.1512 0.1517 0.1513 0.1513 0.1512
Triad: 314.03 0.1529 0.1532 0.1530 0.1530 0.1531
All times are
0.1073 0.1062 0.1512 0.1529
0.1205 0.1165 0.1517 0.1532
0.1202 0.1164 0.1513 0.1532
0.1203 0.1165 0.1515 0.1530
0.1200 0.1165 0.1512 0.1531
0.1202 0.1167 0.1512 0.1530
0.1202 0.1165 0.1515 0.1529
0.1204 0.1166 0.1512 0.1530
0.1202 0.1168 0.1513 0.1530
0.1203 0.1165 0.1513 0.1530
Sum of a is = 115330078125.0000
Sum of b is = 23066015625.00000
Sum of c is = 30754687500.00000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 103388 microseconds
(= 103388 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 299.79 0.1067 0.1202 0.1186 0.1187 0.1200
Scale: 307.58 0.1040 0.1162 0.1148 0.1149 0.1160
Add: 316.79 0.1515 0.1518 0.1517 0.1517 0.1516
Triad: 315.79 0.1520 0.1525 0.1522 0.1522 0.1524
All times are
0.1067 0.1040 0.1518 0.1521
0.1202 0.1159 0.1516 0.1521
0.1198 0.1161 0.1516 0.1525
0.1200 0.1160 0.1517 0.1520
0.1202 0.1159 0.1516 0.1524
0.1199 0.1162 0.1515 0.1524
0.1198 0.1160 0.1518 0.1521
0.1201 0.1159 0.1518 0.1523
0.1198 0.1162 0.1517 0.1523
0.1201 0.1161 0.1518 0.1521
Sum of a is = 115330078125.0000
Sum of b is = 23066015625.00000
Sum of c is = 30754687500.00000
3 trials with N=5e6
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 288140 microseconds
(= 288140 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 279.58 0.3000 0.3379 0.3302 0.3306 0.3378
Scale: 288.93 0.2903 0.3223 0.3158 0.3160 0.3221
Add: 312.10 0.4032 0.4042 0.4034 0.4034 0.4032
Triad: 272.97 0.4610 0.4618 0.4612 0.4612 0.4618
All times are
0.3000 0.2903 0.4042 0.4610
0.3377 0.3223 0.4033 0.4610
0.3378 0.3221 0.4032 0.4618
0.3379 0.3223 0.4032 0.4611
0.3379 0.3219 0.4033 0.4610
Sum of a is = 303750.0000000000
Sum of b is = 60750.00000000000
Sum of c is = 81000.00000000000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 273045 microseconds
(= 273045 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 281.40 0.2981 0.3358 0.3282 0.3286 0.3357
Scale: 292.31 0.2870 0.3196 0.3130 0.3133 0.3196
Add: 314.48 0.4001 0.4009 0.4003 0.4003 0.4003
Triad: 275.14 0.4573 0.4576 0.4574 0.4574 0.4573
All times are
0.2981 0.2870 0.4009 0.4574
0.3358 0.3196 0.4002 0.4576
0.3357 0.3196 0.4003 0.4573
0.3357 0.3196 0.4003 0.4574
0.3358 0.3195 0.4001 0.4574
Sum of a is = 303750.0000000000
Sum of b is = 60750.00000000000
Sum of c is = 81000.00000000000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 283174 microseconds
(= 283174 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 278.61 0.3011 0.3386 0.3310 0.3313 0.3383
Scale: 287.68 0.2916 0.3231 0.3167 0.3170 0.3231
Add: 312.61 0.4025 0.4036 0.4028 0.4028 0.4027
Triad: 272.80 0.4613 0.4616 0.4614 0.4614 0.4615
All times are
0.3011 0.2916 0.4036 0.4613
0.3386 0.3229 0.4027 0.4614
0.3383 0.3231 0.4027 0.4615
0.3386 0.3230 0.4025 0.4614
0.3383 0.3230 0.4025 0.4616
Sum of a is = 303750.0000000000
Sum of b is = 60750.00000000000
Sum of c is = 81000.00000000000
Running on 2 cpus
3 trials with N=2e6
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 67243 microseconds
(= 67243 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 322.23 0.0993 0.1050 0.1043 0.1043 0.1050
Scale: 329.39 0.0971 0.1061 0.1045 0.1045 0.1043
Add: 356.41 0.1347 0.1355 0.1352 0.1352 0.1350
Triad: 355.75 0.1349 0.1360 0.1351 0.1351 0.1355
All times are
0.0993 0.0971 0.1354 0.1351
0.1049 0.1056 0.1355 0.1349
0.1049 0.1057 0.1354 0.1350
0.1049 0.1054 0.1353 0.1349
0.1049 0.1025 0.1347 0.1360
0.1050 0.1061 0.1352 0.1349
0.1048 0.1057 0.1352 0.1350
0.1048 0.1055 0.1352 0.1350
0.1049 0.1054 0.1352 0.1349
0.1048 0.1057 0.1351 0.1350
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 67626 microseconds
(= 67626 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 321.64 0.0995 0.1054 0.1047 0.1047 0.1052
Scale: 327.52 0.0977 0.1058 0.1048 0.1048 0.1056
Add: 354.48 0.1354 0.1359 0.1357 0.1357 0.1357
Triad: 357.62 0.1342 0.1348 0.1345 0.1345 0.1344
All times are
0.0995 0.0977 0.1359 0.1347
0.1052 0.1055 0.1359 0.1344
0.1052 0.1056 0.1355 0.1346
0.1052 0.1055 0.1357 0.1343
0.1052 0.1057 0.1356 0.1347
0.1052 0.1055 0.1358 0.1342
0.1052 0.1057 0.1355 0.1348
0.1053 0.1054 0.1358 0.1342
0.1053 0.1058 0.1354 0.1346
0.1054 0.1055 0.1358 0.1347
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 63427 microseconds
(= 63427 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 326.64 0.0980 0.1050 0.1042 0.1043 0.1049
Scale: 332.20 0.0963 0.1056 0.1046 0.1046 0.1055
Add: 354.67 0.1353 0.1357 0.1355 0.1355 0.1355
Triad: 356.94 0.1345 0.1348 0.1346 0.1346 0.1346
All times are
0.0980 0.0963 0.1357 0.1348
0.1049 0.1055 0.1355 0.1346
0.1049 0.1055 0.1357 0.1346
0.1050 0.1056 0.1354 0.1346
0.1049 0.1054 0.1356 0.1345
0.1049 0.1056 0.1353 0.1346
0.1050 0.1054 0.1357 0.1345
0.1048 0.1056 0.1354 0.1345
0.1049 0.1055 0.1354 0.1346
0.1050 0.1055 0.1356 0.1345
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
3 trials with N=5e6
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 179947 microseconds
(= 179947 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 331.26 0.2532 0.2752 0.2706 0.2708 0.2750
Scale: 332.00 0.2527 0.2776 0.2719 0.2721 0.2765
Add: 356.45 0.3530 0.3544 0.3538 0.3538 0.3544
Triad: 360.48 0.3491 0.3500 0.3496 0.3496 0.3491
All times are
0.2532 0.2527 0.3536 0.3492
0.2748 0.2764 0.3542 0.3500
0.2750 0.2765 0.3544 0.3491
0.2752 0.2765 0.3530 0.3499
0.2749 0.2776 0.3538 0.3496
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 173858 microseconds
(= 173858 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 330.73 0.2536 0.2746 0.2703 0.2704 0.2744
Scale: 334.70 0.2506 0.2762 0.2709 0.2711 0.2760
Add: 355.75 0.3537 0.3541 0.3539 0.3539 0.3537
Triad: 360.47 0.3491 0.3492 0.3492 0.3492 0.3491
All times are
0.2536 0.2506 0.3538 0.3492
0.2744 0.2760 0.3540 0.3491
0.2744 0.2760 0.3537 0.3491
0.2746 0.2758 0.3541 0.3492
0.2744 0.2762 0.3540 0.3492
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 176200 microseconds
(= 176200 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 332.77 0.2521 0.2747 0.2699 0.2700 0.2747
Scale: 333.43 0.2516 0.2765 0.2714 0.2716 0.2765
Add: 355.72 0.3537 0.3540 0.3539 0.3539 0.3537
Triad: 369.26 0.3408 0.3497 0.3477 0.3478 0.3497
All times are
0.2521 0.2516 0.3540 0.3494
0.2741 0.2763 0.3540 0.3494
0.2747 0.2765 0.3537 0.3497
0.2743 0.2765 0.3539 0.3494
0.2742 0.2762 0.3539 0.3408
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
Running on 4 cpus
3 trials with N=2e6
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 37289 microseconds
(= 37289 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 609.11 0.0525 0.0707 0.0563 0.0567 0.0527
Scale: 624.52 0.0512 0.0702 0.0569 0.0572 0.0537
Add: 676.15 0.0710 0.0916 0.0752 0.0757 0.0713
Triad: 703.82 0.0682 0.0892 0.0727 0.0731 0.0687
All times are
0.0545 0.0512 0.0712 0.0683
0.0527 0.0557 0.0710 0.0688
0.0526 0.0539 0.0713 0.0686
0.0527 0.0538 0.0710 0.0682
0.0528 0.0537 0.0710 0.0686
0.0526 0.0538 0.0716 0.0687
0.0526 0.0539 0.0711 0.0686
0.0525 0.0538 0.0712 0.0685
0.0694 0.0688 0.0916 0.0892
0.0707 0.0702 0.0915 0.0892
Sum of a is = 38443378596.67969
Sum of b is = 7688675719.335938
Sum of c is = 10251567625.78125
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 36693 microseconds
(= 36693 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 590.64 0.0542 0.0708 0.0564 0.0566 0.0544
Scale: 629.56 0.0508 0.0714 0.0575 0.0578 0.0554
Add: 654.40 0.0733 0.0940 0.0775 0.0778 0.0735
Triad: 675.10 0.0711 0.0940 0.0755 0.0759 0.0715
All times are
0.0552 0.0508 0.0737 0.0712
0.0546 0.0555 0.0733 0.0717
0.0564 0.0553 0.0735 0.0712
0.0544 0.0553 0.0742 0.0711
0.0544 0.0555 0.0735 0.0717
0.0543 0.0554 0.0734 0.0712
0.0542 0.0554 0.0738 0.0711
0.0547 0.0560 0.0735 0.0716
0.0547 0.0647 0.0940 0.0940
0.0708 0.0714 0.0916 0.0898
Sum of a is = 38443378596.67969
Sum of b is = 7688675719.335938
Sum of c is = 10251567625.78125
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 36638 microseconds
(= 36638 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 581.55 0.0550 0.0709 0.0583 0.0585 0.0551
Scale: 613.44 0.0522 0.0707 0.0590 0.0592 0.0592
Add: 648.45 0.0740 0.0928 0.0780 0.0783 0.0741
Triad: 678.33 0.0708 0.0917 0.0751 0.0755 0.0711
All times are
0.0565 0.0522 0.0749 0.0710
0.0553 0.0574 0.0742 0.0710
0.0558 0.0566 0.0748 0.0715
0.0552 0.0562 0.0748 0.0709
0.0550 0.0603 0.0740 0.0710
0.0553 0.0582 0.0742 0.0713
0.0579 0.0562 0.0746 0.0708
0.0559 0.0565 0.0741 0.0724
0.0651 0.0658 0.0928 0.0917
0.0709 0.0707 0.0915 0.0892
Sum of a is = 38443378596.67969
Sum of b is = 7688675719.335938
Sum of c is = 10251567625.78125
3 trials with N=5e6
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 90695 microseconds
(= 90695 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 605.41 0.1386 0.1856 0.1592 0.1606 0.1434
Scale: 641.87 0.1307 0.1853 0.1560 0.1575 0.1425
Add: 660.28 0.1906 0.2410 0.2205 0.2218 0.2410
Triad: 691.04 0.1821 0.2340 0.2128 0.2143 0.2340
All times are
0.1386 0.1307 0.1907 0.1821
0.1428 0.1438 0.1906 0.1821
0.1434 0.1425 0.2410 0.2340
0.1856 0.1778 0.2409 0.2330
0.1856 0.1853 0.2393 0.2328
Sum of a is = 101250.0193119049
Sum of b is = 20250.00386238098
Sum of c is = 27000.00514984131
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 89745 microseconds
(= 89745 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 632.41 0.1326 0.1867 0.1477 0.1491 0.1399
Scale: 662.86 0.1266 0.1860 0.1457 0.1471 0.1390
Add: 695.31 0.1810 0.2402 0.1931 0.1945 0.1812
Triad: 721.73 0.1743 0.2351 0.1980 0.2001 0.1743
All times are
0.1326 0.1266 0.1810 0.1750
0.1399 0.1387 0.1813 0.1748
0.1399 0.1390 0.1812 0.1743
0.1396 0.1381 0.1818 0.2308
0.1867 0.1860 0.2402 0.2351
Sum of a is = 101250.0193119049
Sum of b is = 20250.00386238098
Sum of c is = 27000.00514984131
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 99430 microseconds
(= 99430 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 609.53 0.1376 0.1850 0.1579 0.1592 0.1428
Scale: 612.42 0.1370 0.1862 0.1599 0.1612 0.1471
Add: 651.75 0.1931 0.2515 0.2143 0.2158 0.1931
Triad: 681.27 0.1847 0.2362 0.2055 0.2070 0.1862
All times are
0.1376 0.1370 0.1936 0.1851
0.1437 0.1471 0.1931 0.1847
0.1428 0.1471 0.1931 0.1862
0.1804 0.1821 0.2515 0.2362
0.1850 0.1862 0.2401 0.2353
Sum of a is = 101250.0193119049
Sum of b is = 20250.00386238098
Sum of c is = 27000.00514984131
Now running 2 threads -- 1 per node
3 trials with N=2e6
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 37724 microseconds
(= 37724 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 563.74 0.0568 0.0602 0.0598 0.0598 0.0602
Scale: 569.12 0.0562 0.0618 0.0610 0.0611 0.0615
Add: 591.83 0.0811 0.0833 0.0814 0.0814 0.0812
Triad: 588.59 0.0816 0.0819 0.0817 0.0817 0.0817
All times are
0.0568 0.0562 0.0812 0.0818
0.0602 0.0615 0.0812 0.0817
0.0601 0.0618 0.0812 0.0819
0.0601 0.0616 0.0812 0.0817
0.0602 0.0616 0.0812 0.0817
0.0602 0.0615 0.0813 0.0817
0.0602 0.0615 0.0811 0.0819
0.0602 0.0615 0.0811 0.0816
0.0601 0.0615 0.0815 0.0816
0.0602 0.0615 0.0833 0.0817
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 40005 microseconds
(= 40005 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 572.21 0.0559 0.0603 0.0598 0.0598 0.0602
Scale: 577.41 0.0554 0.0612 0.0605 0.0606 0.0611
Add: 592.53 0.0810 0.0814 0.0811 0.0811 0.0811
Triad: 598.38 0.0802 0.0805 0.0803 0.0803 0.0803
All times are
0.0559 0.0554 0.0810 0.0804
0.0603 0.0611 0.0811 0.0803
0.0602 0.0611 0.0813 0.0803
0.0601 0.0612 0.0811 0.0805
0.0602 0.0612 0.0811 0.0803
0.0602 0.0611 0.0811 0.0803
0.0602 0.0611 0.0814 0.0804
0.0603 0.0611 0.0812 0.0802
0.0602 0.0611 0.0810 0.0804
0.0603 0.0610 0.0811 0.0803
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 2000000
Offset = 0
The total memory requirement is 45 MB
You are running each test 10 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 36931 microseconds
(= 36931 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 548.72 0.0583 0.0604 0.0600 0.0600 0.0603
Scale: 582.02 0.0550 0.0616 0.0608 0.0608 0.0614
Add: 591.78 0.0811 0.0815 0.0812 0.0812 0.0814
Triad: 596.88 0.0804 0.0806 0.0805 0.0805 0.0804
All times are
0.0583 0.0550 0.0812 0.0805
0.0601 0.0615 0.0811 0.0805
0.0601 0.0616 0.0812 0.0805
0.0601 0.0614 0.0812 0.0805
0.0601 0.0614 0.0814 0.0804
0.0604 0.0613 0.0815 0.0805
0.0604 0.0613 0.0811 0.0805
0.0601 0.0615 0.0811 0.0805
0.0601 0.0614 0.0812 0.0806
0.0603 0.0614 0.0812 0.0804
Sum of a is = 57665039062.50000
Sum of b is = 11533007812.50000
Sum of c is = 15377343750.00000
3 trials with N=5e6
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 132988 microseconds
(= 132988 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 589.98 0.1422 0.1556 0.1527 0.1528 0.1553
Scale: 578.73 0.1449 0.1595 0.1565 0.1566 0.1595
Add: 603.20 0.2086 0.2093 0.2091 0.2091 0.2093
Triad: 600.18 0.2097 0.2099 0.2098 0.2098 0.2099
All times are
0.1422 0.1449 0.2086 0.2099
0.1552 0.1595 0.2093 0.2098
0.1553 0.1595 0.2093 0.2099
0.1552 0.1593 0.2093 0.2099
0.1556 0.1591 0.2092 0.2097
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 129742 microseconds
(= 129742 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 584.43 0.1435 0.3380 0.2620 0.2776 0.3345
Scale: 592.58 0.1416 0.3224 0.2519 0.2654 0.3148
Add: 612.03 0.2056 0.4013 0.2851 0.3004 0.4009
Triad: 607.11 0.2073 0.4597 0.3101 0.3332 0.4597
All times are
0.1435 0.1416 0.2056 0.2073
0.1561 0.1583 0.2057 0.2091
0.3345 0.3148 0.4009 0.4597
0.3380 0.3223 0.4013 0.4591
0.3377 0.3224 0.2121 0.2152
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
Array size = 5242880
Offset = 255
The total memory requirement is 120 MB
You are running each test 5 times
The *best* time for each test is used
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 134463 microseconds
(= 134463 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
Function Rate (MB/s) Min time Max time Mean time RMS time Median
Copy: 594.46 0.1411 0.1557 0.1527 0.1528 0.1555
Scale: 588.73 0.1425 0.1596 0.1559 0.1561 0.1594
Add: 604.24 0.2082 0.2092 0.2085 0.2085 0.2084
Triad: 601.58 0.2092 0.2100 0.2095 0.2095 0.2095
All times are
0.1411 0.1425 0.2084 0.2092
0.1557 0.1596 0.2092 0.2100
0.1555 0.1594 0.2084 0.2095
0.1556 0.1590 0.2084 0.2093
0.1554 0.1590 0.2082 0.2094
Sum of a is = 151875.0000000000
Sum of b is = 30375.00000000000
Sum of c is = 40500.00000000000
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:07 CDT