[I hope you haven't made your last posting on this subject, since
there might well be other contributors; I got back as quickly as I
could.]
Here are the results of your benchmark for two Stardent machines,
the Vistra 800b and the ST2000.
(This is the double-precision benchmark in all cases.)
1) Vistra 800b (i860-based) Compiler: Portland Group FTN, Rev 1.4 (beta)
Compiler output:
f77 -O4 -Mvect -Mbeta -Mx,0,2 -o mcc2 mcc2.f
PGFTN-I-Beta Release Optimizations Activated
Vect: streaming data and stripmining loop at line 106. strip size = 252.
Vect: loop at line 106 replaced by call to __add8s.
Vect: streaming data and stripmining loop at line 99. strip size = 252.
Vect: loop at line 99 replaced by call to __add8s.
Vect: streaming data and stripmining loop at line 92. strip size = 504.
Vect: streaming data and stripmining loop at line 85. strip size = 504.
Vect: streaming data and stripmining loop at line 69. strip size = 252.
# SW pipelined loop w/ 23 cycles and 2 columns w/ cnt 4 gend for line 115
Linking:
Runtime output:
--------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
Timing calibration ; time = 99.99999627470970 hundredths of a second
Increase the size of the arrays if this is <30
and your clock precision is =<1/100 second
---------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Assignment: 160.0002 0.0373 0.0300 0.0400
Scaling : 160.0014 0.0354 0.0300 0.0400
Summing : 120.0001 0.0652 0.0600 0.0700
SAXPYing : 120.0001 0.0652 0.0600 0.0700
2) Stardent ST2000 Compiler: Stardent f77 compiler (Rev 2.3)
(second() modified for Stellix) (vector parallel)
Compiler output:
f77 -O3 -o mcc2 mcc2_gs.f
"mcc2_gs.f", line 125: Advisory: argument variable 'RMSTIME' may inhibit
vectorization.
"mcc2_gs.f", line 125: Advisory: argument variable 'MINTIME' may inhibit
vectorization.
"mcc2_gs.f", line 125: Advisory: argument variable 'MAXTIME' may inhibit
vectorization.
"mcc2_gs.f", line 69: Loop (J-loop) fully parallelized
"mcc2_gs.f", line 69: Loop (J-loop) fully vectorized
"mcc2_gs.f", line 82: Loop (K-loop) contains a subroutine or function
call or character expressions
"mcc2_gs.f", line 82: Loop (K-loop) not vectorized
"mcc2_gs.f", line 85: Loop (J-loop) fully parallelized
"mcc2_gs.f", line 85: Loop (J-loop) fully vectorized
"mcc2_gs.f", line 92: Loop (J-loop) fully parallelized
"mcc2_gs.f", line 92: Loop (J-loop) fully vectorized
"mcc2_gs.f", line 99: Loop (J-loop) fully parallelized
"mcc2_gs.f", line 99: Loop (J-loop) fully vectorized
"mcc2_gs.f", line 106: Loop (J-loop) fully parallelized
"mcc2_gs.f", line 106: Loop (J-loop) fully vectorized
"mcc2_gs.f", line 115: Loop (J-loop) fully vectorized
"mcc2_gs.f", line 122: Loop (J-loop) performs I/O or contains character
expressions
"mcc2_gs.f", line 122: Loop (J-loop) not vectorized
Vectorization-Parallelization Summary for Routine STREAM
Line Index Label Start Stop Step Vec. Par. Reason
---------------------------------------------------------------------------
69 J 10 1 300000 1 FULL FULL
82 K 60 1 10 1 None None Library or function call
85 J 20 1 300000 1 FULL FULL
92 J 30 1 300000 1 FULL FULL
99 J 40 1 300000 1 FULL FULL
106 J 50 1 300000 1 FULL FULL
114 K 80 1 10 1 None None Outer loop of nest
115 J 70 1 4 1 FULL None
122 J 90 1 4 1 None None I/O or char expressions
"mcc2_gs.f", line 181: Warning: label '10' defined but not referenced.
"mcc2_gs.f", line 181: Loop (J-loop) fully vectorized
"mcc2_gs.f", line 185: Loop (J-loop) or a contained loop has multiple exits
"mcc2_gs.f", line 185: Loop (J-loop) not vectorized
Vectorization-Parallelization Summary for Routine REALSIZE
Line Index Label Start Stop Step Vec. Par. Reason
---------------------------------------------------------------------------
181 J 20 1 30 1 FULL None
185 J 30 1 30 1 None None Inner loop: many exits
Runtime output:
--------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
Timing calibration ; time = 72.9492187500000 hundredths of a second
Increase the size of the arrays if this is <30
and your clock precision is =<1/100 second
---------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Assignment: 163.8400 0.0357 0.0293 0.0498
Scaling : 163.8400 0.0353 0.0293 0.0400
Summing : 179.8244 0.0463 0.0400 0.0508
SAXPYing : 179.8244 0.0502 0.0400 0.0605
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:01 CDT