> I don't know the details. I know that there is a barrier tree in
> the router network and that the design came from the T3E folks
> in Chippewa Falls.
I don't have time to recheck but this sounds more like the faster T3D
barrier tree.
> >PS> Have you had any IBM formal submission of STREAM numbers for Model
> > 260? I wouldn't want to submit mine.
>
> No word from IBM. I have gotten numbers from a number of customers,
> and they are all in the 550-800 MB/s range. I may put them in the
> standard table soon, just to try to get IBM's attention....
Well in that case I'm not breaking any embarrassing secret:
These are the best results I managed to get compiling with flags for
Power3 and a high res PowerPC timer:
xlf 6.1.0.0 and also 5.1.1.0
-O3 -qarch=pwr3 -qtune=pwr3 1 548.9 545.3 766.0 767.6
Using the standard timer may actually give slightly higher numbers but
the number of timer ticks used is very small.
Following the suggestions in the RS/6000 Scientific and Technical
Computing: POWER3 Introduction and Tuning Guide (that describe how to
tune BLAS/ESSL dcopy()) I transformed the STREAM source code for COPY
and managed to get approximately 862.3MB/s for COPY user-visible
bandwidth(*). This is still below IBM claims (and anyway the standard
STREAM results are supposed to be attainable without hand-tuning *and*
the technique used is so simple the compiler should have it in as a
special case given that copying vectors is very frequently done in
codes).
(*) I had to compile with -O2 instead of -O3 as -O3 negated the
effects of the hand-tuning.
Constantinos Evangelinos
Center for Fluid Mechanics
Brown University/Division of Applied Mathematics
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:08 CDT