John -- here is the STREAM result for our new Blade product.
Peter Wong will submit the corresponding Linux measurements.
Could you please make this public as soon as possible? Please
tell me ASAP if there are any more details I need to supply.
Thanks,
Carl Ponder, Ph.D.
AIX/Blade Performance
SPEC/HPG Representative
==========================================================================
Here are the details:
Hardware:
IBM BladeCenter JS22
Measurements were made with a single blade (2 chips)
POWER6 processors, 4.0 GHz, 2 cores per chip
16 GB of Memory per blade: 4 x 4 GB DIMMs, DDR2 667MHz
Primary Cache: 64 KB I + 64 KB D on chip per core
Secondary Cache: 4 MB I+D on chip per core
L3 Cache: None
SMT is enabled
OS & Environment:
IBM AIX 5L V5.3, updated with the 5300-07 Technology Level.
MEMORY_AFFINITY=MCM
OMP_NUM_THREADS=8
XLSMPOPTS=startproc=0:stride=1
C Compile:
XL C/C++ Enterprise Edition Version 9.0 for AIX, updated to
October 2007 PTF Level.
xlc_r -q64 -O5 -qsmp=omp -qthreaded -bdatapsize:64K -btextpsize:64K
*Result description:*
Computer Array size SMT Triad score
------------------------------------- ---------- --- -----------
IBM BladeCenter JS22 (4.0 GHz POWER6) 80,000,000 on 15456.5800
*Full report:*
==========================================================================
-------------------------------------------------------------
STREAM version $Revision: 5.8 $
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 80000000, Offset = 0
Total memory required = 1831.1 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Number of Threads requested = 8
-------------------------------------------------------------
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 86977 microseconds.
(= 86977 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 13606.5842 0.0942 0.0941 0.0942
Scale: 13589.1228 0.0943 0.0942 0.0943
Add: 15416.3682 0.1246 0.1245 0.1247
Triad: 15456.5800 0.1243 0.1242 0.1245
-------------------------------------------------------------
Solution Validates
-------------------------------------------------------------
Received on Tue Nov 06 10:36:05 2007
This archive was generated by hypermail 2.1.8 : Tue Nov 06 2007 - 14:44:46 CST