-O2 -fomit-frame-pointer -funroll-loops -malign-double
-O2 -Mvect -p p6
The way to reset the precision of FPU operations is
#ifdef __linux__
__setfpucw( (unsigned short)((_FPU_DEFAULT & 0xF0FF) | _FPU_DOUBLE) );
#endif
MACHINE, COMPILER, COMPILATION OPTIONS TIME DATE
--------------------------------------------------------------- ------ -------
DEC AlphaStn 500/500(EV5 500MHz/8MB),f77 4.1 [1] ............... 6.93s 17Jun97
DEC AlphaStn 500/500(EV5 500MHz/8MB),f77 4.1 [2] ............... 7.17s 17Jun97
DEC AlphaStn 500/500(EV5 500MHz/8MB),f77 4.1 -fast-speculate all 7.28s 17Jun97
[1] -fast -speculate all -tune host -feedback x.fb
[2] -fast -speculate all -tune host
SGI Origin 2000 R10000 (195MHz/4MB SC), f77 7.1 [^] ........... 7.65s+ 11Dec96
[^] -64 -O3 -mips4 -align64 -r10000 -OPT:Ol=0:IEEE_a=3:ro=3:pad=on
-TENV:X=3 -LNO:fis=0:pref=0:cs2=4M -GCM:ag:ar -CG:body_ins=0
-INLINE:=ON:preemp -TARG:pl=ip27
...
Dell XPS H266 266MHz Pentium II, Linux, Portland pgf77 1.6 [*] 19.9 s 29Sep97
Dell XPS H266 266MHz Pentium II, Linux, Portland pgf77 1.4 [^] 20.4 s 21Aug97
[*] -O2 -Minline -Mvect=smallvect -Munroll -Minfo -tp p6 -pc 64
[^] -O2 -Mvect=altcode -Munroll -tpp6
...
Dell PentiumPro 200MHz/256K, Win95, Intel F77 2.4, -G6 -Qxi ... 22.6 s 22Oct97
...
PentiumPro 200MHz/256K cache, Linux, g77 0.5.18+gcc 2.7.2 [*] . 29.6 s 25Apr97
PentiumPro 200MHz/256K cache, Linux, g77 0.5.18+gcc 2.7.2 [^] . 33.0 s 25Apr97
[*] as [^], plus -fstrength-reduce -fthread-jumps -mno-ieee-fp
[^] -O3 -malign-double -funroll-all-loops -fomit-frame-pointer
-ffast-math^M
...
Dell XPS,I.PentiumPro200MHz,WindowsNT 3.51,djgpp 2.0+f2c+gcc[^] 31.6 s 10Apr96
[^] gcc -c -O3 -m486 -ffast-math -fstrength-reduce
-fthread-jumps -mno-ieee-fp
...
Pentium 586 200MHz, 256k cache, Win95, MS PwrStatn 4.0, -G5 -Ox 46.7 s 08Nov96
Pentium 586 200MHz, 256k cache, Win95, MS PwrStatn 4.0 ........ 76.6 s 08Nov96
...
PC, PPro 200MHz/256Kb, OS/2 Warp 4.0, g77 0.5.21 [^] .......... 47 s 10Nov97
[^] -O6 -funroll-all-loops -fstrength-reduce -fthread-jumps
-fomit-frame-pointer -ffast-math -malign-double -fno-automatic^M
...
Micron Pentium Pro 200MHz, Linux, g77 2.7.2.f.1 [^] ........... 49.5 s 18Jul96
Micron Pentium Pro 200MHz, Linux, g77 2.7.2.f.1, -O6 .......... 51.6 s 18Jul96
Micron Pentium Pro 200MHz, Linux, g77 2.7.2.f.1, -O ........... 52.6 s 18Jul96
Micron Pentium Pro 200MHz, Linux, g77 2.7.2.f.1 ............... 83.1 s 18Jul96
[^] -O3 -ffast-math -fstrength-reduce -fthread-jumps -mno-ieee-fp ^M
...
Intel i860 40 MHz (board Microway on a 386 PC) [^] ............ 172 s 13Feb91
[^] analog of the Paragon node.
Nodes 1 2 4 8 16 32 64 128 256 512
HP/735 2.93 1.49 0.78 0.48
Cray-T3E 1.47 0.73 0.37 0.18 0.094 0.054 0.034 0.025 SGI-Origin 1.19 0.17 0.095 0.055 Cray-T3D 0.77 0.40 0.21 0.13 0.086 0.068 Cray-C90 0.51 0.263 0.14 0.074 IBM/SP2 1.93 0.98 0.50 0.26 0.14 0.09 Intel Paragon 14.37 1.92 0.98 0.52 0.29 0.18 0.14 0.098 Intel Delta 0.17 0.13 DEC-alpha 0.96 0.51 0.27 0.14 TERRA 5.03 1.00 0.505 0.27 0.15 IBM/SP1 6.55 3.42 1.78 0.96 0.59 0.28 0.22 0.19 CM-5E 2.24 0.32 0.20 SGI P. Challenge 2.88 1.46 0.78 0.38 0.21 SGI/R10000 1.61 0.82 0.42 0.22 Convex CSPP-1000 2.50 1.27 0.69 0.39 0.22 Intel Gamma 20.72 10.49 5.31 2.66 1.37 0.73 0.42 0.26 CM-5 6.10 0.90 0.63 0.42 0.30 Cray-J90 2.00 1.038 0.565 0.343 PentiumPro 2.04 1.05 HP-J200 2.63 1.39
Cray Y-MP 1.01 Convex - 3 3.11
HP/C180 1.24 IBM-560 7.45
| Machines
(details) |
Clock | C4H4 | C4H4 | C4H6 | Ti2H8 | Thymine | Thymine |
|---|---|---|---|---|---|---|---|
| MCSCF | MCSCF | GVB | MP2 | RHF | RHF | ||
| grad Full NR | grad SOSCF | hessian | energy | gradient | gradient, Direct | ||
| SGI 4*R8k
R8000/75MHz, 4 CPUs. |
CPU
(0.94) |
696
(0.76) |
****
(****) |
607
(1.06) |
1224
(0.98) |
1045
(0.97) |
****
(****) |
| Pentium 2/266
266MHz, Gateway(?) with 128MB memory. Red Hat Linux 4.2. SPECfp95= 7.68 |
CPU
(0.74) |
608
(0.66) |
****
(****) |
530
(0.93) |
745
(0.60) |
845
(0.79) |
****
(****) |
| SG Power Challenge
R10000/194MHz. SPECfp95= 13.80 |
CPU
(0.53) |
512
(0.56) |
****
(****) |
346
(0.61) |
603
(0.48) |
499
(0.46) |
****
(****) |
| DEC 500/500
AXP5/500MHz. SPECfp95= 20.40 |
CPU
(0.39) |
328
(0.36) |
****
(****) |
263
(0.46) |
439
(0.35) |
437
(0.41) |
****
(****) |
|
|
|
|
| Initialization | MPI_Init() | - |
| Shutdown | MPI_Finalize() | - |
| Blocking Comm. | MPI_Send(), MPI_Recv() | csend(), crecv() |
| Non-blocking Comm. | MPI_ISend(), MPI_IRecv() | isend(), irecv() |
| Asynchronous Comm. | none | hrecv() |
| Collective Comm. | MPI_Scatter(), MPI_Gather(),
MPI_Reduce() |
|
| none |