gcc performance regression on md2 from gcc-2.95.3?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

gcc performance regression on md2 from gcc-2.95.3?

Dan Kegel-2
I'm looking into http://gcc.gnu.org/PR19923
which claims that gcc-4.0 is slower on 'openssl speed'
than earlier versions.  The only huge regression seems
to be in md2.  Has anyone else looked at this yet?
I imagine it's a gcc problem, but I thought I'd ask
here just in case.

OpenSSL 0.9.7e 25 Oct 2004
built on: Mon May 30 09:18:03 PDT 2005
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: /usr/crosstool/gcc-2.95.3-glibc-2.2.2-hdrs-2.6.11.2/i686-unknown-linux-gnu/bin/i686-unknown-linux-gnu-gcc -fPIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DOPENSSL_NO_IDEA -DOPENSSL_NO_MDC2
-DOPENSSL_NO_RC5 -DDSA_PRECOMPUTE -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -mcpu=pentiumpro -Wall -DSHA1_ASM -DMD5_ASM -DRMD160_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               2444.10k     5557.76k     8213.50k     9278.81k     9693.87k


OpenSSL 0.9.7e 25 Oct 2004
built on: Mon May 30 09:46:19 PDT 2005
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: /usr/crosstool/gcc-3.4.3-glibc-2.2.2-hdrs-2.6.11.2/i686-unknown-linux-gnu/bin/i686-unknown-linux-gnu-gcc -fPIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DOPENSSL_NO_IDEA -DOPENSSL_NO_MDC2
-DOPENSSL_NO_RC5 -DDSA_PRECOMPUTE -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -mcpu=pentiumpro -Wall -DSHA1_ASM -DMD5_ASM -DRMD160_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               2272.52k     5173.93k     7487.15k     8409.09k     9063.08k


OpenSSL 0.9.7e 25 Oct 2004
built on: Mon May 30 10:04:02 PDT 2005
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: /usr/crosstool/gcc-4.0.0-glibc-2.2.2-hdrs-2.6.11.2/i686-unknown-linux-gnu/bin/i686-unknown-linux-gnu-gcc -fPIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -DOPENSSL_NO_IDEA -DOPENSSL_NO_MDC2
-DOPENSSL_NO_RC5 -DDSA_PRECOMPUTE -DL_ENDIAN -DTERMIO -O3 -fomit-frame-pointer -mcpu=pentiumpro -Wall -DSHA1_ASM -DMD5_ASM -DRMD160_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               1805.88k     3974.55k     5463.89k     6073.00k     5789.01k

--
Trying to get a job as a c++ developer?  See http://kegel.com/academy/getting-hired.html
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: gcc performance regression on md2 from gcc-2.95.3?

Andy Polyakov
> I'm looking into http://gcc.gnu.org/PR19923
> which claims that gcc-4.0 is slower on 'openssl speed'
> than earlier versions.   The only huge regression seems
> to be in md2.

Note that most of the code involved in the report in question is
hand-coded assembler. Meaning that the report [or your conclustion that
md2 is the only one suffering] doesn't necessarily representative in
respect to compiler optimizations per se. If you want to perform more
fair comparison between compiler versions configure toolkit with no-asm
option to compare compiler-generated codes.

> Has anyone else looked at this yet?

No. A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: gcc performance regression on md2 from gcc-2.95.3?

Dan Kegel-2
Andy Polyakov wrote:

>> I'm looking into http://gcc.gnu.org/PR19923
>> which claims that gcc-4.0 is slower on 'openssl speed'
>> than earlier versions.   The only huge regression seems
>> to be in md2.
>
>
> Note that most of the code involved in the report in question is
> hand-coded assembler. Meaning that the report [or your conclustion that
> md2 is the only one suffering] doesn't necessarily representative in
> respect to compiler optimizations per se. If you want to perform more
> fair comparison between compiler versions configure toolkit with no-asm
> option to compare compiler-generated codes.

I'm interested in the observed performance regression even with the
hand-coded assembly; that simply should not be happening.

But thanks for the tip; I will also look for performance regressions
with the no-asm option.

>> Has anyone else looked at this yet?
>
>
> No.

Thanks for the info.
- Dan

--
Trying to get a job as a c++ developer?  See http://kegel.com/academy/getting-hired.html
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: gcc performance regression on md2 from gcc-2.95.3?

Andy Polyakov
>>> I'm looking into http://gcc.gnu.org/PR19923
>>> which claims that gcc-4.0 is slower on 'openssl speed'
>>> than earlier versions.   The only huge regression seems
>>> to be in md2.
>>
>> Note that most of the code involved in the report in question is
>> hand-coded assembler. Meaning that the report [or your conclustion
>> that md2 is the only one suffering] doesn't necessarily representative
>> in respect to compiler optimizations per se. If you want to perform
>> more fair comparison between compiler versions configure toolkit with
>> no-asm option to compare compiler-generated codes.
>
> I'm interested in the observed performance regression even with the
> hand-coded assembly; that simply should not be happening.

Well, hand-coded assembler doesn't do *all* the job, compiler-generated
code is always involved in some degree, so that one can argue that if
compiler managed to "sink" "assembler" performance, then it got to be
really bad... Even more reason to test with no-asm:-) But seriously
speaking, smaller differences in "assembler" performance [few percents]
can as well be caused by different layout of resulting code in memory
[different TLB and cache hit/miss pattern], which naturally varies from
one compiler version to another, which in turn is not really something
to worry about. no-asm is really the only representative option to
compare compilers or compiler versions. A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: gcc performance regression on md2 from gcc-2.95.3?

Dan Kegel-2
Andy Polyakov wrote:

>>>> I'm looking into http://gcc.gnu.org/PR19923
>>>> which claims that gcc-4.0 is slower on 'openssl speed'
>>>> than earlier versions.   The only huge regression seems
>>>> to be in md2.  ...
>>
>> I'm interested in the observed performance regression even with the
>> hand-coded assembly; that simply should not be happening.
>
> ... no-asm is really the only representative option to
> compare compilers or compiler versions.

As it turned out, the same regression was present with
no-asm.  (Hmm.)  It only occurs for -PIC with -O2 or -O3.
Omitting -PIC made the slowdown go away.

Deriving a minimal test case
was extremely straightforward; here's how it evolved:
   1. verify "openssl speed" shows slowdown
   2. verify "openssl speed md2" show slowdown
   3. build crypto/md2/md2.c, verify that 'time md2 < 32megabytefile' shows slowdown
   4. replace -lcrypto with .o files from build directory, verify that still shows slowdown
   5. replace .o files with .c files, verify
   6. cat .c files together into one big one, verify
   7. preprocess big .c file into big .i file, verify
   8. remove most of .i file, get rid of argv handling, verify
And that yielded a nice 200 line preprocessed C source file
which showed the problem.  Uploading that to
http://gcc.gnu.org/PR19923, along with timing
results for various compilers and optimization options
showing the regression, was like throwing chum to sharks.
Within a couple hours, the gcc developers had confirmed
the problem, and posted the offending generated code.

How quickly they'll fix it is another matter, but at
least now they're on the case.  If we're lucky, gcc-4.0.1
will have a fix.
- Dan

--
Trying to get a job as a c++ developer?  See http://kegel.com/academy/getting-hired.html
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: gcc performance regression on md2 from gcc-2.95.3?

Andy Polyakov
>>>>> I'm looking into http://gcc.gnu.org/PR19923
>>>>> which claims that gcc-4.0 is slower on 'openssl speed'
>>>>> than earlier versions.   The only huge regression seems
>>>>> to be in md2.  ...
>>>
>>> I'm interested in the observed performance regression even with the
>>> hand-coded assembly; that simply should not be happening.
>>
>> ... no-asm is really the only representative option to compare
>> compilers or compiler versions.
>
> As it turned out, the same regression was present with
> no-asm.  (Hmm.)

When I suggested no-asm I was rather suggesting to examine results for
*other* algorithms. Point is that md2 is rarely used and it's more
interesting/important to pick up test case with more popular algorithm.
Does above mean that even with no-asm you observed same regression
coefficients for *other* algorithms?

> It only occurs for -PIC with -O2 or -O3.
> Omitting -PIC made the slowdown go away.
>
> Deriving a minimal test case
> was extremely straightforward; here's how it evolved:
>   1. verify "openssl speed" shows slowdown
>   2. verify "openssl speed md2" show slowdown

For md2 *alone*, no-asm naturally won't make any difference:-) Fixing
md2 might [and most likely shall] have positive effect on other
algorithms as well, so thanks. A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]