AES x86_64 assembler

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

AES x86_64 assembler

Andy Polyakov
AES x86_64 assembler implementation is available in HEAD now. The code
was benchmarked on Opteron CPU and exhibited 50% improvement over gcc
3.3.2. I wonder if somebody could compare them, compiler-generated and
hand-coded codes, on EM64T. To do so, grab latest openssl-SNAP-* at
ftp://ftp.openssl.org/snapshot/ and './config no-asm; make; apps/openssl
speed aes-128-cbc' and then './config; make; apps/openssl speed
aes-128-cbc'. When submitting results, do mention your CPU clock
frequency. A lot of thanks in advance. A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: AES x86_64 assembler

Xuekun Hu
I measured on Nocona3.6GHz.

with no-asm, the results are:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc     117668.96k   127171.52k   134233.69k   135039.66k   135012.87k

with asm, the results are:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc      74138.23k   127064.68k   158567.68k   169018.37k   169525.25k

So, on 8192 bytes there are 25% performance boost, however why on 16
bytes, the performance degrade a lot?

Thx, Xuekun

On 7/13/05, Andy Polyakov <[hidden email]> wrote:

> AES x86_64 assembler implementation is available in HEAD now. The code
> was benchmarked on Opteron CPU and exhibited 50% improvement over gcc
> 3.3.2. I wonder if somebody could compare them, compiler-generated and
> hand-coded codes, on EM64T. To do so, grab latest openssl-SNAP-* at
> ftp://ftp.openssl.org/snapshot/ and './config no-asm; make; apps/openssl
> speed aes-128-cbc' and then './config; make; apps/openssl speed
> aes-128-cbc'. When submitting results, do mention your CPU clock
> frequency. A lot of thanks in advance. A.
> ______________________________________________________________________
> OpenSSL Project                                 http://www.openssl.org
> Development Mailing List                       [hidden email]
> Automated List Manager                           [hidden email]
>
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: AES x86_64 assembler

Andy Polyakov
> I measured on Nocona3.6GHz.
>
> with no-asm, the results are:
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
> aes-128 cbc     117668.96k   127171.52k   134233.69k   135039.66k   135012.87k
>
> with asm, the results are:
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
> aes-128 cbc      74138.23k   127064.68k   158567.68k   169018.37k   169525.25k
>
> So, on 8192 bytes there are 25% performance boost,

This sounds like original version. Yesterday updated version was
uploaded, which was benchmarked at 160860k at 3.0GHz Xeon, so try the
very latest snapshot too. Do you get +190m at 3.6GHz?

> however why on 16
> bytes, the performance degrade a lot?

This is perfectly expected, because CBC assembler implementation
attempts to mitigate impact from cache-timing attack by copying key
schedule to controlled place on the stack and prefecthing s-box tables.
And in "16 bytes" case it does this for every 16 bytes, and so on.
Naturally it affects small block performance. A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]