Re: [openssl-team] Discussion: design issue: async and -lpthread

classic Classic list List threaded Threaded
60 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Viktor Dukhovni

> On Nov 24, 2015, at 2:13 AM, Nico Williams <[hidden email]> wrote:
>
> If the OpenSSL team finally decides to do something about sane locking
> by default, then it will be a huge improvement.  If this thread provides
> the impetus, so much the better.

I hope that happens.  It would certainly make a big contribution to the
ecosystem around OpenSSL.  The key question is whether there is a right
someone outside the team to step forward and write the code, or whether
it needs be developed by an OpenSSL team member (and whether cycles for
that are available in the near term).  I hope that one or the other may
come to pass.

--
        Viktor.



_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Jonathan Larmour
In reply to this post by Matt Caswell-2
On 23/11/15 20:34, Matt Caswell wrote:

> On 23/11/15 17:49, Nico Williams wrote:
>
>> Still, if -lpthread avoidance were still desired, you'd have to find an
>> alternative to pthread_key_create(), pthread_getspecific(), and friends.
>
> Just a point to note about this. The async code that introduced this has
> 3 different implementations:
>
> - posix
> - windows
> - null
>
> The detection code will check if you have a suitable posix or windows
> implementation and use that. Otherwise the fallback position is to use
> the null implementation. With "null" everything will compile and run but
> you won't be able to use any of the new async functionality.

I hope there will be the ability to plug in different operating systems
and it's not hard coded to just these choices. There are embedded systems
which use OpenSSL which do not have POSIX thread APIs (POSIX is very
heavyweight for many).

For example, GCC abstracts their threading dependenencies with a "gthread"
API which OS implementers can write against. What would be bad would be
direct calls to pthread* dotted around the OpenSSL code base.

> One other option we could pursue is to use the "__thread" syntax for
> thread local variables and avoid the need for libpthread altogether. An
> earlier version of the code did this. I have not found a way to reliably
> detect at compile time the capability to do this and my understanding is
> that this is a lot less portable.

Behind the scenes, if pthreads are available on an OS, then the compiler
will probably just end up using pthread functions anyway, meaning there's
no advantage over having just linked against libpthread.

Jifl
--
eCosCentric Limited      http://www.eCosCentric.com/     The eCos experts
Barnwell House, Barnwell Drive, Cambridge, UK.       Tel: +44 1223 245571
Registered in England and Wales: Reg No 4422071.
------["Si fractum non sit, noli id reficere"]------       Opinions==mine
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Matt Caswell-2


On 24/11/15 15:16, Jonathan Larmour wrote:

> On 23/11/15 20:34, Matt Caswell wrote:
>> On 23/11/15 17:49, Nico Williams wrote:
>>
>>> Still, if -lpthread avoidance were still desired, you'd have to find an
>>> alternative to pthread_key_create(), pthread_getspecific(), and friends.
>>
>> Just a point to note about this. The async code that introduced this has
>> 3 different implementations:
>>
>> - posix
>> - windows
>> - null
>>
>> The detection code will check if you have a suitable posix or windows
>> implementation and use that. Otherwise the fallback position is to use
>> the null implementation. With "null" everything will compile and run but
>> you won't be able to use any of the new async functionality.
>
> I hope there will be the ability to plug in different operating systems
> and it's not hard coded to just these choices. There are embedded systems
> which use OpenSSL which do not have POSIX thread APIs (POSIX is very
> heavyweight for many).
>
> For example, GCC abstracts their threading dependenencies with a "gthread"
> API which OS implementers can write against. What would be bad would be
> direct calls to pthread* dotted around the OpenSSL code base.

We are talking specifically about the new async functionality here. The
three implementations are all you get for that. That has no impact on
the rest of OpenSSL functionality. So if you happen to be on a non-posix
and non-windows platform then OpenSSL will function as it does today.
You just don't get the new async functionality on top.

There are pthread references in one (internal) header file and one c
source code file. If the new threading API discussed elsewhere in this
thread goes ahead, then those references would be replaced with calls to
that instead.

Matt
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Kurt Roeckx
In reply to this post by Jonathan Larmour
On Tue, Nov 24, 2015 at 03:16:59PM +0000, Jonathan Larmour wrote:

> On 23/11/15 20:34, Matt Caswell wrote:
> > One other option we could pursue is to use the "__thread" syntax for
> > thread local variables and avoid the need for libpthread altogether. An
> > earlier version of the code did this. I have not found a way to reliably
> > detect at compile time the capability to do this and my understanding is
> > that this is a lot less portable.
>
> Behind the scenes, if pthreads are available on an OS, then the compiler
> will probably just end up using pthread functions anyway, meaning there's
> no advantage over having just linked against libpthread.

__thread can be used in something that doesn't use threads.  It
gets put in a different section in the elf file (.tbss, .tdata).
But that still requires that things like the dynamic loader
support it.


Kurt

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Florian Weimer-2
In reply to this post by Kurt Roeckx
On 11/23/2015 11:08 PM, Kurt Roeckx wrote:

> I think that we currently don't do any compile / link test to
> detect features but that we instead explicitly say so for each
> platform.
>
> Anyway, the gcc the documentation is here:
> https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
>
> TLS support clearly isn't supported everywhere.

The most portable approach is to switch C++11, which provides you
thread-local variables with destructors (and you also get portable
atomics).  There are simply no standards-based C solutions as long as
you have to support the Microsoft compiler under Windows.

Florian
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Kurt Roeckx
On Wed, Nov 25, 2015 at 01:02:29PM +0100, Florian Weimer wrote:

> On 11/23/2015 11:08 PM, Kurt Roeckx wrote:
>
> > I think that we currently don't do any compile / link test to
> > detect features but that we instead explicitly say so for each
> > platform.
> >
> > Anyway, the gcc the documentation is here:
> > https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
> >
> > TLS support clearly isn't supported everywhere.
>
> The most portable approach is to switch C++11, which provides you
> thread-local variables with destructors (and you also get portable
> atomics).  There are simply no standards-based C solutions as long as
> you have to support the Microsoft compiler under Windows.

Please note that we use C, not C++.  But C11 has the same atomics
extentions as C++11.

We're also currently still targetting C89/C90 (with some minor
extentions), but I think we should try to use them if the platform
supports it.


Kurt

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Hubert Kario
In reply to this post by Dr Paul Dale
On Tuesday 24 November 2015 10:49:26 Paul Dale wrote:

> On Mon, 23 Nov 2015 11:11:37 PM Alessandro Ghedini wrote:
> > Is this TLS connections?
>
> Yes, this is just measuring the TLS handshake.  Renegotiations
> predominately. We deliberately didn't test the bulk symmetric crypto
> phase of the connection.
> > I'd like to know more...
>
> The data are a bit rough and ready but I've included what I can.  I
> wasn't directly involved in taking these measurements, so Chinese
> whispers are entirely possible.  I've been tasked with trying to find
> some performance enhancements.
>
>     The TLS stack results are:
>
>     stack         CPU %  connections/s
>     OpenSSL         85      11,935
>     atomic patch    22      16,465    proof of concept only, the stack
> is broken elsewhere
>     NSS             47      46,507    !!!!!
are you sure that the negotiated cipher suite is the same and that the
NSS is not configured to reuse the server key share if you're using DHE
or ECDHE?

--
Regards,
Hubert Kario
Senior Quality Engineer, QE BaseOS Security team
Web: www.cz.redhat.com
Red Hat Czech s.r.o., Purkyňova 99/71, 612 45, Brno, Czech Republic
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Nico Williams
In reply to this post by Viktor Dukhovni
On Mon, Nov 23, 2015 at 11:56:54PM +0000, Viktor Dukhovni wrote:
> > It may be a good idea to rethink locking completely.
>
> There is some glimmer of hope in that as various libcrypto structures
> become opaque, the locking moves from application code into the
> library.  For example, we now have (yet to be documented):
>
> X509_up_ref()

Ideally there would be very little locking in OpenSSL, and instead the
app would be responsible for most locking (if needed).

But that will be a lengthy transition, no?  Maybe we'll need functions
by which to indicate that the app will be doing locking for specific
objects.  Still, functions like RAND_bytes() that have no context object
will need locking, so new functions will be needed that take contexts so
as to minimize locking.

> Doing this requires a global review of the API, and filling in
> missing functions and documentation. :-(

Yes.

Nico
--
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Dr Paul Dale
In reply to this post by Hubert Kario

> are you sure that the negotiated cipher suite is the same and that the

> NSS is not configured to reuse the server key share if you're using DHE

> or ECDHE?

 

The cipher suite was the same. I'd have to check to see exactly which was used. It is certainly possible that NSS was configured as you suggest and, if so, this would improve its performance.

 

 

However, the obstacle preventing 100% CPU utilisation for both stacks is lock contention. The NSS folks apparently spent a lot of effort addressing this and they have a far more scalable locking model than OpenSSL: one lock per context for all the different kinds of context versus a small number of global locks.

 

There is definitely scope for improvement here. My atomic operation suggestion is one approach which was quick and easy to validate, better might be more locks since it doesn't introduce a new paradigm and is more widely supported (C11 notwithstanding).

 

 

Regards,

 

Pauli

--

Oracle

Dr Paul Dale | Cryptographer | Network Security & Encryption

Phone +61 7 3031 7217

Oracle Australia

 


_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Nico Williams
On Tue, Dec 01, 2015 at 09:21:34AM +1000, Paul Dale wrote:
> However, the obstacle preventing 100% CPU utilisation for both stacks
> is lock contention.  The NSS folks apparently spent a lot of effort
> addressing this and they have a far more scalable locking model than
> OpenSSL: one lock per context for all the different kinds of context
> versus a small number of global locks.

I prefer APIs which state that they are "thread-safe provided the
application accesses each XYZ context from only one thread at a time".

Leave it to the application to do locking, as much as possible.  Many
threaded applications won't need locking here because they may naturally
have only one thread using a given context.

Also, for something like a TLS context, ideally it should be naturally
possible to have two threads active, as long as one thread only reads
and the other thread only writes.  There can be some dragons here with
respect to fatal events and deletion of a context, but the simplest
thing to do is to use atomics for manipulating state like "had a fatal
alert", and use reference counts to defer deletion (then if the
application developer wants it this way, each of the reader and writer
threads can have a reference and the last one to stop using the context
deletes it).

> There is definitely scope for improvement here.  My atomic operation
> suggestion is one approach which was quick and easy to validate,
> better might be more locks since it doesn't introduce a new paradigm
> and is more widely supported (C11 notwithstanding).

A platform compatibility atomics library would be simple enough (plenty
exist, I believe).  For platforms where no suitable implementation
exists you can use a single global lock, and if there's not even that,
then you can use non-atomic implementations and pretend it's all OK or
fail to build (users of such platforms will quickly provide real
implementations).

(Most compilers have pre-C11 atomics intrinsics and many OSes have
atomics libraries.)

Nico
--
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Peter Waltenberg
I'd suggest checking where the bottlenecks are before making major structural changes. I'll admit we have made a few changes to the basic OpenSSL sources but I don't see unacceptable amounts of locking even on large machines (100's of processing units) with thousands of threads.

Blinding and the RNG's were the hot spots and relatively easy to address.
Also, you use TRNG's for things like blinding where a PRNG will do, fixing that also helps performance.

Peter


-----"openssl-dev" <[hidden email]> wrote: -----
To: [hidden email], [hidden email]
From: Nico Williams
Sent by: "openssl-dev"
Date: 12/01/2015 10:16AM
Subject: Re: [openssl-dev] [openssl-team] Discussion: design issue: async and -lpthread

On Tue, Dec 01, 2015 at 09:21:34AM +1000, Paul Dale wrote:
> However, the obstacle preventing 100% CPU utilisation for both stacks
> is lock contention.  The NSS folks apparently spent a lot of effort
> addressing this and they have a far more scalable locking model than
> OpenSSL: one lock per context for all the different kinds of context
> versus a small number of global locks.

I prefer APIs which state that they are "thread-safe provided the
application accesses each XYZ context from only one thread at a time".

Leave it to the application to do locking, as much as possible.  Many
threaded applications won't need locking here because they may naturally
have only one thread using a given context.

Also, for something like a TLS context, ideally it should be naturally
possible to have two threads active, as long as one thread only reads
and the other thread only writes.  There can be some dragons here with
respect to fatal events and deletion of a context, but the simplest
thing to do is to use atomics for manipulating state like "had a fatal
alert", and use reference counts to defer deletion (then if the
application developer wants it this way, each of the reader and writer
threads can have a reference and the last one to stop using the context
deletes it).

> There is definitely scope for improvement here.  My atomic operation
> suggestion is one approach which was quick and easy to validate,
> better might be more locks since it doesn't introduce a new paradigm
> and is more widely supported (C11 notwithstanding).

A platform compatibility atomics library would be simple enough (plenty
exist, I believe).  For platforms where no suitable implementation
exists you can use a single global lock, and if there's not even that,
then you can use non-atomic implementations and pretend it's all OK or
fail to build (users of such platforms will quickly provide real
implementations).

(Most compilers have pre-C11 atomics intrinsics and many OSes have
atomics libraries.)

Nico
--
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev



_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Hubert Kario
In reply to this post by Dr Paul Dale
On Tuesday 01 December 2015 09:21:34 Paul Dale wrote:
> > are you sure that the negotiated cipher suite is the same and that
> > the NSS is not configured to reuse the server key share if you're
> > using DHE or ECDHE?
>
> There is definitely scope for improvement here.  My atomic operation
> suggestion is one approach which was quick and easy to validate,
> better might be more locks since it doesn't introduce a new paradigm
> and is more widely supported (C11 notwithstanding).

I'm not saying there is no room for improvement or that the improvements
are useless. But as long as we're not comparing apples-to-apples the
statistic is useless.

Other things to look for: ServerKeyExchange curve or group and signature
algorithm used as well as the key size of server. Each of those things
can have impact completely overshadowing the lock contention differences
(picking big RSA key size can easily slash performance by an order of
magnitude).
--
Regards,
Hubert Kario
Senior Quality Engineer, QE BaseOS Security team
Web: www.cz.redhat.com
Red Hat Czech s.r.o., Purkyňova 99/71, 612 45, Brno, Czech Republic
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Dr Paul Dale

The figures were for connection reestablishment, RSA computations etc simply don't feature. For initial connection establishment, on the other hand, they are the single largest factor. The crypto is definitely not the bottleneck for this case.

 

 

Pauli

--

Oracle

Dr Paul Dale | Cryptographer | Network Security & Encryption

Phone +61 7 3031 7217

Oracle Australia

 


_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Florian Weimer-2
In reply to this post by Kurt Roeckx
On 11/25/2015 06:48 PM, Kurt Roeckx wrote:

> On Wed, Nov 25, 2015 at 01:02:29PM +0100, Florian Weimer wrote:
>> On 11/23/2015 11:08 PM, Kurt Roeckx wrote:
>>
>>> I think that we currently don't do any compile / link test to
>>> detect features but that we instead explicitly say so for each
>>> platform.
>>>
>>> Anyway, the gcc the documentation is here:
>>> https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
>>>
>>> TLS support clearly isn't supported everywhere.
>>
>> The most portable approach is to switch C++11, which provides you
>> thread-local variables with destructors (and you also get portable
>> atomics).  There are simply no standards-based C solutions as long as
>> you have to support the Microsoft compiler under Windows.
>
> Please note that we use C, not C++.  But C11 has the same atomics
> extentions as C++11.

C++11 support is much more widespread than C11 support.  You will have
trouble finding reliable support for C11 atomics with the Microsoft
toolchain.

> We're also currently still targetting C89/C90 (with some minor
> extentions), but I think we should try to use them if the platform
> supports it.

It is a lot of working getting the atomics right on all supported platforms.

Florian

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Nico Williams
On Mon, Dec 07, 2015 at 02:41:35PM +0100, Florian Weimer wrote:

> On 11/25/2015 06:48 PM, Kurt Roeckx wrote:
> > Please note that we use C, not C++.  But C11 has the same atomics
> > extentions as C++11.
>
> C++11 support is much more widespread than C11 support.  You will have
> trouble finding reliable support for C11 atomics with the Microsoft
> toolchain.
>
> [...]
>
> It is a lot of working getting the atomics right on all supported platforms.

The MSFT toolchain has its own intrisics, as do GCC/clang.  A variety of
OSes have their own atomics libraries (e.g., Solaris/Illumos, FreeBSD,
and others).  Linux has several as well, but I am not sure that the
licensing on those will be compatible to link against (much less to
incorporate as source in OpenSSL).  Some of the BSD or CDDL licensed
libraries might be possible to incorporate as source into OpenSSL.

It's a solvable problem, but yes, a lot of work :(  Still, it seems
worth doing.

Nico
--
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Nico Williams
Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Florian Weimer
* Nico Williams:

> Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?

It seems to have trouble to keep up with new architectures.
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Nico Williams
On Tue, Dec 08, 2015 at 11:19:32AM +0100, Florian Weimer wrote:
> > Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
>
> It seems to have trouble to keep up with new architectures.

New architectures are not really a problem because between a) decent
compilers with C11 and/or non-C11 atomic intrinsics, b) asm-coded
atomics, and c) mutex-based dumb atomics, we can get full coverage.
Anyone who's still not satisfied can then contribute missing asm-coded
atomics to OpenPA.  I suspect that OpenSSL using OpenPA is likely to
lead to contributions to OpenPA that will make it better anyways.

What's the alternative anyways?

We're talking about API and performance enhancements to OpenSSL to go
faster on platforms for which there are atomics, and maybe slower
otherwise -- or maybe not; maybe we can implement context up-/down-ref
functions that use fine-grained (or even global) locking as a fallback
that yields performance comparable to today's.

If OpenPA's (or some other such library's) license works for OpenSSL,
someone might start using it.  That someone might be me.  So that seems
like a good question to ask: is OpenPA's license compatible with
OpenSSL's?  For inclusion into OpenSSL's tree, or for use by OpenSSL?

Nico
--
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Dr Paul Dale
It will be possible to support atomics in such a way that there is no performance penalty for machines without them or for single threaded operation.
My sketcy design is along the lines of adding a new API CRYPTO_add_atomic that takes the same arguments as CRYPTO_add (i.e. reference to counter, value to add and lock to use):

CRYPTO_add_atomic(int *addr, int amount, int lock)
    if have-atomics then
        atomic_add(addr, amount)
    else if (lock == have-lock-already)
        *addr += amount
    else
        CRYPTO_add(addr, amount, lock)

The have-lock-already will need to be a new code that indicates that the caller has the relevant lock held and there is no need to lock before the add.  Some conditional compilation like CRYPTO_add & CRYPTO_add_lock have can be done to get the overhead down to zero in the single threaded case and the case where it is known beforehand that there are no atomic operations.  It is also possible for the atomic_add function to be passed in as a user call back as per the other locking callbacks which means OSSL doesn't actually need to know how any of this works underneath.

Once this is done, most instances of CRYPTO_add can be changed to CRYPTO_add_atomic.  Unfortunately, not all can be changed, so this would involve manual inspection of each lock for which CRYPTO_add is used to see if atomics are suitable.  I've done a partial list of which could be changed over (attached) but it is pretty rough and needs rechecking.

It would be prudent to have a CRYPTO_add_atomic_lock call underneath CRYPTO_add_atomic, like CRYPTO_add has CRYPTO_add_lock, to get the extra debug output.


Finally, can someone explain what the callback passed to CRYPTO_set_add_lock_callback is supposed to do?  Superficially, it seems like a way to use atomic operations instead of full locking -- but that breaks things due to the way the locking is done elsewhere.  So this call back needs to lock, add and unlock like the alternate code path in the CRYPTO_add_lock function.  There is no obvious benefit to providing it.


Pauli

On Tue, 8 Dec 2015 11:22:01 AM Nico Williams wrote:

> On Tue, Dec 08, 2015 at 11:19:32AM +0100, Florian Weimer wrote:
> > > Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
> >
> > It seems to have trouble to keep up with new architectures.
>
> New architectures are not really a problem because between a) decent
> compilers with C11 and/or non-C11 atomic intrinsics, b) asm-coded
> atomics, and c) mutex-based dumb atomics, we can get full coverage.
> Anyone who's still not satisfied can then contribute missing asm-coded
> atomics to OpenPA.  I suspect that OpenSSL using OpenPA is likely to
> lead to contributions to OpenPA that will make it better anyways.
>
> What's the alternative anyways?
>
> We're talking about API and performance enhancements to OpenSSL to go
> faster on platforms for which there are atomics, and maybe slower
> otherwise -- or maybe not; maybe we can implement context up-/down-ref
> functions that use fine-grained (or even global) locking as a fallback
> that yields performance comparable to today's.
>
> If OpenPA's (or some other such library's) license works for OpenSSL,
> someone might start using it.  That someone might be me.  So that seems
> like a good question to ask: is OpenPA's license compatible with
> OpenSSL's?  For inclusion into OpenSSL's tree, or for use by OpenSSL?
>
> Nico
>
--
Oracle
Dr Paul Dale | Cryptographer | Network Security & Encryption
Phone +61 7 3031 7217
Oracle Australia

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

locks-openssl (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [openssl-team] Discussion: design issue: async and -lpthread

Nico Williams
On Wed, Dec 09, 2015 at 09:27:16AM +1000, Paul Dale wrote:

> It will be possible to support atomics in such a way that there is no
> performance penalty for machines without them or for single threaded
> operation.  My sketcy design is along the lines of adding a new API
> CRYPTO_add_atomic that takes the same arguments as CRYPTO_add (i.e.
> reference to counter, value to add and lock to use):
>
> CRYPTO_add_atomic(int *addr, int amount, int lock)
>     if have-atomics then
>         atomic_add(addr, amount)
>     else if (lock == have-lock-already)
>         *addr += amount
>     else
>         CRYPTO_add(addr, amount, lock)

"have-atomics" must be known at compile time.

"lock" should not be needed because we should always have atomics, even
when we don't have true atomics: just use a global lock in a stub
implementation of atomic_add() and such.  KISS.  Besides, this will add
pressure to add true atomics wherever they are truly needed.

Nico
--
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
123