Improving OpenSSL default RNG

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Improving OpenSSL default RNG

Alessandro Ghedini
Hello everyone,

(sorry for the wall of text...)

one of the things that both BoringSSL and LibreSSL have in common is the
replacement of OpenSSL's default RNG RAND_SSLeay() with a simpler and saner
alternative. Given RAND_SSLeay() complexity I think it'd be worth to at least
consider possible alternatives for OpenSSL.

BoringSSL started using the system RNG (e.g. /dev/urandom) for every call to
RAND_bytes(). Additionally, if the RDRAND instruction is available, the output
of RDRAND is mixed with the output of the system RNG using ChaCha20. This uses
thread-local storage to keep the global RNG state.

Incidentally, BoringSSL added a whole new API for thread-local storage which
OpenSSL could adopt given that e.g. the ASYNC support could benefit from it
(there are other interesting bits in BoringSSL, like the new threading API that
could also be adopted by OpenSSL).

The BoringSSL method is very simple but it needs a read from /dev/urandom for
every call to RAND_bytes() which can be slow (though, BoringSSL's RAND_bytes()
seems to implement some sort of buffering for /dev/urandom so the cost may be
lower).

On the other hand, LibreSSL replaced the whole RAND_* API with calls to
OpenBSD's arc4random(). This is a nice and simple scheme that uses ChaCha20 to
mix the internal RNG state, which is regularly reseeded from the system RNG.
The core logic of this (excluding ChaCha20 and platform-specific bits) is
implemented in less than 200 lines of code and, at least in theory, it's the
one that provides the best performance/simplicity trade-off (ChaCha20 can be
pretty fast even for non-asm platform-generic implementations).

Both of these methods are robust and mostly platform-indipendent (e.g. none of
them uses the system time, PID or uninitilized buffers to seed the RNG state)
and have simple implementations, so I think OpenSSL can benefit a lot from
adopting one of them. The astute readers may point out that OpenSSL doesn't
support ChaCha20 yet, but that's hopefully coming soon.

I think there's also room for improvement in the platform-specific RAND_poll()
implementations, e.g.:

- on Linux getrandom() should be used if available
- on OpenBSD getentropy() should be used instead of arc4random()
- the /dev/urandom code IMO can be simplified
- the non-CryptGenRandom() code on Windows is just crazy. Do we even support
  Windows versions before XP?
- is EGD actually used anywhere today?
- what about Netware, OS/2 and VMS, do we have any users on them? IIRC support
  for other platforms has already been removed, what are the criteria for
  keeping support for one?
- etc...

For comparison, OpenBSD's getentropy() implementation [0] is much cleaner and
supports many of the platforms supported by OpenSSL.

So, any thought? If there's interest in this, I can look into investigating
these things more in detail and propose possible patches.

Cheers

[0] https://github.com/libressl-portable/openbsd/tree/master/src/lib/libcrypto/crypto

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Tomas Mraz-2
On Pá, 2015-10-23 at 15:22 +0200, Alessandro Ghedini wrote:
...
> BoringSSL started using the system RNG (e.g. /dev/urandom) for every call to
> RAND_bytes(). Additionally, if the RDRAND instruction is available, the output
> of RDRAND is mixed with the output of the system RNG using ChaCha20. This uses
> thread-local storage to keep the global RNG state.
...

> The BoringSSL method is very simple but it needs a read from /dev/urandom for
> every call to RAND_bytes() which can be slow (though, BoringSSL's RAND_bytes()
> seems to implement some sort of buffering for /dev/urandom so the cost may be
> lower).
>
> On the other hand, LibreSSL replaced the whole RAND_* API with calls to
> OpenBSD's arc4random(). This is a nice and simple scheme that uses ChaCha20 to
> mix the internal RNG state, which is regularly reseeded from the system RNG.
> The core logic of this (excluding ChaCha20 and platform-specific bits) is
> implemented in less than 200 lines of code and, at least in theory, it's the
> one that provides the best performance/simplicity trade-off (ChaCha20 can be
> pretty fast even for non-asm platform-generic implementations).
>
> Both of these methods are robust and mostly platform-indipendent (e.g. none of
> them uses the system time, PID or uninitilized buffers to seed the RNG state)
> and have simple implementations, so I think OpenSSL can benefit a lot from
> adopting one of them. The astute readers may point out that OpenSSL doesn't
> support ChaCha20 yet, but that's hopefully coming soon.

How these libraries handle generation of random numbers after the
fork()? The mixing in of the system time & PID before pulling bytes from
RNG prevents sharing two identical streams of random numbers among
forked processes. If there is a buffering of data pulled from the kernel
RNG it is not sufficient to just say that all the data are pulled from
the kernel and thus unique.
--
Tomas Mraz
No matter how far down the wrong road you've gone, turn back.
                                              Turkish proverb
(You'll never know whether the road is wrong though.)


_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Alessandro Ghedini
On Fri, Oct 23, 2015 at 03:38:51pm +0200, Tomas Mraz wrote:

> On Pá, 2015-10-23 at 15:22 +0200, Alessandro Ghedini wrote:
> ...
> > BoringSSL started using the system RNG (e.g. /dev/urandom) for every call to
> > RAND_bytes(). Additionally, if the RDRAND instruction is available, the output
> > of RDRAND is mixed with the output of the system RNG using ChaCha20. This uses
> > thread-local storage to keep the global RNG state.
> ...
> > The BoringSSL method is very simple but it needs a read from /dev/urandom for
> > every call to RAND_bytes() which can be slow (though, BoringSSL's RAND_bytes()
> > seems to implement some sort of buffering for /dev/urandom so the cost may be
> > lower).
> >
> > On the other hand, LibreSSL replaced the whole RAND_* API with calls to
> > OpenBSD's arc4random(). This is a nice and simple scheme that uses ChaCha20 to
> > mix the internal RNG state, which is regularly reseeded from the system RNG.
> > The core logic of this (excluding ChaCha20 and platform-specific bits) is
> > implemented in less than 200 lines of code and, at least in theory, it's the
> > one that provides the best performance/simplicity trade-off (ChaCha20 can be
> > pretty fast even for non-asm platform-generic implementations).
> >
> > Both of these methods are robust and mostly platform-indipendent (e.g. none of
> > them uses the system time, PID or uninitilized buffers to seed the RNG state)
> > and have simple implementations, so I think OpenSSL can benefit a lot from
> > adopting one of them. The astute readers may point out that OpenSSL doesn't
> > support ChaCha20 yet, but that's hopefully coming soon.
>
> How these libraries handle generation of random numbers after the
> fork()? The mixing in of the system time & PID before pulling bytes from
> RNG prevents sharing two identical streams of random numbers among
> forked processes. If there is a buffering of data pulled from the kernel
> RNG it is not sufficient to just say that all the data are pulled from
> the kernel and thus unique.
So, it seems that BoringSSL's /dev/urandom buffering is disabled by default and
needs to be explicitly enabled by calling RAND_enable_fork_unsafe_buffering().

arc4random() detects forks by registering a pthread_atfork() handler and/or by
checking changes in getpid() output before returning any randomness. When a
fork is detected, the internal state is automatically re-seeded with the system
RNG.

Cheers

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Salz, Rich
In reply to this post by Alessandro Ghedini
I am very interested in cleaning this area up.  We still do care about Netware, OS/2, and VMS; I don't think we care about pre-XP Windows.

We have broader portability issues than boringSSL does, so my thoughts on threading are different:  two builds, either "not threaded" or "use native system threads" and internally use an API that is a very small thin layer per-OS.
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Dr. Matthias St. Pierre
In reply to this post by Alessandro Ghedini

Hi,

I have a related question concerning alternative RNGs, hope it is not too off-topic:

Currently we are using the NIST-SP800-90a compliant DRBG (FIPS_drbg_method()), because it seemed to us to be more
sophisticated and mature than the default RAND_SSLeay(). At least it's better documented and tested.

Currently this DRBG is only available through the FIPS object module, so you need to build a FIPS capable OpenSSL library in
order to use it.

Shouldn't the FIPS DRBG code be added to the normal code base in master, too, as an alternative RNG implemtation?
Or is the NIST-SP800-90a DRG construction already obsolete outside of FIPS world?


Regards,
Matthias











_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Loganaden Velvindron
In reply to this post by Alessandro Ghedini


On Fri, Oct 23, 2015 at 5:22 PM, Alessandro Ghedini <[hidden email]> wrote:
Hello everyone,

...

For comparison, OpenBSD's getentropy() implementation [0] is much cleaner and
supports many of the platforms supported by OpenSSL.

So, any thought? If there's interest in this, I can look into investigating
these things more in detail and propose possible patches.


LibreSSL has been tested on a wider range of platforms.


_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Dmitry Belyavsky-3
In reply to this post by Alessandro Ghedini
Hello Alexander,

On Fri, Oct 23, 2015 at 4:22 PM, Alessandro Ghedini <[hidden email]> wrote:
 
So, any thought? If there's interest in this, I can look into investigating
these things more in detail and propose possible patches.


In Russia we have to certify the RNG hardware and software for using in organizations where the certified products are required.
Currently we are able to implement custom RAND_METHODs and provide it via engines. So if the hardware is unavailable, the RAND_bytes() call fails.

In the 1.0.* versions of the OpenSSL library not all calls to RAND* functions were checked for success, and it caused some problems.
LibreSSL treats their RNG functions as never-failed, and I do not know about BoringSSL.

So we need non-void RAND API and possibility to provide our own RAND_METHODs. If the current code is to be refactored, I ask to leave these options possible.

--
SY, Dmitry Belyavsky

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Alessandro Ghedini
In reply to this post by Salz, Rich
On Fri, Oct 23, 2015 at 02:30:14pm +0000, Salz, Rich wrote:
> I am very interested in cleaning this area up.  We still do care about
> Netware, OS/2, and VMS; I don't think we care about pre-XP Windows.

Ok.

> We have broader portability issues than boringSSL does, so my thoughts on
> threading are different:  two builds, either "not threaded" or "use native
> system threads" and internally use an API that is a very small thin layer
> per-OS.

Yes, that's what BoringSSL does. They have three implementations: pthread,
windows and none (which is just nops). I don't know what the availability of
pthreads is on the above platforms (NW, OS/2 and VMS), but it should cover
quite a bit of platforms.

Basically they deprecated the current CRYPTO_lock and CRYPTO_THREADID API, and
replaced that with mutex objects (CRYPTO_MUTEX). Additionally, this API
provides thread-local storage support and "once" objects (to execute functions
only once, for example for initialization).

On top of the CRYPTO_MUTEX they added a reference counting API (which can use
C11 atomics instead of mutexes), but this is not used a lot so it can be
ignored for now.

Cheers

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Alessandro Ghedini
In reply to this post by Dmitry Belyavsky-3
On Fri, Oct 23, 2015 at 05:40:29PM +0300, Dmitry Belyavsky wrote:

> Hello Alexander,
>
> On Fri, Oct 23, 2015 at 4:22 PM, Alessandro Ghedini <[hidden email]>
> wrote:
>
>
> > So, any thought? If there's interest in this, I can look into investigating
> > these things more in detail and propose possible patches.
> >
> >
> In Russia we have to certify the RNG hardware and software for using in
> organizations where the certified products are required.
> Currently we are able to implement custom RAND_METHODs and provide it via
> engines. So if the hardware is unavailable, the RAND_bytes() call fails.
>
> In the 1.0.* versions of the OpenSSL library not all calls to RAND*
> functions were checked for success, and it caused some problems.
> LibreSSL treats their RNG functions as never-failed, and I do not know
> about BoringSSL.
>
> So we need non-void RAND API and possibility to provide our own
> RAND_METHODs. If the current code is to be refactored, I ask to leave these
> options possible.
Yeah, the idea is to keep the current ENGINE API, and only change the default
RAND_METHOD which is returned by RAND_SSLeay(). So if you use any other RNG
this change shouldn't affect you.

Cheers

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Salz, Rich
In reply to this post by Alessandro Ghedini
> Yes, that's what BoringSSL does. They have three implementations: pthread,
> windows and none (which is just nops). I don't know what the availability of
> pthreads is on the above platforms (NW, OS/2 and VMS), but it should cover
> quite a bit of platforms.

I'd instead have two:  native or none.  Where hopefully native is mostly pthreads, but for NW,OS/2,VMS it can at least start at as none.
 
> Basically they deprecated the current CRYPTO_lock and CRYPTO_THREADID
> API, and replaced that with mutex objects (CRYPTO_MUTEX).

Not sure about that, I could be persuaded either way, for CRYPTO_lock.  The THREADID might need to stay for platform portability.

> Additionally,
> this API provides thread-local storage support and "once" objects (to
> execute functions only once, for example for initialization).

I'd probably leave thread-local storage up to the application.  But the once stuff would be useful *iff and only iff we used it to make initialization simpler.*

 
> On top of the CRYPTO_MUTEX they added a reference counting API (which
> can use
> C11 atomics instead of mutexes), but this is not used a lot so it can be ignored
> for now.

We already have CRYPTO_add ...

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Alessandro Ghedini
In reply to this post by Dr. Matthias St. Pierre
On Fri, Oct 23, 2015 at 04:34:11PM +0200, Dr. Matthias St. Pierre wrote:

>
> Hi,
>
> I have a related question concerning alternative RNGs, hope it is not too
> off-topic:
>
> Currently we are using the NIST-SP800-90a compliant DRBG (fips_drbg_method()),
> because it seemed to us to be more sophisticated and mature than the default
> RAND_SSLeay(). At least it's better documented and tested.
>
> Currently this DRBG is only available through the FIPS object module, so you
> need to build a FIPS capable OpenSSL library in order to use it.
>
> Shouldn't the FIPS DRBG code be added to the normal code base in master, too,
> as an alternative RNG implemtation? Or is the NIST-SP800-90a DRG construction
> already obsolete outside of FIPS world?
FWIW, the FIPS module was recently removed, so FIPS_drbg_method() is not present
in master anymore. I think there are plans to reimplement the whole thing, but
I don't know anything about that.

In general the NIST DRBGs seem fairly complicated (or completely untrustworthy
like Dual EC DRBG), so I'd rather have a different implementation as default
RNG for OpenSSL.

Cheers

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Kurt Roeckx
In reply to this post by Alessandro Ghedini
On Fri, Oct 23, 2015 at 03:22:39PM +0200, Alessandro Ghedini wrote:
> Hello everyone,
>
> (sorry for the wall of text...)
>
> one of the things that both BoringSSL and LibreSSL have in common is the
> replacement of OpenSSL's default RNG RAND_SSLeay() with a simpler and saner
> alternative. Given RAND_SSLeay() complexity I think it'd be worth to at least
> consider possible alternatives for OpenSSL.

I think at least a few of us want to see something changed, but
I'm not sure what direction it's going to go.

> BoringSSL started using the system RNG (e.g. /dev/urandom) for every call to
> RAND_bytes(). Additionally, if the RDRAND instruction is available, the output
> of RDRAND is mixed with the output of the system RNG using ChaCha20. This uses
> thread-local storage to keep the global RNG state.

The problem with only relying on /dev/urandom on Linux is that
it can give you data without having had enough entropy feed in it
first.  At least getrandom() will fix that, but that's only in the
kernel since 3.17 and a lot of people still don't have that.  But
maybe most people will actually have it on systems that are going
to run 1.1.

This is of course not a new problem, we also already have that
now.

I think some people have concerns about only relying on the OSs
RNG and instead want to get a high entropy seed from it and use
that to feed our own RNG.  We might use a better (by whatever
definition) algorithm than the kernel is using.

Only using /dev/urandom might result in people complaining that we
start to drain their available entropy.  I don't know enough about
the Linux kernel's implementation of the RNG, but their
calculation just seems wrong to me.

> Incidentally, BoringSSL added a whole new API for thread-local storage which
> OpenSSL could adopt given that e.g. the ASYNC support could benefit from it
> (there are other interesting bits in BoringSSL, like the new threading API that
> could also be adopted by OpenSSL).

Threads, and so thread-local storage, is all very OS specific.
And I'm afraid we don't have that on all platforms we still want
to support.  But I think that belongs in a different discussion.

> The BoringSSL method is very simple but it needs a read from /dev/urandom for
> every call to RAND_bytes() which can be slow (though, BoringSSL's RAND_bytes()
> seems to implement some sort of buffering for /dev/urandom so the cost may be
> lower).

An other issue is that /dev/urandom might not be available in a
chroot or when you run out of file descriptor.  The chroot is at
least something the LibreSSL seems to be running in to, but is
nothing we can really do about on all OSs.

> On the other hand, LibreSSL replaced the whole RAND_* API with calls to
> OpenBSD's arc4random(). This is a nice and simple scheme that uses ChaCha20 to
> mix the internal RNG state, which is regularly reseeded from the system RNG.

I think the arc4random() thing is such a misleading name since it
doesn't have anything to do with (A)RC4 anymore.  I'm also not
sure that the implementation with ChaCha20 is the best way to go,
you could also go for something like Fortuna, or stay with
whatever we use now.

We currently don't reseed, and I have no idea what amount of data
we should be able to safely extract from it now, I think it's
rather large, but reseeding should never hurt.



Kurt

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Joey Yandle
In reply to this post by Alessandro Ghedini
> - the non-CryptGenRandom() code on Windows is just crazy. Do we even support
>    Windows versions before XP?

Some of that code really needs to go away, specifically the heap walk
code.  It is extremely unsafe, and crashes ~66% of the time when running
under the Visual Studio debugger.  There's nothing OpenSSL can do about
the crashes, because they occur deep in ntdll code.

It used to be possible to avoid calling RAND_poll on windows, via
RAND_screen etc (at least that's what Mr Google thinks).  But
RAND_screen now *calls* RAND_poll.  And ssleay_get_rand_bytes has a
weird static local variable that guarantees calling RAND_poll at least
once even if you preseed the RNG via RAND_add.

I removed the heap walk and all the other insane kernel loading code
from my local tree, and just do the CryptGenRandom and the mixins at the
end (pid, etc).  I'd strongly suggest doing something similar in the
future.

cheers,

Joey
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Benjamin Kaduk
In reply to this post by Alessandro Ghedini
On 10/23/2015 08:22 AM, Alessandro Ghedini wrote:
> Hello everyone,
>
> (sorry for the wall of text...)
>
> one of the things that both BoringSSL and LibreSSL have in common is the
> replacement of OpenSSL's default RNG RAND_SSLeay() with a simpler and saner
> alternative. Given RAND_SSLeay() complexity I think it'd be worth to at least
> consider possible alternatives for OpenSSL.

I heartily support this; the existing RAND_SSLeay() is a bit frightening
(though I take some solace in the existence of ENGINE_rdrand()).

> BoringSSL started using the system RNG (e.g. /dev/urandom) for every call to
> RAND_bytes(). Additionally, if the RDRAND instruction is available, the output
> of RDRAND is mixed with the output of the system RNG using ChaCha20. This uses
> thread-local storage to keep the global RNG state.

/dev/urandom is simple and safe absent the chroot case.  (Note that
capsicum-using applications will frequently open a fd for /dev/urandom
before entering capability mode and leave it open; the same might be
worth considering.)  Concerns about "running out of entropy" are
unfounded; the kernel uses a CS-PRNG and if we trust its output to seed
our own scheme, we can trust its output indefinitely.

Intel recommends calling RDRAND in a loop since it does not always
return successfully, and IIRC best practice is to mix it in with other
inputs (i.e., not use it directly).

> Incidentally, BoringSSL added a whole new API for thread-local storage which
> OpenSSL could adopt given that e.g. the ASYNC support could benefit from it
> (there are other interesting bits in BoringSSL, like the new threading API that
> could also be adopted by OpenSSL).
>
> The BoringSSL method is very simple but it needs a read from /dev/urandom for
> every call to RAND_bytes() which can be slow (though, BoringSSL's RAND_bytes()
> seems to implement some sort of buffering for /dev/urandom so the cost may be
> lower).

Keeping the default method simple, slow, and reliable could be a
reasonable approach, given that there is always the option of inserting
an alternate implementation if performance is a concern.  ("Simple"
probably means "rely on the kernel for everything, do not use
thread-local storage, etc.")

It might also be worth having a more complicated scheme that does use
thread-local storage (on systems where we know how to implement it) and
runs fortuna or something similar, but that does not necessarily need to
be the default implementation, in my opinion.

>
> On the other hand, LibreSSL replaced the whole RAND_* API with calls to
> OpenBSD's arc4random(). This is a nice and simple scheme that uses ChaCha20 to
> mix the internal RNG state, which is regularly reseeded from the system RNG.
> The core logic of this (excluding ChaCha20 and platform-specific bits) is
> implemented in less than 200 lines of code and, at least in theory, it's the
> one that provides the best performance/simplicity trade-off (ChaCha20 can be
> pretty fast even for non-asm platform-generic implementations).

A single syscall to get entropy is nice, whether it's a sysctl node,
getentropy(), getrandom(), or some other spelling; a library call like
arc4random() is almost as good.  But I don't think we're in a position
to rip out the RAND_* API layer as LibreSSL did.

> Both of these methods are robust and mostly platform-indipendent (e.g. none of
> them uses the system time, PID or uninitilized buffers to seed the RNG state)
> and have simple implementations, so I think OpenSSL can benefit a lot from
> adopting one of them. The astute readers may point out that OpenSSL doesn't
> support ChaCha20 yet, but that's hopefully coming soon.
>
> I think there's also room for improvement in the platform-specific RAND_poll()
> implementations, e.g.:
>
> - on Linux getrandom() should be used if available
> - on OpenBSD getentropy() should be used instead of arc4random()
> - the /dev/urandom code IMO can be simplified
> - the non-CryptGenRandom() code on Windows is just crazy. Do we even support
>   Windows versions before XP?
> - is EGD actually used anywhere today?

"I really hope not."

> - what about Netware, OS/2 and VMS, do we have any users on them? IIRC support
>   for other platforms has already been removed, what are the criteria for
>   keeping support for one?
> - etc...
>
> For comparison, OpenBSD's getentropy() implementation [0] is much cleaner and
> supports many of the platforms supported by OpenSSL.
>
> So, any thought? If there's interest in this, I can look into investigating
> these things more in detail and propose possible patches.
>

I'll just note in closing that there are a number of "fortuna-like"
implementations out there that are not actually fortuna, for example,
were implemented based off of the wikipedia article and not the actual
textbook.
https://github.com/krb5/krb5/blob/master/src/lib/crypto/krb/prng_fortuna.c
was implemented from _Cryptography Engineering_, and I think
https://github.com/freebsd/freebsd/blob/master/sys/dev/random/fortuna.c
is another.

-Ben
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Peter Waltenberg
If you are going to make all that effort you may as well go for FIPS compliance as the default.

SP800-90 A/B/C do cover the areas of concern, the algorithms are simple and clear as is the overall flow of processing to start from 'noise' to produce safe and reliable TRNG/PRNG's.
More importantly, you already have most of the necessary code in OpenSSL-FIPS.

And you can always swap out AES/SHA in the core for other algorithms to cater for the very paranoid and those who don't trust US algorithms, or just leave the RNG code 'pluggable' as it is now.

Peter



-----"openssl-dev" <[hidden email]> wrote: -----
To: [hidden email]
From: Benjamin Kaduk
Sent by: "openssl-dev"
Date: 10/24/2015 08:46AM
Subject: Re: [openssl-dev] Improving OpenSSL default RNG

On 10/23/2015 08:22 AM, Alessandro Ghedini wrote:
> Hello everyone,
>
> (sorry for the wall of text...)
>
> one of the things that both BoringSSL and LibreSSL have in common is the
> replacement of OpenSSL's default RNG RAND_SSLeay() with a simpler and saner
> alternative. Given RAND_SSLeay() complexity I think it'd be worth to at least
> consider possible alternatives for OpenSSL.

I heartily support this; the existing RAND_SSLeay() is a bit frightening
(though I take some solace in the existence of ENGINE_rdrand()).

> BoringSSL started using the system RNG (e.g. /dev/urandom) for every call to
> RAND_bytes(). Additionally, if the RDRAND instruction is available, the output
> of RDRAND is mixed with the output of the system RNG using ChaCha20. This uses
> thread-local storage to keep the global RNG state.

/dev/urandom is simple and safe absent the chroot case.  (Note that
capsicum-using applications will frequently open a fd for /dev/urandom
before entering capability mode and leave it open; the same might be
worth considering.)  Concerns about "running out of entropy" are
unfounded; the kernel uses a CS-PRNG and if we trust its output to seed
our own scheme, we can trust its output indefinitely.

Intel recommends calling RDRAND in a loop since it does not always
return successfully, and IIRC best practice is to mix it in with other
inputs (i.e., not use it directly).

> Incidentally, BoringSSL added a whole new API for thread-local storage which
> OpenSSL could adopt given that e.g. the ASYNC support could benefit from it
> (there are other interesting bits in BoringSSL, like the new threading API that
> could also be adopted by OpenSSL).
>
> The BoringSSL method is very simple but it needs a read from /dev/urandom for
> every call to RAND_bytes() which can be slow (though, BoringSSL's RAND_bytes()
> seems to implement some sort of buffering for /dev/urandom so the cost may be
> lower).

Keeping the default method simple, slow, and reliable could be a
reasonable approach, given that there is always the option of inserting
an alternate implementation if performance is a concern.  ("Simple"
probably means "rely on the kernel for everything, do not use
thread-local storage, etc.")

It might also be worth having a more complicated scheme that does use
thread-local storage (on systems where we know how to implement it) and
runs fortuna or something similar, but that does not necessarily need to
be the default implementation, in my opinion.

>
> On the other hand, LibreSSL replaced the whole RAND_* API with calls to
> OpenBSD's arc4random(). This is a nice and simple scheme that uses ChaCha20 to
> mix the internal RNG state, which is regularly reseeded from the system RNG.
> The core logic of this (excluding ChaCha20 and platform-specific bits) is
> implemented in less than 200 lines of code and, at least in theory, it's the
> one that provides the best performance/simplicity trade-off (ChaCha20 can be
> pretty fast even for non-asm platform-generic implementations).

A single syscall to get entropy is nice, whether it's a sysctl node,
getentropy(), getrandom(), or some other spelling; a library call like
arc4random() is almost as good.  But I don't think we're in a position
to rip out the RAND_* API layer as LibreSSL did.

> Both of these methods are robust and mostly platform-indipendent (e.g. none of
> them uses the system time, PID or uninitilized buffers to seed the RNG state)
> and have simple implementations, so I think OpenSSL can benefit a lot from
> adopting one of them. The astute readers may point out that OpenSSL doesn't
> support ChaCha20 yet, but that's hopefully coming soon.
>
> I think there's also room for improvement in the platform-specific RAND_poll()
> implementations, e.g.:
>
> - on Linux getrandom() should be used if available
> - on OpenBSD getentropy() should be used instead of arc4random()
> - the /dev/urandom code IMO can be simplified
> - the non-CryptGenRandom() code on Windows is just crazy. Do we even support
>   Windows versions before XP?
> - is EGD actually used anywhere today?

"I really hope not."

> - what about Netware, OS/2 and VMS, do we have any users on them? IIRC support
>   for other platforms has already been removed, what are the criteria for
>   keeping support for one?
> - etc...
>
> For comparison, OpenBSD's getentropy() implementation [0] is much cleaner and
> supports many of the platforms supported by OpenSSL.
>
> So, any thought? If there's interest in this, I can look into investigating
> these things more in detail and propose possible patches.
>

I'll just note in closing that there are a number of "fortuna-like"
implementations out there that are not actually fortuna, for example,
were implemented based off of the wikipedia article and not the actual
textbook.
https://github.com/krb5/krb5/blob/master/src/lib/crypto/krb/prng_fortuna.c
was implemented from _Cryptography Engineering_, and I think
https://github.com/freebsd/freebsd/blob/master/sys/dev/random/fortuna.c
is another.

-Ben
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev



_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Alessandro Ghedini
In reply to this post by Salz, Rich
On Fri, Oct 23, 2015 at 05:13:02pm +0000, Salz, Rich wrote:
> > Yes, that's what BoringSSL does. They have three implementations: pthread,
> > windows and none (which is just nops). I don't know what the availability of
> > pthreads is on the above platforms (NW, OS/2 and VMS), but it should cover
> > quite a bit of platforms.
>
> I'd instead have two:  native or none.  Where hopefully native is mostly
> pthreads, but for NW,OS/2,VMS it can at least start at as none.

Yeah, I guess both pthread and Windows implementations can both be called
"native".

FWIW I did a quick research and NW, OS/2 and VMS all seem to support pthreads
(but I don't know anything about those platforms, so I may be wrong).

> > Basically they deprecated the current CRYPTO_lock and CRYPTO_THREADID
> > API, and replaced that with mutex objects (CRYPTO_MUTEX).
>
> Not sure about that, I could be persuaded either way, for CRYPTO_lock.  The
> THREADID might need to stay for platform portability.

I think it could be possible to implement CRYPTO_MUTEX and thread-local storage
support by using CRYPTO_lock and CRYPTO_THREADID, so that could be kept as
fallback (e.g. it could be hidden behind OPENSSL_NO_DEPRECATE).

Incidentally a big user of the lock and thread-id API is mem_dbg.c, and looking
at the code in it I was wondering whether we really need it, considering that
now we have better tools to debug memory problems. So at some point I'd like to
try and make OPENSSL_malloc & co. aliases for malloc(), realloc() and free()
and remove (or deprecate) the custom memory functions... but that's probably a
whole different discussion.

> > Additionally,
> > this API provides thread-local storage support and "once" objects (to
> > execute functions only once, for example for initialization).
>
> I'd probably leave thread-local storage up to the application.

FWIW the ASYNC pull request [0] already uses thread-local storage, but instead
of using the pthread API (which is probably more portable) it uses the __thread
syntax.

The ERR_STATE thing could also be simplified a lot by using thread-local
storage (and the fallback thread-local support can be implemented using
THREADID as it's currently done in ERR_STATE itself, but all the complexity
would be moved to its own file, leaving err.c cleaner).

> But the once stuff would be useful *iff and only iff we used it to make
> initialization simpler.*

The RAND code would be an obvious candidate, but even in BoringSSL once objects
are not used a lot (not that it would add a lot of code since it would be just
a wrapper for pthread_once()).

Cheers

[0] https://github.com/openssl/openssl/pull/433

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Kurt Roeckx
On Sat, Oct 24, 2015 at 04:22:38PM +0200, Alessandro Ghedini wrote:
>
> So at some point I'd like to
> try and make OPENSSL_malloc & co. aliases for malloc(), realloc() and free()
> and remove (or deprecate) the custom memory functions... but that's probably a
> whole different discussion.

Please note that at least on Windows you need to have a way that
malloc() and free() are called from the same dll.

> > > Additionally,
> > > this API provides thread-local storage support and "once" objects (to
> > > execute functions only once, for example for initialization).
> >
> > I'd probably leave thread-local storage up to the application.
>
> FWIW the ASYNC pull request [0] already uses thread-local storage, but instead
> of using the pthread API (which is probably more portable) it uses the __thread
> syntax.

__thread isn't exactly portable, the "__" should make that clear.
C11 does add support for threads, but we're still stuck at C89
with some minor extentions.

But threading libraries always have to provide a way for thread
local storage, and so pthreads provides pthread_key_create(),
Windows has TlsAlloc().


Kurt

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Marcus Meissner
In reply to this post by Alessandro Ghedini
On Fri, Oct 23, 2015 at 07:19:11PM +0200, Alessandro Ghedini wrote:

> On Fri, Oct 23, 2015 at 04:34:11PM +0200, Dr. Matthias St. Pierre wrote:
> >
> > Hi,
> >
> > I have a related question concerning alternative RNGs, hope it is not too
> > off-topic:
> >
> > Currently we are using the NIST-SP800-90a compliant DRBG (fips_drbg_method()),
> > because it seemed to us to be more sophisticated and mature than the default
> > RAND_SSLeay(). At least it's better documented and tested.
> >
> > Currently this DRBG is only available through the FIPS object module, so you
> > need to build a FIPS capable OpenSSL library in order to use it.
> >
> > Shouldn't the FIPS DRBG code be added to the normal code base in master, too,
> > as an alternative RNG implemtation? Or is the NIST-SP800-90a DRG construction
> > already obsolete outside of FIPS world?
>
> FWIW, the FIPS module was recently removed, so FIPS_drbg_method() is not present
> in master anymore. I think there are plans to reimplement the whole thing, but
> I don't know anything about that.
>
> In general the NIST DRBGs seem fairly complicated (or completely untrustworthy
> like Dual EC DRBG), so I'd rather have a different implementation as default
> RNG for OpenSSL.

Well, the Dual EC has been removed from the guidance.

The other 3 modes described in NIST 800-90a make sense though. I suggest to read
the standard, the main things making it long are all the error handling and
reseeding strategies.

Ciao, Marcus
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Salz, Rich
In reply to this post by Alessandro Ghedini
> Yeah, I guess both pthread and Windows implementations can both be called
> "native".

Yes, that's the intent.
 
> FWIW I did a quick research and NW, OS/2 and VMS all seem to support
> pthreads (but I don't know anything about those platforms, so I may be
> wrong).

That would be good.

> Incidentally a big user of the lock and thread-id API is mem_dbg.c, and
> looking at the code in it I was wondering whether we really need it,

Take a look at: https://github.com/openssl/openssl/pull/450 

> FWIW the ASYNC pull request [0] already uses thread-local storage, but
> instead of using the pthread API (which is probably more portable) it uses
> the __thread syntax.

That should probably be changed.
 
> The ERR_STATE thing could also be simplified a lot by using thread-local
> storage (and the fallback thread-local support can be implemented using
> THREADID as it's currently done in ERR_STATE itself, but all the complexity
> would be moved to its own file, leaving err.c cleaner).

Yes, that should be changed too.

In case it's not clear, I've changed my thoughts.  Thread-local storage is more important and useful (thanks Kaduk and Ghedini!) than pthread-once kinds of things. I'd like to see a single API that initializes *everything* (or maybe takes a flag-bits) and a peer routine that takes down everything.  It would handle fork() and reset the RNG, and so on.

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improving OpenSSL default RNG

Dr. Matthias St. Pierre
In reply to this post by Marcus Meissner

On 10/24/2015 05:55 PM, Marcus Meissner wrote:

> On Fri, Oct 23, 2015 at 07:19:11PM +0200, Alessandro Ghedini wrote:
>> On Fri, Oct 23, 2015 at 04:34:11PM +0200, Dr. Matthias St. Pierre wrote:
>> ...
>> In general the NIST DRBGs seem fairly complicated (or completely untrustworthy
>> like Dual EC DRBG), so I'd rather have a different implementation as default
>> RNG for OpenSSL.
>
> Well, the Dual EC has been removed from the guidance.
>
> The other 3 modes described in NIST 800-90a make sense though. I suggest to read
> the standard, the main things making it long are all the error handling and
> reseeding strategies.
>
> Ciao, Marcus

I agree, to me it seems to be a rather straightforward implementation of a hybrid RNG. To get an impression of the
essentials, e.g. for the DRBG based on AES-CTR, it helps to have a look at Figures 11 (p.49) and 12 (p.51)
of  <http://csrc.nist.gov/publications/nistpubs/800-90A/SP800-90A.pdf>.

The nice part about the DRBG is that one can connect it to an external entropy source and configure
the reseed interval. It also supports prediction resistance on demand, although this feature is not available through
FIPS_drbg_method(), only if one uses FIPS_drbg_generate() directly.

So it would be convenient for us to have it available in the normal OpenSSL library without having to fiddle
with the FIPS object module. It wouldn't have to be the default OpenSSL RNG, though.

Regards, Matthias

_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev