Potential timing problems in OpenSSL RSA implementation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Potential timing problems in OpenSSL RSA implementation

Hanno Böck-4
Hi,

As many have probably seen some people (including me) recently published
the so-called ROBOT attack [1], which is the re-birth of the old
Bleichenbacher attack against RSA in TLS.

We mostly focussed on non-timing issues and OpenSSL is not among the
vulnerable implementations. However during various conversations I
learned something about timing problems that affects OpenSSL (but very
likely also almost every other TLS library out there).

The problem is the use of the bignum library, which is variable size.
This means that numbers with leading zeros will take up slightly less
space and thus could cause a timing signal. A more detailed description
below.

For RSA this means if the result of an RSA operation has one or several
leading zeros this means copying around the data will be slightly
faster. Thus an attacker can learn something about the decrypted value.
In the end this could be turned into an attack very similar to a
Bleichenbacher attack.

I'd like to stress that this is highly speculative, it may very well be
that this is not exploitable in any practical way. But I thought it's
important enough that it should be public knowledge. (Also "This leaves
a small timing channel, [...] but it is not believed to be large
enough to be exploitable" said TLS 1.2 when it described what later
became Lucky13.)

Fixing this would likely require changing the bignum library in a
way that it knows fixed size types that allow e.g. treating a 256 byte
number in the same way as a 255 byte number.


---------------------------

Here
<https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/ssl/s3_srvr.c#L2278>
is
the top-level RSA-decrypt for OpenSSL 1.0.2's TLS impl. It calls
RSA_private_decrypt, which ends up here
<https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/crypto/rsa/rsa_eay.c#L483>.
Note that that function calls
<https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/crypto/rsa/rsa_eay.c#L580>
BN_bn2bin
to convert from the resulting BIGNUM to a big-endian buffer. That
<https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/crypto/bn/bn_lib.c#L646>
function
works based on the number of bytes in the BIGNUM. Since BIGNUMs can
never have a most-significant zero limb, if the RSA result starts with
32 (or 64) leading zero bits, there'll be a timing signal there. Also,
if it starts with 8 leading zero bits, there'll be a timing signal in
BN_bn2bin. BoringSSL solves
<https://boringssl.googlesource.com/boringssl/+/296a61d6007688a1472798879b81517920e35dff/crypto/fipsmodule/bn/bytes.c#208>
the
latter problem, but not the former.



[1] https://robotattack.org/

--
Hanno Böck
https://hboeck.de/

mail/jabber: [hidden email]
GPG: FE73757FA60E4E21B937579FA5880072BBB51E42
--
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Reply | Threaded
Open this post in threaded view
|

Re: Potential timing problems in OpenSSL RSA implementation

Andy Polyakov-2
Hi,

> I'd like to stress that this is highly speculative, it may very well be
> that this is not exploitable in any practical way. But I thought it's
> important enough that it should be public knowledge. (Also "This leaves
> a small timing channel, [...] but it is not believed to be large
> enough to be exploitable" said TLS 1.2 when it described what later
> became Lucky13.)
>
> Fixing this would likely require changing the bignum library in a
> way that it knows fixed size types that allow e.g. treating a 256 byte
> number in the same way as a 255 byte number.

One has to keep in mind that in order to measure some/any difference
your instrument's accuracy has to be sufficient. And I'd say this was
what lead to failure to recognize significance of time channel that
became Lucky13. I mean it was failure to appreciate accuracy of
measuring, wasn't it? [In the context one can even wonder how
significant was the fact that attacker was ~2x faster than victim in
original paper :-)]. Another essential component is that oracle timing
has to be *reliably* distinguishable from "natural" variation. [I write
"natural" in quotes, because program behaviour is not directly governed
by laws of physics, isn't it? Even if they, complex programs, tend to
exhibit not purely deterministic timing.] With this in mind one can
actually wonder if timing difference between handling 256- and 254-byte
numbers at the end of operation can actually be measured. As we would be
looking at handful cycles difference when "natural" variation is more
like hundred[s]. However! I'm not saying that it's absolutely
unfeasible(*), but rather that above might be wrong *first* question to
ask. Because timing signal from input lengths that differ by *limb* is
more likely to be reliably distinguishable. One of key reasons being the
fact that inputs of unusual lengths are customarily treated by distinct
and less optimized code paths. What *presumably* saves the day is that
probability of hitting a limb-shorter result is prohibitively low. [Not
to forget that attacker would be limited by victim's resources.] Is this
reasonable assumption?

(*) Attacker might be in position to amplify the signal, for example by
invalidating victim's guest's cache in virtialized environment.
--
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev