Race Condition

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Race Condition

Serti Ayoub
Hi,

We upgraded a code base from openssl 1.0.2 to openssl1.1.1b.
The upgrade was straightforward and we manage to activate and use TLS1.3.
OpenSSL is used to implement multi-threaded HTTPS server.

While using TLS1.3 as the minimum version with option SSL_OP_NO_TICKET  the app crash randomly due to a race condition in openssl, exactly when 2 threads use the same SSL_SESSION*.

Some t

We don't install any session management callback and we keep session caching mode to 2 ( SSL_SESS_CACHE_SERVER).

I made some debugging/tracing and I found that SSL_OP_NO_TICKET force openssl to call lookup_sess_in_cache() function ( file ssl_sess.c ) 

CRYPTO_THREAD_read_lock(s->session_ctx->lock);
ret = lh_SSL_SESSION_retrieve(s->session_ctx->sessions, &data);
if (ret != NULL) {
/* don't allow other threads to steal it: */
SSL_SESSION_up_ref(ret);
}
CRYPTO_THREAD_unlock(s->session_ctx->lock);

lh_SSL_SESSION_retriev() doesn't make any lock check in retrieved session and lookup_sess_in_cache don't lock the session avec SSL_SESSION have a CRYPT_RWLOCK member.

I can't provide a sample to reproduce the crash, it's totaly random.
 
Here example of thread call stack:

Thread1:
libssl-1_1-x64.dll!tls_parse_ctos_supported_groups(ssl_st * s, PACKET * pkt, unsigned int context, x509_st * x, unsigned __int64 chainidx) Ligne 966 C
  libssl-1_1-x64.dll!tls_parse_extension(ssl_st * s, tlsext_index_en idx, int context, raw_extension_st * exts, x509_st * x, unsigned __int64 chainidx) Ligne 715 C
  libssl-1_1-x64.dll!tls_parse_all_extensions(ssl_st * s, int context, raw_extension_st * exts, x509_st * x, unsigned __int64 chainidx, int fin) Ligne 748 C
  libssl-1_1-x64.dll!tls_early_post_process_client_hello(ssl_st * s) Ligne 1883 C
  libssl-1_1-x64.dll!tls_post_process_client_hello(ssl_st * s, WORK_STATE wst) Ligne 2222 C
  libssl-1_1-x64.dll!ossl_statem_server_post_process_message(ssl_st * s, WORK_STATE wst) Ligne 1220 C
  libssl-1_1-x64.dll!read_state_machine(ssl_st * s) Ligne 664 C
  libssl-1_1-x64.dll!state_machine(ssl_st * s, int server) Ligne 434 C
  libssl-1_1-x64.dll!ossl_statem_accept(ssl_st * s) Ligne 256 C
  libssl-1_1-x64.dll!ssl3_read_bytes(ssl_st * s, int type, int * recvd_type, unsigned char * buf, unsigned __int64 len, int peek, unsigned __int64 * readbytes) Ligne 1270 C
  libssl-1_1-x64.dll!ssl3_read_internal(ssl_st * s, void * buf, unsigned __int64 len, int peek, unsigned __int64 * readbytes) Ligne 4473 C
  libssl-1_1-x64.dll!ssl3_read(ssl_st * s, void * buf, unsigned __int64 len, unsigned __int64 * readbytes) Ligne 4498 C
  libssl-1_1-x64.dll!ssl_read_internal(ssl_st * s, void * buf, unsigned __int64 num, unsigned __int64 * readbytes) Ligne 1754 C
  libssl-1_1-x64.dll!SSL_read(ssl_st * s, void * buf, int num) Ligne 1766 C

Thread2:
  libcrypto-1_1-x64.dll!CRYPTO_malloc(unsigned __int64 num, const char * file, int line) Ligne 222 C
 libssl-1_1-x64.dll!tls1_save_u16(PACKET * pkt, wchar_t * * pdest, unsigned __int64 * pdestlen) Ligne 1779 C
  libssl-1_1-x64.dll!tls_parse_ctos_supported_groups(ssl_st * s, PACKET * pkt, unsigned int context, x509_st * x, unsigned __int64 chainidx) Ligne 968 C
  libssl-1_1-x64.dll!tls_parse_extension(ssl_st * s, tlsext_index_en idx, int context, raw_extension_st * exts, x509_st * x, unsigned __int64 chainidx) Ligne 715 C
  libssl-1_1-x64.dll!tls_parse_all_extensions(ssl_st * s, int context, raw_extension_st * exts, x509_st * x, unsigned __int64 chainidx, int fin) Ligne 748 C
  libssl-1_1-x64.dll!tls_early_post_process_client_hello(ssl_st * s) Ligne 1883 C
  libssl-1_1-x64.dll!tls_post_process_client_hello(ssl_st * s, WORK_STATE wst) Ligne 2222 C
  libssl-1_1-x64.dll!ossl_statem_server_post_process_message(ssl_st * s, WORK_STATE wst) Ligne 1220 C
  libssl-1_1-x64.dll!read_state_machine(ssl_st * s) Ligne 664 C
  libssl-1_1-x64.dll!state_machine(ssl_st * s, int server) Ligne 434 C
  libssl-1_1-x64.dll!ossl_statem_accept(ssl_st * s) Ligne 256 C
  libssl-1_1-x64.dll!ssl3_read_bytes(ssl_st * s, int type, int * recvd_type, unsigned char * buf, unsigned __int64 len, int peek, unsigned __int64 * readbytes) Ligne 1270 C
  libssl-1_1-x64.dll!ssl3_read_internal(ssl_st * s, void * buf, unsigned __int64 len, int peek, unsigned __int64 * readbytes) Ligne 4473 C
  libssl-1_1-x64.dll!ssl3_read(ssl_st * s, void * buf, unsigned __int64 len, unsigned __int64 * readbytes) Ligne 4498 C
  libssl-1_1-x64.dll!ssl_read_internal(ssl_st * s, void * buf, unsigned __int64 num, unsigned __int64 * readbytes) Ligne 1754 C
  libssl-1_1-x64.dll!SSL_read(ssl_st * s, void * buf, int num) Ligne 1766 C


Reply | Threaded
Open this post in threaded view
|

Re: Race Condition

Dr Paul Dale
The SSL sessions are not thread safe.  It is up to the calling application to ensure that this race condition does not occur.


Pauli
-- 
Dr Paul Dale | Cryptographer | Network Security & Encryption 
Phone +61 7 3031 7217
Oracle Australia



On 14 Jun 2019, at 8:09 pm, Serti Ayoub <[hidden email]> wrote:

Hi,

We upgraded a code base from openssl 1.0.2 to openssl1.1.1b.
The upgrade was straightforward and we manage to activate and use TLS1.3.
OpenSSL is used to implement multi-threaded HTTPS server.

While using TLS1.3 as the minimum version with option SSL_OP_NO_TICKET  the app crash randomly due to a race condition in openssl, exactly when 2 threads use the same SSL_SESSION*.

Some t

We don't install any session management callback and we keep session caching mode to 2 ( SSL_SESS_CACHE_SERVER).

I made some debugging/tracing and I found that SSL_OP_NO_TICKET force openssl to call lookup_sess_in_cache() function ( file ssl_sess.c ) 

CRYPTO_THREAD_read_lock(s->session_ctx->lock);
ret = lh_SSL_SESSION_retrieve(s->session_ctx->sessions, &data);
if (ret != NULL) {
/* don't allow other threads to steal it: */
SSL_SESSION_up_ref(ret);
}
CRYPTO_THREAD_unlock(s->session_ctx->lock);

lh_SSL_SESSION_retriev() doesn't make any lock check in retrieved session and lookup_sess_in_cache don't lock the session avec SSL_SESSION have a CRYPT_RWLOCK member.

I can't provide a sample to reproduce the crash, it's totaly random.
 
Here example of thread call stack:

Thread1:
libssl-1_1-x64.dll!tls_parse_ctos_supported_groups(ssl_st * s, PACKET * pkt, unsigned int context, x509_st * x, unsigned __int64 chainidx) Ligne 966 C
  libssl-1_1-x64.dll!tls_parse_extension(ssl_st * s, tlsext_index_en idx, int context, raw_extension_st * exts, x509_st * x, unsigned __int64 chainidx) Ligne 715 C
  libssl-1_1-x64.dll!tls_parse_all_extensions(ssl_st * s, int context, raw_extension_st * exts, x509_st * x, unsigned __int64 chainidx, int fin) Ligne 748 C
  libssl-1_1-x64.dll!tls_early_post_process_client_hello(ssl_st * s) Ligne 1883 C
  libssl-1_1-x64.dll!tls_post_process_client_hello(ssl_st * s, WORK_STATE wst) Ligne 2222 C
  libssl-1_1-x64.dll!ossl_statem_server_post_process_message(ssl_st * s, WORK_STATE wst) Ligne 1220 C
  libssl-1_1-x64.dll!read_state_machine(ssl_st * s) Ligne 664 C
  libssl-1_1-x64.dll!state_machine(ssl_st * s, int server) Ligne 434 C
  libssl-1_1-x64.dll!ossl_statem_accept(ssl_st * s) Ligne 256 C
  libssl-1_1-x64.dll!ssl3_read_bytes(ssl_st * s, int type, int * recvd_type, unsigned char * buf, unsigned __int64 len, int peek, unsigned __int64 * readbytes) Ligne 1270 C
  libssl-1_1-x64.dll!ssl3_read_internal(ssl_st * s, void * buf, unsigned __int64 len, int peek, unsigned __int64 * readbytes) Ligne 4473 C
  libssl-1_1-x64.dll!ssl3_read(ssl_st * s, void * buf, unsigned __int64 len, unsigned __int64 * readbytes) Ligne 4498 C
  libssl-1_1-x64.dll!ssl_read_internal(ssl_st * s, void * buf, unsigned __int64 num, unsigned __int64 * readbytes) Ligne 1754 C
  libssl-1_1-x64.dll!SSL_read(ssl_st * s, void * buf, int num) Ligne 1766 C

Thread2:
  libcrypto-1_1-x64.dll!CRYPTO_malloc(unsigned __int64 num, const char * file, int line) Ligne 222 C
 libssl-1_1-x64.dll!tls1_save_u16(PACKET * pkt, wchar_t * * pdest, unsigned __int64 * pdestlen) Ligne 1779 C
  libssl-1_1-x64.dll!tls_parse_ctos_supported_groups(ssl_st * s, PACKET * pkt, unsigned int context, x509_st * x, unsigned __int64 chainidx) Ligne 968 C
  libssl-1_1-x64.dll!tls_parse_extension(ssl_st * s, tlsext_index_en idx, int context, raw_extension_st * exts, x509_st * x, unsigned __int64 chainidx) Ligne 715 C
  libssl-1_1-x64.dll!tls_parse_all_extensions(ssl_st * s, int context, raw_extension_st * exts, x509_st * x, unsigned __int64 chainidx, int fin) Ligne 748 C
  libssl-1_1-x64.dll!tls_early_post_process_client_hello(ssl_st * s) Ligne 1883 C
  libssl-1_1-x64.dll!tls_post_process_client_hello(ssl_st * s, WORK_STATE wst) Ligne 2222 C
  libssl-1_1-x64.dll!ossl_statem_server_post_process_message(ssl_st * s, WORK_STATE wst) Ligne 1220 C
  libssl-1_1-x64.dll!read_state_machine(ssl_st * s) Ligne 664 C
  libssl-1_1-x64.dll!state_machine(ssl_st * s, int server) Ligne 434 C
  libssl-1_1-x64.dll!ossl_statem_accept(ssl_st * s) Ligne 256 C
  libssl-1_1-x64.dll!ssl3_read_bytes(ssl_st * s, int type, int * recvd_type, unsigned char * buf, unsigned __int64 len, int peek, unsigned __int64 * readbytes) Ligne 1270 C
  libssl-1_1-x64.dll!ssl3_read_internal(ssl_st * s, void * buf, unsigned __int64 len, int peek, unsigned __int64 * readbytes) Ligne 4473 C
  libssl-1_1-x64.dll!ssl3_read(ssl_st * s, void * buf, unsigned __int64 len, unsigned __int64 * readbytes) Ligne 4498 C
  libssl-1_1-x64.dll!ssl_read_internal(ssl_st * s, void * buf, unsigned __int64 num, unsigned __int64 * readbytes) Ligne 1754 C
  libssl-1_1-x64.dll!SSL_read(ssl_st * s, void * buf, int num) Ligne 1766 C



Reply | Threaded
Open this post in threaded view
|

Re: Race Condition

Matt Caswell-2
In reply to this post by Serti Ayoub


On 14/06/2019 11:09, Serti Ayoub wrote:
>
> I can't provide a sample to reproduce the crash, it's totaly random.
>  
> Here example of thread call stack:

Yes, this does look like a bug. My guess is most people don't hit this because
they don't set SSL_OP_NO_TICKET in TLSv1.3. The default behaviour is to use
stateless tickets which aren't shared between threads, so no race condition is
possible. However, with SSL_OP_NO_TICKET we use stateful tickets which means the
session objects *are* shared.

Session objects are supposed to be immutable after the initial handshake is
complete so that this sort of thing doesn't happen. Looks like that isn't the
case in the handling of supported groups. In reality there is no reason at all
to store the supported groups information in the session object since we don't
reuse that information from one session resume to another anyway so its just
misplaced in the session object.

Please try out this patch:

https://github.com/openssl/openssl/pull/9162

Matt



Reply | Threaded
Open this post in threaded view
|

Re: Race Condition

Viktor Dukhovni
In reply to this post by Dr Paul Dale
> On Jun 14, 2019, at 8:02 AM, Dr Paul Dale <[hidden email]> wrote:
>
> The SSL sessions are not thread safe.  It is up to the calling application to ensure that this race condition does not occur.

Paul, it sounds like you're confusing (SSL_SESSION *) with (SSL *).

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: Race Condition

Dr Paul Dale
I did confuse things, apologies.
One day I’ll learn that I shouldn’t answer questions late on a Friday evening after a long week.

Pauli
-- 
Dr Paul Dale | Cryptographer | Network Security & Encryption 
Phone +61 7 3031 7217
Oracle Australia



On 15 Jun 2019, at 5:33 am, Viktor Dukhovni <[hidden email]> wrote:

On Jun 14, 2019, at 8:02 AM, Dr Paul Dale <[hidden email]> wrote:

The SSL sessions are not thread safe.  It is up to the calling application to ensure that this race condition does not occur.

Paul, it sounds like you're confusing (SSL_SESSION *) with (SSL *).

--
Viktor.