thread-safety questions on 1.0.1c

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

thread-safety questions on 1.0.1c

Thomas Eckert
I am seeing lots of errors whose error message reads
  "S <server_ip>: 2851965808:error:14092105:SSL
routines:SSL3_GET_SERVER_HELLO:wrong cipher returned:s3_clnt.c:963:"
if I run it in at least several (8+) threads. Single threaded it's all
doing fine, so I guess the kind of issue is obvious.

I assumed this was related to my code initiating OpenSSL thread-safety
with deprecated calls, e.g.
     CRYPTO_set_locking_callback(ssl_lock);
     CRYPTO_set_id_callback(ssl_get_thread_id);
where ssl_lock() simply uses glib mutexes to do the locking and
ssl_get_thread_id() uses pthread_self() to return the thread's id. These
have worked perfectly in the past for a long time so I didn't expect
them to be the source of the problem. Anyway, since the docs at
http://www.openssl.org/docs/crypto/threads.html advise to use the new
calls with any version >= 1.0.0 I replaced
     CRYPTO_set_locking_callback(ssl_lock);
with
     CRYPTO_THREADID_set_callback(threadid_func));
where threadid_func is just
     CRYPTO_THREADID_set_numeric(id, pthread_self());

and also added the dynamic locking functions. Before, though, I checked
the OpenSSL sources and I got the feeling those dynamic locks would only
rarely (if at all) get get called. So far, my dynamic locks have not
been called once by OpenSSL - confirming this here
http://fixunix.com/openssl/359957-re-clarification-questions-openssl-thread-safe-support.html

In my application, there is only one global context and it is used to
set up all SSL sessions. To be able to do so, it is modified heavily
(read: for every connection) prior to calling SSL_new(ssl_ctx). This may
include setting the ciphers, the certificates, SNI, etc., depending on
the situation and the needs for that connection. Since I couldn't find a
locking callback inside the SSL_CTX, the whole code is protected by a
mutex on my end so I am fairly sure concurrent access on the SSL_CTX in
my code is not the problem. But maybe after calling SSL_new(ssl_ctx)
there's some magic going on behind the doors which accesses the context
again ? Of course, such access would no longer be safe and would need to
be controlled (how?).

As a side note: Am I correct in assuming the 'old'
CRYPTO_set_locking_callback() function did not get a replacement, as did
CRYPTO_set_id_callback() ? I couldn't find any such replacement in the
sources and I suppose that's one of the places where the dyn locks are
supposed to come in, in future versions.

Anyone got an idea on how to procede ?

Regards,
  Thomas
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

RE: thread-safety questions on 1.0.1c

J. J. Farrell-2
> From: Thomas Eckert [mailto:[hidden email]]
> Sent: Tuesday, November 20, 2012 9:44 AM
>
> I am seeing lots of errors whose error message reads
>   "S <server_ip>: 2851965808:error:14092105:SSL
> routines:SSL3_GET_SERVER_HELLO:wrong cipher returned:s3_clnt.c:963:"
> if I run it in at least several (8+) threads. Single threaded it's all
> doing fine, so I guess the kind of issue is obvious.
>
> I assumed this was related to my code initiating OpenSSL thread-safety
> with deprecated calls, e.g.
>      CRYPTO_set_locking_callback(ssl_lock);
>      CRYPTO_set_id_callback(ssl_get_thread_id);
> where ssl_lock() simply uses glib mutexes to do the locking and
> ssl_get_thread_id() uses pthread_self() to return the thread's id.
> These
> have worked perfectly in the past for a long time so I didn't expect
> them to be the source of the problem. Anyway, since the docs at
> http://www.openssl.org/docs/crypto/threads.html advise to use the new
> calls with any version >= 1.0.0 I replaced
>      CRYPTO_set_locking_callback(ssl_lock);
> with
>      CRYPTO_THREADID_set_callback(threadid_func));
> where threadid_func is just
>      CRYPTO_THREADID_set_numeric(id, pthread_self());

That threads man page got updated as part of these changes, which was a great idea, but unfortunately it got mangled in the process. In particular, while it talks in detail about locking and thread ID callbacks it apparently no longer gives any clue how to actually set the locking callback. It also now contains a lot of stuff which appears to be internal implementation detail and APIs which are private to the library, getting in the way of finding the required information for the external APIs (the most important bit of which is no longer there, or it's ended up so well hidden that I can't find it at least).

My understanding is that you still need to call CRYPTO_set_locking_callback() to register your locking function, as before. Without that you have no locking, leading to what you're seeing.

What's really changed is the thread ID mechanism. The biggest change for people working on "normal" OSes (such as Windows, and UNIX and other POSIXy things) is that you no longer need thread ID callbacks at all. If your OS is Windows, or if errno has a different address in each thread, then the built-in thread ID mechanism is all you need.

My code runs only on such server OSes, and my change when moving up to 1.0.0 and later was simply to remove my dodgy thread ID callback function and the call to CRYPTO_set_id_callback(). All the standard and dynamic locking stuff stays the same.

I can't say for certain that this is correct, and I've only just made the change and haven't yet tested it thoroughly, but it's my understanding after some thinking and digging.

Not that this explains why you started seeing the problem when you still had your original locking callback in place, that is worrying ...

Regards,
                 jjf


> and also added the dynamic locking functions. Before, though, I checked
> the OpenSSL sources and I got the feeling those dynamic locks would
> only
> rarely (if at all) get get called. So far, my dynamic locks have not
> been called once by OpenSSL - confirming this here
> http://fixunix.com/openssl/359957-re-clarification-questions-openssl-
> thread-safe-support.html
>
> In my application, there is only one global context and it is used to
> set up all SSL sessions. To be able to do so, it is modified heavily
> (read: for every connection) prior to calling SSL_new(ssl_ctx). This
> may
> include setting the ciphers, the certificates, SNI, etc., depending on
> the situation and the needs for that connection. Since I couldn't find
> a
> locking callback inside the SSL_CTX, the whole code is protected by a
> mutex on my end so I am fairly sure concurrent access on the SSL_CTX in
> my code is not the problem. But maybe after calling SSL_new(ssl_ctx)
> there's some magic going on behind the doors which accesses the context
> again ? Of course, such access would no longer be safe and would need
> to
> be controlled (how?).
>
> As a side note: Am I correct in assuming the 'old'
> CRYPTO_set_locking_callback() function did not get a replacement, as
> did
> CRYPTO_set_id_callback() ? I couldn't find any such replacement in the
> sources and I suppose that's one of the places where the dyn locks are
> supposed to come in, in future versions.
>
> Anyone got an idea on how to procede ?
>
> Regards,
>   Thomas
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]