Crash at SSL_accept

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Crash at SSL_accept

Nauman Akbar

Dear Users

 

I am having this problem for a long time. Initially I thought it was an issue with configuration of multi-threading but the problem seems to remain with multi-threading removed.

 

I have developed a simple ssl based multi-threaded server application. Previously, openssl data was shared among threads but now all ssl functions are performed in a single thread. I am developing this application on RH9 using openssl 0.9.7a. There is only one client connecting to this server using the same credentials. Both client and server only use ADH with SSLv3.

 

The problem I am having is, sometimes SSL_accept fails completely randomly, taking down the server with it. It may be a segmentation fault or some other exception. Since I am connecting to the machine remotely, it is not possible for me to monitor the application at all times (although I have tried). This is why I don’t know for certain what error is generated when the server application crashes.

 

One thing is always common. The server terminates while doing a new SSL_accept. The client receives this error on the other side: 21298:error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac:s3_pkt.c:1052:SSL alert number 20.

 

Even the more bizarre thing is sometimes it would handle <500 connections, sometimes <1000. There had been few cases of 1000-5000 requests. Last week it crashed after about 2-3 weeks time with request count in excess of 11000. It crashed again yesterday after running for less than 24 hours and handling 40000 requests. It crashed again today within 24 hours with 700 requests. After every crash, I changed different multi-threading options (both generic and openssl based) to make it work. However, during last 2 runs no ssl based functions/data are shared among threads. So it is not a case of multi threading failing or any race condition causing the crash. Additionally, the application is explicitly made to keep thread count under 10 so it can’t be an issue of memory unavailability. The server program is quite linear and do not use dynamic blocks of memory except for certain class/structure objects (but no arrays etc), so index over running or anything similar is also not plausible. Just for sanity check, I am also having my code reviewed by others.

 

The situation has become very urgent as I have to deliver this by coming Friday and I still don’t know what is causing this. The only plausible option I am left with is 0.9.7a has some issues with SSL_accept. I am trying to get new version installed on the system. In the meantime, can anyone guide me with respect to this problem? Is this really a version issue or is there anything else I need to look at?

 

Regards

Nauman Akbar

Concise Solutions

Reply | Threaded
Open this post in threaded view
|

Re: Crash at SSL_accept

Dan Trainor-3
Nauman Akbar wrote:

> Dear Users
>
>  
>
> I am having this problem for a long time. Initially I thought it was an
> issue with configuration of multi-threading but the problem seems to
> remain with multi-threading removed.
>
>  
>
> I have developed a simple ssl based multi-threaded server application.
> Previously, openssl data was shared among threads but now all ssl
> functions are performed in a single thread. I am developing this
> application on RH9 using openssl 0.9.7a. There is only one client
> connecting to this server using the same credentials. Both client and
> server only use ADH with SSLv3.
>
>  
>
> The problem I am having is, sometimes SSL_accept fails completely
> randomly, taking down the server with it. It may be a segmentation fault
> or some other exception. Since I am connecting to the machine remotely,
> it is not possible for me to monitor the application at all times
> (although I have tried). This is why I don’t know for certain what error
> is generated when the server application crashes.
>
>  
>
> One thing is always common. The server terminates while doing a new
> SSL_accept. The client receives this error on the other side:
> 21298:error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record
> mac:s3_pkt.c:1052:SSL alert number 20.
>
>  
>
> Even the more bizarre thing is sometimes it would handle <500
> connections, sometimes <1000. There had been few cases of 1000-5000
> requests. Last week it crashed after about 2-3 weeks time with request
> count in excess of 11000. It crashed again yesterday after running for
> less than 24 hours and handling 40000 requests. It crashed again today
> within 24 hours with 700 requests. After every crash, I changed
> different multi-threading options (both generic and openssl based) to
> make it work. However, during last 2 runs no ssl based functions/data
> are shared among threads. So it is not a case of multi threading failing
> or any race condition causing the crash. Additionally, the application
> is explicitly made to keep thread count under 10 so it can’t be an issue
> of memory unavailability. The server program is quite linear and do not
> use dynamic blocks of memory except for certain class/structure objects
> (but no arrays etc), so index over running or anything similar is also
> not plausible. Just for sanity check, I am also having my code reviewed
> by others.
>
>  
>
> The situation has become very urgent as I have to deliver this by coming
> Friday and I still don’t know what is causing this. The only plausible
> option I am left with is 0.9.7a has some issues with SSL_accept. I am
> trying to get new version installed on the system. In the meantime, can
> anyone guide me with respect to this problem? Is this really a version
> issue or is there anything else I need to look at?
>
>  
>
> Regards
>
> Nauman Akbar
>
> Concise Solutions
>


Nauman -

We have been battling this exact same situation for the last three
weeks, to no avail.  We're regretfully considering other options such as
GnuTLS which looks promising, although this may not be an option for you.

I wish we could have worked out the issues with OpenSSL.  Perhaps it's
our coding that is messing up, but we are unable to get any help with
our problems.  It seems as if this error is quite common, and many
people have had it, yet not many can explain it, and even less know how
it's fixed.  We've tried compiling and linking our app against
OpenSSL-0.9.5 through OpenSSL-0.9.7g, with almost the exact same error.
  I am telling you this to save you the time.

I'm not sure that this helps, but at least we understand what you're
going through :)

Thanks
-dant
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Stop Compile?

Tom Spence
Hello Users,
 
I want to update the software (OpenSSL 0.9.7g) but why it stopped in somewhere the file.  Here what i did:
 
/* AIX 5100-06 */
 
# ./Configure aix-gcc zlib
.... (done)
# make
....
gcc -c -I.. -I../.. -I../../include -DOPENSSL_SYSNAME_AIX -DZLIB -DOPENSSL_THREADS -D_THREAD_SAFE -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_KRB5 -O3 -DB_ENDIAN  -o asm/aix_ppc32.o asm/aix_ppc32.s
<ctrl-c>
 
Have to stop cuz it stayed there for forever...  Any idea why?
 
Thanks.


(__[TomCigar]___{{{{~~~
Reply | Threaded
Open this post in threaded view
|

Re: Crash at SSL_accept

Dr. Stephen Henson
In reply to this post by Dan Trainor-3
On Tue, May 24, 2005, dan wrote:

> Nauman Akbar wrote:
> >Dear Users
> >
> >
> >
> >I am having this problem for a long time. Initially I thought it was an
> >issue with configuration of multi-threading but the problem seems to
> >remain with multi-threading removed.
> >
> >
> >
> >I have developed a simple ssl based multi-threaded server application.
> >Previously, openssl data was shared among threads but now all ssl
> >functions are performed in a single thread. I am developing this
> >application on RH9 using openssl 0.9.7a. There is only one client
> >connecting to this server using the same credentials. Both client and
> >server only use ADH with SSLv3.
> >
> >
> >
> >The problem I am having is, sometimes SSL_accept fails completely
> >randomly, taking down the server with it. It may be a segmentation fault
> >or some other exception. Since I am connecting to the machine remotely,
> >it is not possible for me to monitor the application at all times
> >(although I have tried). This is why I don’t know for certain what error
> >is generated when the server application crashes.
> >
> >
> >
> >One thing is always common. The server terminates while doing a new
> >SSL_accept. The client receives this error on the other side:
> >21298:error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record
> >mac:s3_pkt.c:1052:SSL alert number 20.
> >
> >
> >
> >Even the more bizarre thing is sometimes it would handle <500
> >connections, sometimes <1000. There had been few cases of 1000-5000
> >requests. Last week it crashed after about 2-3 weeks time with request
> >count in excess of 11000. It crashed again yesterday after running for
> >less than 24 hours and handling 40000 requests. It crashed again today
> >within 24 hours with 700 requests. After every crash, I changed
> >different multi-threading options (both generic and openssl based) to
> >make it work. However, during last 2 runs no ssl based functions/data
> >are shared among threads. So it is not a case of multi threading failing
> >or any race condition causing the crash. Additionally, the application
> >is explicitly made to keep thread count under 10 so it can’t be an issue
> >of memory unavailability. The server program is quite linear and do not
> >use dynamic blocks of memory except for certain class/structure objects
> >(but no arrays etc), so index over running or anything similar is also
> >not plausible. Just for sanity check, I am also having my code reviewed
> >by others.
> >
> >
> >
> >The situation has become very urgent as I have to deliver this by coming
> >Friday and I still don’t know what is causing this. The only plausible
> >option I am left with is 0.9.7a has some issues with SSL_accept. I am
> >trying to get new version installed on the system. In the meantime, can
> >anyone guide me with respect to this problem? Is this really a version
> >issue or is there anything else I need to look at?
> >
> >
> >
> >Regards
> >
> >Nauman Akbar
> >
> >Concise Solutions
> >
>
>
> Nauman -
>
> We have been battling this exact same situation for the last three
> weeks, to no avail.  We're regretfully considering other options such as
> GnuTLS which looks promising, although this may not be an option for you.
>
> I wish we could have worked out the issues with OpenSSL.  Perhaps it's
> our coding that is messing up, but we are unable to get any help with
> our problems.  It seems as if this error is quite common, and many
> people have had it, yet not many can explain it, and even less know how
> it's fixed.  We've tried compiling and linking our app against
> OpenSSL-0.9.5 through OpenSSL-0.9.7g, with almost the exact same error.
>  I am telling you this to save you the time.
>
> I'm not sure that this helps, but at least we understand what you're
> going through :)
>

In brief the "bad record mac" error is caused when OpenSSLs calculated record
macs (a checksum of sorts) doesn't agree with the value the peer has given.

There are a very large number of possible causes for this error. One could be
an implementation bug (either OpenSSL or the peer), corruption of network
data a badly written application or even malicious activity.

If this is causing a crash then a stack trace is needed to have a reasonable
chance to trace the cause.

Steve.
--
Dr Stephen N. Henson. Email, S/MIME and PGP keys: see homepage
OpenSSL project core developer and freelance consultant.
Funding needed! Details on homepage.
Homepage: http://www.drh-consultancy.demon.co.uk
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

SSL3_READ_BYTES:reason(1000)

tony vong
I got this error after calling ssl_peek, which returns
0.
What does the error mean ?
SSL3_READ_BYTES:reason(1000)

--- "Dr. Stephen Henson" <[hidden email]> wrote:

> On Tue, May 24, 2005, dan wrote:
>
> > Nauman Akbar wrote:
> > >Dear Users
> > >
> > >
> > >
> > >I am having this problem for a long time.
> Initially I thought it was an
> > >issue with configuration of multi-threading but
> the problem seems to
> > >remain with multi-threading removed.
> > >
> > >
> > >
> > >I have developed a simple ssl based
> multi-threaded server application.
> > >Previously, openssl data was shared among threads
> but now all ssl
> > >functions are performed in a single thread. I am
> developing this
> > >application on RH9 using openssl 0.9.7a. There is
> only one client
> > >connecting to this server using the same
> credentials. Both client and
> > >server only use ADH with SSLv3.
> > >
> > >
> > >
> > >The problem I am having is, sometimes SSL_accept
> fails completely
> > >randomly, taking down the server with it. It may
> be a segmentation fault
> > >or some other exception. Since I am connecting to
> the machine remotely,
> > >it is not possible for me to monitor the
> application at all times
> > >(although I have tried). This is why I don’t know
> for certain what error
> > >is generated when the server application crashes.
> > >
> > >
> > >
> > >One thing is always common. The server terminates
> while doing a new
> > >SSL_accept. The client receives this error on the
> other side:
> > >21298:error:140943FC:SSL
> routines:SSL3_READ_BYTES:sslv3 alert bad record
> > >mac:s3_pkt.c:1052:SSL alert number 20.
> > >
> > >
> > >
> > >Even the more bizarre thing is sometimes it would
> handle <500
> > >connections, sometimes <1000. There had been few
> cases of 1000-5000
> > >requests. Last week it crashed after about 2-3
> weeks time with request
> > >count in excess of 11000. It crashed again
> yesterday after running for
> > >less than 24 hours and handling 40000 requests.
> It crashed again today
> > >within 24 hours with 700 requests. After every
> crash, I changed
> > >different multi-threading options (both generic
> and openssl based) to
> > >make it work. However, during last 2 runs no ssl
> based functions/data
> > >are shared among threads. So it is not a case of
> multi threading failing
> > >or any race condition causing the crash.
> Additionally, the application
> > >is explicitly made to keep thread count under 10
> so it can’t be an issue
> > >of memory unavailability. The server program is
> quite linear and do not
> > >use dynamic blocks of memory except for certain
> class/structure objects
> > >(but no arrays etc), so index over running or
> anything similar is also
> > >not plausible. Just for sanity check, I am also
> having my code reviewed
> > >by others.
> > >
> > >
> > >
> > >The situation has become very urgent as I have to
> deliver this by coming
> > >Friday and I still don’t know what is causing
> this. The only plausible
> > >option I am left with is 0.9.7a has some issues
> with SSL_accept. I am
> > >trying to get new version installed on the
> system. In the meantime, can
> > >anyone guide me with respect to this problem? Is
> this really a version
> > >issue or is there anything else I need to look
> at?
> > >
> > >
> > >
> > >Regards
> > >
> > >Nauman Akbar
> > >
> > >Concise Solutions
> > >
> >
> >
> > Nauman -
> >
> > We have been battling this exact same situation
> for the last three
> > weeks, to no avail.  We're regretfully considering
> other options such as
> > GnuTLS which looks promising, although this may
> not be an option for you.
> >
> > I wish we could have worked out the issues with
> OpenSSL.  Perhaps it's
> > our coding that is messing up, but we are unable
> to get any help with
> > our problems.  It seems as if this error is quite
> common, and many
> > people have had it, yet not many can explain it,
> and even less know how
> > it's fixed.  We've tried compiling and linking our
> app against
> > OpenSSL-0.9.5 through OpenSSL-0.9.7g, with almost
> the exact same error.
> >  I am telling you this to save you the time.
> >
> > I'm not sure that this helps, but at least we
> understand what you're
> > going through :)
> >
>
> In brief the "bad record mac" error is caused when
> OpenSSLs calculated record
> macs (a checksum of sorts) doesn't agree with the
> value the peer has given.
>
> There are a very large number of possible causes for
> this error. One could be
> an implementation bug (either OpenSSL or the peer),
> corruption of network
> data a badly written application or even malicious
> activity.
>
> If this is causing a crash then a stack trace is
> needed to have a reasonable
> chance to trace the cause.
>
> Steve.
> --
> Dr Stephen N. Henson. Email, S/MIME and PGP keys:
> see homepage
> OpenSSL project core developer and freelance
> consultant.
> Funding needed! Details on homepage.
> Homepage: http://www.drh-consultancy.demon.co.uk
>
______________________________________________________________________
> OpenSSL Project                                
> http://www.openssl.org
> User Support Mailing List                  
> [hidden email]
> Automated List Manager                          
> [hidden email]
>


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com 
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]