Deadlock in RAND_poll's Heap32First call

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Deadlock in RAND_poll's Heap32First call

sandeep kiran p
Hi,

OpenSSL Version: 0.9.8o
OS : Windows Server 2008 R2 SP1

I am seeing a deadlock in a windows application between two threads, one thread calling Heap32First from OpenSSL's RAND_poll and the other that allocates memory over the heap.

Here is the relevant stack trace from both the threads involved in deadlock.

Thread 523
----------------
ntdll!ZwWaitForSingleObject+a
ntdll!RtlpWaitOnCriticalSection+e8
ntdll!RtlEnterCriticalSection+d1
ntdll!RtlpAllocateHeap+18a6
ntdll!RtlAllocateHeap+16c
ntdll!RtlpAllocateUserBlock+145
ntdll!RtlpLowFragHeapAllocFromContext+4e7
ntdll!RtlAllocateHeap+e4
ntdll!RtlInitializeCriticalSectionEx+d2
ntdll!RtlpActivateLowFragmentationHeap+181
ntdll!RtlpPerformHeapMaintenance+27
ntdll!RtlpAllocateHeap+1819
ntdll!RtlAllocateHeap+16c


Thread 454
-----------------
ntdll!NtWaitForSingleObject+0xa
ntdll!RtlpWaitOnCriticalSection+0xe8
ntdll!RtlEnterCriticalSection+0xd1
ntdll!RtlLockHeap+0x3b
ntdll!RtlpQueryExtendedHeapInformation+0xf4
ntdll!RtlQueryHeapInformation+0x3c
ntdll!RtlQueryProcessHeapInformation+0x3ad
ntdll!RtlQueryProcessDebugInformation+0x3b0
kernel32!Heap32First+0x71

WinDBG reports that thread 523 and 454 both hold locks and are waiting for each other locks thereby resulting in a deadlock. 

On searching, I have found a couple instances where such an issue has been reported with Heap32Next on Windows 7 but haven't found anything that helps me solve the problem. Most of the references I found conclude that this could be because of a possible bug in heap traversal APIs. If someone has faced a similar problem, can you guide me to possible workarounds by which I can avoid the deadlock? Can I remove the heap traversal routines and find some other sources of entropy?

Thanks for your help.

Regards
Sandeep





Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in RAND_poll's Heap32First call

Jakob Bohm-7
 From the evidence given, I would *almost* certainly characterize
this as a deadlock bug in ntdll.dll, the deepest, most trusted
user mode component of Windows!

Specifically, nothing should allow regular user code such as
OpenSSL to hold onto NT internal critical sections while not
running inside NTDLL, and NTDLL should be designed not to
deadlock against itself.

There is one other possibility though:

The OpenSSL code in rand_win.c holds on to a "snapshot" lock
on some of the heap data while walking it.  It may be doing
this in a way not permitted by the rules that are presumed
by the deadlock avoidance design of the speed critical heap
locking code.

On 2/23/2012 2:11 PM, sandeep kiran p wrote:

> Hi,
>
> OpenSSL Version: 0.9.8o
> OS : Windows Server 2008 R2 SP1
>
> I am seeing a deadlock in a windows application between two threads,
> one thread calling Heap32First from OpenSSL's RAND_poll and the other
> that allocates memory over the heap.
>
> Here is the relevant stack trace from both the threads involved in
> deadlock.
>
> Thread 523
> ----------------
> ntdll!ZwWaitForSingleObject+a
> ntdll!RtlpWaitOnCriticalSection+e8
> ntdll!RtlEnterCriticalSection+d1
> ntdll!RtlpAllocateHeap+18a6
> ntdll!RtlAllocateHeap+16c
> ntdll!RtlpAllocateUserBlock+145
> ntdll!RtlpLowFragHeapAllocFromContext+4e7
> ntdll!RtlAllocateHeap+e4
> ntdll!RtlInitializeCriticalSectionEx+d2
> ntdll!RtlpActivateLowFragmentationHeap+181
> ntdll!RtlpPerformHeapMaintenance+27
> ntdll!RtlpAllocateHeap+1819
> ntdll!RtlAllocateHeap+16c
>
>
> Thread 454
> -----------------
> ntdll!NtWaitForSingleObject+0xa
> ntdll!RtlpWaitOnCriticalSection+0xe8
> ntdll!RtlEnterCriticalSection+0xd1
> ntdll!RtlLockHeap+0x3b
> ntdll!RtlpQueryExtendedHeapInformation+0xf4
> ntdll!RtlQueryHeapInformation+0x3c
> ntdll!RtlQueryProcessHeapInformation+0x3ad
> ntdll!RtlQueryProcessDebugInformation+0x3b0
> kernel32!Heap32First+0x71
>
> WinDBG reports that thread 523 and 454 both hold locks and are waiting
> for each other locks thereby resulting in a deadlock.
>
> On searching, I have found a couple instances where such an issue has
> been reported with Heap32Next on Windows 7 but haven't found anything
> that helps me solve the problem. Most of the references I found
> conclude that this could be because of a possible bug in heap
> traversal APIs. If someone has faced a similar problem, can you guide
> me to possible workarounds by which I can avoid the deadlock? Can I
> remove the heap traversal routines and find some other sources of entropy?
>
> Thanks for your help.
>
> Regards
> Sandeep
>
>
>
>
>

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in RAND_poll's Heap32First call

sandeep kiran p
You mentioned that OpenSSL is holding a "snapshot" lock in rand_win.c. I couldn't find anything like that in that file. Can you specifically point me to the code that you are referring to? I would also like to get an opinion on possible workarounds that I can enforce to avoid the deadlock. 

1. Can I remove the heap traversal routines Heap32First and Heap32Next? Will it badly affect the PRNG output later on?

2. Can I replace Heap32First and Heap32Next calls with any other sources of entropy? What if I make a call to CryptGenRandom again in place of the heap traversal routines?

3. Any other possible ways out?

Thanks,
Sandeep

On Thu, Feb 23, 2012 at 10:08 PM, Jakob Bohm <[hidden email]> wrote:
From the evidence given, I would *almost* certainly characterize
this as a deadlock bug in ntdll.dll, the deepest, most trusted
user mode component of Windows!

Specifically, nothing should allow regular user code such as
OpenSSL to hold onto NT internal critical sections while not
running inside NTDLL, and NTDLL should be designed not to
deadlock against itself.

There is one other possibility though:

The OpenSSL code in rand_win.c holds on to a "snapshot" lock
on some of the heap data while walking it.  It may be doing
this in a way not permitted by the rules that are presumed
by the deadlock avoidance design of the speed critical heap
locking code.


On 2/23/2012 2:11 PM, sandeep kiran p wrote:
Hi,

OpenSSL Version: 0.9.8o
OS : Windows Server 2008 R2 SP1

I am seeing a deadlock in a windows application between two threads, one thread calling Heap32First from OpenSSL's RAND_poll and the other that allocates memory over the heap.

Here is the relevant stack trace from both the threads involved in deadlock.

Thread 523
----------------
ntdll!ZwWaitForSingleObject+a
ntdll!RtlpWaitOnCriticalSection+e8
ntdll!RtlEnterCriticalSection+d1
ntdll!RtlpAllocateHeap+18a6
ntdll!RtlAllocateHeap+16c
ntdll!RtlpAllocateUserBlock+145
ntdll!RtlpLowFragHeapAllocFromContext+4e7
ntdll!RtlAllocateHeap+e4
ntdll!RtlInitializeCriticalSectionEx+d2
ntdll!RtlpActivateLowFragmentationHeap+181
ntdll!RtlpPerformHeapMaintenance+27
ntdll!RtlpAllocateHeap+1819
ntdll!RtlAllocateHeap+16c


Thread 454
-----------------
ntdll!NtWaitForSingleObject+0xa
ntdll!RtlpWaitOnCriticalSection+0xe8
ntdll!RtlEnterCriticalSection+0xd1
ntdll!RtlLockHeap+0x3b
ntdll!RtlpQueryExtendedHeapInformation+0xf4
ntdll!RtlQueryHeapInformation+0x3c
ntdll!RtlQueryProcessHeapInformation+0x3ad
ntdll!RtlQueryProcessDebugInformation+0x3b0
kernel32!Heap32First+0x71

WinDBG reports that thread 523 and 454 both hold locks and are waiting for each other locks thereby resulting in a deadlock.

On searching, I have found a couple instances where such an issue has been reported with Heap32Next on Windows 7 but haven't found anything that helps me solve the problem. Most of the references I found conclude that this could be because of a possible bug in heap traversal APIs. If someone has faced a similar problem, can you guide me to possible workarounds by which I can avoid the deadlock? Can I remove the heap traversal routines and find some other sources of entropy?

Thanks for your help.

Regards
Sandeep






Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark.  Direct <a href="tel:%2B45%2031%2013%2016%2010" value="+4531131610" target="_blank">+45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]

Reply | Threaded
Open this post in threaded view
|

Resources for certificates using OpenSSL (newbie)

Jaquez Jr, Hector L.

Hello,

                I am new to certificates, how to create them, how to import them etc.  I am looking for good training material that I can read over to learn more about this.   I can create a CSR file but don’t know how to import it using command line or GUI for that matter.  We have servers that use Apache so I need to learn how to import the CRT once I get it using OpenSSL, the format the certificate needs to be in, or if there is a GUI I can use to import the certificate.  If anyone knows of a good one stop shop resource please let me know. 

 

Thanks,

 

Hector L. Jaquez Jr.
Data Security Analyst II
HQ AAFES, Information Technology Governance
W 214-312-4449
BB 214-794-3641

Reply | Threaded
Open this post in threaded view
|

Re: Resources for certificates using OpenSSL (newbie)

Michael S. Zick-4
On Fri February 24 2012, Jaquez Jr, Hector L. wrote:
> Hello,
>                 I am new to certificates, how to create them, how to import them etc.

You must be new to mailing lists also.
Start your own thread, they are cheap here, don't hijack another topic.

Mike

>                 I am looking for good training material that I can read over to learn more about this.   I can create a CSR file but don't know how to import it using command line or GUI for that matter.  We have servers that use Apache so I need to learn how to import the CRT once I get it using OpenSSL, the format the certificate needs to be in, or if there is a GUI I can use to import the certificate.  If anyone knows of a good one stop shop resource please let me know.
>
> Thanks,
>
> Hector L. Jaquez Jr.
> Data Security Analyst II
> HQ AAFES, Information Technology Governance
> W 214-312-4449
> BB 214-794-3641
>


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

RE: Resources for certificates using OpenSSL (newbie)

Edward Ned Harvey (openssl)
> From: [hidden email] [mailto:owner-openssl-
> [hidden email]] On Behalf Of Michael S. Zick
>
> You must be new to mailing lists also.
> Start your own thread, they are cheap here, don't hijack another topic.

Mike, How do you call that a thread hijack?  New subject, new thread id...
I don't see how it was a thread hijack.

Hector, I wish I had a good resource to send your way. My experience has
been like this:  Years ago when I didn't know anything about generating or
installing certs, I just found some random webpages about how to generate
self-signed certs and I copied them brainlessly, but gained some
familiarity.  Later I wanted to have trusted signed certs, so I paid for
services such as godaddy and thawte, and brainlessly followed their
instructions, but gained further experience.  More recently, I'm a fan of
startssl.com.

Often when I do this sort of stuff, the instructions written by whoever are
slightly too specific, or the starting point or resources available to you
at the time are slightly different.  The industry keeps evolving a little
bit.  Targets move.

For example, in a cisco ASA, last year I generated a csr, and got it signed.
This year I went to regenerate and renew, but I found the ASA is only
capable of signing using md5sum, which is no longer acceptable by the
certificate authority, so even though I'm doing precisely the same task as I
did 1 year ago, I can't follow the same process anymore.

Hopefully someone can refer you to a good introductory set of materials, but
I think most likely, you'll find too often something isn't written precisely
for what you need, or something else has changed.  

I suggest you basically just start experimenting and learning.  Ask
questions here when you get stuck.  The more exposure you give yourself, the
better you'll learn.

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Resources for certificates using OpenSSL (newbie)

Jakob Bohm-7
On 2/24/2012 8:27 PM, Edward Ned Harvey wrote:
>> From: [hidden email] [mailto:owner-openssl-
>> [hidden email]] On Behalf Of Michael S. Zick
>>
>> You must be new to mailing lists also.
>> Start your own thread, they are cheap here, don't hijack another topic.
> Mike, How do you call that a thread hijack?  New subject, new thread id...
> I don't see how it was a thread hijack.
This depends on your choice of e-mail program!

If you look at Hector's mail headers, they list the
mails about "Deadlock in RAND_poll's Heap32First call"
in the "References:" header, which is what many mail
programs (but obviously not Outlook version 14) uses
to find out which mails belong to the same thread.
   If Outlook 14 has a button to "create a new
thread", it is unfortunate if that button adds
headers related to other threads anyway.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in RAND_poll's Heap32First call

Jakob Bohm-7
In reply to this post by sandeep kiran p
On 2/24/2012 2:14 PM, sandeep kiran p wrote:
> You mentioned that OpenSSL is holding a "snapshot" lock in rand_win.c.
> I couldn't find anything like that in that file. Can you specifically
> point me to the code that you are referring to? I would also like to
> get an opinion on possible workarounds that I can enforce to avoid the
> deadlock.
>
In OpenSSL 1.0.0 it is line 486 which says

          module_next && (handle = snap(TH32CS_SNAPALL,0))

where snap is a pointer to KERNEL32.CreateToolhelp32Snapshot()

> 1. Can I remove the heap traversal routines Heap32First and
> Heap32Next? Will it badly affect the PRNG output later on?
It depends how good the other sources of random numbers are,
more below.
>
> 2. Can I replace Heap32First and Heap32Next calls with any other
> sources of entropy? What if I make a call to CryptGenRandom again in
> place of the heap traversal routines?
Calling CryptGenRandom() twice isn't going to help much.

If CryptGenRandom() is as good as it is "supposed to" be,
the other entropy sources are not really needed.  But if
CryptGenRandom() is somehow broken or untrustworthy,
calling it a million times wouldn't help.

Anyway, I have my doubts about the value of using the local
heap walking functions as a source of entropy, as they
reflect only the state of your own process.  Pretending that
the address and size of each malloc()-ed memory block in
your process contributes 3 to 5 bytes of additional entropy
(which is what the comments say) is wildly optimistic and
quite unrealistic.

In a long-running web browser or a similarly long running
web server, the net total of the memory layout effects of
thousands of semi-chaotic previous network requests and
user actions might contribute a total of 10 to 50 bits of
entropy.  But in a typical freshly started process, the
layout is going to be pretty deterministic (if the OS
uses address layout randomization, it probably does so
based on entropy sources already incorporated into its
standard random source, i.e. CryptGenRandom() on Windows).

>
> 3. Any other possible ways out?
>
> Thanks,
> Sandeep
>
> On Thu, Feb 23, 2012 at 10:08 PM, Jakob Bohm <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     From the evidence given, I would *almost* certainly characterize
>     this as a deadlock bug in ntdll.dll, the deepest, most trusted
>     user mode component of Windows!
>
>     Specifically, nothing should allow regular user code such as
>     OpenSSL to hold onto NT internal critical sections while not
>     running inside NTDLL, and NTDLL should be designed not to
>     deadlock against itself.
>
>     There is one other possibility though:
>
>     The OpenSSL code in rand_win.c holds on to a "snapshot" lock
>     on some of the heap data while walking it.  It may be doing
>     this in a way not permitted by the rules that are presumed
>     by the deadlock avoidance design of the speed critical heap
>     locking code.
>
>
>     On 2/23/2012 2:11 PM, sandeep kiran p wrote:
>
>         Hi,
>
>         OpenSSL Version: 0.9.8o
>         OS : Windows Server 2008 R2 SP1
>
>         I am seeing a deadlock in a windows application between two
>         threads, one thread calling Heap32First from OpenSSL's
>         RAND_poll and the other that allocates memory over the heap.
>
>         Here is the relevant stack trace from both the threads
>         involved in deadlock.
>
>         Thread 523
>         ----------------
>         ntdll!ZwWaitForSingleObject+a
>         ntdll!RtlpWaitOnCriticalSection+e8
>         ntdll!RtlEnterCriticalSection+d1
>         ntdll!RtlpAllocateHeap+18a6
>         ntdll!RtlAllocateHeap+16c
>         ntdll!RtlpAllocateUserBlock+145
>         ntdll!RtlpLowFragHeapAllocFromContext+4e7
>         ntdll!RtlAllocateHeap+e4
>         ntdll!RtlInitializeCriticalSectionEx+d2
>         ntdll!RtlpActivateLowFragmentationHeap+181
>         ntdll!RtlpPerformHeapMaintenance+27
>         ntdll!RtlpAllocateHeap+1819
>         ntdll!RtlAllocateHeap+16c
>
>
>         Thread 454
>         -----------------
>         ntdll!NtWaitForSingleObject+0xa
>         ntdll!RtlpWaitOnCriticalSection+0xe8
>         ntdll!RtlEnterCriticalSection+0xd1
>         ntdll!RtlLockHeap+0x3b
>         ntdll!RtlpQueryExtendedHeapInformation+0xf4
>         ntdll!RtlQueryHeapInformation+0x3c
>         ntdll!RtlQueryProcessHeapInformation+0x3ad
>         ntdll!RtlQueryProcessDebugInformation+0x3b0
>         kernel32!Heap32First+0x71
>
>         WinDBG reports that thread 523 and 454 both hold locks and are
>         waiting for each other locks thereby resulting in a deadlock.
>
>         On searching, I have found a couple instances where such an
>         issue has been reported with Heap32Next on Windows 7 but
>         haven't found anything that helps me solve the problem. Most
>         of the references I found conclude that this could be because
>         of a possible bug in heap traversal APIs. If someone has faced
>         a similar problem, can you guide me to possible workarounds by
>         which I can avoid the deadlock? Can I remove the heap
>         traversal routines and find some other sources of entropy?
>
>         Thanks for your help.
>
>
Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in RAND_poll's Heap32First call

Jeffrey Walton-3
On Fri, Feb 24, 2012 at 4:08 PM, Jakob Bohm <[hidden email]> wrote:

> On 2/24/2012 2:14 PM, sandeep kiran p wrote:
>>
>> You mentioned that OpenSSL is holding a "snapshot" lock in rand_win.c. I
>> couldn't find anything like that in that file. Can you specifically point me
>> to the code that you are referring to? I would also like to get an opinion
>> on possible workarounds that I can enforce to avoid the deadlock.
>>
> In OpenSSL 1.0.0 it is line 486 which says
>
>         module_next && (handle = snap(TH32CS_SNAPALL,0))
>
> where snap is a pointer to KERNEL32.CreateToolhelp32Snapshot()
I've found that creating too many tool tip snapshots too frequently
causes problems (in a different problem domain). How frequently is
OpenSSL doing it? Just once during module startup? From where is it
being called (it can't be DllMain)?

If the heap walk occurs once after DllMain, there should not be any
problems (in theory).

>> 1. Can I remove the heap traversal routines Heap32First and Heap32Next?
>> Will it badly affect the PRNG output later on?
>
> It depends how good the other sources of random numbers are,
> more below.
>>
>>
>> 2. Can I replace Heap32First and Heap32Next calls with any other sources
>> of entropy? What if I make a call to CryptGenRandom again in place of the
>> heap traversal routines?
>
> Calling CryptGenRandom() twice isn't going to help much.
>
> If CryptGenRandom() is as good as it is "supposed to" be,
> the other entropy sources are not really needed.  But if
> CryptGenRandom() is somehow broken or untrustworthy,
> calling it a million times wouldn't help.
"Cryptanalysis of the Random Number Generator of the Windows Operating
System," eprint.iacr.org/2007/419.pdf

>
> [SNIP]
>

Also of interest might be "Analysis of the Linux Random Number
Generator," eprint.iacr.org/2006/086.pdf.

Jeff
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Resources for certificates using OpenSSL (newbie)

Michael S. Zick-4
In reply to this post by Edward Ned Harvey (openssl)
On Fri February 24 2012, Edward Ned Harvey wrote:
> > From: [hidden email] [mailto:owner-openssl-
> > [hidden email]] On Behalf Of Michael S. Zick
> >
> > You must be new to mailing lists also.
> > Start your own thread, they are cheap here, don't hijack another topic.
>
> Mike, How do you call that a thread hijack?  New subject, new thread id...
> I don't see how it was a thread hijack.
>

Message-ID: <[hidden email]>
References: <[hidden email]>
        <[hidden email]>
 <[hidden email]>
In-Reply-To: <[hidden email]>
 

> Hector, I wish I had a good resource to send your way. My experience has
> been like this:  Years ago when I didn't know anything about generating or
> installing certs, I just found some random webpages about how to generate
> self-signed certs and I copied them brainlessly, but gained some
> familiarity.  Later I wanted to have trusted signed certs, so I paid for
> services such as godaddy and thawte, and brainlessly followed their
> instructions, but gained further experience.  More recently, I'm a fan of
> startssl.com.
>
> Often when I do this sort of stuff, the instructions written by whoever are
> slightly too specific, or the starting point or resources available to you
> at the time are slightly different.  The industry keeps evolving a little
> bit.  Targets move.
>
> For example, in a cisco ASA, last year I generated a csr, and got it signed.
> This year I went to regenerate and renew, but I found the ASA is only
> capable of signing using md5sum, which is no longer acceptable by the
> certificate authority, so even though I'm doing precisely the same task as I
> did 1 year ago, I can't follow the same process anymore.
>
> Hopefully someone can refer you to a good introductory set of materials, but
> I think most likely, you'll find too often something isn't written precisely
> for what you need, or something else has changed.  
>
> I suggest you basically just start experimenting and learning.  Ask
> questions here when you get stuck.  The more exposure you give yourself, the
> better you'll learn.
>
> ______________________________________________________________________
> OpenSSL Project                                 http://www.openssl.org
> User Support Mailing List                    [hidden email]
> Automated List Manager                           [hidden email]
>
>
>


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in RAND_poll's Heap32First call

sandeep kiran p
In reply to this post by Jakob Bohm-7
MSDN says

" To enumerate the heap or module states for all processes, specify TH32CS_SNAPALL and set th32ProcessID to zero. "

So it presumably does the heap and module walk for all processes and not only for the current process.

Do you think  CreateToolhelp32Snapshot's  lock on the read-only snapshot could be a possible culprit?

I am now thinking about removing the calls to Heap32First and Heap32Next in rand_win.c and look for alternate sources of entropy.

Thanks for you help.

Regards
Sandeep

On Sat, Feb 25, 2012 at 2:38 AM, Jakob Bohm <[hidden email]> wrote:
On 2/24/2012 2:14 PM, sandeep kiran p wrote:
You mentioned that OpenSSL is holding a "snapshot" lock in rand_win.c. I couldn't find anything like that in that file. Can you specifically point me to the code that you are referring to? I would also like to get an opinion on possible workarounds that I can enforce to avoid the deadlock.

In OpenSSL 1.0.0 it is line 486 which says

        module_next && (handle = snap(TH32CS_SNAPALL,0))

where snap is a pointer to KERNEL32.CreateToolhelp32Snapshot()


1. Can I remove the heap traversal routines Heap32First and Heap32Next? Will it badly affect the PRNG output later on?
It depends how good the other sources of random numbers are,
more below.


2. Can I replace Heap32First and Heap32Next calls with any other sources of entropy? What if I make a call to CryptGenRandom again in place of the heap traversal routines?
Calling CryptGenRandom() twice isn't going to help much.

If CryptGenRandom() is as good as it is "supposed to" be,
the other entropy sources are not really needed.  But if
CryptGenRandom() is somehow broken or untrustworthy,
calling it a million times wouldn't help.

Anyway, I have my doubts about the value of using the local
heap walking functions as a source of entropy, as they
reflect only the state of your own process.  Pretending that
the address and size of each malloc()-ed memory block in
your process contributes 3 to 5 bytes of additional entropy
(which is what the comments say) is wildly optimistic and
quite unrealistic.

In a long-running web browser or a similarly long running
web server, the net total of the memory layout effects of
thousands of semi-chaotic previous network requests and
user actions might contribute a total of 10 to 50 bits of
entropy.  But in a typical freshly started process, the
layout is going to be pretty deterministic (if the OS
uses address layout randomization, it probably does so
based on entropy sources already incorporated into its
standard random source, i.e. CryptGenRandom() on Windows).


3. Any other possible ways out?

Thanks,
Sandeep

On Thu, Feb 23, 2012 at 10:08 PM, Jakob Bohm <[hidden email] <mailto:[hidden email]>> wrote:

   From the evidence given, I would *almost* certainly characterize
   this as a deadlock bug in ntdll.dll, the deepest, most trusted
   user mode component of Windows!

   Specifically, nothing should allow regular user code such as
   OpenSSL to hold onto NT internal critical sections while not
   running inside NTDLL, and NTDLL should be designed not to
   deadlock against itself.

   There is one other possibility though:

   The OpenSSL code in rand_win.c holds on to a "snapshot" lock
   on some of the heap data while walking it.  It may be doing
   this in a way not permitted by the rules that are presumed
   by the deadlock avoidance design of the speed critical heap
   locking code.


   On 2/23/2012 2:11 PM, sandeep kiran p wrote:

       Hi,

       OpenSSL Version: 0.9.8o
       OS : Windows Server 2008 R2 SP1

       I am seeing a deadlock in a windows application between two
       threads, one thread calling Heap32First from OpenSSL's
       RAND_poll and the other that allocates memory over the heap.

       Here is the relevant stack trace from both the threads
       involved in deadlock.

       Thread 523
       ----------------
       ntdll!ZwWaitForSingleObject+a
       ntdll!RtlpWaitOnCriticalSection+e8
       ntdll!RtlEnterCriticalSection+d1
       ntdll!RtlpAllocateHeap+18a6
       ntdll!RtlAllocateHeap+16c
       ntdll!RtlpAllocateUserBlock+145
       ntdll!RtlpLowFragHeapAllocFromContext+4e7
       ntdll!RtlAllocateHeap+e4
       ntdll!RtlInitializeCriticalSectionEx+d2
       ntdll!RtlpActivateLowFragmentationHeap+181
       ntdll!RtlpPerformHeapMaintenance+27
       ntdll!RtlpAllocateHeap+1819
       ntdll!RtlAllocateHeap+16c


       Thread 454
       -----------------
       ntdll!NtWaitForSingleObject+0xa
       ntdll!RtlpWaitOnCriticalSection+0xe8
       ntdll!RtlEnterCriticalSection+0xd1
       ntdll!RtlLockHeap+0x3b
       ntdll!RtlpQueryExtendedHeapInformation+0xf4
       ntdll!RtlQueryHeapInformation+0x3c
       ntdll!RtlQueryProcessHeapInformation+0x3ad
       ntdll!RtlQueryProcessDebugInformation+0x3b0
       kernel32!Heap32First+0x71

       WinDBG reports that thread 523 and 454 both hold locks and are
       waiting for each other locks thereby resulting in a deadlock.

       On searching, I have found a couple instances where such an
       issue has been reported with Heap32Next on Windows 7 but
       haven't found anything that helps me solve the problem. Most
       of the references I found conclude that this could be because
       of a possible bug in heap traversal APIs. If someone has faced
       a similar problem, can you guide me to possible workarounds by
       which I can avoid the deadlock? Can I remove the heap
       traversal routines and find some other sources of entropy?

       Thanks for your help.


Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark.  Direct <a href="tel:%2B45%2031%2013%2016%2010" value="+4531131610" target="_blank">+45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Deadlock in RAND_poll's Heap32First call

Jakob Bohm-7
On 2/25/2012 3:30 PM, sandeep kiran p wrote:
> MSDN says
>
> " To enumerate the heap or module states for all processes, specify
> TH32CS_SNAPALL and set /th32ProcessID/ to zero. "
>
> So it presumably does the heap and module walk for all processes and
> not only for the current process.
>
Aha!  Missed that detail in this hard-to-read code.  I had
enough trouble untangling the crazy run-on lines and the
unconventional naming of function pointers very differently
than the pointed-to functions, not to mention the lack of
comments clarifying why it doesn't check for lack of a
pointer to the snapshot close function (there is a reason,
several pages further down in the code, but still no comment).
> Do you think *CreateToolhelp32Snapshot's* lock on the read-only
> snapshot could be a possible culprit?
That was the guess, but just a guess, hard to know without
spending several days reverse engineering that particular
version of the heap code in ntdll .

>
> I am now thinking about removing the calls to Heap32First and
> Heap32Next in rand_win.c and look for alternate sources of entropy.
>
> Thanks for you help.
>
> Regards
> Sandeep
>
> On Sat, Feb 25, 2012 at 2:38 AM, Jakob Bohm <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     On 2/24/2012 2:14 PM, sandeep kiran p wrote:
>
>         You mentioned that OpenSSL is holding a "snapshot" lock in
>         rand_win.c. I couldn't find anything like that in that file.
>         Can you specifically point me to the code that you are
>         referring to? I would also like to get an opinion on possible
>         workarounds that I can enforce to avoid the deadlock.
>
>     In OpenSSL 1.0.0 it is line 486 which says
>
>             module_next && (handle = snap(TH32CS_SNAPALL,0))
>
>     where snap is a pointer to KERNEL32.CreateToolhelp32Snapshot()
>
>
>         1. Can I remove the heap traversal routines Heap32First and
>         Heap32Next? Will it badly affect the PRNG output later on?
>
>     It depends how good the other sources of random numbers are,
>     more below.
>
>
>         2. Can I replace Heap32First and Heap32Next calls with any
>         other sources of entropy? What if I make a call to
>         CryptGenRandom again in place of the heap traversal routines?
>
>     Calling CryptGenRandom() twice isn't going to help much.
>
>     If CryptGenRandom() is as good as it is "supposed to" be,
>     the other entropy sources are not really needed.  But if
>     CryptGenRandom() is somehow broken or untrustworthy,
>     calling it a million times wouldn't help.
>
>     Anyway, I have my doubts about the value of using the local
>     heap walking functions as a source of entropy, as they
>     reflect only the state of your own process.  Pretending that
>     the address and size of each malloc()-ed memory block in
>     your process contributes 3 to 5 bytes of additional entropy
>     (which is what the comments say) is wildly optimistic and
>     quite unrealistic.
>
>     In a long-running web browser or a similarly long running
>     web server, the net total of the memory layout effects of
>     thousands of semi-chaotic previous network requests and
>     user actions might contribute a total of 10 to 50 bits of
>     entropy.  But in a typical freshly started process, the
>     layout is going to be pretty deterministic (if the OS
>     uses address layout randomization, it probably does so
>     based on entropy sources already incorporated into its
>     standard random source, i.e. CryptGenRandom() on Windows).
>
>
>         3. Any other possible ways out?
>
>         Thanks,
>         Sandeep
>
>         On Thu, Feb 23, 2012 at 10:08 PM, Jakob Bohm
>         <[hidden email] <mailto:[hidden email]>
>         <mailto:[hidden email] <mailto:[hidden email]>>>
>         wrote:
>
>            From the evidence given, I would *almost* certainly
>         characterize
>            this as a deadlock bug in ntdll.dll, the deepest, most trusted
>            user mode component of Windows!
>
>            Specifically, nothing should allow regular user code such as
>            OpenSSL to hold onto NT internal critical sections while not
>            running inside NTDLL, and NTDLL should be designed not to
>            deadlock against itself.
>
>            There is one other possibility though:
>
>            The OpenSSL code in rand_win.c holds on to a "snapshot" lock
>            on some of the heap data while walking it.  It may be doing
>            this in a way not permitted by the rules that are presumed
>            by the deadlock avoidance design of the speed critical heap
>            locking code.
>
>
>            On 2/23/2012 2:11 PM, sandeep kiran p wrote:
>
>                Hi,
>
>                OpenSSL Version: 0.9.8o
>                OS : Windows Server 2008 R2 SP1
>
>                I am seeing a deadlock in a windows application between two
>                threads, one thread calling Heap32First from OpenSSL's
>                RAND_poll and the other that allocates memory over the
>         heap.
>
>                Here is the relevant stack trace from both the threads
>                involved in deadlock.
>
>                Thread 523
>                ----------------
>                ntdll!ZwWaitForSingleObject+a
>                ntdll!RtlpWaitOnCriticalSection+e8
>                ntdll!RtlEnterCriticalSection+d1
>                ntdll!RtlpAllocateHeap+18a6
>                ntdll!RtlAllocateHeap+16c
>                ntdll!RtlpAllocateUserBlock+145
>                ntdll!RtlpLowFragHeapAllocFromContext+4e7
>                ntdll!RtlAllocateHeap+e4
>                ntdll!RtlInitializeCriticalSectionEx+d2
>                ntdll!RtlpActivateLowFragmentationHeap+181
>                ntdll!RtlpPerformHeapMaintenance+27
>                ntdll!RtlpAllocateHeap+1819
>                ntdll!RtlAllocateHeap+16c
>
>
>                Thread 454
>                -----------------
>                ntdll!NtWaitForSingleObject+0xa
>                ntdll!RtlpWaitOnCriticalSection+0xe8
>                ntdll!RtlEnterCriticalSection+0xd1
>                ntdll!RtlLockHeap+0x3b
>                ntdll!RtlpQueryExtendedHeapInformation+0xf4
>                ntdll!RtlQueryHeapInformation+0x3c
>                ntdll!RtlQueryProcessHeapInformation+0x3ad
>                ntdll!RtlQueryProcessDebugInformation+0x3b0
>                kernel32!Heap32First+0x71
>
>                WinDBG reports that thread 523 and 454 both hold locks
>         and are
>                waiting for each other locks thereby resulting in a
>         deadlock.
>
>                On searching, I have found a couple instances where such an
>                issue has been reported with Heap32Next on Windows 7 but
>                haven't found anything that helps me solve the problem.
>         Most
>                of the references I found conclude that this could be
>         because
>                of a possible bug in heap traversal APIs. If someone
>         has faced
>                a similar problem, can you guide me to possible
>         workarounds by
>                which I can avoid the deadlock? Can I remove the heap
>                traversal routines and find some other sources of entropy?
>
>                Thanks for your help.
>
>
Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]