Re: Win32 OPENSSL_USE_APPLINK usage

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Win32 OPENSSL_USE_APPLINK usage

Modem Man
Andy Polyakov schrieb:

>>>> I actually ended up solving it by removing all uses of BIO_new_fp() in
>>>> favor of my own custom BIO  that I just finished writing earlier this
>>>> week.
>>> Why not BIO_new_file?
>>
>> Yeah, I discovered while analyzing the code that using BIO_new_file()
>> rather than BIO_new_fp() would disengage applink, however that was not
>> an option for me because BIO_new_file() can't open a file whose name
>> contains non-ANSI Unicode characters on Windows.  I have to open the
>> file myself using _wfopen().
>
> There was suggestion to fall back to wfopen from a vmware engineer a
> while ago, but he couldn't provide patch (not that it would be very
> complex) and it was not followed up. Idea must have been something
> similar to just committed http://cvs.openssl.org/chngview?cn=19610.
why not adding the following to BIO_new_file()?

- BIO interface still uses char * (meaning latin ASCII 0x20..0x7F)
- BIO implementation calls UTF8_to_UCS16() on all platforms supporting
wfopen or _wfopen
- BIO implementation then calls wfopen / _wfopen with this UCS16 value
(sometimes known as WCHAR*)
- For Win32 and Win32_WinCE the conversion can be done with
FormatMessage() API. It's allways available.

... just my 5 cents.

The Modem Man

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Win32 OPENSSL_USE_APPLINK usage

Andy Polyakov
>>>>> I actually ended up solving it by removing all uses of BIO_new_fp() in
>>>>> favor of my own custom BIO  that I just finished writing earlier this
>>>>> week.
>>>> Why not BIO_new_file?
>>> Yeah, I discovered while analyzing the code that using BIO_new_file()
>>> rather than BIO_new_fp() would disengage applink, however that was not
>>> an option for me because BIO_new_file() can't open a file whose name
>>> contains non-ANSI Unicode characters on Windows.  I have to open the
>>> file myself using _wfopen().
>> There was suggestion to fall back to wfopen from a vmware engineer a
>> while ago, but he couldn't provide patch (not that it would be very
>> complex) and it was not followed up. Idea must have been something
>> similar to just committed http://cvs.openssl.org/chngview?cn=19610.
> why not adding the following to BIO_new_file()?
>
> - BIO interface still uses char * (meaning latin ASCII 0x20..0x7F)

This is incorrect assumption. BIO_new_file and fopen allow for
characters recognized as valid in current locale, which is not
necessarily limited to ASCII. In other words it's perfectly possible to
BIO_new_file as well as to fopen file with international characters in
its name, but the set of allowed non-ASCII characters varies.

> - BIO implementation calls UTF8_to_UCS16()

I.e. suggestion is to unconditionally assume UTF8 encoding of file name.
It's not safe assumption (see above).

> on all platforms supporting
> wfopen or _wfopen

What are the platforms supporting wfopen? Among those supported by
OpenSSL? It's Windows platforms.

> - BIO implementation then calls wfopen / _wfopen with this UCS16 value
> (sometimes known as WCHAR*)
> - For Win32 and Win32_WinCE the conversion can be done with
> FormatMessage() API. It's allways available.

??? So as MultiByteToWideChar... A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Win32 OPENSSL_USE_APPLINK usage

Modem Man
Hi Andy Polyakov,
you kindly made some comments on my topic.
I understand your comments in the meaning of:
    Please do not try to break fopen()/setlocale() compatibility of
BIO_new_file().
This is a good idea, no doubt! But let me go into deep, by quoting your
quotes...

MM>> - For Win32 and Win32_WinCE the conversion can be done with
MM>> FormatMessage() API. It's allways available.

AP> ??? So as MultiByteToWideChar... A.

Right, this is my mistake. I mentioned the same: MultiByteToWideChar().


MM>> why not adding the following to BIO_new_file()?
MM>> - BIO interface still uses char * (meaning latin ASCII 0x20..0x7F)

AP> This is incorrect assumption. BIO_new_file and fopen allow for

AP> characters recognized as valid in current locale, which is not
AP> necessarily limited to ASCII. In other words it's perfectly possible to
AP> BIO_new_file as well as to fopen file with international characters in
AP> its name, but the set of allowed non-ASCII characters varies.

I have very bad experience with filenames containing locale depending
characters.
This is only running in homogeneous architectures without border
crossing networks.
Please note, we have reached the year 2010 ;-)
So it's mostly unlikely to handle filenames the "national" (racialist)
way all times well.
May be, I'm unusually aggressive about this, because my surname contains
the
German 'sz ligature' letter, or 0xDF in /one/ of your locales, 0xF1 in
/another/....


MM>>  on all platforms supporting
MM>>  wfopen or _wfopen

AP> What are the platforms supporting wfopen? Among those supported by
AP> OpenSSL? It's Windows platforms.

I don't know. For Windows XYZ: yes. For others: who knows?
How do other platforms handle such mixed character filenames?
All doing still the racialist way?
I can't believe this! Well, linux may be. But HPUX? VMS? All the same?

I often get files with basically German names (conatining "Umlauts" and
"sz ligature"),
then completed with Chinese or Cyrillic suffixes. Or I get FTP
directories filled up
with files from Germany and Denmark (which are indeed _not_ from the
same locale).
All wchar_t functions and utf8 functions do handle this perfectly.
But the "locale" shit like 1974's fopen() can't deal with such files, right?

In my eyes, "locale" is a dying species. Or at least - it should be!
The future is either a mix between UTF8 and UCS16 for systems which
using wchar_t,
so UTF8 is the 8-bit-interfacing format, while internally using UCS16 -
or clean UTF8.

Anyway, in summary, we can see two critical requirements:
a) do not break fopen() / setlocale() compatibility of BIO_new_file()
b) support seasonable file naming (3rd millennium!)
then
c) we found conclusion to _not_ change BIO_new_file() at all.

Hmmmm. Give it another try?

Another well approved way is to create "W" or "U" variants of each
relevant API, like:
BIO_new_fileW( const wchar_t *filename, const wchar_t *mode );
  or
BIO_new_fileU( const char *filenameUTF8, const char *mode );
The first is the way, Microsoft did for their API. The latter is the way,
I use to port locale-depending old code into current millennium.

How about this?


with best regards,
Modem-Man

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    [hidden email]
Automated List Manager                           [hidden email]