Building recent versions on Windows with VS 2010

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Building recent versions on Windows with VS 2010

Steven Kneizys
Greetings,

I have been building the newer 1.0.1's and the beta/snapshots of
1.0.2.  For the 32 bit builds, I like to put in the "/safeseh" flag.
For 64 bit I like to put them in a separate directory (such as
out64dll instead of out32dll) so that the 32 and 64 versions can be
built on the same tree and it is easier to keep things straight.  Here
are the changes I make to do that, just wondering if this kind of
thing might be rolled into the distributions of if there is a better
way to do it:

--- openssl-1.0.2-stable-SNAP-20140226/util/pl/VC-32_ORIG.pl
2014-02-27 11:01:09.248294010 -0400
+++ openssl-1.0.2-stable-SNAP-20140226/util/pl/VC-32.pl 2014-02-27
11:01:09.248294010 -0400
@@ -134,7 +134,7 @@
     $ff = "/fixed";
     $opt_cflags=$f.' /Ox /O2 /Ob2';
     $dbg_cflags=$f.'d /Od -DDEBUG -D_DEBUG';
-    $lflags="/nologo /subsystem:console /opt:ref";
+    $lflags="/nologo /subsystem:console /opt:ref /safeseh";
     }
 $lib_cflag='/Zl' if (!$shlib); # remove /DEFAULTLIBs from static lib
 $mlflags='';
@@ -145,6 +145,13 @@
  $tmp_def.='_$(TARGETCPU)' if ($FLAVOR =~ /CE/);
 $inc_def="inc32";
+if ($FLAVOR =~ /WIN64/) {
+  # change directory names so 32 and 64 bit can somewhat co-exist in
same build tree
+  $out_def =~ s/32/64/;
+  $tmp_def =~ s/32/64/;
+  $inc_def =~ s/32/64/;
+}
+
 if ($debug)
  {
  $cflags=$dbg_cflags.$base_cflags;
@@ -228,7 +235,7 @@
  $asmtype="win32n";
  $afile='-o ';
 } else {
- $asm='ml /nologo /Cp /coff /c /Cx /Zi';
+ $asm='ml /nologo /Cp /coff /c /Cx /Zi /safeseh';
  $afile='/Fo';
  $asmtype="win32";
 }


--------------------------------------------------

On the 1.0.2 builds there is an issue with VS and the sha256-586.asm
and a DD declaration being to big.  It is in:

https://github.com/openssl/openssl/issues/34

There are two issues in #34, I am currently just addressing this one:
tmp32\sha256-586.asm(264) : error A2042:statement too complex
tmp32\sha256-586.asm(264) : error A2039:line too long

I have a patch for that to break down such declarations into smaller
chunks if needed:

--- openssl-1.0.2-stable-SNAP-20140226\crypto\perlasm\x86masm_ORIG.pl
2014-02-27 10:40:36.599122930 -0400
+++ openssl-1.0.2-stable-SNAP-20140226\crypto\perlasm\x86masm.pl
2014-02-27 10:40:36.600123057 -0400
@@ -160,13 +160,13 @@
 {   push(@out,"PUBLIC\t".&::LABEL($_[0],$nmdecor.$_[0])."\n");   }
 sub ::data_byte
-{   push(@out,("DB\t").join(',',@_)."\n"); }
+{   push @out, (("DW\t").join(',',splice(@_, 0, 16))."\n") while @_; }
 sub ::data_short
-{   push(@out,("DW\t").join(',',@_)."\n"); }
+{   push @out, (("DW\t").join(',',splice(@_, 0, 8))."\n") while @_; }
 sub ::data_word
-{   push(@out,("DD\t").join(',',@_)."\n"); }
+{   push @out, (("DD\t").join(',',splice(@_, 0, 4))."\n") while @_; }
 sub ::align
 {   push(@out,"ALIGN\t$_[0]\n"); }


Thanks,

Steve...

--
Steve Kneizys
Senior Business Process Engineer
Voice: (610) 256-1396  [For Emergency Service (888)864-3282]
Ferrilli Information Group -- Quality Service and Solutions for Higher Education
web: http://www.ferrilli.com/

Making you a success while exceeding your expectations.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Building recent versions on Windows with VS 2010

Steven Kneizys
I had a small change from my previous post ... when pasting in my fix
for the asm code, I had a "DW" where a "DB" belonged.  Here is the
corrected diff:
--- openssl-1.0.2-stable-SNAP-20140226\crypto\perlasm\x86masm_ORIG.pl
2014-02-27 10:40:36.599122930 -0400
+++ openssl-1.0.2-stable-SNAP-20140226\crypto\perlasm\x86masm.pl
2014-02-27 10:40:36.600123057 -0400
@@ -160,13 +160,13 @@
 {   push(@out,"PUBLIC\t".&::LABEL($_[0],$nmdecor.$_[0])."\n");   }
 sub ::data_byte
-{   push(@out,("DB\t").join(',',@_)."\n"); }
+{   push @out, (("DB\t").join(',',splice(@_, 0, 16))."\n") while @_; }
 sub ::data_short
-{   push(@out,("DW\t").join(',',@_)."\n"); }
+{   push @out, (("DW\t").join(',',splice(@_, 0, 8))."\n") while @_; }
 sub ::data_word
-{   push(@out,("DD\t").join(',',@_)."\n"); }
+{   push @out, (("DD\t").join(',',splice(@_, 0, 4))."\n") while @_; }
 sub ::align
 {   push(@out,"ALIGN\t$_[0]\n"); }


On the open issue at: https://github.com/openssl/openssl/issues/34

There was another part to the issue besides the "complexity/line too
long" issue:
tmp32\sha256-586.asm(4422) : error A2070:invalid instruction operands
tmp32\sha256-586.asm(4424) : error A2070:invalid instruction operands
tmp32\sha256-586.asm(4425) : error A2070:invalid instruction operands
tmp32\sha256-586.asm(4426) : error A2070:invalid instruction operands
tmp32\sha256-586.asm(4559) : error A2070:invalid instruction operands
tmp32\sha256-586.asm(4712) : error A2070:invalid instruction operands
tmp32\sha256-586.asm(4865) : error A2070:invalid instruction operands
tmp32\sha256-586.asm(5018) : error A2070:invalid instruction operands
NMAKE : fatal error U1077: “"C:\Program Files (x86)\Microsoft Visual Studio 10.
0\VC\BIN\ml.EXE"”: 返回代码“0x1”
Stop.

I am not an expert on this asm code or sha or anything, but armed with
the "tools of the inter-web" I thought I might see what I could see!
I looked at the code, and I noticed a mix of vpaddd with QWORD, and I
wondered what would happen if I flipped the vpaddd's in those to
vpaddq instead.  I did it, and the thing compiled and passed the
"ms\test.bat" process!  (I have no idea if what I did is
mathematically correct!)  So I am now able to get the build for 32 bit
on Visual Studio 2010 to work by adding in this change in addition to
the one above:

--- openssl-1.0.2-stable-SNAP-20140226/crypto/sha/asm/sha256-586_ORIG.pl
2014-02-27 13:53:05.182539228 -0400
+++ openssl-1.0.2-stable-SNAP-20140226/crypto/sha/asm/sha256-586.pl
2014-02-27 13:53:05.182539228 -0400
@@ -851,11 +851,11 @@
  &mov (&DWP(96+4,"esp"),"edi");
  &vpshufb (@X[1],@X[1],$t3);
  &vpshufb (@X[2],@X[2],$t3);
- &vpaddd ($t0,@X[0],&QWP(0,$K256));
+ &vpaddq ($t0,@X[0],&QWP(0,$K256));
  &vpshufb (@X[3],@X[3],$t3);
- &vpaddd ($t1,@X[1],&QWP(16,$K256));
- &vpaddd ($t2,@X[2],&QWP(32,$K256));
- &vpaddd ($t3,@X[3],&QWP(48,$K256));
+ &vpaddq ($t1,@X[1],&QWP(16,$K256));
+ &vpaddq ($t2,@X[2],&QWP(32,$K256));
+ &vpaddq ($t3,@X[3],&QWP(48,$K256));
  &vmovdqa (&QWP(32+0,"esp"),$t0);
  &vmovdqa (&QWP(32+16,"esp"),$t1);
  &vmovdqa (&QWP(32+32,"esp"),$t2);
@@ -916,7 +916,7 @@
     eval($insn = shift(@insns));
     eval(shift(@insns)) if ($insn =~ /rorx/ && @insns[0] =~ /rorx/);
  }
- &vpaddd ($t2,@X[0],&QWP(16*$j,$K256));
+ &vpaddq ($t2,@X[0],&QWP(16*$j,$K256));
  foreach (@insns) { eval; } # remaining instructions
  &vmovdqa (&QWP(32+16*$j,"esp"),$t2);
 }
@@ -1048,11 +1048,11 @@
  &mov (&DWP(96+4,"esp"),"edi");
  &vpshufb (@X[1],@X[1],$t3);
  &vpshufb (@X[2],@X[2],$t3);
- &vpaddd ($t0,@X[0],&QWP(0,$K256));
+ &vpaddq ($t0,@X[0],&QWP(0,$K256));
  &vpshufb (@X[3],@X[3],$t3);
- &vpaddd ($t1,@X[1],&QWP(16,$K256));
- &vpaddd ($t2,@X[2],&QWP(32,$K256));
- &vpaddd ($t3,@X[3],&QWP(48,$K256));
+ &vpaddq ($t1,@X[1],&QWP(16,$K256));
+ &vpaddq ($t2,@X[2],&QWP(32,$K256));
+ &vpaddq ($t3,@X[3],&QWP(48,$K256));
  &vmovdqa (&QWP(32+0,"esp"),$t0);
  &vmovdqa (&QWP(32+16,"esp"),$t1);
  &vmovdqa (&QWP(32+32,"esp"),$t2);


Hopefully this information will be useful to somebody.

Thanks,

Steve...

On Thu, Feb 27, 2014 at 10:23 AM, Steven Kneizys <[hidden email]> wrote:

> Greetings,
>
> I have been building the newer 1.0.1's and the beta/snapshots of
> 1.0.2.  For the 32 bit builds, I like to put in the "/safeseh" flag.
> For 64 bit I like to put them in a separate directory (such as
> out64dll instead of out32dll) so that the 32 and 64 versions can be
> built on the same tree and it is easier to keep things straight.  Here
> are the changes I make to do that, just wondering if this kind of
> thing might be rolled into the distributions of if there is a better
> way to do it:
>
> --- openssl-1.0.2-stable-SNAP-20140226/util/pl/VC-32_ORIG.pl
> 2014-02-27 11:01:09.248294010 -0400
> +++ openssl-1.0.2-stable-SNAP-20140226/util/pl/VC-32.pl 2014-02-27
> 11:01:09.248294010 -0400
> @@ -134,7 +134,7 @@
>      $ff = "/fixed";
>      $opt_cflags=$f.' /Ox /O2 /Ob2';
>      $dbg_cflags=$f.'d /Od -DDEBUG -D_DEBUG';
> -    $lflags="/nologo /subsystem:console /opt:ref";
> +    $lflags="/nologo /subsystem:console /opt:ref /safeseh";
>      }
>  $lib_cflag='/Zl' if (!$shlib); # remove /DEFAULTLIBs from static lib
>  $mlflags='';
> @@ -145,6 +145,13 @@
>   $tmp_def.='_$(TARGETCPU)' if ($FLAVOR =~ /CE/);
>  $inc_def="inc32";
> +if ($FLAVOR =~ /WIN64/) {
> +  # change directory names so 32 and 64 bit can somewhat co-exist in
> same build tree
> +  $out_def =~ s/32/64/;
> +  $tmp_def =~ s/32/64/;
> +  $inc_def =~ s/32/64/;
> +}
> +
>  if ($debug)
>   {
>   $cflags=$dbg_cflags.$base_cflags;
> @@ -228,7 +235,7 @@
>   $asmtype="win32n";
>   $afile='-o ';
>  } else {
> - $asm='ml /nologo /Cp /coff /c /Cx /Zi';
> + $asm='ml /nologo /Cp /coff /c /Cx /Zi /safeseh';
>   $afile='/Fo';
>   $asmtype="win32";
>  }
>
>
> --------------------------------------------------
>
> On the 1.0.2 builds there is an issue with VS and the sha256-586.asm
> and a DD declaration being to big.  It is in:
>
> https://github.com/openssl/openssl/issues/34
>
> There are two issues in #34, I am currently just addressing this one:
> tmp32\sha256-586.asm(264) : error A2042:statement too complex
> tmp32\sha256-586.asm(264) : error A2039:line too long
>
> I have a patch for that to break down such declarations into smaller
> chunks if needed:
>
> --- openssl-1.0.2-stable-SNAP-20140226\crypto\perlasm\x86masm_ORIG.pl
> 2014-02-27 10:40:36.599122930 -0400
> +++ openssl-1.0.2-stable-SNAP-20140226\crypto\perlasm\x86masm.pl
> 2014-02-27 10:40:36.600123057 -0400
> @@ -160,13 +160,13 @@
>  {   push(@out,"PUBLIC\t".&::LABEL($_[0],$nmdecor.$_[0])."\n");   }
>  sub ::data_byte
> -{   push(@out,("DB\t").join(',',@_)."\n"); }
> +{   push @out, (("DW\t").join(',',splice(@_, 0, 16))."\n") while @_; }
>  sub ::data_short
> -{   push(@out,("DW\t").join(',',@_)."\n"); }
> +{   push @out, (("DW\t").join(',',splice(@_, 0, 8))."\n") while @_; }
>  sub ::data_word
> -{   push(@out,("DD\t").join(',',@_)."\n"); }
> +{   push @out, (("DD\t").join(',',splice(@_, 0, 4))."\n") while @_; }
>  sub ::align
>  {   push(@out,"ALIGN\t$_[0]\n"); }
>
>
> Thanks,
>
> Steve...
>
> --
> Steve Kneizys
> Senior Business Process Engineer
> Voice: (610) 256-1396  [For Emergency Service (888)864-3282]
> Ferrilli Information Group -- Quality Service and Solutions for Higher Education
> web: http://www.ferrilli.com/
>
> Making you a success while exceeding your expectations.



--
Steve Kneizys
Senior Business Process Engineer
Voice: (610) 256-1396  [For Emergency Service (888)864-3282]
Ferrilli Information Group -- Quality Service and Solutions for Higher Education
web: http://www.ferrilli.com/

Making you a success while exceeding your expectations.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Building recent versions on Windows with VS 2010

Andy Polyakov-2
In reply to this post by Steven Kneizys
> I have been building the newer 1.0.1's and the beta/snapshots of
> 1.0.2.  For the 32 bit builds, I like to put in the "/safeseh" flag.

Quoting INSTALL.W32:

- Netwide Assembler, a.k.a. NASM, available from
http://nasm.sourceforge.net/
   is required if you intend to utilize assembler modules. Note that NASM
   is now the only supported assembler.

With nasm on your %PATH% you should get safe SEH tables without any
extra flags. nasm also compiles sha256-586.asm. [Replacing vpaddd with
vpaddq shall produce incorrect result].
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Building recent versions on Windows with VS 2010

Steven Kneizys
Hi Andy,

Thanks so much.  I did see the note about NASM but I could have sworn
I saw something else someplace that contradicted that, but yes I see
that on the wiki too.  The "too complex" patch I put it would/should
still make the generated asm a little more readable/printable for that
one instance anyway ;-).  Even with NASM doing the SafeSEH stuff, I
think I do have to put the /safeseh part in myself for VS to complete
the build properly, don't I?  I remember when I did not put it in for
ml.exe that it complained about the other stuff not being safeseh, so
I think that is needed for VS.

As for the incorrect results ... that is the "fun" part!  I compared
my new "I used vpaddq for the heck of it" 32 bit build to a 64 bit
build out there in net-land from a year ago, and I get the same hash,
see below.  That is why I was wondering, scratching my head :-)

Thanks again,

Steve...

e:\usr_local_src\openssl-1.0.2-stable-SNAP-20140226\out32dll>echo
HELLO_WORLD_HASH_ME|.\openssl sha256
(stdin)= 62aea5920c4c3fc02c68b207b3b1ae5756a40732bdfcad7fe108291a8ee4deee

e:\usr_local_src\openssl-1.0.2-stable-SNAP-20140226\out32dll>.\openssl
version -a
OpenSSL 1.0.2-beta2-dev xx XXX xxxx
built on: Thu Feb 27 17:26:57 2014
platform: VC-WIN32
options:  bn(64,32) rc4(8x,mmx) des(idx,cisc,2,long) idea(int) blowfish(idx)
compiler: cl  /MD /Ox /O2 /Ob2 -DOPENSSL_THREADS  -DDSO_WIN32 -W3 -Gs0
-GF -Gy -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN
 -DL_ENDIAN -D_CRT_SECURE_NO_DEPRECATE -DOPENSSL_BN_ASM_PART_WORDS
-DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -
DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM
-DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DOPENSSL_USE_APPLINK -
I. -DOPENSSL_NO_RC5 -DOPENSSL_NO_MD2 -DOPENSSL_NO_KRB5
-DOPENSSL_NO_JPAKE -DOPENSSL_NO_STATIC_ENGINE
OPENSSLDIR: "C:/ApacheSoftware/Apache22/conf"


C:\ApacheSoftware\Apache22 - Copy\bin>echo HELLO_WORLD_HASH_ME|.\openssl sha256
(stdin)= 62aea5920c4c3fc02c68b207b3b1ae5756a40732bdfcad7fe108291a8ee4deee

C:\ApacheSoftware\Apache22 - Copy\bin>.\openssl version -a
OpenSSL 1.0.1e 11 Feb 2013
built on: Tue Feb 19 20:26:29 2013
platform: VC-WIN64A
options:  bn(64,64) rc4(16x,int) des(idx,cisc,2,long) idea(int) blowfish(idx)
compiler: cl  /MD /Ox -DOPENSSL_THREADS  -DDSO_WIN32 -W3 -Gs0 -Gy
-nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -
DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE -DOPENSSL_IA32_SSE2
-DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2
m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM
-DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DOPENSSL_USE_APPLINK
 -I. -DOPENSSL_NO_RC5 -DOPENSSL_NO_MD2 -DOPENSSL_NO_KRB5
-DOPENSSL_NO_JPAKE -DOPENSSL_NO_STATIC_ENGINE
OPENSSLDIR: "c:/openssl-1.0.1e-X64/ssl"

On Thu, Feb 27, 2014 at 2:52 PM, Andy Polyakov <[hidden email]> wrote:

>> I have been building the newer 1.0.1's and the beta/snapshots of
>> 1.0.2.  For the 32 bit builds, I like to put in the "/safeseh" flag.
>
>
> Quoting INSTALL.W32:
>
> - Netwide Assembler, a.k.a. NASM, available from
> http://nasm.sourceforge.net/
>   is required if you intend to utilize assembler modules. Note that NASM
>   is now the only supported assembler.
>
> With nasm on your %PATH% you should get safe SEH tables without any extra
> flags. nasm also compiles sha256-586.asm. [Replacing vpaddd with vpaddq
> shall produce incorrect result].
> ______________________________________________________________________
> OpenSSL Project                                 http://www.openssl.org
> Development Mailing List                       [hidden email]
> Automated List Manager                           [hidden email]



--
Steve Kneizys
Senior Business Process Engineer
Voice: (610) 256-1396  [For Emergency Service (888)864-3282]
Ferrilli Information Group -- Quality Service and Solutions for Higher Education
web: http://www.ferrilli.com/

Making you a success while exceeding your expectations.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Building recent versions on Windows with VS 2010

Andy Polyakov-2
Hi,

> ...  Even with NASM doing the SafeSEH stuff, I
> think I do have to put the /safeseh part in myself for VS to complete
> the build properly, don't I?

No.

> I remember when I did not put it in for
> ml.exe that it complained about the other stuff not being safeseh, so
> I think that is needed for VS.

That is correct. You must add /safeseh to ml. But you don't need it with
nasm. Basically it works like this. If all .obj modules you try to link
at any given occasion are safeseh-aware, then linker will generate
safeseh table even without /safeseh argument. If at least one .obj is
not safeseh-aware (like one generated by ml without /safeseh flag), then
linker will fail if you pass /safeseh argument and silently omit the
table otherwise.

> As for the incorrect results ... that is the "fun" part!  I compared
> my new "I used vpaddq for the heck of it" 32 bit build to a 64 bit
> build out there in net-land from a year ago, and I get the same hash,
> see below.  That is why I was wondering, scratching my head :-)

Is you processor AVX-capable? The code in question is AVX and it takes
AVX-capable processor to observe the incorrect result. Otherwise
*another* code path is executed and produces correct result.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Building recent versions on Windows with VS 2010

Steven Kneizys
Greetings,

Thanks for the details, that explains it all very nicely.  I grabbed
NASM, installed, and it all built just fine.

Just for fun, I grabbed "objconv" and disassembled sha256-586.obj, and
looked at the sections of code that were an issue when using the
unsupported ml.exe, they looked like this when NASM created the object
code:

L$011grand_avx LABEL NEAR
        vmovdqu xmm0, xmmword ptr [edi]                 ; 33A0 _ C5 FA: 6F. 07
        vmovdqu xmm1, xmmword ptr [edi+10H]             ; 33A4 _ C5
FA: 6F. 4F, 10
        vmovdqu xmm2, xmmword ptr [edi+20H]             ; 33A9 _ C5
FA: 6F. 57, 20
        vmovdqu xmm3, xmmword ptr [edi+30H]             ; 33AE _ C5
FA: 6F. 5F, 30
; Note: Immediate operand could be made smaller by sign extension
        add     edi, 64                                 ; 33B3 _ 81.
C7, 00000040
        vpshufb xmm0, xmm0, xmm7                        ; 33B9 _ C4 E2
79: 00. C7
        mov     dword ptr [esp+64H], edi                ; 33BE _ 89. 7C 24, 64
        vpshufb xmm1, xmm1, xmm7                        ; 33C2 _ C4 E2
71: 00. CF
        vpshufb xmm2, xmm2, xmm7                        ; 33C7 _ C4 E2
69: 00. D7
        vpaddd  xmm4, xmm0, xmmword ptr [ebp]           ; 33CC _ C5
F9: FE. 65, 00
        vpshufb xmm3, xmm3, xmm7                        ; 33D1 _ C4 E2
61: 00. DF
        vpaddd  xmm5, xmm1, xmmword ptr [ebp+10H]       ; 33D6 _ C5
F1: FE. 6D, 10
        vpaddd  xmm6, xmm2, xmmword ptr [ebp+20H]       ; 33DB _ C5
E9: FE. 75, 20
        vpaddd  xmm7, xmm3, xmmword ptr [ebp+30H]       ; 33E0 _ C5
E1: FE. 7D, 30
        vmovdqa xmmword ptr [esp+20H], xmm4             ; 33E5 _ C5
F9: 7F. 64 24, 20
        vmovdqa xmmword ptr [esp+30H], xmm5             ; 33EB _ C5
F9: 7F. 6C 24, 30
        vmovdqa xmmword ptr [esp+40H], xmm6             ; 33F1 _ C5
F9: 7F. 74 24, 40
        vmovdqa xmmword ptr [esp+50H], xmm7             ; 33F7 _ C5
F9: 7F. 7C 24, 50
; Note: Immediate operand could be made smaller by sign extension
        jmp     L$012avx_00_47                          ; 33FD _ E9, 0000000E

I remembered seeing XMMWORD being taken into consideration if the
opcode args were of the xmm variety, so out of curiosity I added a
little code to my previous patch to modify the opcode's third arg as
well, if present:

 --- openssl-1.0.2-stable-SNAP-20140226\crypto\perlasm\x86masm_ORIG.pl
2014-02-27 10:40:36.599122930 -0400
 +++ openssl-1.0.2-stable-SNAP-20140226\crypto\perlasm\x86masm.pl
2014-02-27 23:10:15.051142244 -0400
 @@ -22,6 +22,10 @@
      { # fix xmm references
  $arg[0] =~ s/\b[A-Z]+WORD\s+PTR/XMMWORD PTR/i if ($arg[1]=~/\bxmm[0-7]\b/i);
  $arg[1] =~ s/\b[A-Z]+WORD\s+PTR/XMMWORD PTR/i if ($arg[0]=~/\bxmm[0-7]\b/i);
 + if (defined($arg[2])) {
 + $arg[2] =~ s/\b[A-Z]+WORD\s+PTR/XMMWORD PTR/i if ($arg[1]=~/\bxmm[0-7]\b/i);
 + $arg[2] =~ s/\b[A-Z]+WORD\s+PTR/XMMWORD PTR/i if ($arg[0]=~/\bxmm[0-7]\b/i);
 + }
      }
      &::emit($opcode,@arg);
 @@ -160,13 +164,13 @@
  {   push(@out,"PUBLIC\t".&::LABEL($_[0],$nmdecor.$_[0])."\n");   }
  sub ::data_byte
 -{   push(@out,("DB\t").join(',',@_)."\n"); }
 +{   push @out, (("DB\t").join(',',splice(@_, 0, 16))."\n") while @_; }
  sub ::data_short
 -{   push(@out,("DW\t").join(',',@_)."\n"); }
 +{   push @out, (("DW\t").join(',',splice(@_, 0, 8))."\n") while @_; }
  sub ::data_word
 -{   push(@out,("DD\t").join(',',@_)."\n"); }
 +{   push @out, (("DD\t").join(',',splice(@_, 0, 4))."\n") while @_; }
  sub ::align
 {   push(@out,"ALIGN\t$_[0]\n"); }


And I generated it again the wrong way using ml.exe.  It passed the
tests as far as my older CPU could take it.  Then disassembled that
section:

$L011grand_avx LABEL NEAR
        vmovdqu xmm0, xmmword ptr [edi]                 ; 3380 _ C5 FA: 6F. 07
        vmovdqu xmm1, xmmword ptr [edi+10H]             ; 3384 _ C5
FA: 6F. 4F, 10
        vmovdqu xmm2, xmmword ptr [edi+20H]             ; 3389 _ C5
FA: 6F. 57, 20
        vmovdqu xmm3, xmmword ptr [edi+30H]             ; 338E _ C5
FA: 6F. 5F, 30
        add     edi, 64                                 ; 3393 _ 83. C7, 40
        vpshufb xmm0, xmm0, xmm7                        ; 3396 _ C4 E2
79: 00. C7
        mov     dword ptr [esp+64H], edi                ; 339B _ 89. 7C 24, 64
        vpshufb xmm1, xmm1, xmm7                        ; 339F _ C4 E2
71: 00. CF
        vpshufb xmm2, xmm2, xmm7                        ; 33A4 _ C4 E2
69: 00. D7
        vpaddd  xmm4, xmm0, xmmword ptr [ebp]           ; 33A9 _ C5
F9: FE. 65, 00
        vpshufb xmm3, xmm3, xmm7                        ; 33AE _ C4 E2
61: 00. DF
        vpaddd  xmm5, xmm1, xmmword ptr [ebp+10H]       ; 33B3 _ C5
F1: FE. 6D, 10
        vpaddd  xmm6, xmm2, xmmword ptr [ebp+20H]       ; 33B8 _ C5
E9: FE. 75, 20
        vpaddd  xmm7, xmm3, xmmword ptr [ebp+30H]       ; 33BD _ C5
E1: FE. 7D, 30
        vmovdqa xmmword ptr [esp+20H], xmm4             ; 33C2 _ C5
F9: 7F. 64 24, 20
        vmovdqa xmmword ptr [esp+30H], xmm5             ; 33C8 _ C5
F9: 7F. 6C 24, 30
        vmovdqa xmmword ptr [esp+40H], xmm6             ; 33CE _ C5
F9: 7F. 74 24, 40
        vmovdqa xmmword ptr [esp+50H], xmm7             ; 33D4 _ C5
F9: 7F. 7C 24, 50
        jmp     $L012avx_00_47                          ; 33DA _ EB, 04


It looks like it is generating better code now for that opcode at
least.  I know the issue on
https://github.com/openssl/openssl/issues/34 is officially due to an
"unsupported" use, but just in case that old masm code is still in the
distribution for a reason I thought I'd report what I found out.

Thanks again,

Steve...

On Thu, Feb 27, 2014 at 3:58 PM, Andy Polyakov <[hidden email]> wrote:

> Hi,
>
>> ...  Even with NASM doing the SafeSEH stuff, I
>>
>> think I do have to put the /safeseh part in myself for VS to complete
>> the build properly, don't I?
>
>
> No.
>
>
>> I remember when I did not put it in for
>> ml.exe that it complained about the other stuff not being safeseh, so
>> I think that is needed for VS.
>
>
> That is correct. You must add /safeseh to ml. But you don't need it with
> nasm. Basically it works like this. If all .obj modules you try to link at
> any given occasion are safeseh-aware, then linker will generate safeseh
> table even without /safeseh argument. If at least one .obj is not
> safeseh-aware (like one generated by ml without /safeseh flag), then linker
> will fail if you pass /safeseh argument and silently omit the table
> otherwise.
>
>
>> As for the incorrect results ... that is the "fun" part!  I compared
>> my new "I used vpaddq for the heck of it" 32 bit build to a 64 bit
>> build out there in net-land from a year ago, and I get the same hash,
>> see below.  That is why I was wondering, scratching my head :-)
>
>
> Is you processor AVX-capable? The code in question is AVX and it takes
> AVX-capable processor to observe the incorrect result. Otherwise *another*
> code path is executed and produces correct result.
>
> ______________________________________________________________________
> OpenSSL Project                                 http://www.openssl.org
> Development Mailing List                       [hidden email]
> Automated List Manager                           [hidden email]



--
Steve Kneizys
Senior Business Process Engineer
Voice: (610) 256-1396  [For Emergency Service (888)864-3282]
Ferrilli Information Group -- Quality Service and Solutions for Higher Education
web: http://www.ferrilli.com/

Making you a success while exceeding your expectations.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Building recent versions on Windows with VS 2010

Andy Polyakov-2
> I remembered seeing XMMWORD being taken into consideration if the
> opcode args were of the xmm variety, so out of curiosity I added a
> little code to my previous patch to modify the opcode's third arg as
> well, if present:

It will be looked into as time permits, but not for 1.0.2 release. If
anything one should also look into AVX2 and AVX512 support
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Building recent versions on Windows with VS 2010

Steven Kneizys
I'm working on a little fun thing to see if I can get Visual Studio to
compile and assemble (64 bit)  the 1.0.2 snapshots ... more on that
later on in the week.  I am trying to figure out what the MASM 12.x
assembler wants that is different from what NASM takes right now.  I
know things are going through the translator on OpenSSL, but the
current MASM-generated code is not working, and I know it is
officially unsupported.  But I am looking at it :-)

On Fri, Feb 28, 2014 at 5:00 PM, Andy Polyakov <[hidden email]> wrote:

>> I remembered seeing XMMWORD being taken into consideration if the
>> opcode args were of the xmm variety, so out of curiosity I added a
>> little code to my previous patch to modify the opcode's third arg as
>> well, if present:
>
>
> It will be looked into as time permits, but not for 1.0.2 release. If
> anything one should also look into AVX2 and AVX512 support
>
> ______________________________________________________________________
> OpenSSL Project                                 http://www.openssl.org
> Development Mailing List                       [hidden email]
> Automated List Manager                           [hidden email]



--
Steve Kneizys
Senior Business Process Engineer
Voice: (610) 256-1396  [For Emergency Service (888)864-3282]
Ferrilli Information Group -- Quality Service and Solutions for Higher Education
web: http://www.ferrilli.com/

Making you a success while exceeding your expectations.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [hidden email]
Automated List Manager                           [hidden email]