I was just designing a timecritical subroutine. As always my first rule is to avoid floating point and use integer variables.
In Powerbasic we can use Registervariables additionally.
Hereby the compiler uses the ESI-Register for the first Register-variable and the EDI - Register for the second Register-Variable.
Take this Line for example:
Local A,C as DWORD
REGISTER B AS DWORD
A=B*C
Taking a look on the result, I realized that Powerbasic makes this out of my line:
408708 DB451C FILD LONG PTR [EBP+1C]
40870B 8975A4 MOV DWORD PTR [EBP-5C], ESI
40870E C745A800000000 MOV DWORD PTR [EBP-58], DWORD 00000000
408715 DF6DA4 FILD QUAD PTR [EBP-5C]
408718 DEC9 FMULP ST(1), ST
40871A E8AB110000 CALL L4098CA
40871F 89856CFFFFFF MOV DWORD PTR [EBP+FFFFFF6C], EAX
Thats somehow ok as the compiler does not know what size the result will have.
From my standpoint a Integer Multiplication would be what I need here.
' Multipliziert P1=P2*P3
' Uses Flags, uses EAX,EDX. Result is in EAX:EDX
' P1,P2,P3 sind Variablen oder Registernamen
MACRO A_MUL(P1,P2,P3)
! MOV EAX,P2
! MUL P3
! MOV P1,EAX
END MACRO
In the code it just replaces the Line:
A=B*C
' becomes
A_MUL(A,B,C)
As a result the DisASM (http://home.midsouth.rr.com/theirware/DisAsm.htm) shows what I expected:
408708 8BC6 MOV EAX, ESI
40870A F7651C MUL DWORD PTR [EBP+1C]
40870D 89856CFFFFFF MOV DWORD PTR [EBP+FFFFFF6C], EAX
Less cycles and no floating-point used.
CPU-Flags:
Please note that the MUL - Command uses EAX:EDX as result-register.
In case the result is 32 bit, EDX will stay 0 and CF and OF-Flag are 0, else they will be set to 1.
Intresting is this:
If I had declared my variables AS LONG, not AS DWORD, the compiler makes this:
40869B 8B451C MOV EAX, DWORD PTR [EBP+1C]
40869E F7EE IMUL ESI
4086A0 89856CFFFFFF MOV DWORD PTR [EBP+FFFFFF6C], EAX
Which is just what I needed in this case.
Again, the PowerBasic compilers are presently optimized for LONG (signed)
arithmetic.
Steve Hutchesson posted this gem some time ago and I use it where appropriate:
asm multiply tricks (http://www.powerbasic.com/support/forums/Forum4/HTML/003673.html)
Really a GEM!
! mov eax, var
' ! lea eax, [eax+eax] ; x 2
' ! lea eax, [eax*2+eax] ; x 3
' ! lea eax, [eax*4] ; x 4
' ! lea eax, [eax*4+eax] ; x 5
' ! lea ecx, [eax*2]
' ! lea eax, [eax*4+ecx] ; x 6
' ! lea ecx, [eax*2+eax]
' ! lea eax, [eax*4+ecx] ; x 7
' ! lea eax, [eax*8] ; x 8
' ! lea eax, [eax*8+eax] ; x 9
' ! lea ecx, [eax*2]
' ! lea eax, [eax*8+ecx] ; x 10
There is more like this in Agner Fog'S Optimization Manuals.
Get it here;
http://www.agner.org/optimize/ (http://www.agner.org/optimize/)