Forum: Too Lazy BBS

8086 32-bit multiply

From Paul Edwards@mutazilah@nospicedham.gmail.com to comp.lang.asm.x86 on Fri Apr 23 04:42:29 2021

From Newsgroup: comp.lang.asm.x86

Hi.

Since 1994 I have been working on a project to
create a public domain version of MSDOS, called
PDOS. There is an 8086 version and an 80386
version which can be found here:

http://pdos.sourceforge.net/

I took some shortcuts along the way to get it to
work at all, and one of those has finally bitten me.

I'm getting incorrect results from this:

https://sourceforge.net/p/pdos/gitcode/ci/master/tree/pdpclib/dossupa.asm

; multiply cx:bx by dx:ax, result in dx:ax

public __I4M
__I4M:
public __U4M
__U4M:
public f_lxmul@
f_lxmul@ proc
push bp
mov bp,sp
push cx

push ax
mul cx
mov cx, ax
pop ax
mul bx
add dx, cx

pop cx
pop bp
ret
f_lxmul@ endp

Does anyone have some public domain (explicit notice)
8086 (not 80386) code they are willing to share to do
this? Not LGPL. Not BSD. Public domain. The entire
codebase of tens of thousands of lines of code is
public domain.

Also let me know if you wish to be acknowledged in
the source code and/or code check-in. Some people
prefer to remain anonymous.

There are other routines in there that may not work
properly either, but I haven't come across them yet.

Thanks. Paul.

--- Synchronet 3.21d-Linux NewsLink 1.2

From DJ Delorie@dj@nospicedham.delorie.com to comp.lang.asm.x86 on Fri Apr 23 19:24:37 2021

From Newsgroup: comp.lang.asm.x86

Paul Edwards <mutazilah@nospicedham.gmail.com> writes:

; multiply cx:bx by dx:ax, result in dx:ax

Such would have three multiplies and a few adds:

LSW = bx * ax (lower 16, save upper 16 in XX)

MSW = bx * dx + cx * ax + XX (from lsw)

--- Synchronet 3.21d-Linux NewsLink 1.2

From wolfgang kern@nowhere@nospicedham.never.at to comp.lang.asm.x86 on Sat Apr 24 02:46:36 2021

From Newsgroup: comp.lang.asm.x86

On 23.04.2021 13:42, Paul Edwards wrote:

[x8086 only]

; multiply cx:bx by dx:ax, result in dx:ax

the result of 32*32 bit doesn't fit into 32 bit.
either go with the given limits (16*16 bit) or
build a cascade with intermediate variables aka
MUL-ADD chains.
__
wolfgang

--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul Edwards@mutazilah@nospicedham.gmail.com to comp.lang.asm.x86 on Fri Apr 23 20:16:04 2021

From Newsgroup: comp.lang.asm.x86

On Saturday, April 24, 2021 at 9:36:28 AM UTC+10, DJ Delorie wrote:

Paul Edwards <muta...@nospicedham.gmail.com> writes:

; multiply cx:bx by dx:ax, result in dx:ax

Such would have three multiplies and a few adds:

LSW = bx * ax (lower 16, save upper 16 in XX)

MSW = bx * dx + cx * ax + XX (from lsw)

Thanks for the algorithm! I thought I might be able to do that,
but my brain started to melt down. Here's what I came up with,
which causes a hang, but at least it happened after I got the
results of some calculations. I'll see if I can figure out what
is happening.

; multiply cx:bx by dx:ax, result in dx:ax

public __I4M
__I4M:
public __U4M
__U4M:
public f_lxmul@
f_lxmul@ proc
push bp
mov bp,sp
push bx
push cx
push si
push di

push ax
push bx

; I think this multiples bx * ax and puts the upper 16 bits in ax
; and lower 16 bits in bx
mul bx

; Save upper 16 in si and lower 16 in di
mov si, ax
mov di, bx

; This does the equivalent of bx * dx
pop bx
mov ax, dx
mul bx
mov dx, ax

; Now we do cx * ax with upper 16 bits in ax and lower in cx
pop ax
mul cx

; Now we need to add the results of those two multiplies together
; lower 16 bits first, so we can get the carry
push bp ; ran out of registers!
mov bp, bx
mov bx, ax
mov ax, 1
add dx, cx
jc noone
mov ax, 1
noone:

push ax

; Now the other lower 16 bits we saved
mov ax, 1
add dx, di
jc noone2
mov ax, 1
noone2:

push ax

; Upper 16 bits
mov ax, bx
add bx, ax
pop ax
add bx, ax ; one carry
pop ax
add bx, ax ; the other carry
mov ax, bp
add bx, ax

; store in proper output register
mov dx, bx

pop bp

pop di
pop si
pop cx
pop bx
pop bp
ret
f_lxmul@ endp

BFN. Paul.

--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul Edwards@mutazilah@nospicedham.gmail.com to comp.lang.asm.x86 on Fri Apr 23 20:17:33 2021

From Newsgroup: comp.lang.asm.x86

On Saturday, April 24, 2021 at 10:51:35 AM UTC+10, wolfgang kern wrote:

[x8086 only]

; multiply cx:bx by dx:ax, result in dx:ax

the result of 32*32 bit doesn't fit into 32 bit.

Good point. I didn't think of that. I can't multiply
17 bits by 17 bits, one of the registers needs to
be 0. But I assume I need to at least overflow in
a predictable manner.

either go with the given limits (16*16 bit) or
build a cascade with intermediate variables aka
MUL-ADD chains.

See my most recent post. :-)

BFN. Paul.

--- Synchronet 3.21d-Linux NewsLink 1.2

From wolfgang kern@nowhere@nospicedham.never.at to comp.lang.asm.x86 on Sat Apr 24 10:36:46 2021

From Newsgroup: comp.lang.asm.x86

On 24.04.2021 05:17, Paul Edwards wrote:

[x8086 only]

; multiply cx:bx by dx:ax, result in dx:ax

the result of 32*32 bit doesn't fit into 32 bit.

Good point. I didn't think of that. I can't multiply
17 bits by 17 bits, one of the registers needs to
be 0. But I assume I need to at least overflow in
a predictable manner.

either go with the given limits (16*16 bit) or
build a cascade with intermediate variables aka
MUL-ADD chains.

See my most recent post. :-)

you create a stack frame but use not a single variable there.
and it may hang because your stack isn't balanced.
__
wolfgang

--- Synchronet 3.21d-Linux NewsLink 1.2

From Terje Mathisen@terje.mathisen@nospicedham.tmsw.no to comp.lang.asm.x86 on Sat Apr 24 12:17:08 2021

From Newsgroup: comp.lang.asm.x86

Paul Edwards wrote:

Hi.

Since 1994 I have been working on a project to
create a public domain version of MSDOS, called
PDOS. There is an 8086 version and an 80386
version which can be found here:

http://pdos.sourceforge.net/

I took some shortcuts along the way to get it to
work at all, and one of those has finally bitten me.

I'm getting incorrect results from this:

https://sourceforge.net/p/pdos/gitcode/ci/master/tree/pdpclib/dossupa.asm

; multiply cx:bx by dx:ax, result in dx:ax

public __I4M
__I4M:
public __U4M
__U4M:
public f_lxmul@
f_lxmul@ proc
push bp
mov bp,sp
push cx

push ax
mul cx
mov cx, ax
pop ax
mul bx
add dx, cx

pop cx
pop bp
ret
f_lxmul@ endp

As several have noted, the code above is missing at least one MUL!

Please test it, then feel free to use (with or without attribution) this totally untested but reasonably efficent/short code:

mov si,ax
mov di,dx
mul cx ;; hi * lo
xchg ax,di ;; First mul saved, grab org dx
mul bx ;; lo * hi
add di,ax ;; top word of result

mov ax,si ;; retrieve original AX
mul bx ;; lo * lo
add dx,di

At this point DX:AX has the low 32 bits of the multiplication result.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- Synchronet 3.21d-Linux NewsLink 1.2

From anton@anton@nospicedham.mips.complang.tuwien.ac.at (Anton Ertl) to comp.lang.asm.x86 on Sat Apr 24 14:01:21 2021

From Newsgroup: comp.lang.asm.x86

Paul Edwards <mutazilah@nospicedham.gmail.com> writes:

On Saturday, April 24, 2021 at 10:51:35 AM UTC+10, wolfgang kern wrote:

[x8086 only]

; multiply cx:bx by dx:ax, result in dx:ax

the result of 32*32 bit doesn't fit into 32 bit.

Good point. I didn't think of that. I can't multiply
17 bits by 17 bits, one of the registers needs to
be 0. But I assume I need to at least overflow in
a predictable manner.

The usual way is to produce the lower 32 bits of the result, i.e.,
produce a*b mod 2^32. And thanks to the magic of 2s-complement
arithmetic, the result is the same for unsigned multiplication and for
signed multiplication (the results for the high 32 bits would differ,
but you are not interested in that).

- anton
--
M. Anton Ertl Some things have to be seen to be believed anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen http://www.complang.tuwien.ac.at/anton/home.html

--- Synchronet 3.21d-Linux NewsLink 1.2

From Paul Edwards@mutazilah@nospicedham.gmail.com to comp.lang.asm.x86 on Sat Apr 24 14:00:07 2021

From Newsgroup: comp.lang.asm.x86

On Saturday, April 24, 2021 at 8:22:39 PM UTC+10, Terje Mathisen wrote:

Paul Edwards wrote:

Hi.

Since 1994 I have been working on a project to
create a public domain version of MSDOS, called
PDOS. There is an 8086 version and an 80386
version which can be found here:

http://pdos.sourceforge.net/

I took some shortcuts along the way to get it to
work at all, and one of those has finally bitten me.

I'm getting incorrect results from this:

https://sourceforge.net/p/pdos/gitcode/ci/master/tree/pdpclib/dossupa.asm

; multiply cx:bx by dx:ax, result in dx:ax

public __I4M
__I4M:
public __U4M
__U4M:
public f_lxmul@
f_lxmul@ proc
push bp
mov bp,sp
push cx

push ax
mul cx
mov cx, ax
pop ax
mul bx
add dx, cx

pop cx
pop bp
ret
f_lxmul@ endp

As several have noted, the code above is missing at least one MUL!

Please test it, then feel free to use (with or without attribution) this totally untested but reasonably efficent/short code:

mov si,ax
mov di,dx
mul cx ;; hi * lo
xchg ax,di ;; First mul saved, grab org dx
mul bx ;; lo * hi
add di,ax ;; top word of result

mov ax,si ;; retrieve original AX
mul bx ;; lo * lo
add dx,di

At this point DX:AX has the low 32 bits of the multiplication result.

Thanks so much!!!

I have tested it and it works fine. I have committed the
change, with attribution:

https://sourceforge.net/p/pdos/gitcode/ci/master/tree/pdpclib/dossupa.asm

BFN. Paul.

--- Synchronet 3.21d-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Hannibal
  Fri Jul 3 01:51:09 2026
  from Des Moines via Telnet
- Geek2
  Thu Jul 2 11:41:05 2026
  from Euclid, Oh via Telnet
- Hannibal
  Thu Jul 2 05:49:27 2026
  from Des Moines via SSH
- Geek2
  Wed Jul 1 16:31:20 2026
  from Euclid, Oh via Telnet

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	70
Nodes:	6 (0 / 6)
Uptime:	00:37:33
Calls:	949
Calls today:	1
Files:	1,325
Messages:	281,479

8086 32-bit multiply

Who's Online

Recent Visitors

System Info