So. What is BSD2.9 sbrk() or signal() doing or not doing such that the
signal dispatcher in the kernal explodes trying to JSR PC,signalhandler
Is there something special on BSD2.9 with sbrk() or signal()?
If I claim memory and stick the stack there, and set a signal,
when the signal is triggered everything blows up with the
signal dispatcher in the kernel unable to access memory to
stack to call the signal handler.
This doesn't happen if I leave the stack at the value on
entry at &FF00+xxx, and doesn't happen with Bell Unix v5/6/7.
I can't just leave the stack at &FFxx as it will descend
across I/O memory at &E000 upwards.
I've stripped my test code down to this:
ORG 0
EQUW &0107 ; magic number, also branch to code
EQUW _DATA%-_TEXT% ; size of text
EQUW _BSS%-_DATA% ; size of initialised data
EQUW _END%-_BSS% ; size of uninitialised data
EQUW &0000 ; size of symbol data
EQUW _ENTRY%-_TEXT% ; entry point
EQUW &0000 ; not used
EQUW &0001 ; no relocation info
;
ORG 0
_TEXT%:
_ENTRY%:
mov #1,r0
trap 4 ; write(stdout, "RUN"..)
equw msg1
equw msg2-msg1
;
trap 17 ; sbrk(&E000)
equw &E000
bcs quit ; failed
mov #1,r0
trap 4 ; write(stdout, "SBRK"..)
equw msg2
equw msg3-msg2
;
trap 48 ; signal(SIGQUIT, quit)
equw 3
equw quit
mov #1,r0
trap 4 ; write(stdout, "SET"..)
equw msg3
equw msg4-msg3
;
mov #&E000,sp ; put stack at top of memory
On 2022-07-18 22:46, Jonathan Harston wrote:
Is there something special on BSD2.9 with sbrk() or signal()?
If I claim memory and stick the stack there, and set a signal,
when the signal is triggered everything blows up with the
signal dispatcher in the kernel unable to access memory to
stack to call the signal handler.
Nothing I'm aware of.
This doesn't happen if I leave the stack at the value on
entry at &FF00+xxx, and doesn't happen with Bell Unix v5/6/7.
I can't just leave the stack at &FFxx as it will descend
across I/O memory at &E000 upwards.
What are you talking about? User programs don't have I/O memory in the
top page.
I've stripped my test code down to this:
ORG-a 0
EQUW &0107-a-a-a-a-a-a-a ; magic number, also branch to code
EQUW _DATA%-_TEXT%-a-a-a ; size of text
EQUW _BSS%-_DATA%-a-a-a ; size of initialised data
EQUW _END%-_BSS%-a-a-a ; size of uninitialised data
EQUW &0000-a-a-a-a-a-a-a ; size of symbol data
EQUW _ENTRY%-_TEXT%-a-a-a ; entry point
EQUW &0000-a-a-a-a-a-a-a ; not used
EQUW &0001-a-a-a-a-a-a-a ; no relocation info
;
ORG-a 0
_TEXT%:
_ENTRY%:
mov #1,r0
trap 4-a-a-a-a-a-a-a ; write(stdout, "RUN"..)
equw msg1
equw msg2-msg1
;
trap 17-a-a-a-a-a-a-a ; sbrk(&E000)
equw &E000
bcs quit-a-a-a ; failed
mov #1,r0
trap 4-a-a-a-a-a-a-a ; write(stdout, "SBRK"..)
equw msg2
equw msg3-msg2
;
trap 48-a-a-a-a-a-a-a ; signal(SIGQUIT, quit)
equw 3
equw quit
mov #1,r0
trap 4-a-a-a-a-a-a-a ; write(stdout, "SET"..)
equw msg3
equw msg4-msg3
;
mov #&E000,sp-a-a-a ; put stack at top of memory
What assembler is this? I find the notation a bit weird. Is #&E000
really the constant E000? Or is it whatever value is located at E000?
With DEC assemblers, I guess it would be
-a MOV #160000,SP
and with the normal Unix assembler, it would be
-a MOV $160000,SP
Looking a few lines higher up, it certainly looked like you used just #n
for a literal constant, and not #&, so this all looks very fishy to me.
Can you run the code with a debugger and verify that you are getting a
good value? Also, what kind of CPU are you running on? I'm sortof
wondering, in case this is something like an 11/70, if the stack limit register is used by the OS (I haven't checked). That could play tricks
with you if so. But I wouldn't think the system plays with the stack
limit register. My primary suspect is that you are not actually setting
the SP to what you think you are.
-a Johnny
By the way - another thing. This can't have been compiled and run on a 2.9BSD system. Where did you get those syscall numbers from?
I am quite certain the numbers have not changed between 2.9BSD and
2.11BSD, and I can tell:
4 is write
17 is *not* sbrk, but chflags. sbrk is 69.
48 is *not* signal, but getegid. signal is just a wrapper around sigvec, which is 108.
However, even the call to write is wrong. In 2.11BSD (and I believe
2.9BSD), you basically should have all arguments on the stack.
On Tuesday, 19 July 2022 at 15:03:42 UTC+1, Johnny Billquist wrote:
By the way - another thing. This can't have been compiled and run on a
2.9BSD system. Where did you get those syscall numbers from?
https://www.tuhs.org/cgi-bin/utree.pl?file=2.9BSD/usr/src/sys/sys/sysent.c and elsewhere.
I am quite certain the numbers have not changed between 2.9BSD and
2.11BSD, and I can tell:
4 is write
17 is *not* sbrk, but chflags. sbrk is 69.
48 is *not* signal, but getegid. signal is just a wrapper around sigvec,
which is 108.
My BSD2.11 agrees with you, my BSD2.9 does not.
However, even the call to write is wrong. In 2.11BSD (and I believe
2.9BSD), you basically should have all arguments on the stack.
That's what I thought, and I was initially pulling my hair out with everything falling over with the parameters on the stack, but with
BSD2.9 they are definitely inline. Here is the assembler for signal()
from the above C snippet:
\ On entry: sp=>ret, signum, func
_signal:
MOV R5,-(SP) :\ Save R5, sp=>R5, ret, signum, func
MOV SP,R5 :\ R5=>stack frame
MOV &0004(R5),R1 :\ R1=signum
CMP R1,#&0014 :\ CMP MAXSIG
BCC L00B8 :\ Too big, bad signum
MOV &0006(R5),R0 :\ R0=func
MOV R1,&0170 :\ Store signum in TRAP
ASL R1 :\ signum*2, index into dispatch table
MOV &0178(R1),-(SP) :\ Stack old entry (default=0)
MOV R0,&0178(R1) :\ Store func in table
MOV R0,&0172 :\ Also store func in TRAP in case not C function
BEQ L00A4 :\ If zero, jump to pass to TRAP to turn off
BIT #&0001,R0 :\ Is func.b0=1?
BNE L00A4 :\ Also jump to pass to TRAP to disable
ASL R1 :\ R1 is now signum*4
ADD #&00C0,R1 :\ Index into jump block
MOV R1,&0172 :\ Store this as func in TRAP
L00A4:
TRAP &00 :\ TRAP indir
EQUW &016E :\ signal, signum, func
BCS L00BC :\ Error, jump to return it
BIT #&0001,R0 :\ Was old func bit 0 set?
BEQ L00B2 :\ No, skip past to return it
MOV R0,(SP) :\ Yes, overwrite stacked old func
L00B2:
MOV (SP)+,R0 :\ R0=old func from table
MOV (SP)+,R5 :\ Restore R5
RTS PC :\ Return
(snip)
sigtrap:
TRAP &30 :\ 016E 30 89 0.
HALT :\ 0170 00 00 ..
HALT :\ 0172 00 00 ..
At L00A4 it's definitely doing a Bell-style indirect TRAP with inline parameters.
jgh
But anyway. Let's assume that is correct then. What about the argument
to sbrk that I pointed at? It certainly looks broken to me. (And I'm
still confused by what assembler you are using.)
When I do:
clr r1 ; r1=top of memory, start at &10000
mov #&8911,TRAP_BUF ; SYS sbrk
InitMemLp:
sub #256,r1 ; Step down 256 bytes
mov r1,TRAP_BUF+2 ; Store as TRAP argument
TRAP 0 ; SYS sbrk,addr
EQUW TRAP_BUF
bcs InitMemLp ; Memory not claimable, try a bit less
rts pc
TRAP_BUF:
EQUW 0
EQUW 0
EQUW 0
EQUW 0
I end up with E000 has the highest top of memory I can claim.
The addresses at E000 upwards are inaccessible as they are
"elsewhere". My documentation tells me that Exxx is used
for paged memory access, and Fxxx is I/O access, "E000+ is
I/O" is just lazy shorthand for "not addresses I (ie the
code) can access".
I did a load of digging and single-stepping through the BSD2.9 code
and think I've worked out what's going on - or at least enough to
get a solution that's workable enough for me.
In Bell PDP11 Unix, the signal dispatcher pages in the entire
process before calling its signal handler. BSD 2.9 appears to
do a form of lazy task switching in that it pages in the text
and stack segments and waits for actual memory access to
trigger any of the data to be paged in.
This works fine if the stack is in the stack segment, but if
the stack is in the data segment, it falls over because pushing
to the stack doesn't trigger the paging in.
This may not be completely accurate, but close enough to
work out what's happening.
Further digging finds there's a 'nostk' function call that
disconnects the SP register from the stack segment and allows
you to put it anywhere. See: https://minnie.tuhs.org/cgi-bin/utree.pl?file=2.9BSD/usr/man/cat2/nostk.2
So, I add a call to nostk to my code, with careful shuffling
around to avoid the stack disappearing under my feet:
; Assume the only thing on the stack is a return address
; Any stack frame above SP has already been processed
;
clr r1 ; r1=top of memory, start at &0000-256=&FF00
mov #&8911,TRAP_BUF ; SYS brk
.IO_InitMemLp
sub #256,r1 ; Step down 256 bytes
mov r1,TRAP_BUF+2
trap 0 ; SYS brk,addr
equw TRAP_BUF
bcs IO_InitMemLp ; Memory not claimable, try a bit less
; ; r1=top of claimed memory
; ; and Carry is clear
mov (sp),r0 ; Get return address
mov r1,sp ; Put stack at top of claimed memory
mov r0,-(sp) ; and push the return address on it
trap 58 ; sys local,nostk
equw TRAP_NOSTK
bcs nostk_not_needed ; nostk doesn't exist, don't need it anyway
; ; nb: SIGSYS caught to set Cy and return
...
rts pc
...
TRAP_NOSTK:
trap 4 ; local nostk
TRAP_BUF:
trap 0
equw 0
equw 0
equw 0
2.11BSD has more syscalls than 2.9BSD. For the common ones, the syscall numbers are similar, but not always identical.
See the sysent tables
https://www.retro11.de/ouxr/29bsd/usr/src/sys/sys/sysent.c.html#s:_sysent
https://www.retro11.de/ouxr/211bsd/usr/src/sys/sys/init_sysent.c.html#s:_sysent
'setuid' is 23 in 2.9BSD and 45 in 2.11BSD.
Beyond that a remark on octal and hex. Earlier on in this thread I saw a lot of HEX numbers.
All PDP-11 hardware documentation uses octal.
The MACRO-11 assembler has binary, octal and decimal radix, but no hex.
HEX is an alien in the PDP-11 hardware world, and shouldn't be used.
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 63 |
| Nodes: | 6 (0 / 6) |
| Uptime: | 492971:18:07 |
| Calls: | 840 |
| Files: | 1,301 |
| D/L today: |
10 files (28,220K bytes) |
| Messages: | 264,287 |