https://gitlab.synchro.net/main/sbbs/-/issues/1169#note_9440
## Root-caused and fixed
This is **not** the native "Initializing User Objects" step, and it's **not** related to #1153. It's a regression in the TLS body-read path, introduced by `50258e70b` (*notion-31-talks*, "detect TLS client disconnect", #1155).
### Mechanism
A webv4 login/logout is an HTTPS `POST /api/auth.ssjs` carrying a body (the credentials). That body frequently arrives **in the same TLS record as the headers**, so once the headers are parsed it sits decrypted-but-unread inside the TLS layer (`tls_pending`), with **nothing left on the raw socket**.
`read_post_data()` → `recvbufsocket()` gates each read on `session_check()`. Before `50258e70b`, `session_check()` short-circuited to "readable" on `tls_pending`. That commit changed the short-circuit to `peeked_valid` (a single peeked byte) to detect a bare-FIN client disconnect. The body-read path has no peeked byte, so `session_check()` now falls through to `socket_check()` on the **raw socket** — which never becomes readable, because the data is already in the TLS buffer — and blocks for the full `MaxInactivity` before the buffered body is finally read.
Header reads were unaffected: `sockreadline()` kept its own `tls_pending` guard, so only the **body** read (i.e. only POST = login/logout) regressed.
This matches every reported symptom:
- **Login/logout only** — those are the POST-with-body requests; GETs never call `read_post_data()`.
- **Stall == `MaxInactivity`** — 60s where it's `1m`, 90s where it's `1m30s`. (The "exactly 90s every time" is just the configured timeout.)
- **Idle system, zero wire traffic in Wireshark** — it's a `select()` on the raw socket waiting on data that already arrived and is buffered in TLS.
- **"JS runs in 0.02s"** — correct; the block is in the native body read, before the script.
- The pre-probe builds logged the gap right after `Initializing User Objects` simply because that was the nearest preceding log line; the probes (`f6d382c13`) place it precisely between `Authorization check complete` and `Responding to request`, i.e. inside `read_post_data()`.
### It's the version, not the platform
The only reason this looked OS-related is that @xbit compared **v3.21f on Linux** (no stall) against **v3.22a on Windows** (stall). The determinant is the version:
- **v3.21f and earlier** — no regression (`session_check()` still short-circuits on `tls_pending`); unaffected. That's why @xbit's Linux box was fast.
- **v3.22a** (post-`50258e70b`) — affected on **all platforms**. Reproduced on **Linux v3.22a** by @deathr0w_ and @nelgin, and on **Windows v3.22a** (vert and @xbit's Windows box).
(A single hot-git web.synchro.net login didn't surface it in one test — most likely TLS record/segment timing or a low `MaxInactivity` — but the defect is present there too.)
### Fix
`b0f02c4e6` (*pink-27-boss*) on master: guard the `recvbufsocket()` wait with `tls_pending` exactly the way `sockreadline()` already does — when TLS data is already buffered, read it directly instead of waiting on the raw socket. `session_check()` / the #1155 disconnect detection is untouched.
**Verified on vert (v3.22a):** the auth-POST `Authorization check complete` → `Responding to request` gap went from **60s → 0s**; `auth.ssjs` then runs in ~0.5s and login is instant.
@deathr0w_ / @nelgin — a rebuild of `libwebsrvr.so` at `b0f02c4e6` should confirm the fix on Linux v3.22a.
*(Separately, the post-login portal page render is slow on heavily-loaded installs with a network-mounted data dir — that's a different I/O issue, not this stall.)*
— *Authored by Claude (Claude Code), on behalf of @rswindell*
--- SBBSecho 3.37-Linux
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)