On Sun, Dec 7, 2025 at 5:19rC>PM Mark Millard <marklmi@yahoo.com> wrote:A somewhat better comparison is now available from the
On Dec 6, 2025, at 19:03, Mark Millard <marklmi@yahoo.com> wrote:
On Dec 6, 2025, at 14:25, Warner Losh <imp@bsdimp.com> wrote:
On Sat, Dec 6, 2025, 3:06rC>PM Mark Millard <marklmi@yahoo.com> wrote: >>>>
On Dec 6, 2025, at 06:14, Mark Millard <marklmi@yahoo.com> wrote:
Mateusz Guzik <mjguzik_at_gmail.com> wrote on
Date: Sat, 06 Dec 2025 10:50:08 UTC :
I got pointed at phoronix: https://www.phoronix.com/review/freebsd-15-amd-epyc
While I don't treat their results as gospel, a FreeBSD vs FreeBSD test >>>>>>> showing a slowdown most definitely warrants a closer look.
They observed slowdowns when using iperf over localhost and when compiling llvm.
I can confirm both problems and more.
I found the profiling tooling for userspace to be broken again so I >>>>>>> did not investigate much and I'm not going to dig into it further. >>>>>>>
Test box is AMD EPYC 9454 48-Core Processor, with the 2 systems
running as 8 core vms under kvm.
. . .
Both of the below are from ampere3 (aarch64) instead, its
2 most recent "bulk -a" runs that completed, elapsed times
shown for qt6-webengine-6.9.3 builds:
150releng-arm64-quarterly qt6-webengine-6.9.3 53:33:46
135arm64-default qt6-webengine-6.9.3 38:43:36
I remind of what started this for my specificA few months before I landed the jemalloc patches, i did 4 or 5 from dirt buildworlds. The elasped time was, iirc, with 1 or 2%. Enough to see maybe a diff with the small sample size, but not enough for ministat to trigger at 95%. I didn't recall keeping the data for this and can't find it now. And I'm not even sure, in hindsight, I ran a good experiment. It might be related, or not, but it would be easy enough for someone to setup a two jails: one just before and one just after. Build from scratch the world (same hash) on both. That would test it since you'd be holding all other variables constant.For reference:
Host OSVERSION: 1600000
Jail OSVERSION: 1500068
vs.
Host OSVERSION: 1600000
Jail OSVERSION: 1305000
The difference for the above is in the Jail's world builds,
not in the boot's (kernel+world) builds.
For reference:
https://pkg-status.freebsd.org/ampere3/build.html?mastername=150releng-arm64-quarterly&build=88084f9163ae
build of www/qt6-webengine | qt6-webengine-6.9.3 ended at Sun Nov 30 05:40:02 -00 2025
build time: 2D:05:33:52
https://pkg-status.freebsd.org/ampere3/build.html?mastername=135arm64-default&build=f5384fe59be6
build of www/qt6-webengine | qt6-webengine-6.9.3 ended at Sat Nov 22 15:33:34 -00 2025
build time: 1D:14:43:41
Expanding the notes to before and after jemalloc 5.3.0
was merged to main: beefy18 was the main-amd64 builder
before and somewhat after the jemalloc 5.3.0 merge from
vendor branch:
Before: p2650762431ca_s51affb7e971 261:29:13 building 36074 port-packages, start 05 Aug 2025 01:10:59 GMT
( jemalloc 5.3.0 merge from vendor branch: 15 Aug 2025)
After : p9652f95ce8e4_sb45a181a74c 428:49:20 building 36318 port-packages, start 19 Aug 2025 01:30:33 GMT
(The log files are long gone for port-packages built.)
main-15 used a debug jail world but 15.0-RELEASE does not.
I'm not aware of such a port-package builder context for a
non-debug jail world before and after a jemalloc 5.3.0 merge.
When we imported the tip of FreeBSD main at work, we didn't get a cpu change trigger from our tests that I recall...
The range of commits look like:
rCo git: 9a7c512a6149 - main - ucred groups: restore a useful comment Eric van Gyzen
rCo git: bf6039f09a30 - main - jemalloc: Unthin contrib/jemalloc Warner Losh
rCo git: a0dfba697132 - main - jemalloc: Update jemalloc.xml.in per FreeBSD-diffs Warner Losh
rCo git: 718b13ba6c5d - main - jemalloc: Add FreeBSD's updates to jemalloc_preamble.h.in Warner Losh
rCo git: 6371645df7b0 - main - jemalloc: Add JEMALLOC_PRIVATE_NAMESPACE for the libc namespace Warner Losh
rCo git: da260ab23f26 - main - jemalloc: Only replace _pthread_mutex_init_calloc_cb in private namespace Warner Losh
rCo git: c43cad871720 - main - jemalloc: Merge from jemalloc 5.3.0 vendor branch Warner Losh
rCo git: 69af14a57c9e - main - jemalloc: Note update in UPDATING and RELNOTES Warner Losh
I've started a build of a non-debug 9a7c512a6149 world
to later create a chroot to do a test buildworld in.
I'll also do a build of a non-debug 69af14a57c9e world
to later create the other chroot to do a test
buildworld in.
non-debug means my use of:
WITH_MALLOC_PRODUCTION=
WITHOUT_ASSERT_DEBUG=
WITHOUT_PTHREADS_ASSERTIONS=
WITHOUT_LLVM_ASSERTIONS=
I've used "env WITH_META_MODE=" as it cuts down on the
volume and frequency of scrolling output. I'll do the
same later.
If there is anything you want controlled in a different
way, let me know.
The Windows Dev Kit 2023 is booted (world and kernel)
with:
# uname -apKU
FreeBSD aarch64-main-pbase 16.0-CURRENT FreeBSD 16.0-CURRENT main-n281922-4872b48b175c GENERIC-NODEBUG arm64 aarch64 1600004 1600004
which is from an official pkgbase distribution. So the
boot-world is a debug world but the boot-kernel is not.
The Windows Dev Kit 2023 will take some time for such
-j8 builds and I may end up sleeping in the middle of
the sequence someplace. So it may be a while before
I've any comparison/contrast data to report.
Summary for jemalloc for before vs. at 5.3.0
for *non-debug* contexts doing the buildworld :
before 5.3.0: 9754 seconds (about 2.7 hrs)
with 5.3.0: 9384 seconds (about 2.6 hrs)
While in principle this can accurately reflect the difference, the
benchmark itself is not valid as is.
First, you can't just run it once -- the result needs to be provenFor comparison to:
repeatable and profiled.For a build of a that duration, for this few resources,
for all I know the real factor was randomness from I/O.Not for a change of scale to instead be
That aside you need a sanitized baseline. From the description it notMy result indicate, in part, that it is not a
clear to me at all if you are doing the build with the clang perf
regression fixed or not.
Even that aside, I outlined 3 more regressions:Do you expect any combination of those to be a
- slower binary startup to begin with
- slower syscalls which fail with an error
- slower syscall interface in the first place
Out of the the first one is most important here.
If I was to work on this,I would not claim that we are targeting the same
seeing that the question at hand is whetherI think the specifics of the qt6-webengine-6.9.3
the jemalloc update is a problem,
I would bypass all of the above andI'll note that ampere1 with a 14.3 jail took 38:25:51
instead take 14.3 (not stable/14!) as a baseline + jemalloc update on
top. This eliminates all of the factors other than jemalloc itself.
building world also seems a little fishy here and it is not clear toThe 9xxx sec timings were both building:
me at all what version have you built
-- was the new jemalloc thingUsing qt6-webengine-6.9.3 would mean using a known
building new jemalloc and old jemalloc building old jemalloc? More
imporantly I would be worried some of the build picks up whatever
jemalloc it finds to use during some of the build.
I would benchmark this by building a big port (not timing dependencies
of the port, just the port itself -- maybe even chromium or firefox).
That's of course quite a bit of effort and if there is nobody to do150releng-arm64-quarterly on ampere3:
that (or compatible), imo the pragmatic play is to revert the jemalloc
update for the time being. This restores the known working state and
should the update be a good thing it can land for 15.1, maybe fixed
up.
On Mon, 8 Dec 2025 09:23:52 -0800
Mark Millard <marklmi@yahoo.com> wrote:
But, as of yet, I've no good evidence for blaming
jemalloc as a major contributor to those timing
ratios --or for blaming any other specific part
of 15.0 .
If you want to bench jmalloc - there is another ways to do that without building something.
Try to find some sythetic benchmarks.
Also jmalloc can be build without OS rebuild and linked with bench.
This 2 things can reduce time to tests, but it will eliminate OS integation factors.
Run same bench on different OS may give more info.
| Sysop: | Amessyroom |
|---|---|
| Location: | Fayetteville, NC |
| Users: | 54 |
| Nodes: | 6 (0 / 6) |
| Uptime: | 14:18:52 |
| Calls: | 742 |
| Files: | 1,218 |
| D/L today: |
3 files (2,681K bytes) |
| Messages: | 183,842 |
| Posted today: | 1 |