• Re: performance regressions in 15.0

    From Warner Losh@imp@bsdimp.com to muc.lists.freebsd.stable on Sat Dec 6 15:25:36 2025
    From Newsgroup: muc.lists.freebsd.stable

    --0000000000002ebbee0645500bc9
    Content-Type: text/plain; charset="UTF-8"
    Content-Transfer-Encoding: quoted-printable

    On Sat, Dec 6, 2025, 3:06=E2=80=AFPM Mark Millard <marklmi@yahoo.com> wrote=
    :


    On Dec 6, 2025, at 06:14, Mark Millard <marklmi@yahoo.com> wrote:

    Mateusz Guzik <mjguzik_at_gmail.com> wrote on
    Date: Sat, 06 Dec 2025 10:50:08 UTC :

    I got pointed at phoronix: https://www.phoronix.com/review/freebsd-15-amd-epyc

    While I don't treat their results as gospel, a FreeBSD vs FreeBSD test
    showing a slowdown most definitely warrants a closer look.

    They observed slowdowns when using iperf over localhost and when
    compiling llvm.

    I can confirm both problems and more.

    I found the profiling tooling for userspace to be broken again so I
    did not investigate much and I'm not going to dig into it further.

    Test box is AMD EPYC 9454 48-Core Processor, with the 2 systems
    running as 8 core vms under kvm.
    . . .



    Both of the below are from ampere3 (aarch64) instead, its
    2 most recent "bulk -a" runs that completed, elapsed times
    shown for qt6-webengine-6.9.3 builds:

    150releng-arm64-quarterly qt6-webengine-6.9.3 53:33:46
    135arm64-default qt6-webengine-6.9.3 38:43:36

    For reference:

    Host OSVERSION: 1600000
    Jail OSVERSION: 1500068

    vs.

    Host OSVERSION: 1600000
    Jail OSVERSION: 1305000

    The difference for the above is in the Jail's world builds,
    not in the boot's (kernel+world) builds.


    For reference:



    https://pkg-status.freebsd.org/ampere3/build.html?mastername=3D150releng-=
    arm64-quarterly&build=3D88084f9163ae

    build of www/qt6-webengine | qt6-webengine-6.9.3 ended at Sun Nov 30
    05:40:02 -00 2025
    build time: 2D:05:33:52



    https://pkg-status.freebsd.org/ampere3/build.html?mastername=3D135arm64-d=
    efault&build=3Df5384fe59be6

    build of www/qt6-webengine | qt6-webengine-6.9.3 ended at Sat Nov 22
    15:33:34 -00 2025
    build time: 1D:14:43:41


    Expanding the notes to before and after jemalloc 5.3.0
    was merged to main: beefy18 was the main-amd64 builder
    before and somewhat after the jemalloc 5.3.0 merge from
    vendor branch:

    Before: p2650762431ca_s51affb7e971 261:29:13 building 36074 port-packages=
    ,
    start 05 Aug 2025 01:10:59 GMT
    ( jemalloc 5.3.0 merge from vendor branch: 15 Aug 2025)
    After : p9652f95ce8e4_sb45a181a74c 428:49:20 building 36318 port-packages=
    ,
    start 19 Aug 2025 01:30:33 GMT

    (The log files are long gone for port-packages built.)

    main-15 used a debug jail world but 15.0-RELEASE does not.

    I'm not aware of such a port-package builder context for a
    non-debug jail world before and after a jemalloc 5.3.0 merge.


    A few months before I landed the jemalloc patches, i did 4 or 5 from dirt buildworlds. The elasped time was, iirc, with 1 or 2%. Enough to see maybe
    a diff with the small sample size, but not enough for ministat to trigger
    at 95%. I didn't recall keeping the data for this and can't find it now.
    And I'm not even sure, in hindsight, I ran a good experiment. It might be related, or not, but it would be easy enough for someone to setup a two
    jails: one just before and one just after. Build from scratch the world
    (same hash) on both. That would test it since you'd be holding all other variables constant.

    When we imported the tip of FreeBSD main at work, we didn't get a cpu
    change trigger from our tests that I recall...

    Warner

    Warner



    --0000000000002ebbee0645500bc9
    Content-Type: text/html; charset="UTF-8"
    Content-Transfer-Encoding: quoted-printable

    <div dir=3D"auto"><div><br><br><div class=3D"gmail_quote gmail_quote_contai= ner"><div dir=3D"ltr" class=3D"gmail_attr">On Sat, Dec 6, 2025, 3:06=E2=80= =AFPM Mark Millard &lt;<a href=3D"mailto:marklmi@yahoo.com">marklmi@yahoo.c= om</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margi= n:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
    On Dec 6, 2025, at 06:14, Mark Millard &lt;<a href=3D"mailto:marklmi@yahoo.= com" target=3D"_blank" rel=3D"noreferrer">marklmi@yahoo.com</a>&gt; wrote:<=


    &gt; Mateusz Guzik &lt;<a href=3D"http://mjguzik_at_gmail.com" rel=3D"noref= errer noreferrer" target=3D"_blank">mjguzik_at_gmail.com</a>&gt; wrote on<b=

    &gt; Date: Sat, 06 Dec 2025 10:50:08 UTC :<br>
    &gt; <br>
    &gt;&gt; I got pointed at phoronix: <a href=3D"https://www.phoronix.com/rev= iew/freebsd-15-amd-epyc" rel=3D"noreferrer noreferrer" target=3D"_blank">ht= tps://www.phoronix.com/review/freebsd-15-amd-epyc</a><br>
    &gt;&gt; <br>
    &gt;&gt; While I don&#39;t treat their results as gospel, a FreeBSD vs Free= BSD test<br>
    &gt;&gt; showing a slowdown most definitely warrants a closer look.<br> &gt;&gt; <br>
    &gt;&gt; They observed slowdowns when using iperf over localhost and when c= ompiling llvm.<br>
    &gt;&gt; <br>
    &gt;&gt; I can confirm both problems and more.<br>
    &gt;&gt; <br>
    &gt;&gt; I found the profiling tooling for userspace to be broken again so = I<br>
    &gt;&gt; did not investigate much and I&#39;m not going to dig into it furt= her.<br>
    &gt;&gt; <br>
    &gt;&gt; Test box is AMD EPYC 9454 48-Core Processor, with the 2 systems<br=

    &gt;&gt; running as 8 core vms under kvm.<br>
    &gt;&gt; . . .<br>
    &gt; <br>
    &gt; <br>
    &gt; <br>
    &gt; Both of the below are from ampere3 (aarch64) instead, its<br>
    &gt; 2 most recent &quot;bulk -a&quot; runs that completed, elapsed times<b=

    &gt; shown for qt6-webengine-6.9.3 builds:<br>
    &gt; <br>
    &gt; 150releng-arm64-quarterly qt6-webengine-6.9.3 53:33:46<br>
    &gt; 135arm64-default=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 qt6-webengine-6.9.3=
    38:43:36<br>
    &gt; <br>
    &gt; For reference:<br>
    &gt; <br>
    &gt; Host OSVERSION: 1600000<br>
    &gt; Jail OSVERSION: 1500068<br>
    &gt; <br>
    &gt; vs.<br>
    &gt; <br>
    &gt; Host OSVERSION: 1600000<br>
    &gt; Jail OSVERSION: 1305000<br>
    &gt; <br>
    &gt; The difference for the above is in the Jail&#39;s world builds,<br>
    &gt; not in the boot&#39;s (kernel+world) builds.<br>
    &gt; <br>
    &gt; <br>
    &gt; For reference:<br>
    &gt; <br>
    &gt; <br>
    &gt; <a href=3D"https://pkg-status.freebsd.org/ampere3/build.html?masternam= e=3D150releng-arm64-quarterly&amp;build=3D88084f9163ae" rel=3D"noreferrer n= oreferrer" target=3D"_blank">https://pkg-status.freebsd.org/ampere3/build.h= tml?mastername=3D150releng-arm64-quarterly&amp;build=3D88084f9163ae</a><br> &gt; <br>
    &gt; build of www/qt6-webengine | qt6-webengine-6.9.3 ended at Sun Nov 30 0= 5:40:02 -00 2025<br>
    &gt; build time: 2D:05:33:52<br>
    &gt; <br>
    &gt; <br>
    &gt; <a href=3D"https://pkg-status.freebsd.org/ampere3/build.html?masternam= e=3D135arm64-default&amp;build=3Df5384fe59be6" rel=3D"noreferrer noreferrer=
    " target=3D"_blank">https://pkg-status.freebsd.org/ampere3/build.html?maste= rname=3D135arm64-default&amp;build=3Df5384fe59be6</a><br>
    &gt; <br>
    &gt; build of www/qt6-webengine | qt6-webengine-6.9.3 ended at Sat Nov 22 1= 5:33:34 -00 2025<br>
    &gt; build time: 1D:14:43:41<br>


    Expanding the notes to before and after jemalloc 5.3.0<br>
    was merged to main: beefy18 was the main-amd64 builder<br>
    before and somewhat after the jemalloc 5.3.0 merge from<br>
    vendor branch:<br>

    Before: p2650762431ca_s51affb7e971 261:29:13 building 36074 port-packages, = start 05 Aug 2025 01:10:59 GMT<br>
    (=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
    =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0jemalloc = 5.3.0 merge from vendor branch: 15 Aug 2025)<br>
    After : p9652f95ce8e4_sb45a181a74c 428:49:20 building 36318 port-packages, = start 19 Aug 2025 01:30:33 GMT<br>

    (The log files are long gone for port-packages built.)<br>

    main-15 used a debug jail world but 15.0-RELEASE does not.<br>

    I&#39;m not aware of such a port-package builder context for a<br>
    non-debug jail world before and after a jemalloc 5.3.0 merge.<br></blockquo= te></div></div><div dir=3D"auto"><br></div><div dir=3D"auto">A few months b= efore I landed the jemalloc patches, i did 4 or 5 from dirt buildworlds. Th=
    e elasped time was, iirc, with 1 or 2%. Enough to see maybe a diff with the=
    small sample size, but not enough for ministat to trigger at 95%. I didn&#= 39;t recall keeping the data for this and can&#39;t find it now. And I&#39;=
    m not even sure, in hindsight, I ran a good experiment. It might be related=
    , or not, but it would be easy enough for someone to setup a two jails: one=
    just before and one just after. Build from scratch the world (same hash) o=
    n both. That would test it since you&#39;d be holding all other variables c= onstant.</div><div dir=3D"auto"><br></div><div dir=3D"auto">When we importe=
    d the tip of FreeBSD main at work, we didn&#39;t get a cpu change trigger f= rom our tests that I recall...</div><div dir=3D"auto"><br></div><div dir=3D= "auto">Warner</div><div dir=3D"auto"><br></div><div dir=3D"auto">Warner</di= v><div dir=3D"auto"><div class=3D"gmail_quote gmail_quote_container"><block= quote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc=
    solid;padding-left:1ex">
    </blockquote></div></div></div>

    --0000000000002ebbee0645500bc9--


    --
    Posted automagically by a mail2news gateway at muc.de e.V.
    Please direct questions, flames, donations, etc. to news-admin@muc.de
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mark Millard@marklmi@yahoo.com to muc.lists.freebsd.stable on Sat Dec 6 19:03:43 2025
    From Newsgroup: muc.lists.freebsd.stable

    On Dec 6, 2025, at 14:25, Warner Losh <imp@bsdimp.com> wrote:
    On Sat, Dec 6, 2025, 3:06rC>PM Mark Millard <marklmi@yahoo.com> wrote:

    On Dec 6, 2025, at 06:14, Mark Millard <marklmi@yahoo.com> wrote:

    Mateusz Guzik <mjguzik_at_gmail.com> wrote on
    Date: Sat, 06 Dec 2025 10:50:08 UTC :

    I got pointed at phoronix: https://www.phoronix.com/review/freebsd-15-amd-epyc

    While I don't treat their results as gospel, a FreeBSD vs FreeBSD test
    showing a slowdown most definitely warrants a closer look.

    They observed slowdowns when using iperf over localhost and when compiling llvm.

    I can confirm both problems and more.

    I found the profiling tooling for userspace to be broken again so I
    did not investigate much and I'm not going to dig into it further.

    Test box is AMD EPYC 9454 48-Core Processor, with the 2 systems
    running as 8 core vms under kvm.
    . . .



    Both of the below are from ampere3 (aarch64) instead, its
    2 most recent "bulk -a" runs that completed, elapsed times
    shown for qt6-webengine-6.9.3 builds:

    150releng-arm64-quarterly qt6-webengine-6.9.3 53:33:46
    135arm64-default qt6-webengine-6.9.3 38:43:36

    For reference:

    Host OSVERSION: 1600000
    Jail OSVERSION: 1500068

    vs.

    Host OSVERSION: 1600000
    Jail OSVERSION: 1305000

    The difference for the above is in the Jail's world builds,
    not in the boot's (kernel+world) builds.


    For reference:


    https://pkg-status.freebsd.org/ampere3/build.html?mastername=150releng-arm64-quarterly&build=88084f9163ae

    build of www/qt6-webengine | qt6-webengine-6.9.3 ended at Sun Nov 30 05:40:02 -00 2025
    build time: 2D:05:33:52


    https://pkg-status.freebsd.org/ampere3/build.html?mastername=135arm64-default&build=f5384fe59be6

    build of www/qt6-webengine | qt6-webengine-6.9.3 ended at Sat Nov 22 15:33:34 -00 2025
    build time: 1D:14:43:41


    Expanding the notes to before and after jemalloc 5.3.0
    was merged to main: beefy18 was the main-amd64 builder
    before and somewhat after the jemalloc 5.3.0 merge from
    vendor branch:

    Before: p2650762431ca_s51affb7e971 261:29:13 building 36074 port-packages, start 05 Aug 2025 01:10:59 GMT
    ( jemalloc 5.3.0 merge from vendor branch: 15 Aug 2025)
    After : p9652f95ce8e4_sb45a181a74c 428:49:20 building 36318 port-packages, start 19 Aug 2025 01:30:33 GMT

    (The log files are long gone for port-packages built.)

    main-15 used a debug jail world but 15.0-RELEASE does not.

    I'm not aware of such a port-package builder context for a
    non-debug jail world before and after a jemalloc 5.3.0 merge.

    A few months before I landed the jemalloc patches, i did 4 or 5 from dirt buildworlds. The elasped time was, iirc, with 1 or 2%. Enough to see maybe a diff with the small sample size, but not enough for ministat to trigger at 95%. I didn't recall keeping the data for this and can't find it now. And I'm not even sure, in hindsight, I ran a good experiment. It might be related, or not, but it would be easy enough for someone to setup a two jails: one just before and one just after. Build from scratch the world (same hash) on both. That would test it since you'd be holding all other variables constant.

    When we imported the tip of FreeBSD main at work, we didn't get a cpu change trigger from our tests that I recall...
    The range of commits look like:
    rCo git: 9a7c512a6149 - main - ucred groups: restore a useful comment Eric van Gyzen
    rCo git: bf6039f09a30 - main - jemalloc: Unthin contrib/jemalloc Warner Losh
    rCo git: a0dfba697132 - main - jemalloc: Update jemalloc.xml.in per FreeBSD-diffs Warner Losh
    rCo git: 718b13ba6c5d - main - jemalloc: Add FreeBSD's updates to jemalloc_preamble.h.in Warner Losh
    rCo git: 6371645df7b0 - main - jemalloc: Add JEMALLOC_PRIVATE_NAMESPACE for the libc namespace Warner Losh
    rCo git: da260ab23f26 - main - jemalloc: Only replace _pthread_mutex_init_calloc_cb in private namespace Warner Losh
    rCo git: c43cad871720 - main - jemalloc: Merge from jemalloc 5.3.0 vendor branch Warner Losh
    rCo git: 69af14a57c9e - main - jemalloc: Note update in UPDATING and RELNOTES Warner Losh
    I've started a build of a non-debug 9a7c512a6149 world
    to later create a chroot to do a test buildworld in.
    I'll also do a build of a non-debug 69af14a57c9e world
    to later create the other chroot to do a test
    buildworld in.
    non-debug means my use of:
    WITH_MALLOC_PRODUCTION=
    WITHOUT_ASSERT_DEBUG=
    WITHOUT_PTHREADS_ASSERTIONS=
    WITHOUT_LLVM_ASSERTIONS=
    I've used "env WITH_META_MODE=" as it cuts down on the
    volume and frequency of scrolling output. I'll do the
    same later.
    If there is anything you want controlled in a different
    way, let me know.
    The Windows Dev Kit 2023 is booted (world and kernel)
    with:
    # uname -apKU
    FreeBSD aarch64-main-pbase 16.0-CURRENT FreeBSD 16.0-CURRENT main-n281922-4872b48b175c GENERIC-NODEBUG arm64 aarch64 1600004 1600004
    which is from an official pkgbase distribution. So the
    boot-world is a debug world but the boot-kernel is not.
    The Windows Dev Kit 2023 will take some time for such
    -j8 builds and I may end up sleeping in the middle of
    the sequence someplace. So it may be a while before
    I've any comparison/contrast data to report.
    ===
    Mark Millard
    marklmi at yahoo.com
    --
    Posted automagically by a mail2news gateway at muc.de e.V.
    Please direct questions, flames, donations, etc. to news-admin@muc.de
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Konstantin Belousov@kib@freebsd.org to muc.lists.freebsd.stable on Mon Dec 8 03:59:17 2025
    From Newsgroup: muc.lists.freebsd.stable

    On Mon, Dec 08, 2025 at 03:51:05AM +0200, Rozhuk Ivan wrote:
    On Mon, 8 Dec 2025 02:15:33 +0200
    Konstantin Belousov <kib@freebsd.org> wrote:

    Next, the change of llvm components to dynamically link with the llvm
    libs is how upstream does it. Not to mention, that this way to use clang+lld saves both disk space (not very important) and memory (much
    more important).

    It waste time and energy = money waster, "multiply CO2 production".
    And there is nothing good to user to pay this price.


    I have:

    # pkg version -vI | grep llvm
    libclc-llvm15-15.0.7 = up-to-date with index
    llvm15-15.0.7_10 = up-to-date with index
    llvm17-17.0.6_8 = up-to-date with index
    llvm18-18.1.8_2 = up-to-date with index
    llvm19-19.1.7_1 = up-to-date with index

    there is no any crappy libprivateclang.so/libprivatellvm.so shared libs:

    # ldd /usr/local/llvm19/bin/clang-19
    /usr/local/llvm19/bin/clang-19:
    libthr.so.3 => /lib/libthr.so.3 (0x801063000)
    libclang-cpp.so.19.1 => /usr/local/llvm19/bin/../lib/libclang-cpp.so.19.1 (0x801200000)
    libLLVM.so.19.1 => /usr/local/llvm19/bin/../lib/libLLVM.so.19.1 (0x805c00000)
    Did you noted this line?

    libc++.so.1 => /lib/libc++.so.1 (0x801092000)
    libcxxrt.so.1 => /lib/libcxxrt.so.1 (0x80119b000)
    libm.so.5 => /lib/libm.so.5 (0x8011bd000)
    libc.so.7 => /lib/libc.so.7 (0x80d663000)
    librt.so.1 => /lib/librt.so.1 (0x805bcb000)
    libexecinfo.so.1 => /usr/lib/libexecinfo.so.1 (0x805bd4000)
    libz.so.6 => /lib/libz.so.6 (0x805bda000)
    libzstd.so.1 => /usr/local/lib/libzstd.so.1 (0x80d963000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x80da38000)
    libelf.so.2 => /lib/libelf.so.2 (0x80da59000)
    [vdso] (0x7ffffffff000)

    But
    # ls /usr/bin/cc
    -r-xr-xr-x 6 root wheel 82M Oct 19 18:10:39 2025 /usr/bin/cc*
    # ls /usr/local/llvm19/bin/clang-19
    -rwxr-xr-x 2 root wheel 125K Aug 18 06:43:31 2025 /usr/local/llvm19/bin/clang-19*
    So it dynamic linked....
    ....
    And we found in port:
    CMAKE_ARGS= -DLLVM_BUILD_LLVM_DYLIB=ON
    CMAKE_ARGS+= -DLLVM_LINK_LLVM_DYLIB=ON
    (exist from first llvm6 372b8a151352984140f74c342a62eae2236b2c2c and copy-pasted to all next llvm~s by brooks@FreeBSD.org)

    According to: https://llvm.org/docs/CMake.html =============================================================================================
    BUILD_SHARED_LIBS is only recommended for use by LLVM developers.
    If you want to build LLVM as a shared library, you should use the LLVM_BUILD_LLVM_DYLIB option.
    =============================================================================================

    So upstream DOES NOT RECOMMEND to build shared libs to users!!!
    I am curious about the motivation.

    JFYI, shared llvm libs are required for lot of things. The incomplete
    list of examples that I am aware of are dri drivers and ispc Intel compiler.


    Why FBSD use shared libs for LLVM in ports and now in base!???

    @brooks - why do you do that?


    The implied load on rtld is something that could be handled: there is definitely no need to have such huge surface of exported symbols on
    both libllvm and esp. libclang. Perhaps by default the internal
    libraries can use protected symbols, normally C++ do not rely on interposing. But such 'fixes' must occur at upstream.

    So far all the clang toolchain changes were aligning it with what the
    llvm project does.


    No, upstream does not recommend to use shared libs to llvm users.


    --
    Posted automagically by a mail2news gateway at muc.de e.V.
    Please direct questions, flames, donations, etc. to news-admin@muc.de
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Konstantin Belousov@kib@freebsd.org to muc.lists.freebsd.stable on Mon Dec 8 13:38:57 2025
    From Newsgroup: muc.lists.freebsd.stable

    On Mon, Dec 08, 2025 at 07:45:40AM +0000, Poul-Henning Kamp wrote:
    --------
    Konstantin Belousov writes:

    JFYI, shared llvm libs are required for lot of things. The incomplete
    list of examples that I am aware of are dri drivers and ispc Intel compiler.

    But installing the shared libs for those other users, does not mean we have to link the compiler itself against the shared lib ?

    Sure, we do not have to.

    But there are other benefits from linking the libraries dynamically.
    E.g. the same (?) user shed crocodile tears over memory usage by 64bit
    system, and linking libllvm dynamically exactly reduces the memory profile
    by sharing significant part of text for cc, lld, and minor binutils.


    --
    Posted automagically by a mail2news gateway at muc.de e.V.
    Please direct questions, flames, donations, etc. to news-admin@muc.de
    --- Synchronet 3.21a-Linux NewsLink 1.2