• Re: performance regressions in 15.0 [14.3-STABLE performs much like 15.0-RELEASE for my aarch64 devel/cmake-core build tests, not like 14.3-RELEASE]

    From Mark Millard@marklmi@yahoo.com to muc.lists.freebsd.stable on Tue Dec 9 20:10:53 2025
    From Newsgroup: muc.lists.freebsd.stable

    On Dec 9, 2025, at 17:15, Mark Millard <marklmi@yahoo.com> wrote:
    On Dec 9, 2025, at 12:32, Mark Millard <marklmi@yahoo.com> wrote:

    On Dec 9, 2025, at 07:22, Rozhuk Ivan <rozhuk.im@gmail.com> wrote:

    On Mon, 8 Dec 2025 09:23:52 -0800
    Mark Millard <marklmi@yahoo.com> wrote:

    But, as of yet, I've no good evidence for blaming
    jemalloc as a major contributor to those timing
    ratios --or for blaming any other specific part
    of 15.0 .

    If you want to bench jmalloc - there is another ways to do that without building something.
    Try to find some sythetic benchmarks.
    Also jmalloc can be build without OS rebuild and linked with bench.

    This 2 things can reduce time to tests, but it will eliminate OS integation factors.
    Run same bench on different OS may give more info.


    [I've eliminated direct Email to most everyone
    for this reply. There is not even minor new
    technical content.]

    At this point I'm more likely to explore if I
    get similar ratios as ampere[13] do for some
    port-package builds that have the large ratios on
    ampere[13]. There are examples that are not as
    overall time consuming for ampere[13] as what I've
    already referenced (but are still non-trivial for
    the time taken). As stands, I do not have a good
    reproduce-the-issue context, much less one with
    build time frames I'd be willing to deal with in
    my environment.

    Time-ratios similar to the ampere[13] ones for
    15.0 vs. 14.3 (or 13.5) were easily repeatable
    on the Microsoft Windows Dev Kit 2023 for doing
    poudriere builds of the examples that I tried.

    port-package builds tested for below: devel/cmake-core
    TMPFS_BLACKLIST empty
    ALLOW_MAKE_JOBS= in use (no explicit MAKE_JOBS_NUMBER like restrictions)
    UFS context (except for what USE_TMPFS=all does in poudriere)
    The below did not update /usr/ports/distfiles/ .

    This does some exploration of USE_TMPFS=no vs.
    USE_TMPFS=all as well, starting with
    USE_TMPFS=no .

    Listed in the sequence executed, first time
    runs shown first:


    USE_TMPFS=no . . .
    (Note: The first times had other port-packages to build first.)

    15.0 poudriere jail:
    [00:37:37] [01] [00:12:30] Finished devel/cmake-core | cmake-core-3.31.9: Success

    14.3 poudriere jail:
    [00:28:26] [01] [00:09:38] Finished devel/cmake-core | cmake-core-3.31.9: Success

    Approx. 1.30 time ratio (15.0's 12:30 / 14.3's 9:38)


    USE_TMPFS=all (no tmpfs black list) . . .

    14.3 poudriere jail:
    [00:09:32] [03] [00:09:24] Finished devel/cmake-core | cmake-core-3.31.9: Success

    15.0 poudriere jail:
    [00:12:45] [03] [00:12:34] Finished devel/cmake-core | cmake-core-3.31.9: Success

    Approx. 1.34 time ratio (15.0's/14.3's)


    The following also prefixed the poudriere bulk -C command
    with: time -l

    15.0 poudriere jail:
    [00:12:36] [04] [00:12:25] Finished devel/cmake-core | cmake-core-3.31.9: Success
    . . .
    757.10 real 4613.06 user 251.09 sys
    866580 maximum resident set size
    131 average shared memory size
    27 average unshared data size
    234 average unshared stack size
    31148816 page reclaims
    0 page faults
    0 swaps
    14 block input operations
    36 block output operations
    37061 messages sent
    33671 messages received
    1758 signals received
    143987 voluntary context switches
    167515 involuntary context switches

    14.3 poudriere jail:
    [00:09:23] [01] [00:09:15] Finished devel/cmake-core | cmake-core-3.31.9: Success
    . . .
    564.48 real 3449.89 user 204.14 sys
    822900 maximum resident set size
    64692 average shared memory size
    791 average unshared data size
    235 average unshared stack size
    28153497 page reclaims
    0 page faults
    0 swaps
    9 block input operations
    12 block output operations
    34180 messages sent
    31539 messages received
    1758 signals received
    131899 voluntary context switches
    132775 involuntary context switches

    Approx. 1.34 time ratio (15.0's/14.3's)


    USE_TMPFS=no . . . (again)

    15.0 poudriere jail:
    [00:13:01] [04] [00:12:27] Finished devel/cmake-core | cmake-core-3.31.9: Success
    . . .
    784.89 real 4596.42 user 257.12 sys
    866600 maximum resident set size
    128 average shared memory size
    25 average unshared data size
    234 average unshared stack size
    31194466 page reclaims
    2371 page faults
    0 swaps
    3573 block input operations
    6687 block output operations
    37643 messages sent
    33840 messages received
    1756 signals received
    241548 voluntary context switches
    304249 involuntary context switches

    14.3 poudriere jail:
    [00:09:49] [04] [00:09:18] Finished devel/cmake-core | cmake-core-3.31.9: Success
    . . .
    592.83 real 3446.18 user 207.61 sys
    823880 maximum resident set size
    64712 average shared memory size
    787 average unshared data size
    236 average unshared stack size
    28176650 page reclaims
    2374 page faults
    0 swaps
    3481 block input operations
    5148 block output operations
    34521 messages sent
    31580 messages received
    1758 signals received
    218881 voluntary context switches
    255193 involuntary context switches

    Approx. 1.34 time ratio (15.0's/14.3's)


    Only some port-packages have time-ratios
    near 1.34. For example, building lang/gcc15
    does not on ampere[13]: closer to 1.1 as
    I remember. (For the most part, lang/gcc15
    does most of its own building based on a
    smaller amount of clang-built code
    to bootstrap.)


    For reference:

    # poudriere jail -l
    JAILNAME VERSION OSVERSION ARCH METHOD TIMESTAMP PATH
    release14-aarch64 14.3-RELEASE-p6 1403000 arm64.aarch64 ftp-archive 2025-12-09 12:54:06 /usr/local/poudriere/jails/release14-aarch64
    . . .
    release-aarch64 15.0-RELEASE 1500068 aarch64 pkgbase 2025-12-06 11:34:39 /usr/local/poudriere/jails/release-aarch64
    . . .

    # ~/fbsd-based-on-what-commit.sh -C /usr/ports
    bb7b77417165 (HEAD -> main, freebsd/main, freebsd/HEAD) www/hurl: update 7.0.0 -> 7.1.0
    Author: Rodrigo Osorio <rodrigo@FreeBSD.org>
    Commit: Rodrigo Osorio <rodrigo@FreeBSD.org>
    CommitDate: 2025-11-28 23:11:52 +0000
    branch: main
    merge-base: bb7b774171651eea0dc56376c225fe976231daa5
    merge-base: CommitDate: 2025-11-28 23:11:52 +0000
    n726888 (--first-parent --count for merge-base)

    # uname -apKU
    FreeBSD aarch64-main-pbase 16.0-CURRENT FreeBSD 16.0-CURRENT main-n281922-4872b48b175c GENERIC-NODEBUG arm64 aarch64 1600004 1600004

    (That last was an official pkgbase distribution.)
    14.3-STABLE does not have jemalloc 5.3.0 or libsys
    but performs like 15.0-RELEASE, not 14.3-RELEASE
    for the aarch64 devel/cmake-core build tests.
    But 14.3-STABLE does have:
    # ldd /usr/local/poudriere/jails/official14-aarch64/usr/bin/cc /usr/local/poudriere/jails/official14-aarch64/usr/bin/cc:
    libprivateclang.so.19 => /usr/lib/libprivateclang.so.19 (0x732e0d600000)
    libprivatellvm.so.19 => /usr/lib/libprivatellvm.so.19 (0x732e12600000) . . .
    while 14.3-RELEASE does not.
    (Another data point is that lang/gcc15 does not have
    nearly as large of a time-ratio vs. 14.3-RELEASE
    in the data from ampere[13] .)
    Details from the Microsoft Dev Kit 2023 experiments
    . . .
    I've collected a sequence for a new poudriere jail
    to compare/contrast with:
    # poudriere jail -l
    JAILNAME VERSION OSVERSION ARCH METHOD TIMESTAMP PATH
    . . .
    official14-aarch64 14.3-STABLE 1403506 arm64.aarch64 freebsdci 2025-12-09 18:24:20 /usr/local/poudriere/jails/official14-aarch64
    . . .
    (ampere[13] do not have examples of recent 14.3-STABLE builds at this point.) USE_TMPFS=no . . .
    (Note: The first times had other port-packages to build first.
    But the system still has the cached the file system data.)
    stable/14 poudriere jail:
    [00:36:29] [01] [00:12:31] Finished devel/cmake-core | cmake-core-3.31.9: Success
    So: 12:31 is far more like 15.0-RELEASE
    USE_TMPFS=all . . .
    stable/14 poudriere jail:
    [00:12:21] [07] [00:12:10] Finished devel/cmake-core | cmake-core-3.31.9: Success
    . . .
    742.70 real 4586.53 user 248.37 sys
    864996 maximum resident set size
    133 average shared memory size
    24 average unshared data size
    235 average unshared stack size
    30958626 page reclaims
    0 page faults
    0 swaps
    456 block input operations
    80 block output operations
    35920 messages sent
    33223 messages received
    1760 signals received
    140580 voluntary context switches
    164112 involuntary context switches
    So: 12:10 is far more like 15.0-RELEASE
    stable/14 poudriere jail (again):
    [00:12:30] [08] [00:12:19] Finished devel/cmake-core | cmake-core-3.31.9: Success
    . . .
    751.98 real 4604.85 user 251.40 sys
    866056 maximum resident set size
    125 average shared memory size
    21 average unshared data size
    235 average unshared stack size
    30976603 page reclaims
    0 page faults
    0 swaps
    20 block input operations
    11 block output operations
    36297 messages sent
    33327 messages received
    1761 signals received
    144213 voluntary context switches
    166975 involuntary context switches
    So: 12:19 is far more like 15.0-RELEASE
    USE_TMPFS=no . . .
    (Note: The first times had other port-packages to build first.)
    stable/14 poudriere jail:
    [00:13:16] [05] [00:12:49] Finished devel/cmake-core | cmake-core-3.31.9: Success
    . . .
    799.95 real 4626.06 user 261.49 sys
    865940 maximum resident set size
    134 average shared memory size
    24 average unshared data size
    235 average unshared stack size
    31110419 page reclaims
    2380 page faults
    0 swaps
    3577 block input operations
    6262 block output operations
    37253 messages sent
    33801 messages received
    1758 signals received
    236161 voluntary context switches
    312615 involuntary context switches
    So: 12:49 is far more like 15.0-RELEASE
    (Nice to have a known repeatable context to try
    variations with.)
    ===
    Mark Millard
    marklmi at yahoo.com
    --
    Posted automagically by a mail2news gateway at muc.de e.V.
    Please direct questions, flames, donations, etc. to news-admin@muc.de
    --- Synchronet 3.21a-Linux NewsLink 1.2