• [gentoo-dev] [RFC] Future changes to LLVM eclasses (or how do you use L

    From =?UTF-8?Q?Micha=C5=82_G=C3=B3rny?=@21:1/5 to All on Tue Dec 3 16:40:01 2024
    Hello,

    TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
    and I'd like to replace that with something better (possibly in llvm- r2.eclass, given how fragile this thing is). So I'd like to discuss
    potential "better" solutions -- and particularly ask you what your LLVM-
    using packages need.


    Background
    ==========

    The current logic goes way back to llvm.eclass, and EAPIs that did not
    have native cross-build support. Back then, prepending the slotted LLVM
    bindir to PATH was the obvious way of getting software to find the right
    LLVM version.

    When I added EAPI 7 support, I went for prepending the following thing
    to PATH:

    ${ESYSROOT}/usr/lib/llvm/.../bin

    People doing cross will clearly notice the mistake here -- it's using
    binaries from ESYSROOT rather than BROOT! Except it's not a mistake,
    but an ugly hack. What we're doing here is:

    1. Relying on a fancy CMake behavior of locating CMake files relative to
    PATH, and

    2. Relying on the package either not caring about LLVM executables or
    the system not being able to execute the executables in ESYSROOT
    and gracefully falling back to other locations in PATH.

    So what we're really doing is implicitly telling CMake to use:

    ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake

    Yes, it's awful. And yes, it already did backfire in the past, so I've
    ended up adding quite a complex logic to prevent these path
    manipulations from overriding the toolchain set by user. For example,
    if the user has CC=clang, that normally evalutes to clang-19, we now
    adjust CC so that it suddenly doesn't switch to clang-17 because
    the package uses libLLVM-17. Meh.

    When working on llvm-r1, I've focused on the more immediate problem of
    horribly complex and broken package dependencies, and forgot about this.
    I've only recalled the problem during the initial rust.eclass reviews,
    since it happened to copy that incorrect logic.


    Future options
    ==============

    Some of the options that already popped up during discussions include:

    1. Stopping to export pkg_setup() entirely, and expecting people to
    explicitly pass the LLVM path to the build system, e.g. something like:

    -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"

    2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
    and so on for CMake, or perhaps CMAKE_PREFIX_PATH).

    3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
    ${PATH} instead of the whole LLVM tree. Note that we'd need to write
    our own since llvm-config is an executable, so we can't run the one from ESYSROOT, and we can't rely on BROOT having a match (or don't want to
    force a second copy of LLVM unnecessarily).

    Any other ideas? How does your package select LLVM version, and which
    of these options would work best for you?


    --
    Best regards,
    Michał Górny


    -----BEGIN PGP SIGNATURE-----

    iQFGBAABCgAwFiEEx2qEUJQJjSjMiybFY5ra4jKeJA4FAmdPJJ0SHG1nb3JueUBn ZW50b28ub3JnAAoJEGOa2uIyniQOVGkH/0wJGS1ZHMg4IxEPzeRLtGw/6p/SjuDu QuD3ixOsS4xiPwbdZay4J0D4A+tCpQjhxWYEZM+eLy5J8XPpJpkm61wfUBLD347+ eES3znrMaO3mEX54FjQuwBz6Fjbtg1JbVWbAhsEQlEWbq0avr2vNumUxVuD3KzM4 ligDxmVIr8s26i2jksYPSnRtNCkNGwoKiQxcW5i08lVcJ5Q14sSgCPuZoltwg/Zk 2fPGKRefst2/DKhSUCaA8rMcAVaNj/24lTr9lUJwc7a9bOP2aN0N1AIA+RmqkRyz mY1Pr4eGUepOYE/persvSogTAcRFY4nzRvHpA+FUERcLafq4hlT9SHA=
    =ywro
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Gerion Entrup@21:1/5 to All on Tue Dec 3 17:29:39 2024
    Hi,

    this is not from a Gentoo packaging perspective but from a developer perspective that needs to compile with and against LLVM on several distributions. For building a package against LLVM, LLVM offers two possibilities: llvm-config and their CMake files. I only have experience
    with the first one. LLVM furthermore does not offer a consistent way to
    install their binaries (including llvm-config) in different versions, so
    most Linux distributions does that in a different way.

    Therefore, we needed to find a way to tell the build system itself,
    how to manage this. In a concrete example, we are doing this for ARA/PARROT here [1] (see the native-<distribution>.ini files). Meson is made aware
    of the path to llvm-config with a distribution specific config file.
    Then Meson discovers other binaries with the help of llvm-config [2, 3]. Overall, this works well but needs work per distribution.

    Here is a Meson bug that I created to classify the solutions of
    different Linux distributions [4]. I also created a Gentoo bug for it
    some time ago (were you recommended for an upstream fix) [5]. Here is the
    old LLVM bug for the same problem (I do not know if they transferred
    it to Github) [6].

    I have no clear solution to the problem. I wish that LLVM itself would
    create versioned symlinks to all of their binary tools that distribution
    could install in /usr/bin and build systems can use to find specific
    versions of LLVM libraries.

    Kind regards
    Gerion

    [1] https://github.com/luhsra/PARROT
    [2] https://github.com/luhsra/ara-toolchains/blob/9a3570017a8a61cf078ed5142d272ed279f8d112/meson.build#L10
    [3] https://mesonbuild.com/Dependencies.html#llvm
    [4] https://github.com/mesonbuild/meson/issues/5370
    [5] https://bugs.gentoo.org/677504
    [5] https://bugs.llvm.org/show_bug.cgi?id=41794

    Am Dienstag, 3. Dezember 2024, 16:32:45 MEZ schrieb Michał Górny:
    Hello,

    TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
    and I'd like to replace that with something better (possibly in llvm- r2.eclass, given how fragile this thing is). So I'd like to discuss potential "better" solutions -- and particularly ask you what your LLVM- using packages need.


    Background
    ==========

    The current logic goes way back to llvm.eclass, and EAPIs that did not
    have native cross-build support. Back then, prepending the slotted LLVM bindir to PATH was the obvious way of getting software to find the right
    LLVM version.

    When I added EAPI 7 support, I went for prepending the following thing
    to PATH:

    ${ESYSROOT}/usr/lib/llvm/.../bin

    People doing cross will clearly notice the mistake here -- it's using binaries from ESYSROOT rather than BROOT! Except it's not a mistake,
    but an ugly hack. What we're doing here is:

    1. Relying on a fancy CMake behavior of locating CMake files relative to PATH, and

    2. Relying on the package either not caring about LLVM executables or
    the system not being able to execute the executables in ESYSROOT
    and gracefully falling back to other locations in PATH.

    So what we're really doing is implicitly telling CMake to use:

    ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake

    Yes, it's awful. And yes, it already did backfire in the past, so I've
    ended up adding quite a complex logic to prevent these path
    manipulations from overriding the toolchain set by user. For example,
    if the user has CC=clang, that normally evalutes to clang-19, we now
    adjust CC so that it suddenly doesn't switch to clang-17 because
    the package uses libLLVM-17. Meh.

    When working on llvm-r1, I've focused on the more immediate problem of horribly complex and broken package dependencies, and forgot about this.
    I've only recalled the problem during the initial rust.eclass reviews,
    since it happened to copy that incorrect logic.


    Future options
    ==============

    Some of the options that already popped up during discussions include:

    1. Stopping to export pkg_setup() entirely, and expecting people to explicitly pass the LLVM path to the build system, e.g. something like:

    -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"

    2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
    and so on for CMake, or perhaps CMAKE_PREFIX_PATH).

    3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
    ${PATH} instead of the whole LLVM tree. Note that we'd need to write
    our own since llvm-config is an executable, so we can't run the one from ESYSROOT, and we can't rely on BROOT having a match (or don't want to
    force a second copy of LLVM unnecessarily).

    Any other ideas? How does your package select LLVM version, and which
    of these options would work best for you?





    -----BEGIN PGP SIGNATURE-----

    iQGzBAABCAAdFiEEM/tVN9WpYYHnPZHxloeAdSYJHeoFAmdPMfMACgkQloeAdSYJ HersLwwAkiD2zXOn9Ke/C1+D+11Jfo0RfCSGMOytEKkbYCHvhkF/BxC1458chVEp saMzvR+zxH52rxbZlUFwwS8TEqKb6hluyCj4OOVC8gTPnWzja6Z4tsi5i8AXeE6u V9AVKDAWywbAU4vh3EBzaTZEYSBagXoh1lQ/4rbrs9padgIAyf6J8c0xUCD1zPvC 79VakF+PsKLLBv6L7mMlezIIOLqXXVBclNyltxxmsQN6E2kXWiubjJusu1WMOzZg /sDfqdOg4iuMl4emOpg0BaJG2kdzZckfv3vO4GhvtBp2CpysW9YyENnoTrWn1EzI q4Yarr9spsmJQHTubq8wkKiYfW4cp200uK8Jue43/j8zegrER0sKm03mUsUVTUey RxQ5HlXuPuRUc/jXWN0Z5w7SlIFPQZs1dKCjqUUalbGtRMaef+MxUwdOaoeHpB99 2fBLqaOvoMA433aBDKEaH352YEyp/hQUvI8IFeRQddt389zYom8ThTKIyGWvysbu
    TrzNtHvt
    =tsyW
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matt Jolly@21:1/5 to All on Wed Dec 4 02:20:01 2024
    Hi Michał,

    I use llvm-r1 in a few packages, and for the intended purpose of
    consistently selecting and depending on a specific LLVM I've had
    no major issues. Overall things work well, and the addition of
    LLVM_SLOT USE_EXPAND for -r1 has made influencing the selection
    as an end-user (and developer) so much more straightforward.

    I don't think that the Rust eclass could work properly without
    llvm-r1 given how tightly coupled dev-lang/rust is its vendored
    LLVM version and the issues that we've encountered mixing those.

    I'm not opposed to any of the options you've presented; they seem
    reasonable and an improvement over the current situation.

    At a high level:

    - option 1: seems to put a lot of the burden on package maintainers to
    ensure that their build system is set up to support this and may
    require upstream changes.
    - option 2: Seems "fine" for CMake based projects, but I have concerns
    about how other build systems will be catered to; is this something
    you could elaborate on - how might a non-CMake build system consume
    more generic variables? Are these widely used/supported and I'm just
    unaware of it?
    - option 3: Seems quite straightforward, and I can see this being quite
    flexible in terms of being called within an ebuild if necessary
    (though consuming LLVM_SLOT might get ebuilds most of the way there?).

    Overall perhaps some combination of options 2 and 3 might be the easiest
    thing for eclass consumers to use flexibly at the cost of additional
    eclass complexity. I'm interested in how others feel about this.

    I wonder if there's some space for catering to those packages which
    (ab)use LLVM_COMPAT as a proxy for 'Only these Clang versions are
    supported' - usually to get `llvm_gen_dep` for appropriate toolchain components.

    For www-client/chromium, where we force `CC=clang` because it's the only supported path upstream (and I simply don't have it in me to maintain
    and GCC patches for three channels a week), I have been stung a few
    times re: PATH manipulation where, for example, on an ~arch system with multiple LLVM slots installed, and LTO enabled:

    1. `CC=clang` is set, then `llvm-r1_pkg_setup` is called.
    2. first llvm-r1 fixes CC=clang to CC=clang-19 because that's the latest
    in PATH.
    3. llvm-r1 uses LLVM_SLOT from the profile and does PATH manipulation
    4. Compilation proceeds normally, however at link time `lld` is called
    from the prefixed `/usr/lib/llvm/18/bin` resulting in an error like:
    '... (Producer: 'LLVM19.1.4' Reader: 'LLVM 18.1.8')`

    I suspect that this may come up on other systems where `CC=clang` is set
    via make.conf and LTO is enabled (which is a good argument for avoiding
    PATH manipulation by default).

    I've worked around this in Chromium where we now call
    `llvm-r1_pkg_setup` _then_ set CC and friends to include `LLVM_SLOT`
    to enable consistent selection of tooling via `llvm_slot_x` USE. I see
    some value in providing eclass consumers with a mechanism to select
    appropriate Clang toolchain components consistently, be it an additional variable or some manually-called `clang_setup` function that follows
    much of the existing LLVM path prefix logic.

    To play devil's advocate, I admit that Chromium (and maybe Firefox) are probably the only packages to have a _need_ to force a Clang toolchain
    (due to overheads and the need to get security updates for web browsers
    to users quickly), and both can continue to do this outside the eclass -
    it's the "LLVM eclass" not "Clang eclass" after all.

    I don't really have strong opinions for packages that I maintain; I
    actually need to go prod an upstream because they still only support
    LLVM >14, so thanks for the reminder! I'm interested in seeing how
    others use LLVM in packages and their opinions.

    Hopefully some of this was useful!

    Cheers,

    Matt


    On 4/12/24 01:32, Michał Górny wrote:
    Hello,

    TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
    and I'd like to replace that with something better (possibly in llvm- r2.eclass, given how fragile this thing is). So I'd like to discuss potential "better" solutions -- and particularly ask you what your LLVM- using packages need.


    Background
    ==========

    The current logic goes way back to llvm.eclass, and EAPIs that did not
    have native cross-build support. Back then, prepending the slotted LLVM bindir to PATH was the obvious way of getting software to find the right
    LLVM version.

    When I added EAPI 7 support, I went for prepending the following thing
    to PATH:

    ${ESYSROOT}/usr/lib/llvm/.../bin

    People doing cross will clearly notice the mistake here -- it's using binaries from ESYSROOT rather than BROOT! Except it's not a mistake,
    but an ugly hack. What we're doing here is:

    1. Relying on a fancy CMake behavior of locating CMake files relative to PATH, and

    2. Relying on the package either not caring about LLVM executables or
    the system not being able to execute the executables in ESYSROOT
    and gracefully falling back to other locations in PATH.

    So what we're really doing is implicitly telling CMake to use:

    ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake

    Yes, it's awful. And yes, it already did backfire in the past, so I've
    ended up adding quite a complex logic to prevent these path
    manipulations from overriding the toolchain set by user. For example,
    if the user has CC=clang, that normally evalutes to clang-19, we now
    adjust CC so that it suddenly doesn't switch to clang-17 because
    the package uses libLLVM-17. Meh.

    When working on llvm-r1, I've focused on the more immediate problem of horribly complex and broken package dependencies, and forgot about this.
    I've only recalled the problem during the initial rust.eclass reviews,
    since it happened to copy that incorrect logic.


    Future options
    ==============

    Some of the options that already popped up during discussions include:

    1. Stopping to export pkg_setup() entirely, and expecting people to explicitly pass the LLVM path to the build system, e.g. something like:

    -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"

    2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
    and so on for CMake, or perhaps CMAKE_PREFIX_PATH).

    3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
    ${PATH} instead of the whole LLVM tree. Note that we'd need to write
    our own since llvm-config is an executable, so we can't run the one from ESYSROOT, and we can't rely on BROOT having a match (or don't want to
    force a second copy of LLVM unnecessarily).

    Any other ideas? How does your package select LLVM version, and which
    of these options would work best for you?



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Le Cuirot@21:1/5 to All on Wed Dec 4 14:20:01 2024
    On Tue, 2024-12-03 at 16:32 +0100, Michał Górny wrote:
    Hello,

    TL;DR: the way llvm/llvm-r1 eclasses currently mangle PATH is broken,
    and I'd like to replace that with something better (possibly in llvm- r2.eclass, given how fragile this thing is). So I'd like to discuss potential "better" solutions -- and particularly ask you what your LLVM- using packages need.


    Background
    ==========

    The current logic goes way back to llvm.eclass, and EAPIs that did not
    have native cross-build support. Back then, prepending the slotted LLVM bindir to PATH was the obvious way of getting software to find the right
    LLVM version.

    When I added EAPI 7 support, I went for prepending the following thing
    to PATH:

    ${ESYSROOT}/usr/lib/llvm/.../bin

    People doing cross will clearly notice the mistake here -- it's using binaries from ESYSROOT rather than BROOT! Except it's not a mistake,
    but an ugly hack. What we're doing here is:

    1. Relying on a fancy CMake behavior of locating CMake files relative to PATH, and

    2. Relying on the package either not caring about LLVM executables or
    the system not being able to execute the executables in ESYSROOT
    and gracefully falling back to other locations in PATH.

    So what we're really doing is implicitly telling CMake to use:

    ${ESYSROOT}/usr/lib/llvm/.../lib*/cmake

    Yes, it's awful. And yes, it already did backfire in the past, so I've
    ended up adding quite a complex logic to prevent these path
    manipulations from overriding the toolchain set by user. For example,
    if the user has CC=clang, that normally evalutes to clang-19, we now
    adjust CC so that it suddenly doesn't switch to clang-17 because
    the package uses libLLVM-17. Meh.

    When working on llvm-r1, I've focused on the more immediate problem of horribly complex and broken package dependencies, and forgot about this.
    I've only recalled the problem during the initial rust.eclass reviews,
    since it happened to copy that incorrect logic.


    Future options
    ==============

    Some of the options that already popped up during discussions include:

    1. Stopping to export pkg_setup() entirely, and expecting people to explicitly pass the LLVM path to the build system, e.g. something like:

    -DLLVM_CMAKE_PATH="$(get_llvm_prefix -d)"

    2. Setting specific environment variables (such as LLVM_ROOT, CLANG_ROOT
    and so on for CMake, or perhaps CMAKE_PREFIX_PATH).

    3. Creating a minimal llvm-config wrapper in ${T}, and adding it to
    ${PATH} instead of the whole LLVM tree. Note that we'd need to write
    our own since llvm-config is an executable, so we can't run the one from ESYSROOT, and we can't rely on BROOT having a match (or don't want to
    force a second copy of LLVM unnecessarily).

    Any other ideas? How does your package select LLVM version, and which
    of these options would work best for you?

    I did some up with something similar to #3 back in 2019, but you were so dead against it that I threw that work away. It wrapped around BROOT's llvm-config, which you don't want to do, but that didn't seem to be the part you had a problem with. Doing it that way would be easier, and maybe not such a big deal since you need BROOT to having a matching slot to build LLVM in the first place. Writing something based around pkg-config would be nicer though. I'd be happy with either. I might even be able to help out. :)

    -----BEGIN PGP SIGNATURE-----

    iQJFBAABCAAvFiEEPxcZ3tkwcedKm2a8EiZBXQDdMTcFAmdQVUQRHGNoZXdpQGdl bnRvby5vcmcACgkQEiZBXQDdMTcNdw//YIpbRHG+mvh+DJrYpOkO3viqTPlZ+1Se 5GjyiqWJLPO4GWWABPgQHikKhlIYtR24rvZbxqMkdN4Cm6fZzEfuZyA2sbHl2ioO iPQwsSL6qCP4BUJF6ZYfXWMPGQbV+JR/olqJfPTIrfc3W3T3mkUwe8c6Jzgt7Z75 ASJQp/gH9O4CJIMi23o/+lk6YReVrREK5U5rF0HTWfCPLEaID5+yoaICtT9Fm9q1 EgAjD43QxNZDnsXGPh0WU8dSDyQ/t5wzu4jdIfWt17EFOqXDc563JSeOEAkz42Cv hnvjKSo7/Roy2lYT/M7mdQXTg8AdDiK65V9DNNqhMhqv8uUAMo5/xcXYWIVVbB9D zmbzDTkVfPTDYar7WZGouk/Vbyo6HLDK+kkQxLDqJN4Zxfoltre2JqKYX+cflbIY 6mWH/8sG6FoMoxQo72pVskMQDPYQmcXGmh/Y7PKwoOFR5IckkiwVlCXWTa0cqo69 DaYWM2W+cHMEqMphq57Giqfon4Omcw/FNnFK9nM+gPvzc7PdIhTGui1in7S/9Bjw RCsGABES+cwiOTN6iN1QmlRCkDO0sekuJV6CEgpxy8uvHFxESFCigtgPa9pYqFEw CZoV+RgHuEkFXr1WALpNQvxx0SSQ1lxJUPXwiGS3FG/0leVdKHfoXpaxGmGtUGfu
    CXpUvTE0FjE=
    =MRwg
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)