Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 43 |
Nodes: | 6 (0 / 6) |
Uptime: | 97:25:56 |
Calls: | 290 |
Files: | 904 |
Messages: | 76,468 |
Hello,
Given that the number of LLVM packages is growing, and probably will
grow again (I'm introducing "offload" right now, expect at least MLIR
soon, there are open requests for flang, polly...), I'd like to propose creating dedicated categories for these packages and moving them there.
If not anything else, this will help consistently applying flags
and keywords to these packages (`/etc/portage/package.*` accept
wildcards).
My initial idea would be to use two categories: one for the toolchain packages, another for runtimes, e.g.:
llvm-core/
clang
clang-common
clang-runtime
clang-toolchain-symlinks
lld
lld-toolchain-symlinks
lldb
llvm
llvm-common
llvm-toolchain-symlinks
llvmgold
llvm-runtimes/
compiler-rt
compiler-rt-sanitizers
libclc
libcxx
libcxxabi
libomp (-> openmp?)
llvm-offload (-> offload)
llvm-unwind (-> unwind?)
clang-python, lit and llvm-ocaml would remain in their language
categories.
WDYT?
Hello,
Given that the number of LLVM packages is growing, and probably will
grow again (I'm introducing "offload" right now, expect at least MLIR
soon, there are open requests for flang, polly...), I'd like to propose creating dedicated categories for these packages and moving them there.
[...]
WDYT?
Michał Górny <mgorny@gentoo.org> writes:
Hello,
Given that the number of LLVM packages is growing, and probably will
grow again (I'm introducing "offload" right now, expect at least MLIR
soon, there are open requests for flang, polly...), I'd like to propose
creating dedicated categories for these packages and moving them there.
[...]
WDYT?
I fear this sort of assumes we won't switch to monobuild any time soon.
I keep thinking [0] about how sustainable our current setup is:
* Fedora moved away from it for >=18 [1].
* As we saw with offload, it broke a few times in just a week. So it's
only Gentoo who cares about this configuration AFAIK.
* It's not working great if we're already not able to easily package
mlir, flang, or polly.
Violet attempted working on a merge before [2].
The trade-offs I see are:
* Increased disk space requirement for building LLVM
I think that's a legitimate concern although perhaps not that big of a
deal given it is, after all, a whole toolchain. GCC takes quite a bit
to build too.
* Build time
Build time for mono LLVM would increase as we're building more
components at least for some users.
But the time added by
building more components (see below) should be balanced out by ccache if
doing development and also, importantly, not needing to force on all
targets anymore (they keep growing).
The cumulative time should be the same (although maybe a bit less
given the targets change) given that most people need the
same set of components because of mesa, firefox, or other things which
need libclang.
* Moving to an upstream supported-configuration
We may have a bit of pain in moving to it and getting used to it, but
we're no longer swimming upstream and being the only people caring
about our choice of build configuration (and regularly having to
explain and justify it to upstream).
Upstream also recommend doing a bootstrap build. We don't and can't do
that right now.
Folks upstream state at every opportunity that they don't care about
standalone builds,
e.g. https://discourse.llvm.org/t/rfc-do-something-with-the-subproject-tarballs-in-the-release-page/75024/2.
* Better support for LLVM as a system toolchain
The current setup doesn't work well for people using LLVM as a system
toolchain (because some of the components *must* be upgraded together),
it doesn't work well for people who want to use mlir/flang/polly, and it
doesn't work well for users on constrained hardware because we have to
force on all targets. It also prohibits more optimisation, PGO, and
bootstrapping it to test reliability.
(This is why I'm not too sympathetic to claims that the monobuild is
mostly for binary distributions, because we're actually *more*
vulnerable to issues as a result of it being split when building from
source if using the LLVM toolchain.)
* Maintaining older LLVMs
It's easier for older LLVMs to be either maintained by somebody else
(for e.g. Rust's purposes) or in an overlay if it's a single package
rather than many.
This is also true for e.g. testing old snapshots or keeping binary
packages to downgrade to once they got cleaned up.
[...]
I'm not sure if I'm sold on *two*. What happens for stuff like mlir
where it's not a runtime but it's arguably more of one than core?
It just doesn't feel like the division works great. Or maybe it's just because I feel like llvm-core will keep growing and llvm-runtimes won't.
On Sun, 2024-12-08 at 04:53 +0000, Sam James wrote:
I fear this sort of assumes we won't switch to monobuild any time soon.
I don't see one precluding the other. Categories are cheap. Package
moves not necessarily, but switching to monorepo will be complete pain whether one more package move is involved or not.
I keep thinking [0] about how sustainable our current setup is:
* Fedora moved away from it for >=18 [1].
* As we saw with offload, it broke a few times in just a week. So it's
only Gentoo who cares about this configuration AFAIK.
It broke once, and only because the pull request merged preceded my
changes, and the author dealt with merge conflicts wrong.
That said, it's not like I didn't fix the monorepo build as well this
week, because it was broken from day one.
We're on our own either way.
* Build time
Build time for mono LLVM would increase as we're building more
components at least for some users.
But the time added by
building more components (see below) should be balanced out by ccache if >> doing development and also, importantly, not needing to force on all
targets anymore (they keep growing).
I don't see how we would avoid forcing targets if *external* projects
(wasn't the bug about Rust originally?) can still be broken if you
change targets.
The cumulative time should be the same (although maybe a bit less
given the targets change) given that most people need the
same set of components because of mesa, firefox, or other things which
need libclang.
So you spend hours building LLVM and Clang. Then you spend hours
building everything again because one more packages needs LLD. Then
more hours because you've decided to try LLDB.
I've been rebuilding three LLVM versions recently because of cpp-httplib changing subslot multiple times recently. With the proposed change, I'd
be rebuilding everything instead.
In fact, I've already started considering splitting llvm-debuginfod.
At the moment, I fear us choosing the non-recommended path gives us very
little, and causes a bunch of problems in return.
If you consider being able to have a really working LLVM package "very little", so be it.
If you choose to go for monorepo, I'm stepping down, because
I definitely won't be able to deal with this shit without being able to
split it into smaller parts.
I don't like the idea that any minor patch (think of compiler-rt that regularly needs to be updated for newer glibc) will require spending
hours rebuilding everything. In multiple LLVM versions. For every
single person, including all the people who don't build compiler-rt
at all.
I don't like the idea of not being able to run parts of test suite
without resorting to ugly hacks.
I don't like the idea of spending hours building everything before I'm
even able to start running tests, just to learn that LLVM is broken
and there's no point in even starting to build the rest.
Or having the test suite fail after a few hours
because of some minor configuration issue (like the one we'd had with
libcxx, and I've hit that one three times), and having to start
everything over again.
And ccache is not a solution. It's a cheap hack, and a costly one. Maintaining a cache for this thing requires tons of wasted disk space.
And unless you go out of the way to reconfigure it, building 2-3 LLVM versions will normally push all previous objects of the cache, so it
won't work for most of the people at all. Provided they go out of their
way to configure it in the first place.
In the end, LLVM is a project primarily maintained by people working for shitty corporations that only care about being able to build their
shitty browser written in bad C++. It sucks we ended up having to
maintain it, but that's not our choice.
I don't like the idea of spending hours building everything before I'm
even able to start running tests, just to learn that LLVM is broken
and there's no point in even starting to build the rest.
I don't follow this bit -- you need the new LLVM merged before you can
build Clang's tests, right? And if any of it fails to build, it's not
like we can commit the release or snapshot?
What am I missing on this bit?
On 12/8/24 4:45 PM, Sam James wrote:
I don't like the idea of spending hours building everything before I'm even able to start running tests, just to learn that LLVM is broken
and there's no point in even starting to build the rest.
I don't follow this bit -- you need the new LLVM merged before you can build Clang's tests, right? And if any of it fails to build, it's not
like we can commit the release or snapshot?
What am I missing on this bit?
I think the point here is that currently, one can build sys-devel/llvm
with tests enabled, and if it fails, there's no point in also building sys-devel/clang. But with a monorepo build, you'd have to build llvm,
clang (and various others) first, and then launch tests for llvm and
clang together, only to get a test failure in the llvm tests that
indicates everything else is broken too. Depending on cmake test
ordering, you may also run half the clang tests before hitting the llvm failures, even.
In theory this could be solved by building monorepo-llvm with
FEATURES=test USE="-clang" to start running tests, and then if that
passes, rebuild llvm again but this time with clang etc. enabled. Not
sure this is actually solving the stated objection...