Forum: Too Lazy BBS

Who's Online

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	40
Nodes:	6 (0 / 6)
Uptime:	12:51:32
Calls:	291
Files:	910
Messages:	76,474

Bug#1091394: nproc: add new option to reduce emitted processors by syst

From Michael Stone@21:1/5 to Helmut Grohne on Thu Dec 26 21:10:01 2024

XPost: linux.debian.devel

On Thu, Dec 26, 2024 at 09:01:30AM +0100, Helmut Grohne wrote:

What other place would be suitable for including this functionality?

As I suggested: you need two tools or one new tool because what you're
looking for is the min of ncpus and (available_mem / process_size). The
result of that calculation is not the "number of cpus", it is the number
of processes you want to run.

have a pattern of packages coming up with code chewing /proc/meminfo
using various means (refer to my initial mail referenced from the bug >submission) and reducing parallelism based on it

Yes, I think that's basically what you need to do.

Do you see the computation of allocatable RAM as something we can
accommodate in coreutils? Michael suggested adding "nmem" between the
lines. Did you mean that in an ironic way or are you open to adding such
a tool? It would solve a quite platform-dependent part of the problem
and significantly reduce the boiler plate in real use cases.

Here's the problem: the definition of "available memory" is very vague.
`free -hwv` output from a random machine:

total used free shared buffers cache available
Mem: 30Gi 6.7Gi 2.4Gi 560Mi 594Mi 21Gi 23Gi
Swap: 11Gi 2.5Mi 11Gi
Comm: 27Gi 22Gi 4.3Gi

Is the amount of available memory 2.4Gi, 23Gi, maybe 23+11Gi? Or 4.3Gi?
IMO, there is no good answer to that question. It's going to vary based
on how/whether virtual memory is implemented, the purpose of the system
(e.g., is it dedicated to building this one thing or does it have other
roles that shouldn't be impacted), the particulars of the build process
(is reducing disk cache better or worse than reducing ||ism?), etc.--and
we havent even gotten to cgroups or other esoteric factors yet. Long
before asking where nmem should go, you'd need to figure out how nmem
would work. You're implicitly looking for this tool to be portable (or
else, what's wrong with using /proc/meminfo directly?) but I don't have
any idea how that would work. You'd need to somehow get people to define policies, what would that look like? I'd suggest starting by writing a
proof of concept and shopping it around to get buy-in and/or see if it's useful. The answers you get from someone doing HPC on linux may be
different from the administrator of an openbsd server or a developer on
an OS/X laptop or windows desktop. I'm personally skeptical that this is
a problem that can be solved, but maybe you'll be able to demonstrate otherwise. At any rate, looking for a project to host & distribute the
tool would seem to be just about the last step. Actually naming the
thing won't be easy either, but showing how it works is probably a
better place to start.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?Q?Julien_Plissonneau_Duqu=C@21:1/5 to All on Fri Dec 27 10:20:01 2024

XPost: linux.debian.devel

Hi,

Le 2024-12-26 20:57, Michael Stone a écrit :

As I suggested: you need two tools or one new tool because what you're looking for is the min of ncpus and (available_mem / process_size). The result of that calculation is not the "number of cpus", it is the
number of processes you want to run.

This is definitely true. "nproc" could potentially be repurposed to mean "number of processes" though.

Here's the problem: the definition of "available memory" is very vague.
`free -hwv` output from a random machine:

total used free shared buffers
cache available
Mem: 30Gi 6.7Gi 2.4Gi 560Mi 594Mi
21Gi 23Gi
Swap: 11Gi 2.5Mi 11Gi
Comm: 27Gi 22Gi 4.3Gi

Is the amount of available memory 2.4Gi, 23Gi, maybe 23+11Gi? Or 4.3Gi?
IMO, there is no good answer to that question.

I would rather argue that there is no perfect answer to that question,
but that the 23GiB in the "Available" column are good enough for most
use cases including building stuff, IF (and only if) you take into
account that you can't have all of it committed by processes as you
still need a decent amount of cache and buffers (how much? very good
question thank you) for that build to run smoothly and efficiently. Swap
should be ignored for all practical purposes here.

(or else, what's wrong with using /proc/meminfo directly?)

I haven't looked at how packages currently try to compute potential
parallelism using data from /proc/meminfo, but my own experience with
Java stuff and otherwise perfectly competent, highly qualified engineers getting available RAM computation wrong makes me not too optimistic
about the overall accuracy of these guesses.

E.g. a few hours ago

I fear your rebuild is ooming workers (...) it seems that some package
is reducing is parallelism to two c++ compilers and that still exceeds
20G

Providing a simple tool that standardizes the calculation and
documenting examples and guidelines is certainly going to help here. It
will also move the logic to collect, parse and compute the result to a
single place, reducing logic duplication and maintainance burden across packages.

You'd need to somehow get people to define policies, what would that
look like?

I would suggest making it possible to input the overall marginal RAM requirements per parallelized process. That is, the amount of additional "available RAM" needed for every additional process. As that value is
very probably going to be larger for the first processes, and as this
fact matters more on constrained environments (e.g. containers, busy CI
runners etc), making it possible to sort of define a curve (e.g. 8 GiB -
5 GiB - 2 GiB - 2 GiB ... => 7 workers with 23 GiB available RAM) will
allow a closer match to the constraints of these environments.

In addition, providing an option to limit the computed result to the
number of available actual cpu cores (not vcpus/threads) and another one
to place an arbitrary upper limit of process beyond which no gains are
expected would be nice.

Cheers,

--
Julien Plissonneau Duquène

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Helmut Grohne@21:1/5 to Michael Stone on Fri Dec 27 13:00:01 2024

XPost: linux.debian.devel

Control: tags -1 + wontfix
Control: close -1

Hi Michael,

On Thu, Dec 26, 2024 at 02:57:12PM -0500, Michael Stone wrote:

On Thu, Dec 26, 2024 at 09:01:30AM +0100, Helmut Grohne wrote:

What other place would be suitable for including this functionality?

As I suggested: you need two tools or one new tool because what you're looking for is the min of ncpus and (available_mem / process_size). The result of that calculation is not the "number of cpus", it is the number of processes you want to run.

This reinforces the question asked in my previous mail what use case
nproc solves. There I have been arguing that changing circumstances
render a significant fraction of what I see as its use cases becoming
broken.

Here's the problem: the definition of "available memory" is very vague.
`free -hwv` output from a random machine:

There is no question about that. You are looking at it from a different
angle than I am though. Perfection is not the goal here. The goal is
guessing better than we currently do. There are two kinds of errors we
may do here.

We may guess a higher concurrency than actually works. This is the
status quo and it causes failing builds. As a result we have been
limiting the number of processors available to build machines and thus
reduce efficiency. So whatever we do here can hardly be worse than the
status quo.

We may guess a lower concurrency than actually works. In this case, we
slow down builds. To a certain extent, this will happen. In return, we
get less failing builds and we get a higher available concurrency to the majority of builds that do not require huge amounts of RAM. We are not optimizing build latency here, but build throughput as well as reducing spurious build failures. Accepting this error is part of the proposed
strategy.

IMO, there is no good answer to that question. It's going to vary based on how/whether virtual memory is implemented, the purpose of the system (e.g., is it dedicated to building this one thing or does it have other roles that shouldn't be impacted), the particulars of the build process (is reducing disk cache better or worse than reducing ||ism?), etc.--and we havent even gotten to cgroups or other esoteric factors yet. Long before asking where nmem should go, you'd need to figure out how nmem would work. You're

This is exactly why I supplied a patch, right? I am beyond the figuring
out how it should work as I have now translated the proposed
implementation into the third programming language. As far as I can see,
it works for the typical build machine that does little beyond compiling software.

implicitly looking for this tool to be portable (or else, what's wrong with using /proc/meminfo directly?) but I don't have any idea how that would
work. You'd need to somehow get people to define policies, what would that look like? I'd suggest starting by writing a proof of concept and shopping
it around to get buy-in and/or see if it's useful. The answers you get from someone doing HPC on linux may be different from the administrator of an openbsd server or a developer on an OS/X laptop or windows desktop. I'm personally skeptical that this is a problem that can be solved, but maybe you'll be able to demonstrate otherwise. At any rate, looking for a project to host & distribute the tool would seem to be just about the last step. Actually naming the thing won't be easy either, but showing how it works is probably a better place to start.

Your resistance is constructive. Both of us agree that the proposed
heuristic falls short in a number of situations and will need
improvements to cover more situations. Iterating this via repeated
coreutils updates likely is a disservice to users as it causes long
iteration times and renders coreutils (or part of it)
unreliable/unstable. As a result, you suggest self-hosting it at least
for a while. I was initially disregarding this option as it looked like
such a simple feature, but your reasoning more and more convinces me
that it is not as simple as originally anticipated. Doing it as a new
upstream project actually has some merit as the number of expected users
is fairly low.

Thanks for engaging in this discussion and clarifying your views as that
moved the discussion forward. You made me agree that coreutils is not a
good place (at least not now). Especially your and Guillem's earlier
feedback significantly changed the way I look at this.

Helmut

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?B?T3R0byBLZWvDpGzDpGluZW4=?@21:1/5 to All on Fri Dec 27 18:00:01 2024

XPost: linux.debian.devel

Hi,

Before we move on, please allow me to ask what problem the nproc tool is supposed to solve. Of course it tells you the number of processors, but
that is not a use case on its own. If you search for uses, "make
-j$(nproc)" (with varying build systems) is the immediate hit. This use
is now broken by hardware trends (see below). It can also be used to partition a system and computing partition sizes. This use case
continues to work. Are there other use cases? Unless I am missing
something, it seems like nproc no longer solves its most common use case
(and this bug is asking to fix it).

Thus far, I am effectively turned down by:
* coreutils
* debhelper
* dpkg

What other place would be suitable for including this functionality? We

Thanks Helmut for trying to solve this. I rely on nproc in all my
packages to instruct Make on how many parallel build processes to run,
and I have repeatedly run into the issue that on large hardware with
16+ cores available, builds tend to crash on memory issues unless
there is also at least 2GB of physical RAM on the system per core. The
man page of nproc says its purpose is to "print the number of
processing units available" so I too would have assumed that adding a
memory cap would fit nproc best. Man page of nproc also already has
the concept of "offline cpu" for other reasons. Creating a new tool
just for this seems like a lot of overhead.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Bug#1091394: nproc: add new option to reduce emitted processors by syst