Hi folks,
I expired my access to ppc64el since CUDA (>> 12.4) no longer supports
ppc64el for the trixie+1 cycle. But the recent CUDA 12.2->12.4 transition requires me to rebuild pytorch-cuda, while I've already lost access.
The help I need is pretty simple -- manuually rebuild pytorch-cuda and
upload the resulting binaries. Note the building process
involves two major non-free dependencies:
(1) nvidia-cuda-toolkit: from non-free section
(2) nvidia-cudnn: this is my installation script to download binary
blobs during postinst.
They are the direct reason why XS-Autobuild and porterbox do not work.
Steps
=====
1. get the source of pytorch-cuda, make sure version is 2.6.0+dfsg-7
apt source pytorch-cuda
2. do the manual binNMU with sbuild
sbuild --no-clean -c unstable-ppc64el-sbuild \
--build=ppc64el --arch=ppc64el \
--make-binNMU="Rebuild against CUDA 12.4." \
-m "your name <your email>" \
pytorch-cuda_2.6.0+dfsg-7.dsc -d sid
3. sign the built packages and upload
debsign pytorch-cuda_2.6.0+dfsg-7+b1_ppc64el.changes
dput ftp-master pytorch-cuda_2.6.0+dfsg-7+b1_ppc64el.changes
Parallelism and RAM
===================
On amd64/ppc64el building pytorch-cuda needs 4GB per job to avoid
OOM during parallel link. On arm64 it requires 8GB per job. It is
OK to allocate a large swap as it is largely used to counter the
RAM spikes during parallel linker invokes.
I have already done the amd64 rebuild:
https://buildd.debian.org/status/package.php?p=pytorch%2dcuda
My arm64 rebuild is on the way but it will take roughly one day
with my raspberry pi 5. If you have a stronger arm64 device,
feel free to help the rebuild and upload before I do.
Note, arm64 needs roughly 8GB RAM/swap per job to avoid OOM.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)