Forum: Too Lazy BBS

Who's Online
Recent Visitors
- Geek2
  Wed Jul 2 06:54:35 2025
  from Euclid, Oh via Telnet
- Geek2
  Tue Jul 1 06:29:00 2025
  from Euclid, Oh via Telnet
- Geek2
  Mon Jun 30 21:22:34 2025
  from Euclid, Oh via Telnet
- Sykotik
  Mon Jun 30 19:47:17 2025
  from Canada via Telnet

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	26
Nodes:	6 (0 / 6)
Uptime:	88:47:25
Calls:	483
Calls today:	1
Files:	1,073
Messages:	97,571

RFH: manually rebuild pytorch-cuda on ppc64el and upload binaries

From M. Zhou@21:1/5 to All on Sun Jun 15 03:00:01 2025

Hi folks,

I expired my access to ppc64el since CUDA (>> 12.4) no longer supports
ppc64el for the trixie+1 cycle. But the recent CUDA 12.2->12.4 transition requires me to rebuild pytorch-cuda, while I've already lost access.

The help I need is pretty simple -- manuually rebuild pytorch-cuda and
upload the resulting binaries. Note the building process
involves two major non-free dependencies:

(1) nvidia-cuda-toolkit: from non-free section
(2) nvidia-cudnn: this is my installation script to download binary
blobs during postinst.

They are the direct reason why XS-Autobuild and porterbox do not work.

Steps
=====

1. get the source of pytorch-cuda, make sure version is 2.6.0+dfsg-7

apt source pytorch-cuda

2. do the manual binNMU with sbuild

sbuild --no-clean -c unstable-ppc64el-sbuild \
--build=ppc64el --arch=ppc64el \
--make-binNMU="Rebuild against CUDA 12.4." \
-m "your name <your email>" \
pytorch-cuda_2.6.0+dfsg-7.dsc -d sid

3. sign the built packages and upload

debsign pytorch-cuda_2.6.0+dfsg-7+b1_ppc64el.changes
dput ftp-master pytorch-cuda_2.6.0+dfsg-7+b1_ppc64el.changes

Parallelism and RAM
===================

On amd64/ppc64el building pytorch-cuda needs 4GB per job to avoid
OOM during parallel link. On arm64 it requires 8GB per job. It is
OK to allocate a large swap as it is largely used to counter the
RAM spikes during parallel linker invokes.

I have already done the amd64 rebuild: https://buildd.debian.org/status/package.php?p=pytorch%2dcuda

My arm64 rebuild is on the way but it will take roughly one day
with my raspberry pi 5. If you have a stronger arm64 device,
feel free to help the rebuild and upload before I do.
Note, arm64 needs roughly 8GB RAM/swap per job to avoid OOM.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From M. Zhou@21:1/5 to M. Zhou on Wed Jun 18 05:30:01 2025

This is no longer needed. The ppc64el nvidia driver is gone from testing,
which means the dependency libcuda1 can no longer be satisfied.

On Sat, 2025-06-14 at 20:51 -0400, M. Zhou wrote:

The help I need is pretty simple -- manuually rebuild pytorch-cuda and
upload the resulting binaries. Note the building process
involves two major non-free dependencies:

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

Recent Visitors

System Info

RFH: manually rebuild pytorch-cuda on ppc64el and upload binaries