• Filesystem snapshotting in dpkg (was Re: A 2025 NewYear present: make d

    From Guillem Jover@21:1/5 to Jonathan Kamens on Sat Dec 28 15:30:01 2024
    Hi!

    On Fri, 2024-12-27 at 12:46:02 -0500, Jonathan Kamens wrote:
    On 12/27/24 7:34 AM, Geert Stappers wrote:
    Yeah, it feels wrong that dpkg gets file system code, gets code for one particular file system.

    I disagree. If there is a significant optimization that dpkg can implement that is only available for btrfs, and if enough people use btrfs that there would be significant communal benefit in that optimization being
    implemented, and if it is easiest to implement the optimization within dpkg as seems to be the case here (indeed, it may /only/ be possible to implement the optimization within dpkg), then it is perfectly reasonable to implement the optimization in dpkg. Dpkg is a low-level OS-level utility, it is entirely reasonable for it to have OS-level optimizations implemented within it.

    Port-specific or hardware specific optimizations might make sense in
    dpkg, but that depends on the type, semantics, testability and
    intrusiveness, among other things.

    In this case (filesystem snapshotting), I do think dpkg is (currently at
    least) really the wrong place, for at least the following reasons:

    * No overall transaction visibility:

    Frontends, such as apt, split installation (and configuration) in
    multiple dpkg invocations. Installation at least AFAIR, into one per
    Essential:yes package or pre-depended package (group?). So dpkg does
    not currently have whole visibility of the transaction going on.

    * No filesystem layout awareness:

    dpkg has no idea (and should not need to have it) of the current
    filesystem layout, and would need to start scanning all current
    mount points, discern on what filesystems it can use snapshotting
    (as in where the feature is present), and then enable that only on
    the ones where the .deb might end up writing into (before having
    unpacked it!), and not enable that on the ones where only user data
    might reside (say /home, if that is even on a different filesystem).
    Consider dpkg needs to be able to operate on chroots.

    Enabling filesystem snapshotting on all mount points that support
    it seems potentially dangerous, as I don't think dpkg should be
    placed in a position where it needs to decide whether to rollback
    to get back into a good system state vs not rolling back to avoid
    losing user data (say from /home, or given that this can be user
    dependent, then check all current users on the system to check any
    other user home location, where this is not a system user).

    * No trivial testing:

    Even if the above would be non-issues, I'd be very uncomfortable
    having this code in dpkg for something this central to its operation,
    where I personally would not be exercising it daily, and where
    testing this would imply having to perform VM installations with
    such filesystems, and then having to force system crashes, reboots,
    etc. Which while this is all certainly doable, it's going to be
    rather slow, and thus painful as some kind of integration tests
    and CI pipeline.


    But other types of optimizations do make sense in dpkg, even when they
    are port or hardware specific. For example I've got queued a branch
    to add a selectable feature to stop ordering database loads (the .list
    files) based on physical offsets (currently through Linux's fiemap),
    which no longer make sense on non mechanical disks. This will require
    for now enabling/disabling it explicitly (depending on the intended
    sense of the option) to not regress existing installations, but
    perhaps the sense could be inverted in the future to assume by default non-mechanical disks are in use.

    Thanks,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)