• Trivial Backup dilemma

    From pinnerite@pinnerite@gmail.com to alt.os.linux.mint on Mon Feb 2 22:32:35 2026
    From Newsgroup: alt.os.linux.mint

    I rsync my main drive to backups.

    However I am a bit messy at filing documents, just keeping them in
    temporary places, like "Downloads" for example. Eventually I Will
    decide to create one or more directories and move stuff into them.

    When I backup of course, the new directories (and contents) will be
    copied but the originals will remain on the backup drive. It would time-consuming to go through the backup drive, work out what had been duplicated and delete the original files.

    In the past, because I have multiple backups, I would delete all the
    contents of a backup drive before backing up to it again.

    That though is time consuming and seems like using a hammer to crack a
    nut.
    --
    Linux Mint 22.1 kernel version 6.8.0-84-generic Cinnamon 6.4.8
    AMD Ryzen 7 7700, Radeon RX 6600, 32GB DDR5, 2TB SSD, 2TB Barracuda
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Mike Easter@MikeE@ster.invalid to alt.os.linux.mint on Mon Feb 2 15:04:39 2026
    From Newsgroup: alt.os.linux.mint

    pinnerite wrote:
    That though is time consuming and seems like using a hammer to crack a
    nut.

    The key to good storage 'maintenance' is *starting with* good organization.

    Theoretically, once upon a time a person would accumulate a big pile of disorderly files that need to be organized into directories.

    However, the 'nature' of that disorder post hoc begins to emerge as
    greater order as the post hoc organizer comes along a designates an
    order w/ directories and maybe hierarchies based on what he has accumulated.

    Logically, it would seem that *future* would-be disorderly files could
    fall into a type of order which emerged from the prior post hoc ordering experience.

    Ya' know whattuh mean, Gene?
    --
    Mike Easter
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to alt.os.linux.mint on Mon Feb 2 23:13:53 2026
    From Newsgroup: alt.os.linux.mint

    On Mon, 2 Feb 2026 22:32:35 +0000, pinnerite wrote:

    When I backup of course, the new directories (and contents) will be
    copied but the originals will remain on the backup drive. It would time-consuming to go through the backup drive, work out what had been duplicated and delete the original files.

    rsync has the --link-dest option so it can do the deduping for you.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Mon Feb 2 21:58:28 2026
    From Newsgroup: alt.os.linux.mint

    On Mon, 2/2/2026 5:32 PM, pinnerite wrote:
    I rsync my main drive to backups.

    However I am a bit messy at filing documents, just keeping them in
    temporary places, like "Downloads" for example. Eventually I Will
    decide to create one or more directories and move stuff into them.

    When I backup of course, the new directories (and contents) will be
    copied but the originals will remain on the backup drive. It would time-consuming to go through the backup drive, work out what had been duplicated and delete the original files.

    In the past, because I have multiple backups, I would delete all the
    contents of a backup drive before backing up to it again.

    That though is time consuming and seems like using a hammer to crack a
    nut.



    There is an article here on hardlinking, and using it to do
    a Full-incremental-incremental-incremental kind of thing.

    # Intro and inode number demo

    https://www.admin-magazine.com/Articles/Using-rsync-for-Backups

    # More meaty

    http://www.mikerubel.org/computers/rsync_snapshots/

    By positioning yourself in a particular backup instance directory
    like backup.1 , you get a consolidated view of a point in time.
    That seems to be the intent of the method, is to make the
    structure work for you, instead of you working for it.

    And this does not dedup. If the source directory has apple/b and baker/b
    the method copies both of them, and does not try its hand at any hijinks further saving space. But, when you delete a directory at the source, the deletion "propagates" to the time based view of backup.1, backup.2 and so on. If you select the date where a file existed, you can still access it. You
    can pick a date where the "b" file existed or pick a date where it
    did not exist, when doing a restore.

    The purpose of hardlinking, is to have two file pointers to the same
    set of data inodes. That's how you can have consistent points-in-time,
    for the cost of the file system overhead for the linkage. As a real world example,
    on a rather large collection of files, the hardlink overhead was 500MB.
    There is a cost associated with the method, but the consistent points-in-time is the benefit.

    If you dedup the source directory, that can be a bit dangerous if you don't understand the details of the implementation. Maybe it's just better to back
    up things as they stand, at a particular point in time.

    And as with any "rotation" strategy, you should not rely on any one filesystem maintaining sanity forever. Journaled file systems are a hell of a lot more reliable than the un-journaled ones. While it might be tempting to have Full-1-2-3...100, there is a risk that if the Delete Fairy gets in there
    (the execution of a badly formatted command by the user, wipes out
    the entire cache), it's helpful to have more than one filesystem you can look to for your points in time.

    Disk1 Full, 1,2,3 # This is not a RAID.
    / # The two devices are intended to have some independence
    Source- # A power supply failure, doesn't wipe both out.
    \ # A fire burns one, and not the other
    Disk2 Full, 1,2,3

    Paul
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to alt.os.linux.mint on Tue Feb 3 03:27:21 2026
    From Newsgroup: alt.os.linux.mint

    On Mon, 2 Feb 2026 23:13:53 -0000 (UTC), I wrote:

    rsync has the --link-dest option so it can do the deduping for you.

    Just to clarify, I mean deduping across different backup snapshots, to
    avoid creating additional copies of a file that has not changed, not
    within a single backup snapshot.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Gordon@Gordon@leaf.net.nz to alt.os.linux.mint on Tue Feb 3 07:10:47 2026
    From Newsgroup: alt.os.linux.mint

    On 2026-02-02, Dan Purgert <dan@djph.net> wrote:
    On 2026-02-02, pinnerite wrote:
    I rsync my main drive to backups.

    However I am a bit messy at filing documents, just keeping them in
    temporary places, like "Downloads" for example. Eventually I Will
    decide to create one or more directories and move stuff into them.

    When I backup of course, the new directories (and contents) will be
    copied but the originals will remain on the backup drive. It would
    time-consuming to go through the backup drive, work out what had been
    duplicated and delete the original files.

    In the past, because I have multiple backups, I would delete all the
    contents of a backup drive before backing up to it again.

    That though is time consuming and seems like using a hammer to crack a
    nut.

    So use something like fdupes or the occasional --delete[*] switch with your rsync job?

    If Alex just wants his backup (via rsync) to be idential to the scource (directory) then, the --delete option will do it.

    Do a test run on a sample directory, to see if you get what you are expecting.

    --del an alias for --delete-during
    --delete delete extraneous files from dest dirs
    --delete-before receiver deletes before xfer, not during
    --delete-during receiver deletes during the transfer
    --delete-delay find deletions during, delete after
    --delete-after receiver deletes after transfer, not during
    --delete-excluded also delete excluded files from dest dirs




    [*] NB -- there are several 'options' you can use with --delete; such as 'before' or 'after' the transfer (and some additional variants thereto)

    See above.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Mike Scott@usenet.16@scottsonline.org.uk.invalid to alt.os.linux.mint on Tue Feb 3 08:53:10 2026
    From Newsgroup: alt.os.linux.mint

    On 03/02/2026 03:27, Lawrence DrCOOliveiro wrote:
    On Mon, 2 Feb 2026 23:13:53 -0000 (UTC), I wrote:

    rsync has the --link-dest option so it can do the deduping for you.

    Just to clarify, I mean deduping across different backup snapshots, to
    avoid creating additional copies of a file that has not changed, not
    within a single backup snapshot.

    Isn't this where timeshift and 'back in time' come in? They're really
    just front-ends to rsync plus scheduling of one sort or another.
    They seem to work well enough.

    Or are we talking archiving here, as opposed to backup? Or some mixture?
    --
    Mike Scott
    Harlow, England
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From pinnerite@pinnerite@gmail.com to alt.os.linux.mint on Tue Feb 3 12:25:32 2026
    From Newsgroup: alt.os.linux.mint

    On Mon, 2 Feb 2026 22:32:35 +0000
    pinnerite <pinnerite@gmail.com> wrote:

    I rsync my main drive to backups.

    However I am a bit messy at filing documents, just keeping them in
    temporary places, like "Downloads" for example. Eventually I Will
    decide to create one or more directories and move stuff into them.

    When I backup of course, the new directories (and contents) will be
    copied but the originals will remain on the backup drive. It would time-consuming to go through the backup drive, work out what had been duplicated and delete the original files.

    In the past, because I have multiple backups, I would delete all the
    contents of a backup drive before backing up to it again.

    That though is time consuming and seems like using a hammer to crack a
    nut.


    The advice given (thank you) appears to assume that the file locations
    will be in the same place. They won't. The source file will have been
    moved to a more suitable directory. The existing backup copy will be
    matching the original location of the file (which will no longer be
    there of course).

    That is why I now get duplicates.
    --
    Linux Mint 22.1 kernel version 6.8.0-84-generic Cinnamon 6.4.8
    AMD Ryzen 7 7700, Radeon RX 6600, 32GB DDR5, 2TB SSD, 2TB Barracuda
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to alt.os.linux.mint on Tue Feb 3 21:20:53 2026
    From Newsgroup: alt.os.linux.mint

    On Tue, 3 Feb 2026 08:53:10 +0000, Mike Scott wrote:

    On 03/02/2026 03:27, Lawrence DrCOOliveiro wrote:

    Just to clarify, I mean deduping across different backup
    snapshots...

    Or are we talking archiving here, as opposed to backup? Or some
    mixture?

    I was talking about backup snapshots.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Handsome Jack@jack@handsome.com to alt.os.linux.mint on Wed Feb 4 09:38:34 2026
    From Newsgroup: alt.os.linux.mint

    On Tue, 3 Feb 2026 08:53:10 +0000, Mike Scott wrote:

    On 03/02/2026 03:27, Lawrence DrCOOliveiro wrote:
    On Mon, 2 Feb 2026 23:13:53 -0000 (UTC), I wrote:

    rsync has the --link-dest option so it can do the deduping for you.

    Just to clarify, I mean deduping across different backup snapshots, to
    avoid creating additional copies of a file that has not changed, not
    within a single backup snapshot.

    Isn't this where timeshift and 'back in time' come in? They're really
    just front-ends to rsync plus scheduling of one sort or another.
    They seem to work well enough.

    Or are we talking archiving here, as opposed to backup? Or some mixture?

    Timeshift doesn't work on non-ext4 destination disks, which rules it out
    for removable disks that might have to be used on a spare Windows machine.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Wed Feb 4 06:57:54 2026
    From Newsgroup: alt.os.linux.mint

    On Tue, 2/3/2026 7:25 AM, pinnerite wrote:
    On Mon, 2 Feb 2026 22:32:35 +0000
    pinnerite <pinnerite@gmail.com> wrote:

    I rsync my main drive to backups.

    However I am a bit messy at filing documents, just keeping them in
    temporary places, like "Downloads" for example. Eventually I Will
    decide to create one or more directories and move stuff into them.

    When I backup of course, the new directories (and contents) will be
    copied but the originals will remain on the backup drive. It would
    time-consuming to go through the backup drive, work out what had been
    duplicated and delete the original files.

    In the past, because I have multiple backups, I would delete all the
    contents of a backup drive before backing up to it again.

    That though is time consuming and seems like using a hammer to crack a
    nut.


    The advice given (thank you) appears to assume that the file locations
    will be in the same place. They won't. The source file will have been
    moved to a more suitable directory. The existing backup copy will be
    matching the original location of the file (which will no longer be
    there of course).

    That is why I now get duplicates.


    If you don't like this scheme, because it doesn't happen to
    hardlink moved duplicates...

    # Intro and inode number demo

    https://www.admin-magazine.com/Articles/Using-rsync-for-Backups

    # More meaty

    http://www.mikerubel.org/computers/rsync_snapshots/

    you could ask the AI how to modify the script to achieve that end.
    The AI seems to be able to read material if you provide a URL
    (which means the free ones have limited agentic capability). The
    site of the URL though, may not like the visit of Agentic AI,
    and may repulse the thing.

    Or you can show it the recommended scripted sequence. The first
    article, tries to pick the best of what it found on the second
    web page.

    rm -rf backup.3
    mv backup.2 backup.3
    mv backup.1 backup.2
    cp -al backup.0 backup.1 <=== I think this hard links the entire backup
    rsync -a --delete source_directory/ backup.0/

    If I was personally "nervous" about what was going on,
    then I would "track" the source directory and the destination
    directories (the point-in-time backups with their hardlinks
    for unchanged files), and manually deduplicate the latest backup
    made (backup.0) so that additional things in there which had
    been assigned brand new inodes, were replaced with hardlinks
    to existing files. Perhaps the rsync can generate a log of sufficient
    quality, to track everything that has happened from generation
    to generation.

    Summary: An insistence on complexity, eventually leads to disaster.
    At some point, it's OK to suffer a little inefficiency, if it
    means "the thing will never break". You have demonstrated in this
    group already, your problems with getting altered NFS mounts to
    work properly, as an example of a complexity that required quite
    a bit of boot-kicking to fix.

    Why do I like Macrium backups ? It's definitely not the efficiency.
    It's the "hard to blow up" behavior. I know that the integrity of
    backups can be ruined by... bad RAM in the computer! It already happened!
    And, by using Verify capability, I caught it in time! So while it wastes
    gobs of space for the "free version", it gets the job done in such a
    way that I never have to worry about details. It's a point-in-time.
    It's "complete". Is it ideal ? Does it cure cancer ? Nope.

    If you want it completely deduplicated, you're probably pretty close
    already. The AI may figure out a way to finish the job. Your job then,
    is to assemble a testbench case, of moved duplicates and so on, in the
    same style as the author of the first link above.

    Paul
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to alt.os.linux.mint on Wed Feb 4 20:42:54 2026
    From Newsgroup: alt.os.linux.mint

    On Wed, 4 Feb 2026 09:38:34 -0000 (UTC), Handsome Jack wrote:

    Timeshift doesn't work on non-ext4 destination disks ...

    Funnily enough, rsync works on any filesystem that Linux will support.
    --- Synchronet 3.21b-Linux NewsLink 1.2
  • From Gordon@Gordon@leaf.net.nz to alt.os.linux.mint on Wed Feb 4 22:42:54 2026
    From Newsgroup: alt.os.linux.mint

    On 2026-02-04, Lawrence DrCOOliveiro <ldo@nz.invalid> wrote:
    On Wed, 4 Feb 2026 09:38:34 -0000 (UTC), Handsome Jack wrote:

    Timeshift doesn't work on non-ext4 destination disks ...

    Funnily enough, rsync works on any filesystem that Linux will support.

    Timeshift also works with a btrfs, if there is a sub volume @ in the picture --- Synchronet 3.21b-Linux NewsLink 1.2