• [gentoo-user] backup horror story (happy ending)

    From ralfconn@21:1/5 to All on Sat Nov 9 11:40:02 2024
    I have a 4Tb hard disk half full of videos and photos my daughter took
    with her cell phone over the years, shared with Win11 in a dual boot box
    so it is NTFS-formatted. The disk is backed up on a different EXT4 disk
    and the backup is performed by (ana)cron via an rsync bash script.

    Last evening there was a power outage. When I rebooted in linux the NTFS
    disk would not mount. OK, just e2fsck the disk and it will fix it, I
    thought, forgetting that it was an NTFS not EXT. e2fsck -y starts
    finding and fixing hundreds of issues on the disk, till I get bored, I
    stop it and reboot into Win11, which chkdsk's it and mounts with no
    problem in less than 10s.

    Finally I realize the huge mistake I had made, allowing e2fsck to delete thousands of otherwise fine clusters/nodes/whatever on a filesystem it
    does not understand.

    But I have a backup, no problem... till I realize the cron job had
    already run so it had overwritten the old files with the new, corrupt
    versions.

    Fortunately rsync uses the file access date to quickly find potential differences and since the e2fsck did not touch those the backup was
    still fine.

    Later I found that the disk did not mount in linux due to mount not
    finding anymore the NTFS's UUID that I had in fstab, but it did mount
    fine with /dev/sdx.

    I felt like a triple-idiot.

    raf

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to mentadent47@yahoo.com on Sat Nov 9 15:50:01 2024
    On Sat, Nov 9, 2024 at 5:30 AM ralfconn <mentadent47@yahoo.com> wrote:

    But I have a backup, no problem... till I realize the cron job had
    already run so it had overwritten the old files with the new, corrupt versions.


    I highly recommend having multiple backups to avoid this sort of problem.

    If you aren't wedded to rsync, then restic seems to be the platform of
    choice these days. Not sure offhand if it handles ntfs very well. I
    use duplicati for backing up windows hosts to an S3 backend and that
    works great, but that is more of a windows solution (VSS and so on).
    I imagine that restic doesn't care much about the filesystem if it is
    running on linux and everything is mounted.

    If you are wedded to the rsync approach where your backups are just a
    big directory tree that you can easily access, then I suggest using
    rsnapshot. It is basically a wrapper around rsync that maintains a
    backup history, in a very clean way. Basically it does a hard-link
    copy of your entire backup set to a new directory tree named by the
    timestamp, and then it runs rsync to sync that new tree. The result
    is that you get file-level deduplication effectively, and otherwise
    get what looks like a nice big full copy of the backup source in each directory. It probably won't be as efficient as something like restic
    since I'm guessing that can do deduplication below the file level (I
    think it can also deduplicate across multiple hosts/etc if you're
    using it that way). Most of these modern tools still use librsync
    under the hood so the actual data transfer is just as efficient.

    rsync by itself is nice for its simplicity, but it just isn't a very
    elegant backup solution. You can tell it to preserve old file
    versions, but those end up stored next to the original files with
    different filenames and that can be a real mess to restore if you
    don't want to end up with all those old versions. With
    restic/rsnapshot you can go back in time but still get a clean
    restore, and you can still extract individual files from various
    snapshots.

    If you really are running rsync at any kind of scale also consider
    rclone, which is often faster since it can transfer multiple files in
    parallel, which is useful if you're more bound by latency than disk
    IOPS (often the case on solid state drives over networks).

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Philip Webb@21:1/5 to All on Sat Nov 9 22:10:01 2024
    241109 ralfconn wrote:
    I have a 4Tb hard disk half full of videos and photos my daughter took
    with her cell phone over the years, shared with Win11 in a dual boot box
    so it is NTFS-formatted. The disk is backed up on a different EXT4 disk
    and the backup is performed by (ana)cron via an rsync bash script.

    Last evening there was a power outage. When I rebooted in linux the NTFS
    disk would not mount. OK, just e2fsck the disk and it will fix it, I
    thought, forgetting that it was an NTFS not EXT. e2fsck -y starts
    finding and fixing hundreds of issues on the disk, till I get bored, I
    stop it and reboot into Win11, which chkdsk's it and mounts with no
    problem in less than 10s.

    Finally I realize the huge mistake I had made, allowing e2fsck to delete thousands of otherwise fine clusters/nodes/whatever on a filesystem it
    does not understand.

    But I have a backup, no problem... till I realize the cron job had
    already run so it had overwritten the old files with the new, corrupt versions.

    Fortunately rsync uses the file access date to quickly find potential differences and since the e2fsck did not touch those the backup was
    still fine.

    Later I found that the disk did not mount in linux due to mount not
    finding anymore the NTFS's UUID that I had in fstab, but it did mount
    fine with /dev/sdx.

    Lesson 1 : always have multiple back-ups on differenct devices,
    incl >= 1 off-site, eg USB stick in your bank safe-deposit box.
    Lesson 2 : don't store anything important in Windows format.

    It's good & lucky you escaped unscathed this time (smile).

    --
    ========================,,============================================
    SUPPORT ___________//___, Philip Webb
    ELECTRIC /] [] [] [] [] []| Cities Centre, University of Toronto
    TRANSIT `-O----------O---' purslowatcadotinterdotnet

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)