• protect against bit-rot?

    From Mike Scott@usenet.16@scottsonline.org.uk.invalid to alt.os.linux.mint on Thu Dec 4 11:42:09 2025
    From Newsgroup: alt.os.linux.mint

    Hmmm.

    Checking over things, I found some old files with dates in the future.
    One directory lists as:

    ls -li
    total 36
    3671727 -rw-rw-r-- 1 mike mike 1230 Oct 16 2018 cd1.k3b
    3671728 -rw-rw-r-- 1 mike mike 1160 Oct 16 2018 cd1.k3b.files
    3671729 -rw-r--r-- 1 mike mike 137 Oct 16 2018 k3b2list.sh
    3671730 -rw-r--r-- 1 mike mike 17873 Feb 7 2106 maindata.xml
    3671731 -rw-r--r-- 1 mike mike 17 Feb 7 2106 mimetype


    Note the future dates for the last two. This stuff has been left around
    since 2018, unmodified (by me, at least). The contents look reasonable,
    so it's just the metadata messed up - they seem to be the only two files affected.

    The data was on a freebsd machine until a few weeks ago, when the whole
    lot was rsync'd from spinning rust to the present SSD on a mint server.

    A bit worrying: freebsd failure? rsync failure? SSD failure? linux
    failure? Gremlins?

    But how can anyone possibly realistically detect this sort of thing?
    With an unknown cause?
    --
    Mike Scott
    Harlow, England

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Thu Dec 4 09:09:05 2025
    From Newsgroup: alt.os.linux.mint

    On Thu, 12/4/2025 6:42 AM, Mike Scott wrote:
    Hmmm.

    Checking over things, I found some old files with dates in the future. One directory lists as:

    ls -li
    total 36
    3671727 -rw-rw-r-- 1 mike mike-a 1230 Oct 16-a 2018 cd1.k3b
    3671728 -rw-rw-r-- 1 mike mike-a 1160 Oct 16-a 2018 cd1.k3b.files
    3671729 -rw-r--r-- 1 mike mike-a-a 137 Oct 16-a 2018 k3b2list.sh
    3671730 -rw-r--r-- 1 mike mike 17873 Feb-a 7-a 2106 maindata.xml
    3671731 -rw-r--r-- 1 mike mike-a-a-a 17 Feb-a 7-a 2106 mimetype


    Note the future dates for the last two. This stuff has been left around since 2018, unmodified (by me, at least). The contents look reasonable, so it's just the metadata messed up - they seem to be the only two files affected.

    The data was on a freebsd machine until a few weeks ago, when the whole lot was rsync'd from spinning rust to the present SSD on a mint server.

    A bit worrying: freebsd failure? rsync failure? SSD failure? linux failure? Gremlins?

    But how can anyone possibly realistically detect this sort of thing? With an unknown cause?



    https://github.com/antrea-io/antrea/issues/1417

    "https://tools.ietf.org/html/rfc7011#section-6.1.7 does state:

    The dateTimeSeconds data type is an unsigned 32-bit integer in
    network byte order containing the number of seconds since the UNIX
    epoch, 1 January 1970 at 00:00 UTC, as defined in [POSIX.1].
    dateTimeSeconds is encoded identically to the IPFIX Message Header
    Export Time field. It can represent dates between 1 January 1970 and
    7 February 2106 without wraparound; see Section 5.2 for wraparound considerations."
    ^^^^^^^^^^^^^^^

    That appears to be a magic number and not a random corruption.
    That would be a software problem of some sort.

    Paul
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mike Scott@usenet.16@scottsonline.org.uk.invalid to alt.os.linux.mint on Thu Dec 4 16:43:14 2025
    From Newsgroup: alt.os.linux.mint

    On 04/12/2025 14:09, Paul wrote:
    On Thu, 12/4/2025 6:42 AM, Mike Scott wrote:
    Hmmm.

    Checking over things, I found some old files with dates in the future. One directory lists as:

    ls -li
    total 36
    3671727 -rw-rw-r-- 1 mike mike-a 1230 Oct 16-a 2018 cd1.k3b
    3671728 -rw-rw-r-- 1 mike mike-a 1160 Oct 16-a 2018 cd1.k3b.files
    3671729 -rw-r--r-- 1 mike mike-a-a 137 Oct 16-a 2018 k3b2list.sh
    3671730 -rw-r--r-- 1 mike mike 17873 Feb-a 7-a 2106 maindata.xml
    3671731 -rw-r--r-- 1 mike mike-a-a-a 17 Feb-a 7-a 2106 mimetype


    Note the future dates for the last two. This stuff has been left around since 2018, unmodified (by me, at least). The contents look reasonable, so it's just the metadata messed up - they seem to be the only two files affected.

    The data was on a freebsd machine until a few weeks ago, when the whole lot was rsync'd from spinning rust to the present SSD on a mint server.

    A bit worrying: freebsd failure? rsync failure? SSD failure? linux failure? Gremlins?

    But how can anyone possibly realistically detect this sort of thing? With an unknown cause?



    https://github.com/antrea-io/antrea/issues/1417

    "https://tools.ietf.org/html/rfc7011#section-6.1.7 does state:

    The dateTimeSeconds data type is an unsigned 32-bit integer in
    network byte order containing the number of seconds since the UNIX
    epoch, 1 January 1970 at 00:00 UTC, as defined in [POSIX.1].
    dateTimeSeconds is encoded identically to the IPFIX Message Header
    Export Time field. It can represent dates between 1 January 1970 and
    7 February 2106 without wraparound; see Section 5.2 for wraparound considerations."
    ^^^^^^^^^^^^^^^

    That appears to be a magic number and not a random corruption.
    That would be a software problem of some sort.

    Thanks for pointing that out; I'd missed the significance.

    Nevertheless, /something/ changed the dates somewhen - they should all
    be similar. I've checked old dump files around from freebsd, but restore
    (on mint) sets the extracted file dates to the current time, which isn't helpful (and wrong behaviour?!)

    My random guess would be rsync. But that's just because the wind's in
    the south-west :-}


    Paul
    --
    Mike Scott
    Harlow, England
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Thu Dec 4 16:38:35 2025
    From Newsgroup: alt.os.linux.mint

    On Thu, 12/4/2025 11:43 AM, Mike Scott wrote:
    On 04/12/2025 14:09, Paul wrote:
    On Thu, 12/4/2025 6:42 AM, Mike Scott wrote:
    Hmmm.

    Checking over things, I found some old files with dates in the future. One directory lists as:

    ls -li
    total 36
    3671727 -rw-rw-r-- 1 mike mike-a 1230 Oct 16-a 2018 cd1.k3b
    3671728 -rw-rw-r-- 1 mike mike-a 1160 Oct 16-a 2018 cd1.k3b.files
    3671729 -rw-r--r-- 1 mike mike-a-a 137 Oct 16-a 2018 k3b2list.sh
    3671730 -rw-r--r-- 1 mike mike 17873 Feb-a 7-a 2106 maindata.xml
    3671731 -rw-r--r-- 1 mike mike-a-a-a 17 Feb-a 7-a 2106 mimetype


    Note the future dates for the last two. This stuff has been left around since 2018, unmodified (by me, at least). The contents look reasonable, so it's just the metadata messed up - they seem to be the only two files affected.

    The data was on a freebsd machine until a few weeks ago, when the whole lot was rsync'd from spinning rust to the present SSD on a mint server.

    A bit worrying: freebsd failure? rsync failure? SSD failure? linux failure? Gremlins?

    But how can anyone possibly realistically detect this sort of thing? With an unknown cause?



    https://github.com/antrea-io/antrea/issues/1417

    -a-a-a "https://tools.ietf.org/html/rfc7011#section-6.1.7 does state:

    -a-a-a-a The dateTimeSeconds data type is an unsigned 32-bit integer in
    -a-a-a-a network byte order containing the number of seconds since the UNIX >> -a-a-a-a epoch, 1 January 1970 at 00:00 UTC, as defined in [POSIX.1].
    -a-a-a-a dateTimeSeconds is encoded identically to the IPFIX Message Header >> -a-a-a-a Export Time field. It can represent dates between 1 January 1970 and
    -a-a-a-a 7 February 2106 without wraparound; see Section 5.2 for wraparound considerations."
    -a-a-a-a ^^^^^^^^^^^^^^^

    That appears to be a magic number and not a random corruption.
    That would be a software problem of some sort.

    Thanks for pointing that out; I'd missed the significance.

    Nevertheless, /something/ changed the dates somewhen - they should all be similar.
    I've checked old dump files around from freebsd, but restore (on mint) sets the
    extracted file dates to the current time, which isn't helpful (and wrong behaviour?!)

    My random guess would be rsync. But that's just because the wind's in the south-west :-}

    OK, here is my theory. Rather than name and shame a tool, I look at it this way.

    On a file system such as NTFS, it has 64-bit timestamps, and with a certain resolution
    choice, it can represent a huge set of dates. More dates than a UNIX epoch. On Linux,
    the NTFS metadata is used to populate what stat() uses, just so there is "something to munch on". There is an opportunity for an epoch-mismatch, just at the "stuffing of stat()" level.

    Another part of what you've done, has 32-bit timestamps (you didn't make that choice, someone else did). OK, we can drop the 100ns crap no problem. A timestamp
    to the nearest second is plenty to not annoy anyone.

    But the year range, the "epoch" on the 32 bit representation is strictly limited.

    Let us say the year 9999 appears on NTFS, the "plumbing" supports at
    most 2106, then what value do you send ? Spock-like logic says we
    jam it to 2106 :-) Personally, I like using 1970, because people recognize
    that (as a flag) of a Time Lord snafu. Whereas I had to do a Google to figure out I was hitting a limit-flag via 2106 on a smaller section of plumbing.

    Sometimes, it's just a certain subset of file storage methods that foul up. Like a file with a "short file name", somehow the date gets trashed while
    the metadata on that is re-inflated.

    I've run into interesting limit cases before. At one time, I used two OSes
    that supported different file name length. I would take a ZIP over to the
    one with the shorter limitation, unzip the folder and... there would be
    one file, which had a name one character longer than the system could handle. The extra letter was silently discarded. There was no limit-flag behavior
    on that one, and every time I un-archived something, I had to keep an
    eye peeled for damaged goods.

    Choices for correction, are to "detect all weird situations", which is
    overly ambitious. A second option, is to transfer date information
    from source to destination, and use gnu_touch to set the date on
    the file, using the manually transported metadata.

    I would like to think that turning on a Verify function in the plumbing
    would catch it. But that would only work if the Verify function was end
    to end, which is unlikely to be the case.

    *******

    To start, take the tree from the source end, and list it and sort by date.
    This may cause the damaged goods to appear at one end of the listing file.

    Using my Notes file, I have examples of "old" attempts to catalog some goods.

    find /media/WIN2KAS -type d -exec ls -al -1 -d {} + > directories.txt
    find /media/WIN2KAS -type f -exec ls -al -1 {} + > filelist.txt

    sudo find /media/WIN2KAS -type f -exec stat --printf='%010Y %y %n\n' {} + > statlist.txt

    On Windows, the GNUWIN32 packages, contain ports of the usual suspects, including find and sort.

    find.exe C:\Downloads -type f -exec ls -al -1 {} ; | sort.exe -t: -k3 > SortedList.txt

    I don't usually write extensive notes for the one-liners, leaving
    that for analysis later. In any case, you can craft some one-liners
    to make metadata verification at the two ends a bit easier.

    Paul


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to alt.os.linux.mint on Sun Dec 14 07:28:31 2025
    From Newsgroup: alt.os.linux.mint

    On Thu, 4 Dec 2025 11:42:09 +0000, Mike Scott wrote:

    3671730 -rw-r--r-- 1 mike mike 17873 Feb 7 2106 maindata.xml
    3671731 -rw-r--r-- 1 mike mike 17 Feb 7 2106 mimetype

    Hmmm ...

    ldo@theon:~> TZ=UTC date -d "07-Feb-2106" +%s
    4294944000
    ldo@theon:~> TZ=UTC date -d "@4294967295"
    Sun 07 Feb 2106 06:28:15 UTC

    If you look at the full timestamp, down to the nearest second, do you
    see the above date/time?
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mike Scott@usenet.16@scottsonline.org.uk.invalid to alt.os.linux.mint on Sun Dec 14 09:31:35 2025
    From Newsgroup: alt.os.linux.mint

    On 14/12/2025 07:28, Lawrence DrCOOliveiro wrote:
    On Thu, 4 Dec 2025 11:42:09 +0000, Mike Scott wrote:

    3671730 -rw-r--r-- 1 mike mike 17873 Feb 7 2106 maindata.xml
    3671731 -rw-r--r-- 1 mike mike 17 Feb 7 2106 mimetype

    Hmmm ...

    ldo@theon:~> TZ=UTC date -d "07-Feb-2106" +%s
    4294944000
    ldo@theon:~> TZ=UTC date -d "@4294967295"
    Sun 07 Feb 2106 06:28:15 UTC

    If you look at the full timestamp, down to the nearest second, do you
    see the above date/time?

    Can't check now, I'm afraid. I scanned the whole drive for other
    anomalies (none found) and reset the file dates to something kosher. But thanks for replying.
    --
    Mike Scott
    Harlow, England
    --- Synchronet 3.21a-Linux NewsLink 1.2