• Hard drive question

    From Felix@none@nowhere.com to alt.os.linux.mint on Fri Jul 25 14:47:51 2025
    From Newsgroup: alt.os.linux.mint


    How does LM treat HD bad sectors? Can it identify and mark them (if any)
    'not for use'? or is there an app that will do it? thanks all,
    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mike Easter@MikeE@ster.invalid to alt.os.linux.mint on Thu Jul 24 23:04:47 2025
    From Newsgroup: alt.os.linux.mint

    Felix wrote:
    How does LM treat HD bad sectors? Can it identify and mark them (if any) 'not for use'? or is there an app that will do it?

    Default LM has Disks which is gnome-disks which has a SMART function
    which can check and self-test.

    But I think the way to go about dealing w/ sector exclusion on a disk
    you plan to continue to use is to 'start over' and format it, as opposed
    to trying to 'retro' exclude.

    There are instructions for using badblocks to list them to a file and
    then use fsck to use the file to exclude them; but I don't know if I
    like that; doing it that way instead of the format.

    ------------

    Steps to Check and Mark Bad Sectors:

    1. Identify the disk:
    .

    Use fdisk -l to list all available disks and their partitions.
    2. Unmount the disk:
    .
    Before scanning, unmount the partition you want to check using sudo
    umount /dev/sda1 (replace sda1 with your actual partition).
    3. Scan for bad blocks:
    .
    Use badblocks -v /dev/sda1 > badsectors.txt to scan for bad blocks and
    save the results to a file (replace sda1 with your partition).
    4. Mark bad sectors:
    .
    Use sudo e2fsck -l badsectors.txt /dev/sda1 (for ext2/ext3/ext4
    filesystems) or sudo fsck -l badsectors.txt /dev/sda1 (for other
    filesystems) to mark the bad sectors as unusable.
    5. Remount the disk:
    .
    After the scan and marking process, remount the disk using sudo mount /dev/sda1.
    6. Check disk health with SMART:
    .
    Use the Disks application (found in the menu under Accessories) to check
    the disk's SMART data and run self-tests.
    ------------
    --
    Mike Easter
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.os.linux.mint on Fri Jul 25 08:35:24 2025
    From Newsgroup: alt.os.linux.mint

    On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:

    How does LM treat HD bad sectors?

    You have to specify a list of them to be excluded from file allocations
    when you initialize a filesystem on the disk. Note the -c option in mke2fs <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
    find these for itself. Or the -l option to get a list of them from a
    separate file.

    Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they start happening, theyrCOre only likely to get worse.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Fri Jul 25 10:01:09 2025
    From Newsgroup: alt.os.linux.mint

    On Fri, 7/25/2025 12:47 AM, Felix wrote:

    How does LM treat HD bad sectors? Can it identify and
    mark them (if any) 'not for use'? or is there an app
    that will do it? thanks all,

    https://askubuntu.com/questions/1127377/mark-ext4-blocks-as-bad-manually-without-trying-to-read-write

    The problem I have with this idea, is later if you buy a
    new hard drive, and you want to clone over the drive (say using
    ddrescue), you would also copy the portion of the file system that declares some blocks bad. When cloning, the badblock information
    is really "private" to that particular drive.

    What you have to decide for yourself, is how far to push
    HDD, before transferring the data to a second drive.

    *******

    The hard drive has automatic sparing, which means if there
    is trouble with a sector, the drive has some spare sectors
    in the immediate area. And a table of spared blocks is
    maintained by the drive, independent of anything the user
    is doing. When the drive is getting low on spare sectors,
    the SMART "Reallocated" statistic raw data field goes non-zero,
    indicating drive life is on the warning track.

    The "smartctl" utility from smartmontools package, can tell
    you how healthy the drive is.

    sudo smartctl -a /dev/sda

    SMART gives its best warnings, when the drive errors are
    independent of one another, and uniformly spread out. SMART
    gives a less-useful warning, when the drive has a "bad spot",
    as all the spares in the bad spot can be exhausted and yet
    the drive health will be declared as "Good".

    A bad spot in a disk, can be detected (and not all that accurately),
    by benchmark testing the disk with a transfer benchmark. For example,
    one drive I had, there was a 70GB wide area that transferred data
    at 10MB/sec (which is abnormally low). The drive health was listed
    as "Good" which is rubbish, as the drive was obviously not normal
    at that point. I transferred the data off the drive.

    *******

    Blocks with problems, are maintained in a queue for maintenance activity
    when an attempt is made to write the block. The drive will check whether
    the write is working or not, whether the block needs to be spared, it
    spares the block out and so on. This is all automated and may slow the
    drive down a bit while the determination is made.

    If you write the drive surface:

    # Do a backup first, *before* the next command

    smartctl -a /dev/sda # Record health info before run begins.

    sudo dd if=/dev/zero of=/dev/sda bs=221184 # Destructive write test

    Then do some reads:

    sudo dd if=/dev/sda of=/dev/null bs=221184 # Read verify, test will stop if bad block present

    sudo ddrescue -f -n /dev/sda /dev/null /root/rescue.log # Alternately, ddrescue of gddrescue package can be
    xed /root/rescue.log # used to generate a logfile with badblock info.
    # This sequence differs from the previous command
    # in that the command should always finish.

    smartctl -a /dev/sda # Look to see if Reallocated raw data increased by a couple hundred,
    # indicating the questionable blocks have been permanently harvested.
    # The raw data field might have a range of 0..5500 or so, just to give
    # some idea how worried you should be when Reallocated = 300.

    # Restore disk from backup once the harvesting is complete and you are happy.

    But when the Reallocated SMART parameter raw data field goes non-zero,
    it is time to move the data off the disk and onto another disk.
    While you can punish a drive, use up all the spares in a region,
    forcing the drive to declare an actual "CRC error" on a block there,
    then you need to start using badblocks for EXT4 to manage the
    defects and keep the file system from using the now non-functional
    inodes. And if you do that, if you resort to manual badblock management,
    the main danger is accidentally transferring the (inaccurate for a second
    hard drive) badblock data to a new disk. You are really better off
    with the disks doing their own bad block management, and you the
    operator, monitoring SMART Reallocated plus watching for "benchmark
    bad spots" as indicators the drive is at end-of-life.

    *******

    The last hard drive I opened, a Seagate, I was shocked at what I found.
    The drive only had about 10,000 hours on it, when taken out of service.
    The Reallocated might have been 300. What did I find ? A single platter,
    which is to be expected on some of your hard drive fleet of course.
    What I didn't expect to find, is there was no landing ramp for the
    heads inside the drive. The head just sits on the platter. I looked it
    up, and after the "stiction era" (quantum fireball era or so), they
    had found a way to "laser pattern" the area near the platter hub and
    make a "non-stiction area" for the heads to park when the drive
    spins down. While modern lubricants (polymer finish) are fairly
    robust, not having a landing ramp for the head, that is just not a
    best practice, and guarantees if you cycle the power every day
    on the computer, the drive does lots of spinning down and wearing
    the heads as the heads skate over the surface.

    And that's why the drive had lasted only 10,000 hours. It was because
    even though the drives are in the modern era and science had discovered
    the benefits of landing ramps, my drive didn't have a plastic landing ramp.

    And this is just in case you do not understand why you didn't get
    50,000 hours from a HDD. But you only figure things like this out,
    by examining the drive after it reaches end of life, to see whether
    the drive was too cheaply made. I never expected to find such an
    idiotic development, as to be dragging the heads across the platter
    when I opened the drive. I had expected to find dirt or rubbish inside
    the drive, proportional to a surface degradation, but the filter
    pack was still lilly white and the platter surface was impeccable
    to the eye, yet it had spared out enough blocks to be end-of-life.
    This means I'd need a microscope to find the damage that was
    present on the drive platter.

    *******

    When the first hard drives came out for consumers, I tested them
    in the lab. I took the factory bad block list, and the grown
    defect list, reset them, and had the drive scan for bad blocks.
    What was interesting, is the drive exactly reproduced the same
    defect list as was present in the lists. This is just in case
    you were thinking "oh, those blocks aren't really bad and
    a re-scan would uncover lots of good blocks, if only I could
    reset the automatic sparing system". In my tests, what I discovered
    at the time, is no, resetting any automatic sparing would
    achieve nothing. Still, this is a natural hypothesis for users
    to reach, that if only they could give the automatic sparing
    a whack upside the head, their drive would be "rendered new again".
    It's not true. The drive does make good, high quality determinations
    of its bad blocks. When it tells you a block is bad, it's bad.
    And reproducibly so. These were the first full height 5MB and
    10MB Seagate consumer drives (complete with floppy-like head movement
    and stepper motors for driving the head in and out instead of a
    voice coil).

    Paul



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Gordon@Gordon@leaf.net.nz to alt.os.linux.mint on Sat Jul 26 00:12:34 2025
    From Newsgroup: alt.os.linux.mint

    On 2025-07-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:

    How does LM treat HD bad sectors?

    You have to specify a list of them to be excluded from file allocations
    when you initialize a filesystem on the disk. Note the -c option in mke2fs
    <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
    find these for itself. Or the -l option to get a list of them from a separate file.

    Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they start happening, theyrCOre only likely to get worse.

    Agreed. Once a HD starts to fail it is only a matter of time before it will take your data.

    The blocks fail on the so called bath tub curve (Backblaze)

    https://www.backblaze.com/blog/drive-failure-over-time-the-bathtub-curve-is-leaking/

    More at

    https://darwinsdata.com/how-likely-is-a-hdd-to-fail/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.os.linux.mint on Sat Jul 26 00:55:12 2025
    From Newsgroup: alt.os.linux.mint

    On Fri, 25 Jul 2025 10:01:09 -0400, Paul wrote:

    The problem I have with this idea, is later if you buy a new hard drive,
    and you want to clone over the drive (say using ddrescue), you would
    also copy the portion of the file system that declares some blocks bad.

    This is why I always do file-level copies, rather than trying to rCLclonerCY a drive. rsync is a great tool for this.

    I think Windows users have this assumption that, for OS installs at least,
    you must use some kind of rCLcloningrCY utility to transfer them, otherwise they wonrCOt work.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Fri Jul 25 22:55:48 2025
    From Newsgroup: alt.os.linux.mint

    On Fri, 7/25/2025 8:55 PM, Lawrence D'Oliveiro wrote:
    On Fri, 25 Jul 2025 10:01:09 -0400, Paul wrote:

    The problem I have with this idea, is later if you buy a new hard drive,
    and you want to clone over the drive (say using ddrescue), you would
    also copy the portion of the file system that declares some blocks bad.

    This is why I always do file-level copies, rather than trying to rCLclonerCY a
    drive. rsync is a great tool for this.

    I think Windows users have this assumption that, for OS installs at least, you must use some kind of rCLcloningrCY utility to transfer them, otherwise they wonrCOt work.


    Windows users should "clone", because nobody really understands
    the permission model :-)

    The C: partition has many more features applied to it, than Data
    partitions do.

    Paul
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Felix@none@nowhere.com to alt.os.linux.mint on Sun Jul 27 19:44:10 2025
    From Newsgroup: alt.os.linux.mint

    Mike Easter wrote:
    Felix wrote:
    How does LM treat HD bad sectors? Can it identify and mark them (if
    any) 'not for use'? or is there an app that will do it?

    Default LM has Disks which is gnome-disks which has a SMART function
    which can check and self-test.

    But I think the way to go about dealing w/ sector exclusion on a disk
    you plan to continue to use is to 'start over' and format it, as
    opposed to trying to 'retro' exclude.

    There are instructions for using badblocks to list them to a file and
    then use fsck to use the file to exclude them; but I don't know if I
    like that; doing it that way instead of the format.

    so does formatting mark sectors 'not for use'? even 'fast format'?


    ------------

    Steps to Check and Mark Bad Sectors:

    -a-a-a 1. Identify the disk:
    -a-a-a .

    Use fdisk -l to list all available disks and their partitions.
    2. Unmount the disk:
    .
    Before scanning, unmount the partition you want to check using sudo
    umount /dev/sda1 (replace sda1 with your actual partition).
    3. Scan for bad blocks:
    .
    Use badblocks -v /dev/sda1 > badsectors.txt to scan for bad blocks and
    save the results to a file (replace sda1 with your partition).
    4. Mark bad sectors:
    .
    Use sudo e2fsck -l badsectors.txt /dev/sda1 (for ext2/ext3/ext4
    filesystems) or sudo fsck -l badsectors.txt /dev/sda1 (for other filesystems) to mark the bad sectors as unusable.
    5. Remount the disk:
    .
    After the scan and marking process, remount the disk using sudo mount /dev/sda1.
    6. Check disk health with SMART:
    .
    Use the Disks application (found in the menu under Accessories) to
    check the disk's SMART data and run self-tests.
    ------------


    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Felix@none@nowhere.com to alt.os.linux.mint on Sun Jul 27 19:59:28 2025
    From Newsgroup: alt.os.linux.mint

    Lawrence D'Oliveiro wrote:
    On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:

    How does LM treat HD bad sectors?
    You have to specify a list of them to be excluded from file allocations
    when you initialize a filesystem on the disk. Note the -c option in mke2fs <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
    find these for itself.

    so this is the correct command to create ext4 file system with normal
    cluster size and read/write test?..

    *mkfs.ext4 -c -c*

    Or the -l option to get a list of them from a
    separate file.

    Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they start happening, theyrCOre only likely to get worse.

    I want to know if disks are good to use them or not
    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Felix@none@nowhere.com to alt.os.linux.mint on Sun Jul 27 20:21:01 2025
    From Newsgroup: alt.os.linux.mint

    Paul wrote:
    On Fri, 7/25/2025 12:47 AM, Felix wrote:
    How does LM treat HD bad sectors? Can it identify and
    mark them (if any) 'not for use'? or is there an app
    that will do it? thanks all,
    https://askubuntu.com/questions/1127377/mark-ext4-blocks-as-bad-manually-without-trying-to-read-write

    The problem I have with this idea, is later if you buy a
    new hard drive, and you want to clone over the drive (say using
    ddrescue), you would also copy the portion of the file system that declares some blocks bad. When cloning, the badblock information
    is really "private" to that particular drive.

    What you have to decide for yourself, is how far to push
    HDD, before transferring the data to a second drive.

    *******

    The hard drive has automatic sparing, which means if there
    is trouble with a sector, the drive has some spare sectors
    in the immediate area.

    that's good. is it solely a Linux thing, or windows also?

    And a table of spared blocks is
    maintained by the drive, independent of anything the user
    is doing. When the drive is getting low on spare sectors,
    the SMART "Reallocated" statistic raw data field goes non-zero,
    indicating drive life is on the warning track.

    The "smartctl" utility from smartmontools package, can tell
    you how healthy the drive is.

    sudo smartctl -a /dev/sda

    how would I specify other than the C drive?


    SMART gives its best warnings, when the drive errors are
    independent of one another, and uniformly spread out. SMART
    gives a less-useful warning, when the drive has a "bad spot",
    as all the spares in the bad spot can be exhausted and yet
    the drive health will be declared as "Good".

    A bad spot in a disk, can be detected (and not all that accurately),
    by benchmark testing the disk with a transfer benchmark. For example,
    one drive I had, there was a 70GB wide area that transferred data
    at 10MB/sec (which is abnormally low). The drive health was listed
    as "Good" which is rubbish, as the drive was obviously not normal
    at that point. I transferred the data off the drive.

    *******

    Blocks with problems, are maintained in a queue for maintenance activity
    when an attempt is made to write the block. The drive will check whether
    the write is working or not, whether the block needs to be spared, it
    spares the block out and so on. This is all automated and may slow the
    drive down a bit while the determination is made.

    so drives self test themselves and avoid writing to bad blocks making
    drive testing somewhat unnecessary?


    If you write the drive surface:

    # Do a backup first, *before* the next command

    smartctl -a /dev/sda # Record health info before run begins.

    sudo dd if=/dev/zero of=/dev/sda bs=221184 # Destructive write test

    Then do some reads:

    sudo dd if=/dev/sda of=/dev/null bs=221184 # Read verify, test will stop if bad block present

    sudo ddrescue -f -n /dev/sda /dev/null /root/rescue.log # Alternately, ddrescue of gddrescue package can be
    xed /root/rescue.log # used to generate a logfile with badblock info.
    # This sequence differs from the previous command
    # in that the command should always finish.

    smartctl -a /dev/sda # Look to see if Reallocated raw data increased by a couple hundred,
    # indicating the questionable blocks have been permanently harvested.
    # The raw data field might have a range of 0..5500 or so, just to give
    # some idea how worried you should be when Reallocated = 300.

    # Restore disk from backup once the harvesting is complete and you are happy.

    But when the Reallocated SMART parameter raw data field goes non-zero,
    it is time to move the data off the disk and onto another disk.
    While you can punish a drive, use up all the spares in a region,
    forcing the drive to declare an actual "CRC error" on a block there,
    then you need to start using badblocks for EXT4 to manage the
    defects and keep the file system from using the now non-functional
    inodes. And if you do that, if you resort to manual badblock management,
    the main danger is accidentally transferring the (inaccurate for a second hard drive) badblock data to a new disk. You are really better off
    with the disks doing their own bad block management, and you the
    operator, monitoring SMART Reallocated plus watching for "benchmark
    bad spots" as indicators the drive is at end-of-life.

    *******

    The last hard drive I opened, a Seagate, I was shocked at what I found.
    The drive only had about 10,000 hours on it, when taken out of service.
    The Reallocated might have been 300. What did I find ? A single platter, which is to be expected on some of your hard drive fleet of course.
    What I didn't expect to find, is there was no landing ramp for the
    heads inside the drive. The head just sits on the platter. I looked it
    up, and after the "stiction era" (quantum fireball era or so), they
    had found a way to "laser pattern" the area near the platter hub and
    make a "non-stiction area" for the heads to park when the drive
    spins down. While modern lubricants (polymer finish) are fairly
    robust, not having a landing ramp for the head, that is just not a
    best practice, and guarantees if you cycle the power every day
    on the computer, the drive does lots of spinning down and wearing
    the heads as the heads skate over the surface.

    And that's why the drive had lasted only 10,000 hours. It was because
    even though the drives are in the modern era and science had discovered
    the benefits of landing ramps, my drive didn't have a plastic landing ramp.

    And this is just in case you do not understand why you didn't get
    50,000 hours from a HDD. But you only figure things like this out,
    by examining the drive after it reaches end of life, to see whether
    the drive was too cheaply made. I never expected to find such an
    idiotic development, as to be dragging the heads across the platter
    when I opened the drive. I had expected to find dirt or rubbish inside
    the drive, proportional to a surface degradation, but the filter
    pack was still lilly white and the platter surface was impeccable
    to the eye, yet it had spared out enough blocks to be end-of-life.
    This means I'd need a microscope to find the damage that was
    present on the drive platter.

    *******

    When the first hard drives came out for consumers, I tested them
    in the lab. I took the factory bad block list, and the grown
    defect list, reset them, and had the drive scan for bad blocks.
    What was interesting, is the drive exactly reproduced the same
    defect list as was present in the lists. This is just in case
    you were thinking "oh, those blocks aren't really bad and
    a re-scan would uncover lots of good blocks, if only I could
    reset the automatic sparing system". In my tests, what I discovered
    at the time, is no, resetting any automatic sparing would
    achieve nothing. Still, this is a natural hypothesis for users
    to reach, that if only they could give the automatic sparing
    a whack upside the head, their drive would be "rendered new again".
    It's not true. The drive does make good, high quality determinations
    of its bad blocks. When it tells you a block is bad, it's bad.
    And reproducibly so. These were the first full height 5MB and
    10MB Seagate consumer drives (complete with floppy-like head movement
    and stepper motors for driving the head in and out instead of a
    voice coil).

    I-a remember when a 25mb drive was huge :)


    Paul



    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Sun Jul 27 07:10:59 2025
    From Newsgroup: alt.os.linux.mint

    On Sun, 7/27/2025 5:44 AM, Felix wrote:


    so does formatting mark sectors 'not for use'? even 'fast format'?

    If you Quick Format a partition, there is only sufficient time
    (ten seconds), to erase the metadata and lay down empty metadata
    structures. There is no time during a Quick Format, for a
    read-verify of the platter surface.

    Reading the surface of a disk drive and certifying it is functional,
    takes hours to do. It is a very slow process.

    That's why we don't do this check very often, it takes a long time
    to do it. It takes longer than the time to do a backup <wink>.

    Paul
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Sun Jul 27 07:43:44 2025
    From Newsgroup: alt.os.linux.mint

    On Sun, 7/27/2025 6:21 AM, Felix wrote:
    Paul wrote:

    The hard drive has automatic sparing, which means if there
    is trouble with a sector, the drive has some spare sectors
    in the immediate area.

    that's good. is it solely a Linux thing, or windows also?

    Automatic sparing on ATA drives (SATA or IDE) is a hardware-supported
    activity. It happens no matter what OS is involved, it even
    works for Macintosh computers :-)

    -a-a-a-a sudo smartctl -a /dev/sda

    how would I specify other than the C drive?

    On Linux, the letter on the end is the drive identifier

    Windows Disk0 Linux /dev/sda
    Windows Disk1 Linux /dev/sdb
    Windows Disk2 Linux /dev/sdc

    Windows has Disk Management (diskmgmt.msc), while Linux has "gnome-disks".
    Use the menu in "gnome-disks" to select a particular hard drive like
    /dev/sdc , to show the partitions on it.

    The Linux gparted utility, can also display disks in a format that
    sort of looks like Windows Disk Management.


    so drives self test themselves and avoid writing to bad blocks making drive testing somewhat unnecessary?

    No!

    Drives watch a sector that is being read, for symptoms that the sector needs to be checked.
    Sectors which you did not use a computer program to read, can remain unverified for years
    and years.

    The SMART short test or the SMART long test, both of those complete too quickly to verify
    the entire surface.

    Automatic sparing responds to sectors you are visiting at the moment. The more busy a partition is, the more likely some of the sectors will be evaluated and spared out if something is wrong with them.

    But if you want to know the state of the entire surface, you run a thorough surface
    scan, which takes hours and uses an application program for the determination.

    Take the following list of activities:

    (1) Read the entire surface
    (2) Write the entire surface
    (3) Read the entire surface

    Now, all the automatic sparing should be up to date.


    I-a remember when a 25mb drive was huge :)

    I was able to put two years worth of files, onto a 10MB drive.
    That's how long it took to fill up the drive. The average file
    size back then was 2KB, there were no picture files on the
    disk drive, and the document editor was non-WYSIWYG and used
    "format commands", which meant there was no bloat in formatted
    documents either. It was certainly a different time, in terms
    of what kind of files went onto a disk drive.

    Paul

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Sun Jul 27 07:46:36 2025
    From Newsgroup: alt.os.linux.mint

    On Sun, 7/27/2025 5:59 AM, Felix wrote:
    Lawrence D'Oliveiro wrote:
    On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:

    How does LM treat HD bad sectors?
    You have to specify a list of them to be excluded from file allocations
    when you initialize a filesystem on the disk. Note the -c option in mke2fs >> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
    find these for itself.

    so this is the correct command to create ext4 file system with normal cluster size and read/write test?..

    *mkfs.ext4 -c -c*

    -a Or the -l option to get a list of them from a
    separate file.

    Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they >> start happening, theyrCOre only likely to get worse.

    I want to know if disks are good to use them or not


    You do a surface scan.

    Some tools provide a graphical output as a "map" to give you some
    idea whether there is any pattern to the CRC errors.

    http://www.hdtune.com/errorscan_failed.png

    It can take hours, to finish one of those surface scans.

    Paul
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Felix@none@nowhere.com to alt.os.linux.mint on Sun Jul 27 21:54:12 2025
    From Newsgroup: alt.os.linux.mint

    Gordon wrote:
    On 2025-07-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:

    How does LM treat HD bad sectors?
    You have to specify a list of them to be excluded from file allocations
    when you initialize a filesystem on the disk. Note the -c option in mke2fs >> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
    find these for itself. Or the -l option to get a list of them from a
    separate file.

    Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they >> start happening, theyrCOre only likely to get worse.
    Agreed. Once a HD starts to fail it is only a matter of time before it will take your data.

    The blocks fail on the so called bath tub curve (Backblaze)

    https://www.backblaze.com/blog/drive-failure-over-time-the-bathtub-curve-is-leaking/

    More at

    https://darwinsdata.com/how-likely-is-a-hdd-to-fail/

    I've never actually had a HD drive fail in use. (touch wood)
    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Sun Jul 27 08:46:14 2025
    From Newsgroup: alt.os.linux.mint

    On Sun, 7/27/2025 7:54 AM, Felix wrote:
    Gordon wrote:
    On 2025-07-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:

    How does LM treat HD bad sectors?
    You have to specify a list of them to be excluded from file allocations
    when you initialize a filesystem on the disk. Note the -c option in mke2fs >>> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
    find these for itself. Or the -l option to get a list of them from a
    separate file.

    Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they >>> start happening, theyrCOre only likely to get worse.
    Agreed. Once a HD starts to fail it is only a matter of time before it will >> take your data.

    The blocks fail on the so called bath tub curve (Backblaze)

    https://www.backblaze.com/blog/drive-failure-over-time-the-bathtub-curve-is-leaking/

    More at

    https://darwinsdata.com/how-likely-is-a-hdd-to-fail/

    I've never actually had a HD drive fail in use. (touch wood)


    The behavior has changed with time.

    Disk failures used to be fast and catastrophic.
    I lost a couple 40GB Maxtor drives, and the second one, it was
    alive for a while, I turned off the power and got some sleep,
    and the next day, it would not ID itself and start up. I had a
    relatively small window, to attempt data recovery in that case.
    I did not succeed at it. No data got saved.

    That is how nasty drive failures used to be.

    Today, when the Reallocated hits 200 or 300, you are changing the drive, because the drive is too slow, or, you are getting an error in a file
    you just wrote.

    But, the drive today keeps spinning, and it keeps IDing itself. This
    tells us the critical data block it gets from the drive, continues
    to be in good shape. And it's only various patches of data which
    are in bad shape.

    But at some point, the continued usage of a "sick" drive does not
    make sense. We're not on the Moon or on Mars, we're on Earth and
    we can do something about our sketchy hard drive.

    You should have a healthy primary drive, and you should have a
    healthy backup drive. That represents two copies of data you can
    rely upon. If the first copy is found to be corrupted, you use
    the second copy of the data to correct the situation.

    That is the basic principle of reliable operation. If you are
    using a computer for your livelihood, you're making money by
    using the computer, then you want the reliable operation model
    to apply. That might include purchasing a PC which has working
    ECC on main memory.

    Paul
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Gordon@Gordon@leaf.net.nz to alt.os.linux.mint on Mon Jul 28 04:09:15 2025
    From Newsgroup: alt.os.linux.mint

    On 2025-07-27, Felix <none@nowhere.com> wrote:
    Paul wrote:
    On Fri, 7/25/2025 12:47 AM, Felix wrote:
    How does LM treat HD bad sectors? Can it identify and
    mark them (if any) 'not for use'? or is there an app
    that will do it? thanks all,
    https://askubuntu.com/questions/1127377/mark-ext4-blocks-as-bad-manually-without-trying-to-read-write

    The problem I have with this idea, is later if you buy a
    new hard drive, and you want to clone over the drive (say using
    ddrescue), you would also copy the portion of the file system that declares >> some blocks bad. When cloning, the badblock information
    is really "private" to that particular drive.

    What you have to decide for yourself, is how far to push
    HDD, before transferring the data to a second drive.

    *******

    The hard drive has automatic sparing, which means if there
    is trouble with a sector, the drive has some spare sectors
    in the immediate area.

    that's good. is it solely a Linux thing, or windows also?

    Neither, and both.

    A HD has an interface attacked to it. On the backside of it. Intergrated Drive Electronics or IDE
    in the old Days of serial HD interface

    https://www.techtarget.com/searchstorage/definition/IDE

    This allows the computer to talk to the HD. Keyboard command says Format and the disk is told to format the disk. The OS unimportant.



    And a table of spared blocks is
    maintained by the drive, independent of anything the user
    is doing. When the drive is getting low on spare sectors,
    the SMART "Reallocated" statistic raw data field goes non-zero,
    indicating drive life is on the warning track.

    The "smartctl" utility from smartmontools package, can tell
    you how healthy the drive is.

    sudo smartctl -a /dev/sda

    how would I specify other than the C drive?


    SMART gives its best warnings, when the drive errors are
    independent of one another, and uniformly spread out. SMART
    gives a less-useful warning, when the drive has a "bad spot",
    as all the spares in the bad spot can be exhausted and yet
    the drive health will be declared as "Good".

    A bad spot in a disk, can be detected (and not all that accurately),
    by benchmark testing the disk with a transfer benchmark. For example,
    one drive I had, there was a 70GB wide area that transferred data
    at 10MB/sec (which is abnormally low). The drive health was listed
    as "Good" which is rubbish, as the drive was obviously not normal
    at that point. I transferred the data off the drive.

    *******

    Blocks with problems, are maintained in a queue for maintenance activity
    when an attempt is made to write the block. The drive will check whether
    the write is working or not, whether the block needs to be spared, it
    spares the block out and so on. This is all automated and may slow the
    drive down a bit while the determination is made.

    so drives self test themselves and avoid writing to bad blocks making
    drive testing somewhat unnecessary?


    If you write the drive surface:

    # Do a backup first, *before* the next command

    smartctl -a /dev/sda # Record health info before run begins.

    sudo dd if=/dev/zero of=/dev/sda bs=221184 # Destructive write test >>
    Then do some reads:

    sudo dd if=/dev/sda of=/dev/null bs=221184 # Read verify, test will stop if bad block present

    sudo ddrescue -f -n /dev/sda /dev/null /root/rescue.log # Alternately, ddrescue of gddrescue package can be
    xed /root/rescue.log # used to generate a logfile with badblock info.
    # This sequence differs from the previous command
    # in that the command should always finish.

    smartctl -a /dev/sda # Look to see if Reallocated raw data increased by a couple hundred,
    # indicating the questionable blocks have been permanently harvested.
    # The raw data field might have a range of 0..5500 or so, just to give
    # some idea how worried you should be when Reallocated = 300.

    # Restore disk from backup once the harvesting is complete and you are happy.

    But when the Reallocated SMART parameter raw data field goes non-zero,
    it is time to move the data off the disk and onto another disk.
    While you can punish a drive, use up all the spares in a region,
    forcing the drive to declare an actual "CRC error" on a block there,
    then you need to start using badblocks for EXT4 to manage the
    defects and keep the file system from using the now non-functional
    inodes. And if you do that, if you resort to manual badblock management,
    the main danger is accidentally transferring the (inaccurate for a second
    hard drive) badblock data to a new disk. You are really better off
    with the disks doing their own bad block management, and you the
    operator, monitoring SMART Reallocated plus watching for "benchmark
    bad spots" as indicators the drive is at end-of-life.

    *******

    The last hard drive I opened, a Seagate, I was shocked at what I found.
    The drive only had about 10,000 hours on it, when taken out of service.
    The Reallocated might have been 300. What did I find ? A single platter,
    which is to be expected on some of your hard drive fleet of course.
    What I didn't expect to find, is there was no landing ramp for the
    heads inside the drive. The head just sits on the platter. I looked it
    up, and after the "stiction era" (quantum fireball era or so), they
    had found a way to "laser pattern" the area near the platter hub and
    make a "non-stiction area" for the heads to park when the drive
    spins down. While modern lubricants (polymer finish) are fairly
    robust, not having a landing ramp for the head, that is just not a
    best practice, and guarantees if you cycle the power every day
    on the computer, the drive does lots of spinning down and wearing
    the heads as the heads skate over the surface.

    And that's why the drive had lasted only 10,000 hours. It was because
    even though the drives are in the modern era and science had discovered
    the benefits of landing ramps, my drive didn't have a plastic landing ramp. >>
    And this is just in case you do not understand why you didn't get
    50,000 hours from a HDD. But you only figure things like this out,
    by examining the drive after it reaches end of life, to see whether
    the drive was too cheaply made. I never expected to find such an
    idiotic development, as to be dragging the heads across the platter
    when I opened the drive. I had expected to find dirt or rubbish inside
    the drive, proportional to a surface degradation, but the filter
    pack was still lilly white and the platter surface was impeccable
    to the eye, yet it had spared out enough blocks to be end-of-life.
    This means I'd need a microscope to find the damage that was
    present on the drive platter.

    *******

    When the first hard drives came out for consumers, I tested them
    in the lab. I took the factory bad block list, and the grown
    defect list, reset them, and had the drive scan for bad blocks.
    What was interesting, is the drive exactly reproduced the same
    defect list as was present in the lists. This is just in case
    you were thinking "oh, those blocks aren't really bad and
    a re-scan would uncover lots of good blocks, if only I could
    reset the automatic sparing system". In my tests, what I discovered
    at the time, is no, resetting any automatic sparing would
    achieve nothing. Still, this is a natural hypothesis for users
    to reach, that if only they could give the automatic sparing
    a whack upside the head, their drive would be "rendered new again".
    It's not true. The drive does make good, high quality determinations
    of its bad blocks. When it tells you a block is bad, it's bad.
    And reproducibly so. These were the first full height 5MB and
    10MB Seagate consumer drives (complete with floppy-like head movement
    and stepper motors for driving the head in and out instead of a
    voice coil).

    I-a remember when a 25mb drive was huge :)


    Paul





    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Gordon@Gordon@leaf.net.nz to alt.os.linux.mint on Mon Jul 28 04:17:53 2025
    From Newsgroup: alt.os.linux.mint

    On 2025-07-27, Felix <none@nowhere.com> wrote:
    Lawrence D'Oliveiro wrote:
    On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:

    How does LM treat HD bad sectors?
    You have to specify a list of them to be excluded from file allocations
    when you initialize a filesystem on the disk. Note the -c option in mke2fs >> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
    find these for itself.

    so this is the correct command to create ext4 file system with normal cluster size and read/write test?..

    *mkfs.ext4 -c -c*

    Or the -l option to get a list of them from a
    separate file.

    Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they >> start happening, theyrCOre only likely to get worse.

    I want to know if disks are good to use them or not

    There is no guage wich has good, maybe, stuffed on it. All that you can know
    is have many sectors have failed.

    So people who have have the experience of data going poof into thinn air
    start considering how much they are willing to risk this happening.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mike Easter@MikeE@ster.invalid to alt.os.linux.mint on Mon Jul 28 13:15:51 2025
    From Newsgroup: alt.os.linux.mint

    Mike Easter wrote:
    Felix wrote:
    How does LM treat HD bad sectors? Can it identify and mark them (if
    any) 'not for use'? or is there an app that will do it?

    Default LM has Disks which is gnome-disks which has a SMART function
    which can check and self-test.

    Maybe Felix can bring this around to something practical based on what triggered the question in the first place.

    Perhaps he saw (or even heard) some kind of warning from something.

    People write articles or do YTs of what might constitute a warning and
    then go to the SMART business to see if there is something wrong there.
    --
    Mike Easter
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Felix@none@nowhere.com to alt.os.linux.mint on Tue Jul 29 09:55:21 2025
    From Newsgroup: alt.os.linux.mint

    Paul wrote:
    On Sun, 7/27/2025 6:21 AM, Felix wrote:
    Paul wrote:
    The hard drive has automatic sparing, which means if there
    is trouble with a sector, the drive has some spare sectors
    in the immediate area.
    that's good. is it solely a Linux thing, or windows also?
    Automatic sparing on ATA drives (SATA or IDE) is a hardware-supported activity. It happens no matter what OS is involved, it even
    works for Macintosh computers :-)

    -a-a-a-a sudo smartctl -a /dev/sda
    how would I specify other than the C drive?
    On Linux, the letter on the end is the drive identifier

    Windows Disk0 Linux /dev/sda
    Windows Disk1 Linux /dev/sdb
    Windows Disk2 Linux /dev/sdc

    Windows has Disk Management (diskmgmt.msc), while Linux has "gnome-disks". Use the menu in "gnome-disks" to select a particular hard drive like
    /dev/sdc , to show the partitions on it.

    The Linux gparted utility, can also display disks in a format that
    sort of looks like Windows Disk Management.

    so drives self test themselves and avoid writing to bad blocks making drive testing somewhat unnecessary?
    No!

    Drives watch a sector that is being read, for symptoms that the sector needs to be checked.
    Sectors which you did not use a computer program to read, can remain unverified for years
    and years.

    The SMART short test or the SMART long test, both of those complete too quickly to verify
    the entire surface.

    Automatic sparing responds to sectors you are visiting at the moment. The more
    busy a partition is, the more likely some of the sectors will be evaluated and
    spared out if something is wrong with them.

    But if you want to know the state of the entire surface, you run a thorough surface
    scan, which takes hours and uses an application program for the determination.

    Take the following list of activities:

    (1) Read the entire surface
    (2) Write the entire surface
    (3) Read the entire surface

    Now, all the automatic sparing should be up to date.

    I-a remember when a 25mb drive was huge :)
    I was able to put two years worth of files, onto a 10MB drive.
    That's how long it took to fill up the drive. The average file
    size back then was 2KB, there were no picture files on the
    disk drive, and the document editor was non-WYSIWYG and used
    "format commands", which meant there was no bloat in formatted
    documents either. It was certainly a different time, in terms
    of what kind of files went onto a disk drive.

    Paul


    Thanks for that. :) I'm saving some posts here for reference now, given
    how abysmal my memory is. And speaking of things from bygone days.. My
    first PC was the first generation Apple Mac, a second hand one. It had
    only a 3.5" floppy for data storage and loading programs into memory.
    The amount of disk swapping required to do anything on it was insane!
    And a second disk drive cost an arm and a leg, over $1000 in today's
    money. (but you most likely know these things)
    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Felix@none@nowhere.com to alt.os.linux.mint on Tue Jul 29 10:02:21 2025
    From Newsgroup: alt.os.linux.mint

    Paul wrote:
    On Sun, 7/27/2025 7:54 AM, Felix wrote:
    Gordon wrote:
    On 2025-07-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:

    How does LM treat HD bad sectors?
    You have to specify a list of them to be excluded from file allocations >>>> when you initialize a filesystem on the disk. Note the -c option in mke2fs >>>> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to >>>> find these for itself. Or the -l option to get a list of them from a
    separate file.

    Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they
    start happening, theyrCOre only likely to get worse.
    Agreed. Once a HD starts to fail it is only a matter of time before it will >>> take your data.

    The blocks fail on the so called bath tub curve (Backblaze)

    https://www.backblaze.com/blog/drive-failure-over-time-the-bathtub-curve-is-leaking/

    More at

    https://darwinsdata.com/how-likely-is-a-hdd-to-fail/
    I've never actually had a HD drive fail in use. (touch wood)

    The behavior has changed with time.

    Disk failures used to be fast and catastrophic.
    I lost a couple 40GB Maxtor drives, and the second one, it was
    alive for a while, I turned off the power and got some sleep,
    and the next day, it would not ID itself and start up. I had a
    relatively small window, to attempt data recovery in that case.
    I did not succeed at it. No data got saved.

    That is how nasty drive failures used to be.

    Today, when the Reallocated hits 200 or 300, you are changing the drive, because the drive is too slow, or, you are getting an error in a file
    you just wrote.

    But, the drive today keeps spinning, and it keeps IDing itself. This
    tells us the critical data block it gets from the drive, continues
    to be in good shape. And it's only various patches of data which
    are in bad shape.

    But at some point, the continued usage of a "sick" drive does not
    make sense. We're not on the Moon or on Mars, we're on Earth and
    we can do something about our sketchy hard drive.

    You should have a healthy primary drive, and you should have a
    healthy backup drive.

    I do

    That represents two copies of data you can
    rely upon. If the first copy is found to be corrupted, you use
    the second copy of the data to correct the situation.

    That is the basic principle of reliable operation. If you are
    using a computer for your livelihood, you're making money by
    using the computer, then you want the reliable operation model
    to apply. That might include purchasing a PC which has working
    ECC on main memory.

    I build my own PC's


    Paul
    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Felix@none@nowhere.com to alt.os.linux.mint on Tue Jul 29 11:11:50 2025
    From Newsgroup: alt.os.linux.mint

    Mike Easter wrote:
    Mike Easter wrote:
    Felix wrote:
    How does LM treat HD bad sectors? Can it identify and mark them (if
    any) 'not for use'? or is there an app that will do it?

    Default LM has Disks which is gnome-disks which has a SMART function
    which can check and self-test.

    Maybe Felix can bring this around to something practical based on what triggered the question in the first place.

    Perhaps he saw (or even heard) some kind of warning from something.

    no, nothing like that. just wanting to learn about LM, and how it does
    things


    People write articles or do YTs of what might constitute a warning and
    then go to the SMART business to see if there is something wrong there.

    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mike Easter@MikeE@ster.invalid to alt.os.linux.mint on Mon Jul 28 18:33:01 2025
    From Newsgroup: alt.os.linux.mint

    Felix wrote:
    Mike Easter wrote:

    Perhaps he saw (or even heard) some kind of warning from something.

    no, nothing like that. just wanting to learn about LM, and how it does things

    Ah, so. gotit.

    I've had a hdd fail. 'All of a sudden' as they say, it wasn't there
    anymore. A combination of conditions, including its age and abundance
    of other devices, caused me to decide not to investigate it. But I
    continued to use the device, disconnecting the power/data to the hdd and employing the USB slots for live linux distro/s.

    Now I'm still using it w/ a SSD more than I did the USBs for Ventoy live
    (and persistent live) linux.
    --
    Mike Easter
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Tue Jul 29 08:31:59 2025
    From Newsgroup: alt.os.linux.mint

    On Mon, 7/28/2025 7:55 PM, Felix wrote:


    Thanks for that. :) I'm saving some posts here for reference now, given how abysmal my memory is. And speaking of things from bygone days.. My first PC was the first generation Apple Mac, a second hand one. It had only a 3.5" floppy for data storage and loading programs into memory. The amount of disk swapping required to do anything on it was insane! And a second disk drive cost an arm and a leg, over $1000 in today's money. (but you most likely know these things)


    We did these things too, just a slightly different era.

    The one floppy machines would drive you nuts with "Insert disc #1",
    "Insert disc #2", and after a while you're not even paying
    attention to the prompts, you're just flipping flipping flipping
    the things. It was pretty nutty. I think my machine had 128KB of RAM,
    which is why the buffer size for floppy transfers was so low. That's
    why there was so much flipping.

    You did not look forward to "cloning" an 8 inch floppy diskette,
    when it took that much media flipping.

    We had a few machines with dual floppies, and those were considered
    the "Cadillac" machines. You could copy from one drive to the other,
    sit back and sip your coffee while it happened.

    Yes, everything was expensive back then. Unreasonably so.

    Paul
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From pinnerite@pinnerite@gmail.com to alt.os.linux.mint on Tue Jul 29 17:06:44 2025
    From Newsgroup: alt.os.linux.mint

    On Fri, 25 Jul 2025 10:01:09 -0400
    Paul <nospam@needed.invalid> wrote:

    On Fri, 7/25/2025 12:47 AM, Felix wrote:

    How does LM treat HD bad sectors? Can it identify and
    mark them (if any) 'not for use'? or is there an app
    that will do it? thanks all,

    https://askubuntu.com/questions/1127377/mark-ext4-blocks-as-bad-manually-without-trying-to-read-write

    The problem I have with this idea, is later if you buy a
    new hard drive, and you want to clone over the drive (say using
    ddrescue), you would also copy the portion of the file system that declares some blocks bad. When cloning, the badblock information
    is really "private" to that particular drive.

    What you have to decide for yourself, is how far to push
    HDD, before transferring the data to a second drive.

    *******

    The hard drive has automatic sparing, which means if there
    is trouble with a sector, the drive has some spare sectors
    in the immediate area. And a table of spared blocks is
    maintained by the drive, independent of anything the user
    is doing. When the drive is getting low on spare sectors,
    the SMART "Reallocated" statistic raw data field goes non-zero,
    indicating drive life is on the warning track.

    The "smartctl" utility from smartmontools package, can tell
    you how healthy the drive is.

    sudo smartctl -a /dev/sda

    SMART gives its best warnings, when the drive errors are
    independent of one another, and uniformly spread out. SMART
    gives a less-useful warning, when the drive has a "bad spot",
    as all the spares in the bad spot can be exhausted and yet
    the drive health will be declared as "Good".

    A bad spot in a disk, can be detected (and not all that accurately),
    by benchmark testing the disk with a transfer benchmark. For example,
    one drive I had, there was a 70GB wide area that transferred data
    at 10MB/sec (which is abnormally low). The drive health was listed
    as "Good" which is rubbish, as the drive was obviously not normal
    at that point. I transferred the data off the drive.

    *******

    Blocks with problems, are maintained in a queue for maintenance activity
    when an attempt is made to write the block. The drive will check whether
    the write is working or not, whether the block needs to be spared, it
    spares the block out and so on. This is all automated and may slow the
    drive down a bit while the determination is made.

    If you write the drive surface:

    # Do a backup first, *before* the next command

    smartctl -a /dev/sda # Record health info before run begins.

    sudo dd if=/dev/zero of=/dev/sda bs=221184 # Destructive write test

    Then do some reads:

    sudo dd if=/dev/sda of=/dev/null bs=221184 # Read verify, test will stop if bad block present

    sudo ddrescue -f -n /dev/sda /dev/null /root/rescue.log # Alternately, ddrescue of gddrescue package can be
    xed /root/rescue.log # used to generate a logfile with badblock info.
    # This sequence differs from the previous command
    # in that the command should always finish.

    smartctl -a /dev/sda # Look to see if Reallocated raw data increased by a couple hundred,
    # indicating the questionable blocks have been permanently harvested.
    # The raw data field might have a range of 0..5500 or so, just to give
    # some idea how worried you should be when Reallocated = 300.

    # Restore disk from backup once the harvesting is complete and you are happy.

    But when the Reallocated SMART parameter raw data field goes non-zero,
    it is time to move the data off the disk and onto another disk.
    While you can punish a drive, use up all the spares in a region,
    forcing the drive to declare an actual "CRC error" on a block there,
    then you need to start using badblocks for EXT4 to manage the
    defects and keep the file system from using the now non-functional
    inodes. And if you do that, if you resort to manual badblock management,
    the main danger is accidentally transferring the (inaccurate for a second hard drive) badblock data to a new disk. You are really better off
    with the disks doing their own bad block management, and you the
    operator, monitoring SMART Reallocated plus watching for "benchmark
    bad spots" as indicators the drive is at end-of-life.

    *******

    The last hard drive I opened, a Seagate, I was shocked at what I found.
    The drive only had about 10,000 hours on it, when taken out of service.
    The Reallocated might have been 300. What did I find ? A single platter, which is to be expected on some of your hard drive fleet of course.
    What I didn't expect to find, is there was no landing ramp for the
    heads inside the drive. The head just sits on the platter. I looked it
    up, and after the "stiction era" (quantum fireball era or so), they
    had found a way to "laser pattern" the area near the platter hub and
    make a "non-stiction area" for the heads to park when the drive
    spins down. While modern lubricants (polymer finish) are fairly
    robust, not having a landing ramp for the head, that is just not a
    best practice, and guarantees if you cycle the power every day
    on the computer, the drive does lots of spinning down and wearing
    the heads as the heads skate over the surface.

    And that's why the drive had lasted only 10,000 hours. It was because
    even though the drives are in the modern era and science had discovered
    the benefits of landing ramps, my drive didn't have a plastic landing ramp.

    And this is just in case you do not understand why you didn't get
    50,000 hours from a HDD. But you only figure things like this out,
    by examining the drive after it reaches end of life, to see whether
    the drive was too cheaply made. I never expected to find such an
    idiotic development, as to be dragging the heads across the platter
    when I opened the drive. I had expected to find dirt or rubbish inside
    the drive, proportional to a surface degradation, but the filter
    pack was still lilly white and the platter surface was impeccable
    to the eye, yet it had spared out enough blocks to be end-of-life.
    This means I'd need a microscope to find the damage that was
    present on the drive platter.

    *******

    When the first hard drives came out for consumers, I tested them
    in the lab. I took the factory bad block list, and the grown
    defect list, reset them, and had the drive scan for bad blocks.
    What was interesting, is the drive exactly reproduced the same
    defect list as was present in the lists. This is just in case
    you were thinking "oh, those blocks aren't really bad and
    a re-scan would uncover lots of good blocks, if only I could
    reset the automatic sparing system". In my tests, what I discovered
    at the time, is no, resetting any automatic sparing would
    achieve nothing. Still, this is a natural hypothesis for users
    to reach, that if only they could give the automatic sparing
    a whack upside the head, their drive would be "rendered new again".
    It's not true. The drive does make good, high quality determinations
    of its bad blocks. When it tells you a block is bad, it's bad.
    And reproducibly so. These were the first full height 5MB and
    10MB Seagate consumer drives (complete with floppy-like head movement
    and stepper motors for driving the head in and out instead of a
    voice coil).

    Paul


    Around 20 years ago, I foind that using:

    # fsck -y -C -V /dev/sda(x)

    the (x) represents a partition number example.

    would either solve dive error problems, sometimes permanently, often
    for short time, indicating the drive was o its way out.

    Alternatively to check for bad blocks I used:

    # /sbin/dumpe2fs /dev/sda(x) | grep 'Block size'

    which returned: Blok size : 4096
    and then ran:

    #badblocks -b 4096 -s -v /dev/sda(x)

    To copy a partition converting bad blocks to null bytes, I recoded two commands but I cannot recall ever using them myself:

    # dd if-/dev/sd(source) of=/dev/sd(target) conv=noerror, sync

    or

    # ddrescue -d-v /dev/sd(source) /dev/sd(target) /home/<your_name>/logfile

    I hope that's helpful. No guarantees

    Regards, Alan

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.os.linux.mint on Fri Aug 8 04:02:10 2025
    From Newsgroup: alt.os.linux.mint

    On Sun, 27 Jul 2025 21:54:12 +1000, Felix wrote:

    I've never actually had a HD drive fail in use. (touch wood)

    I have had several failures, in about the last 3-4 decades that I have had
    a computer at home. I learned not to trust hard drives (or floppy disks,
    for that matter) even before that.

    I have also seen clients suffer various failures. There was one time we powered down a machine to replace a drive that was showing signs of
    dodginess, powered it up again ... only to discover that another,
    different drive would no longer start up.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.os.linux.mint on Fri Aug 8 04:05:33 2025
    From Newsgroup: alt.os.linux.mint

    On Sun, 27 Jul 2025 07:43:44 -0400, Paul wrote:

    On Linux, the letter on the end is the drive identifier

    Windows Disk0 Linux /dev/sda
    Windows Disk1 Linux /dev/sdb
    Windows Disk2 Linux /dev/sdc

    Note also that on *nix systems, the device name is not used to access the files on the volume. Instead, you mount the volume in an empty directory called the rCLmount pointrCY, and the contents of that volume become visible in that directory. This allows you to have more meaningful names for your different volumes while they are online.

    This contrasts with DOS/Windows, where rCLdrive lettersrCY like A, B, C etc are used both to refer to the device and to the mounted volume. This kind
    of scheme does not scale.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Fri Aug 8 01:09:57 2025
    From Newsgroup: alt.os.linux.mint

    On Fri, 8/8/2025 12:05 AM, Lawrence D'Oliveiro wrote:
    On Sun, 27 Jul 2025 07:43:44 -0400, Paul wrote:

    On Linux, the letter on the end is the drive identifier

    Windows Disk0 Linux /dev/sda
    Windows Disk1 Linux /dev/sdb
    Windows Disk2 Linux /dev/sdc

    Note also that on *nix systems, the device name is not used to access the files on the volume. Instead, you mount the volume in an empty directory called the rCLmount pointrCY, and the contents of that volume become visible in that directory. This allows you to have more meaningful names for your different volumes while they are online.

    This contrasts with DOS/Windows, where rCLdrive lettersrCY like A, B, C etc are used both to refer to the device and to the mounted volume. This kind
    of scheme does not scale.


    https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats

    In addition to identifying a drive by its drive letter,
    you can identify a volume by using its volume GUID. This takes the form:

    \\.\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt
    \\?\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt

    I believe it is also possible to call CHKDSK with an identifier
    similar to that, instead of C: or D: . This can be helpful in cases,
    where a drive does not have a letter. Note that, while "disktype"
    utility, returns a "partition GUID", these mountvol identifiers
    seem to be different and I don't know where those come from.

    chkdsk "\\?\Volume{eb38d03c-29ed-11e2-be65-806e6f6e6963}"

    For example, you can do:

    mountvol

    \\?\Volume{0bd6166a-0836-4041-891c-792df2c72abd}\
    C:\

    \\?\Volume{c3bc5ab1-c5f0-4dae-838c-751ef868e237}\ <=== This is a hidden partition, equivalent to type 0x27
    *** NO MOUNT POINTS ***

    Notice, in this example, the trailing backslash should be removed.

    chkdsk \\?\Volume{c3bc5ab1-c5f0-4dae-838c-751ef868e237}

    The type of the file system is NTFS.

    WARNING! /F parameter not specified.
    Running CHKDSK in read-only mode.

    Stage 1: Examining basic file system structure ...
    ...
    Total duration: 16.57 milliseconds (16 ms).

    Paul
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Felix@none@nowhere.com to alt.os.linux.mint on Fri Aug 8 16:16:29 2025
    From Newsgroup: alt.os.linux.mint

    Lawrence D'Oliveiro wrote:
    On Sun, 27 Jul 2025 21:54:12 +1000, Felix wrote:

    I've never actually had a HD drive fail in use. (touch wood)
    I have had several failures, in about the last 3-4 decades that I have had
    a computer at home. I learned not to trust hard drives (or floppy disks,
    for that matter) even before that.

    I have also seen clients suffer various failures. There was one time we powered down a machine to replace a drive that was showing signs of dodginess, powered it up again ... only to discover that another,
    different drive would no longer start up.

    actually I accidentally killed a drive once with ESD, but I was
    reluctant to mention that. :) I suppose the data would have been intact
    if I had just replaced the PCB.
    --
    Linux Mint 22.1

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.os.linux.mint on Sat Aug 9 03:42:02 2025
    From Newsgroup: alt.os.linux.mint

    On Fri, 8 Aug 2025 01:09:57 -0400, Paul wrote:

    On Fri, 8/8/2025 12:05 AM, Lawrence D'Oliveiro wrote:

    This contrasts with DOS/Windows, where rCLdrive lettersrCY like A, B, C etc >> are used both to refer to the device and to the mounted volume. This kind >> of scheme does not scale.

    https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats

    In addition to identifying a drive by its drive letter,
    you can identify a volume by using its volume GUID. This takes the form:

    \\.\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt
    \\?\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt

    I believe it is also possible to call CHKDSK with an identifier
    similar to that, instead of C: or D: .

    Only CHKDSK? Not other Windows software?

    And donrCOt forget the nonsense over rCLreservedrCY Windows file names ...
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Paul@nospam@needed.invalid to alt.os.linux.mint on Sat Aug 9 00:06:03 2025
    From Newsgroup: alt.os.linux.mint

    On Fri, 8/8/2025 11:42 PM, Lawrence D'Oliveiro wrote:
    On Fri, 8 Aug 2025 01:09:57 -0400, Paul wrote:

    On Fri, 8/8/2025 12:05 AM, Lawrence D'Oliveiro wrote:

    This contrasts with DOS/Windows, where rCLdrive lettersrCY like A, B, C etc
    are used both to refer to the device and to the mounted volume. This kind >>> of scheme does not scale.

    https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats

    In addition to identifying a drive by its drive letter,
    you can identify a volume by using its volume GUID. This takes the form: >>
    \\.\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt
    \\?\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt

    I believe it is also possible to call CHKDSK with an identifier
    similar to that, instead of C: or D: .

    Only CHKDSK? Not other Windows software?

    And donrCOt forget the nonsense over rCLreservedrCY Windows file names ...


    type "\\?\Volume{c3bc5ab1-c5f0-4dae-838c-751ef868e237}\Recovery\Logs\Reload.xml"

    <?xml version='1.0' encoding='utf-8'?>

    <WindowsRE version="2.0">
    <WinreBCD id="{2329e315-3b91-11f0-99b1-df5f4917c70d}"/>
    ...
    </WindowsRE>

    I can dump the contents of a text file, on a hidden NTFS.
    The "type" command is similar to "cat".

    But to do that, I had to use "testdisk.exe" to be able to list
    the partition and identify some candidate file and folder names.

    And the identifier used, is not the same identifier that
    "disktype" prints out, either.

    Whether this is useful, is another question entirely, but at least
    I can see the potential to read one file from each partition of
    a max-partition GPT disk (128 partitions).

    Paul
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.os.linux.mint on Sun Aug 10 01:46:59 2025
    From Newsgroup: alt.os.linux.mint

    On Sat, 9 Aug 2025 00:06:03 -0400, Paul wrote:

    And the identifier used, is not the same identifier that
    "disktype" prints out, either.

    Consistency, Microsoft has heard of it!
    --- Synchronet 3.21a-Linux NewsLink 1.2