Sysop: | Amessyroom |
---|---|
Location: | Fayetteville, NC |
Users: | 23 |
Nodes: | 6 (0 / 6) |
Uptime: | 54:29:09 |
Calls: | 583 |
Files: | 1,139 |
D/L today: |
179 files (27,921K bytes) |
Messages: | 111,799 |
How does LM treat HD bad sectors? Can it identify and mark them (if any) 'not for use'? or is there an app that will do it?
How does LM treat HD bad sectors?
How does LM treat HD bad sectors? Can it identify and
mark them (if any) 'not for use'? or is there an app
that will do it? thanks all,
On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:
How does LM treat HD bad sectors?
You have to specify a list of them to be excluded from file allocations
when you initialize a filesystem on the disk. Note the -c option in mke2fs
<https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
find these for itself. Or the -l option to get a list of them from a separate file.
Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they start happening, theyrCOre only likely to get worse.
The problem I have with this idea, is later if you buy a new hard drive,
and you want to clone over the drive (say using ddrescue), you would
also copy the portion of the file system that declares some blocks bad.
On Fri, 25 Jul 2025 10:01:09 -0400, Paul wrote:
The problem I have with this idea, is later if you buy a new hard drive,
and you want to clone over the drive (say using ddrescue), you would
also copy the portion of the file system that declares some blocks bad.
This is why I always do file-level copies, rather than trying to rCLclonerCY a
drive. rsync is a great tool for this.
I think Windows users have this assumption that, for OS installs at least, you must use some kind of rCLcloningrCY utility to transfer them, otherwise they wonrCOt work.
Felix wrote:
How does LM treat HD bad sectors? Can it identify and mark them (if
any) 'not for use'? or is there an app that will do it?
Default LM has Disks which is gnome-disks which has a SMART function
which can check and self-test.
But I think the way to go about dealing w/ sector exclusion on a disk
you plan to continue to use is to 'start over' and format it, as
opposed to trying to 'retro' exclude.
There are instructions for using badblocks to list them to a file and
then use fsck to use the file to exclude them; but I don't know if I
like that; doing it that way instead of the format.
------------
Steps to Check and Mark Bad Sectors:
-a-a-a 1. Identify the disk:
-a-a-a .
Use fdisk -l to list all available disks and their partitions.
2. Unmount the disk:
.
Before scanning, unmount the partition you want to check using sudo
umount /dev/sda1 (replace sda1 with your actual partition).
3. Scan for bad blocks:
.
Use badblocks -v /dev/sda1 > badsectors.txt to scan for bad blocks and
save the results to a file (replace sda1 with your partition).
4. Mark bad sectors:
.
Use sudo e2fsck -l badsectors.txt /dev/sda1 (for ext2/ext3/ext4
filesystems) or sudo fsck -l badsectors.txt /dev/sda1 (for other filesystems) to mark the bad sectors as unusable.
5. Remount the disk:
.
After the scan and marking process, remount the disk using sudo mount /dev/sda1.
6. Check disk health with SMART:
.
Use the Disks application (found in the menu under Accessories) to
check the disk's SMART data and run self-tests.
------------
On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:
How does LM treat HD bad sectors?You have to specify a list of them to be excluded from file allocations
when you initialize a filesystem on the disk. Note the -c option in mke2fs <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
find these for itself.
Or the -l option to get a list of them from a
separate file.
Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they start happening, theyrCOre only likely to get worse.
On Fri, 7/25/2025 12:47 AM, Felix wrote:
How does LM treat HD bad sectors? Can it identify andhttps://askubuntu.com/questions/1127377/mark-ext4-blocks-as-bad-manually-without-trying-to-read-write
mark them (if any) 'not for use'? or is there an app
that will do it? thanks all,
The problem I have with this idea, is later if you buy a
new hard drive, and you want to clone over the drive (say using
ddrescue), you would also copy the portion of the file system that declares some blocks bad. When cloning, the badblock information
is really "private" to that particular drive.
What you have to decide for yourself, is how far to push
HDD, before transferring the data to a second drive.
*******
The hard drive has automatic sparing, which means if there
is trouble with a sector, the drive has some spare sectors
in the immediate area.
And a table of spared blocks is
maintained by the drive, independent of anything the user
is doing. When the drive is getting low on spare sectors,
the SMART "Reallocated" statistic raw data field goes non-zero,
indicating drive life is on the warning track.
The "smartctl" utility from smartmontools package, can tell
you how healthy the drive is.
sudo smartctl -a /dev/sda
SMART gives its best warnings, when the drive errors are
independent of one another, and uniformly spread out. SMART
gives a less-useful warning, when the drive has a "bad spot",
as all the spares in the bad spot can be exhausted and yet
the drive health will be declared as "Good".
A bad spot in a disk, can be detected (and not all that accurately),
by benchmark testing the disk with a transfer benchmark. For example,
one drive I had, there was a 70GB wide area that transferred data
at 10MB/sec (which is abnormally low). The drive health was listed
as "Good" which is rubbish, as the drive was obviously not normal
at that point. I transferred the data off the drive.
*******
Blocks with problems, are maintained in a queue for maintenance activity
when an attempt is made to write the block. The drive will check whether
the write is working or not, whether the block needs to be spared, it
spares the block out and so on. This is all automated and may slow the
drive down a bit while the determination is made.
If you write the drive surface:
# Do a backup first, *before* the next command
smartctl -a /dev/sda # Record health info before run begins.
sudo dd if=/dev/zero of=/dev/sda bs=221184 # Destructive write test
Then do some reads:
sudo dd if=/dev/sda of=/dev/null bs=221184 # Read verify, test will stop if bad block present
sudo ddrescue -f -n /dev/sda /dev/null /root/rescue.log # Alternately, ddrescue of gddrescue package can be
xed /root/rescue.log # used to generate a logfile with badblock info.
# This sequence differs from the previous command
# in that the command should always finish.
smartctl -a /dev/sda # Look to see if Reallocated raw data increased by a couple hundred,
# indicating the questionable blocks have been permanently harvested.
# The raw data field might have a range of 0..5500 or so, just to give
# some idea how worried you should be when Reallocated = 300.
# Restore disk from backup once the harvesting is complete and you are happy.
But when the Reallocated SMART parameter raw data field goes non-zero,
it is time to move the data off the disk and onto another disk.
While you can punish a drive, use up all the spares in a region,
forcing the drive to declare an actual "CRC error" on a block there,
then you need to start using badblocks for EXT4 to manage the
defects and keep the file system from using the now non-functional
inodes. And if you do that, if you resort to manual badblock management,
the main danger is accidentally transferring the (inaccurate for a second hard drive) badblock data to a new disk. You are really better off
with the disks doing their own bad block management, and you the
operator, monitoring SMART Reallocated plus watching for "benchmark
bad spots" as indicators the drive is at end-of-life.
*******
The last hard drive I opened, a Seagate, I was shocked at what I found.
The drive only had about 10,000 hours on it, when taken out of service.
The Reallocated might have been 300. What did I find ? A single platter, which is to be expected on some of your hard drive fleet of course.
What I didn't expect to find, is there was no landing ramp for the
heads inside the drive. The head just sits on the platter. I looked it
up, and after the "stiction era" (quantum fireball era or so), they
had found a way to "laser pattern" the area near the platter hub and
make a "non-stiction area" for the heads to park when the drive
spins down. While modern lubricants (polymer finish) are fairly
robust, not having a landing ramp for the head, that is just not a
best practice, and guarantees if you cycle the power every day
on the computer, the drive does lots of spinning down and wearing
the heads as the heads skate over the surface.
And that's why the drive had lasted only 10,000 hours. It was because
even though the drives are in the modern era and science had discovered
the benefits of landing ramps, my drive didn't have a plastic landing ramp.
And this is just in case you do not understand why you didn't get
50,000 hours from a HDD. But you only figure things like this out,
by examining the drive after it reaches end of life, to see whether
the drive was too cheaply made. I never expected to find such an
idiotic development, as to be dragging the heads across the platter
when I opened the drive. I had expected to find dirt or rubbish inside
the drive, proportional to a surface degradation, but the filter
pack was still lilly white and the platter surface was impeccable
to the eye, yet it had spared out enough blocks to be end-of-life.
This means I'd need a microscope to find the damage that was
present on the drive platter.
*******
When the first hard drives came out for consumers, I tested them
in the lab. I took the factory bad block list, and the grown
defect list, reset them, and had the drive scan for bad blocks.
What was interesting, is the drive exactly reproduced the same
defect list as was present in the lists. This is just in case
you were thinking "oh, those blocks aren't really bad and
a re-scan would uncover lots of good blocks, if only I could
reset the automatic sparing system". In my tests, what I discovered
at the time, is no, resetting any automatic sparing would
achieve nothing. Still, this is a natural hypothesis for users
to reach, that if only they could give the automatic sparing
a whack upside the head, their drive would be "rendered new again".
It's not true. The drive does make good, high quality determinations
of its bad blocks. When it tells you a block is bad, it's bad.
And reproducibly so. These were the first full height 5MB and
10MB Seagate consumer drives (complete with floppy-like head movement
and stepper motors for driving the head in and out instead of a
voice coil).
Paul
so does formatting mark sectors 'not for use'? even 'fast format'?
Paul wrote:
The hard drive has automatic sparing, which means if there
is trouble with a sector, the drive has some spare sectors
in the immediate area.
that's good. is it solely a Linux thing, or windows also?
-a-a-a-a sudo smartctl -a /dev/sda
how would I specify other than the C drive?
so drives self test themselves and avoid writing to bad blocks making drive testing somewhat unnecessary?
I-a remember when a 25mb drive was huge :)
Lawrence D'Oliveiro wrote:
On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:
How does LM treat HD bad sectors?You have to specify a list of them to be excluded from file allocations
when you initialize a filesystem on the disk. Note the -c option in mke2fs >> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
find these for itself.
so this is the correct command to create ext4 file system with normal cluster size and read/write test?..
*mkfs.ext4 -c -c*
-a Or the -l option to get a list of them from a
separate file.
Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they >> start happening, theyrCOre only likely to get worse.
I want to know if disks are good to use them or not
On 2025-07-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:Agreed. Once a HD starts to fail it is only a matter of time before it will take your data.
How does LM treat HD bad sectors?You have to specify a list of them to be excluded from file allocations
when you initialize a filesystem on the disk. Note the -c option in mke2fs >> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
find these for itself. Or the -l option to get a list of them from a
separate file.
Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they >> start happening, theyrCOre only likely to get worse.
The blocks fail on the so called bath tub curve (Backblaze)
https://www.backblaze.com/blog/drive-failure-over-time-the-bathtub-curve-is-leaking/
More at
https://darwinsdata.com/how-likely-is-a-hdd-to-fail/
Gordon wrote:
On 2025-07-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:Agreed. Once a HD starts to fail it is only a matter of time before it will >> take your data.
How does LM treat HD bad sectors?You have to specify a list of them to be excluded from file allocations
when you initialize a filesystem on the disk. Note the -c option in mke2fs >>> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
find these for itself. Or the -l option to get a list of them from a
separate file.
Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they >>> start happening, theyrCOre only likely to get worse.
The blocks fail on the so called bath tub curve (Backblaze)
https://www.backblaze.com/blog/drive-failure-over-time-the-bathtub-curve-is-leaking/
More at
https://darwinsdata.com/how-likely-is-a-hdd-to-fail/
I've never actually had a HD drive fail in use. (touch wood)
Paul wrote:
On Fri, 7/25/2025 12:47 AM, Felix wrote:
How does LM treat HD bad sectors? Can it identify andhttps://askubuntu.com/questions/1127377/mark-ext4-blocks-as-bad-manually-without-trying-to-read-write
mark them (if any) 'not for use'? or is there an app
that will do it? thanks all,
The problem I have with this idea, is later if you buy a
new hard drive, and you want to clone over the drive (say using
ddrescue), you would also copy the portion of the file system that declares >> some blocks bad. When cloning, the badblock information
is really "private" to that particular drive.
What you have to decide for yourself, is how far to push
HDD, before transferring the data to a second drive.
*******
The hard drive has automatic sparing, which means if there
is trouble with a sector, the drive has some spare sectors
in the immediate area.
that's good. is it solely a Linux thing, or windows also?
And a table of spared blocks is
maintained by the drive, independent of anything the user
is doing. When the drive is getting low on spare sectors,
the SMART "Reallocated" statistic raw data field goes non-zero,
indicating drive life is on the warning track.
The "smartctl" utility from smartmontools package, can tell
you how healthy the drive is.
sudo smartctl -a /dev/sda
how would I specify other than the C drive?
SMART gives its best warnings, when the drive errors are
independent of one another, and uniformly spread out. SMART
gives a less-useful warning, when the drive has a "bad spot",
as all the spares in the bad spot can be exhausted and yet
the drive health will be declared as "Good".
A bad spot in a disk, can be detected (and not all that accurately),
by benchmark testing the disk with a transfer benchmark. For example,
one drive I had, there was a 70GB wide area that transferred data
at 10MB/sec (which is abnormally low). The drive health was listed
as "Good" which is rubbish, as the drive was obviously not normal
at that point. I transferred the data off the drive.
*******
Blocks with problems, are maintained in a queue for maintenance activity
when an attempt is made to write the block. The drive will check whether
the write is working or not, whether the block needs to be spared, it
spares the block out and so on. This is all automated and may slow the
drive down a bit while the determination is made.
so drives self test themselves and avoid writing to bad blocks making
drive testing somewhat unnecessary?
If you write the drive surface:
# Do a backup first, *before* the next command
smartctl -a /dev/sda # Record health info before run begins.
sudo dd if=/dev/zero of=/dev/sda bs=221184 # Destructive write test >>
Then do some reads:
sudo dd if=/dev/sda of=/dev/null bs=221184 # Read verify, test will stop if bad block present
sudo ddrescue -f -n /dev/sda /dev/null /root/rescue.log # Alternately, ddrescue of gddrescue package can be
xed /root/rescue.log # used to generate a logfile with badblock info.
# This sequence differs from the previous command
# in that the command should always finish.
smartctl -a /dev/sda # Look to see if Reallocated raw data increased by a couple hundred,
# indicating the questionable blocks have been permanently harvested.
# The raw data field might have a range of 0..5500 or so, just to give
# some idea how worried you should be when Reallocated = 300.
# Restore disk from backup once the harvesting is complete and you are happy.
But when the Reallocated SMART parameter raw data field goes non-zero,
it is time to move the data off the disk and onto another disk.
While you can punish a drive, use up all the spares in a region,
forcing the drive to declare an actual "CRC error" on a block there,
then you need to start using badblocks for EXT4 to manage the
defects and keep the file system from using the now non-functional
inodes. And if you do that, if you resort to manual badblock management,
the main danger is accidentally transferring the (inaccurate for a second
hard drive) badblock data to a new disk. You are really better off
with the disks doing their own bad block management, and you the
operator, monitoring SMART Reallocated plus watching for "benchmark
bad spots" as indicators the drive is at end-of-life.
*******
The last hard drive I opened, a Seagate, I was shocked at what I found.
The drive only had about 10,000 hours on it, when taken out of service.
The Reallocated might have been 300. What did I find ? A single platter,
which is to be expected on some of your hard drive fleet of course.
What I didn't expect to find, is there was no landing ramp for the
heads inside the drive. The head just sits on the platter. I looked it
up, and after the "stiction era" (quantum fireball era or so), they
had found a way to "laser pattern" the area near the platter hub and
make a "non-stiction area" for the heads to park when the drive
spins down. While modern lubricants (polymer finish) are fairly
robust, not having a landing ramp for the head, that is just not a
best practice, and guarantees if you cycle the power every day
on the computer, the drive does lots of spinning down and wearing
the heads as the heads skate over the surface.
And that's why the drive had lasted only 10,000 hours. It was because
even though the drives are in the modern era and science had discovered
the benefits of landing ramps, my drive didn't have a plastic landing ramp. >>
And this is just in case you do not understand why you didn't get
50,000 hours from a HDD. But you only figure things like this out,
by examining the drive after it reaches end of life, to see whether
the drive was too cheaply made. I never expected to find such an
idiotic development, as to be dragging the heads across the platter
when I opened the drive. I had expected to find dirt or rubbish inside
the drive, proportional to a surface degradation, but the filter
pack was still lilly white and the platter surface was impeccable
to the eye, yet it had spared out enough blocks to be end-of-life.
This means I'd need a microscope to find the damage that was
present on the drive platter.
*******
When the first hard drives came out for consumers, I tested them
in the lab. I took the factory bad block list, and the grown
defect list, reset them, and had the drive scan for bad blocks.
What was interesting, is the drive exactly reproduced the same
defect list as was present in the lists. This is just in case
you were thinking "oh, those blocks aren't really bad and
a re-scan would uncover lots of good blocks, if only I could
reset the automatic sparing system". In my tests, what I discovered
at the time, is no, resetting any automatic sparing would
achieve nothing. Still, this is a natural hypothesis for users
to reach, that if only they could give the automatic sparing
a whack upside the head, their drive would be "rendered new again".
It's not true. The drive does make good, high quality determinations
of its bad blocks. When it tells you a block is bad, it's bad.
And reproducibly so. These were the first full height 5MB and
10MB Seagate consumer drives (complete with floppy-like head movement
and stepper motors for driving the head in and out instead of a
voice coil).
I-a remember when a 25mb drive was huge :)
Paul
Lawrence D'Oliveiro wrote:
On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:
How does LM treat HD bad sectors?You have to specify a list of them to be excluded from file allocations
when you initialize a filesystem on the disk. Note the -c option in mke2fs >> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to
find these for itself.
so this is the correct command to create ext4 file system with normal cluster size and read/write test?..
*mkfs.ext4 -c -c*
Or the -l option to get a list of them from a
separate file.
Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they >> start happening, theyrCOre only likely to get worse.
I want to know if disks are good to use them or not
Felix wrote:
How does LM treat HD bad sectors? Can it identify and mark them (if
any) 'not for use'? or is there an app that will do it?
Default LM has Disks which is gnome-disks which has a SMART function
which can check and self-test.
On Sun, 7/27/2025 6:21 AM, Felix wrote:
Paul wrote:Automatic sparing on ATA drives (SATA or IDE) is a hardware-supported activity. It happens no matter what OS is involved, it even
The hard drive has automatic sparing, which means if therethat's good. is it solely a Linux thing, or windows also?
is trouble with a sector, the drive has some spare sectors
in the immediate area.
works for Macintosh computers :-)
On Linux, the letter on the end is the drive identifier-a-a-a-a sudo smartctl -a /dev/sdahow would I specify other than the C drive?
Windows Disk0 Linux /dev/sda
Windows Disk1 Linux /dev/sdb
Windows Disk2 Linux /dev/sdc
Windows has Disk Management (diskmgmt.msc), while Linux has "gnome-disks". Use the menu in "gnome-disks" to select a particular hard drive like
/dev/sdc , to show the partitions on it.
The Linux gparted utility, can also display disks in a format that
sort of looks like Windows Disk Management.
so drives self test themselves and avoid writing to bad blocks making drive testing somewhat unnecessary?No!
Drives watch a sector that is being read, for symptoms that the sector needs to be checked.
Sectors which you did not use a computer program to read, can remain unverified for years
and years.
The SMART short test or the SMART long test, both of those complete too quickly to verify
the entire surface.
Automatic sparing responds to sectors you are visiting at the moment. The more
busy a partition is, the more likely some of the sectors will be evaluated and
spared out if something is wrong with them.
But if you want to know the state of the entire surface, you run a thorough surface
scan, which takes hours and uses an application program for the determination.
Take the following list of activities:
(1) Read the entire surface
(2) Write the entire surface
(3) Read the entire surface
Now, all the automatic sparing should be up to date.
I-a remember when a 25mb drive was huge :)I was able to put two years worth of files, onto a 10MB drive.
That's how long it took to fill up the drive. The average file
size back then was 2KB, there were no picture files on the
disk drive, and the document editor was non-WYSIWYG and used
"format commands", which meant there was no bloat in formatted
documents either. It was certainly a different time, in terms
of what kind of files went onto a disk drive.
Paul
On Sun, 7/27/2025 7:54 AM, Felix wrote:
Gordon wrote:The behavior has changed with time.
On 2025-07-25, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:I've never actually had a HD drive fail in use. (touch wood)
On Fri, 25 Jul 2025 14:47:51 +1000, Felix wrote:Agreed. Once a HD starts to fail it is only a matter of time before it will >>> take your data.
How does LM treat HD bad sectors?You have to specify a list of them to be excluded from file allocations >>>> when you initialize a filesystem on the disk. Note the -c option in mke2fs >>>> <https://manpages.debian.org/mke2fs(8)>, which tells it to do a scan to >>>> find these for itself. Or the -l option to get a list of them from a
separate file.
Frankly, I wouldnrCOt like to touch a disk with bad blocks on it. Once they
start happening, theyrCOre only likely to get worse.
The blocks fail on the so called bath tub curve (Backblaze)
https://www.backblaze.com/blog/drive-failure-over-time-the-bathtub-curve-is-leaking/
More at
https://darwinsdata.com/how-likely-is-a-hdd-to-fail/
Disk failures used to be fast and catastrophic.
I lost a couple 40GB Maxtor drives, and the second one, it was
alive for a while, I turned off the power and got some sleep,
and the next day, it would not ID itself and start up. I had a
relatively small window, to attempt data recovery in that case.
I did not succeed at it. No data got saved.
That is how nasty drive failures used to be.
Today, when the Reallocated hits 200 or 300, you are changing the drive, because the drive is too slow, or, you are getting an error in a file
you just wrote.
But, the drive today keeps spinning, and it keeps IDing itself. This
tells us the critical data block it gets from the drive, continues
to be in good shape. And it's only various patches of data which
are in bad shape.
But at some point, the continued usage of a "sick" drive does not
make sense. We're not on the Moon or on Mars, we're on Earth and
we can do something about our sketchy hard drive.
You should have a healthy primary drive, and you should have a
healthy backup drive.
That represents two copies of data you can
rely upon. If the first copy is found to be corrupted, you use
the second copy of the data to correct the situation.
That is the basic principle of reliable operation. If you are
using a computer for your livelihood, you're making money by
using the computer, then you want the reliable operation model
to apply. That might include purchasing a PC which has working
ECC on main memory.
Paul--
Mike Easter wrote:
Felix wrote:Maybe Felix can bring this around to something practical based on what triggered the question in the first place.
How does LM treat HD bad sectors? Can it identify and mark them (if
any) 'not for use'? or is there an app that will do it?
Default LM has Disks which is gnome-disks which has a SMART function
which can check and self-test.
Perhaps he saw (or even heard) some kind of warning from something.
People write articles or do YTs of what might constitute a warning and
then go to the SMART business to see if there is something wrong there.
Mike Easter wrote:
Perhaps he saw (or even heard) some kind of warning from something.
no, nothing like that. just wanting to learn about LM, and how it does things
Thanks for that. :) I'm saving some posts here for reference now, given how abysmal my memory is. And speaking of things from bygone days.. My first PC was the first generation Apple Mac, a second hand one. It had only a 3.5" floppy for data storage and loading programs into memory. The amount of disk swapping required to do anything on it was insane! And a second disk drive cost an arm and a leg, over $1000 in today's money. (but you most likely know these things)
On Fri, 7/25/2025 12:47 AM, Felix wrote:
How does LM treat HD bad sectors? Can it identify and
mark them (if any) 'not for use'? or is there an app
that will do it? thanks all,
https://askubuntu.com/questions/1127377/mark-ext4-blocks-as-bad-manually-without-trying-to-read-write
The problem I have with this idea, is later if you buy a
new hard drive, and you want to clone over the drive (say using
ddrescue), you would also copy the portion of the file system that declares some blocks bad. When cloning, the badblock information
is really "private" to that particular drive.
What you have to decide for yourself, is how far to push
HDD, before transferring the data to a second drive.
*******
The hard drive has automatic sparing, which means if there
is trouble with a sector, the drive has some spare sectors
in the immediate area. And a table of spared blocks is
maintained by the drive, independent of anything the user
is doing. When the drive is getting low on spare sectors,
the SMART "Reallocated" statistic raw data field goes non-zero,
indicating drive life is on the warning track.
The "smartctl" utility from smartmontools package, can tell
you how healthy the drive is.
sudo smartctl -a /dev/sda
SMART gives its best warnings, when the drive errors are
independent of one another, and uniformly spread out. SMART
gives a less-useful warning, when the drive has a "bad spot",
as all the spares in the bad spot can be exhausted and yet
the drive health will be declared as "Good".
A bad spot in a disk, can be detected (and not all that accurately),
by benchmark testing the disk with a transfer benchmark. For example,
one drive I had, there was a 70GB wide area that transferred data
at 10MB/sec (which is abnormally low). The drive health was listed
as "Good" which is rubbish, as the drive was obviously not normal
at that point. I transferred the data off the drive.
*******
Blocks with problems, are maintained in a queue for maintenance activity
when an attempt is made to write the block. The drive will check whether
the write is working or not, whether the block needs to be spared, it
spares the block out and so on. This is all automated and may slow the
drive down a bit while the determination is made.
If you write the drive surface:
# Do a backup first, *before* the next command
smartctl -a /dev/sda # Record health info before run begins.
sudo dd if=/dev/zero of=/dev/sda bs=221184 # Destructive write test
Then do some reads:
sudo dd if=/dev/sda of=/dev/null bs=221184 # Read verify, test will stop if bad block present
sudo ddrescue -f -n /dev/sda /dev/null /root/rescue.log # Alternately, ddrescue of gddrescue package can be
xed /root/rescue.log # used to generate a logfile with badblock info.
# This sequence differs from the previous command
# in that the command should always finish.
smartctl -a /dev/sda # Look to see if Reallocated raw data increased by a couple hundred,
# indicating the questionable blocks have been permanently harvested.
# The raw data field might have a range of 0..5500 or so, just to give
# some idea how worried you should be when Reallocated = 300.
# Restore disk from backup once the harvesting is complete and you are happy.
But when the Reallocated SMART parameter raw data field goes non-zero,
it is time to move the data off the disk and onto another disk.
While you can punish a drive, use up all the spares in a region,
forcing the drive to declare an actual "CRC error" on a block there,
then you need to start using badblocks for EXT4 to manage the
defects and keep the file system from using the now non-functional
inodes. And if you do that, if you resort to manual badblock management,
the main danger is accidentally transferring the (inaccurate for a second hard drive) badblock data to a new disk. You are really better off
with the disks doing their own bad block management, and you the
operator, monitoring SMART Reallocated plus watching for "benchmark
bad spots" as indicators the drive is at end-of-life.
*******
The last hard drive I opened, a Seagate, I was shocked at what I found.
The drive only had about 10,000 hours on it, when taken out of service.
The Reallocated might have been 300. What did I find ? A single platter, which is to be expected on some of your hard drive fleet of course.
What I didn't expect to find, is there was no landing ramp for the
heads inside the drive. The head just sits on the platter. I looked it
up, and after the "stiction era" (quantum fireball era or so), they
had found a way to "laser pattern" the area near the platter hub and
make a "non-stiction area" for the heads to park when the drive
spins down. While modern lubricants (polymer finish) are fairly
robust, not having a landing ramp for the head, that is just not a
best practice, and guarantees if you cycle the power every day
on the computer, the drive does lots of spinning down and wearing
the heads as the heads skate over the surface.
And that's why the drive had lasted only 10,000 hours. It was because
even though the drives are in the modern era and science had discovered
the benefits of landing ramps, my drive didn't have a plastic landing ramp.
And this is just in case you do not understand why you didn't get
50,000 hours from a HDD. But you only figure things like this out,
by examining the drive after it reaches end of life, to see whether
the drive was too cheaply made. I never expected to find such an
idiotic development, as to be dragging the heads across the platter
when I opened the drive. I had expected to find dirt or rubbish inside
the drive, proportional to a surface degradation, but the filter
pack was still lilly white and the platter surface was impeccable
to the eye, yet it had spared out enough blocks to be end-of-life.
This means I'd need a microscope to find the damage that was
present on the drive platter.
*******
When the first hard drives came out for consumers, I tested them
in the lab. I took the factory bad block list, and the grown
defect list, reset them, and had the drive scan for bad blocks.
What was interesting, is the drive exactly reproduced the same
defect list as was present in the lists. This is just in case
you were thinking "oh, those blocks aren't really bad and
a re-scan would uncover lots of good blocks, if only I could
reset the automatic sparing system". In my tests, what I discovered
at the time, is no, resetting any automatic sparing would
achieve nothing. Still, this is a natural hypothesis for users
to reach, that if only they could give the automatic sparing
a whack upside the head, their drive would be "rendered new again".
It's not true. The drive does make good, high quality determinations
of its bad blocks. When it tells you a block is bad, it's bad.
And reproducibly so. These were the first full height 5MB and
10MB Seagate consumer drives (complete with floppy-like head movement
and stepper motors for driving the head in and out instead of a
voice coil).
Paul
I've never actually had a HD drive fail in use. (touch wood)
On Linux, the letter on the end is the drive identifier
Windows Disk0 Linux /dev/sda
Windows Disk1 Linux /dev/sdb
Windows Disk2 Linux /dev/sdc
On Sun, 27 Jul 2025 07:43:44 -0400, Paul wrote:
On Linux, the letter on the end is the drive identifier
Windows Disk0 Linux /dev/sda
Windows Disk1 Linux /dev/sdb
Windows Disk2 Linux /dev/sdc
Note also that on *nix systems, the device name is not used to access the files on the volume. Instead, you mount the volume in an empty directory called the rCLmount pointrCY, and the contents of that volume become visible in that directory. This allows you to have more meaningful names for your different volumes while they are online.
This contrasts with DOS/Windows, where rCLdrive lettersrCY like A, B, C etc are used both to refer to the device and to the mounted volume. This kind
of scheme does not scale.
On Sun, 27 Jul 2025 21:54:12 +1000, Felix wrote:
I've never actually had a HD drive fail in use. (touch wood)I have had several failures, in about the last 3-4 decades that I have had
a computer at home. I learned not to trust hard drives (or floppy disks,
for that matter) even before that.
I have also seen clients suffer various failures. There was one time we powered down a machine to replace a drive that was showing signs of dodginess, powered it up again ... only to discover that another,
different drive would no longer start up.
On Fri, 8/8/2025 12:05 AM, Lawrence D'Oliveiro wrote:
https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats
This contrasts with DOS/Windows, where rCLdrive lettersrCY like A, B, C etc >> are used both to refer to the device and to the mounted volume. This kind >> of scheme does not scale.
In addition to identifying a drive by its drive letter,
you can identify a volume by using its volume GUID. This takes the form:
\\.\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt
\\?\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt
I believe it is also possible to call CHKDSK with an identifier
similar to that, instead of C: or D: .
On Fri, 8 Aug 2025 01:09:57 -0400, Paul wrote:
On Fri, 8/8/2025 12:05 AM, Lawrence D'Oliveiro wrote:
https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats
This contrasts with DOS/Windows, where rCLdrive lettersrCY like A, B, C etc
are used both to refer to the device and to the mounted volume. This kind >>> of scheme does not scale.
In addition to identifying a drive by its drive letter,
you can identify a volume by using its volume GUID. This takes the form: >>
\\.\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt
\\?\Volume{b75e2c83-0000-0000-0000-602f00000000}\Test\Foo.txt
I believe it is also possible to call CHKDSK with an identifier
similar to that, instead of C: or D: .
Only CHKDSK? Not other Windows software?
And donrCOt forget the nonsense over rCLreservedrCY Windows file names ...
And the identifier used, is not the same identifier that
"disktype" prints out, either.