• Re: [gentoo-user] Seagate hard drives with dual actuators.

    From Michael@21:1/5 to All on Sat Nov 16 11:02:25 2024
    On Friday 15 November 2024 22:13:27 GMT Rich Freeman wrote:
    On Fri, Nov 15, 2024 at 10:35 AM Michael <confabulate@kintzios.com> wrote:
    Host managed SMRs (HM-SMR) require the OS and FS to be aware of the need for sequential writes and manage submitted data sympathetically to this limitation of the SMR drive, by queuing up random writes in batches and submitting these as a sequential stream.

    I understand the ext4-lazy option and some patches on btrfs have improved performance of these filesystems on SMR drivers, but perhaps f2fs will perform better? :-/

    IMO a host-managed solution is likely to be the only thing that will
    work reliably.

    I think HM-SMRs may be used in commercial applications and DM-SMRs fed to the unaware retail consumer, not that we can tell without the OEMs providing this rather vital piece of information. I assume (simplistically) with DM-SMRs the discard-garbage collection is managed wholly by the onboard drive controller, while with HM-SMRs the OS will signal the drive to start trimming when the workload is low in order to distribute the timing overheads to the system's idle time.


    If the drive supports discard/trim MAYBE a dumber
    drive might be able to be used with the right filesystem. Even if
    you're doing "write-once" workloads any kind of metadata change is
    going to cause random writes unless the filesystem was designed for
    SMR. Ideally you'd store metadata on a non-SMR device, though it
    isn't strictly necessary with a log-based approach.

    I've considered the flash storage trim operation to be loosely like the SMR behaviour of read-modify-write and similarly susceptible to write- amplification, but with SSD the write speed is so much faster it doesn't cause the same noticeable slowdown as with SMR.


    If the SMR drive tries really hard to not look like an SMR drive and
    doesn't support discard/trim then even an SMR-aware solution probably
    won't be able to use it effectively. The drive is going to keep doing read-before-write cycles to preserve data even if there is nothing
    useful to preserve.

    --
    Rich

    Sure, trimming nudges by the OS won't have an effect on the DM-SMR drive since small changes to shingled bands are managed internally by the onboard STL controller and the persistent SMR cache. However, there are some things a filesystem driver can still provide to improve performance. For example, compression, caching-batching to facilitate sequential writing directly in whole bands rather than random small regions within a band, etc. In the case of ext4 'lazy writeback journaling' the allocated journaling space is increased upfront and the metadata are written once only with a mapping inserted in memory to join the location of the metadata in the journal itself instead of writing it twice. The two links at the bottom of this page explain it better:

    https://www.usenix.org/conference/fast17/technical-sessions/presentation/ aghayev

    I'm not sure if this happens automatically, or if some mkfs and mount option(s) need to be called upon. With up to 30x faster write speed, perhaps this is something Dale may want to investigate and experiment with.

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmc4e8EACgkQseqq9sKV ZxlYhhAAnitNejCtJX1ARqr4mOEdCoOHQ0Wk674IOnkaKyN3AfB7bg2PzeBOiVar en0FxR8O0v451Sj3hpmi+t0as9WPT06bRwzo0Gx8TRKRQ9JJNxIzs6AscDlwDVH8 9S1U8zhB3y4jRxBZwBAsjE+LSo5+sUU1/jiuXlKN+K/HlsGvBmrOe4Hb8IUDr4B+ YAYnDsPAbyOnVYctFnsgieBhpLyB1sTk4VAQTycPhaHxWgwwDj+A7Bd6GA58F9dX Q28HDSBOwY304u4giUmZtxOxlIfyTpSVKbYQoqUwPKXOQ0OjlueePQSMtYV6NGNO nhQ63WMdQoIWO7anybigtbI+nKTd3E/OBxAt32hF4mgb0N/c+SxBVcPPQZU4R5Jb nE4MyVA7kLoqserxMXgW9DXW4/7XeGidh68EMg48w82vhs2x7RzCOyju2gyZFzZh bB/W35VHMDscYwBcSYZaAOU7dD+cmyYFZr6nG+R0QRj/x7qn5ePIPLzaWy6ilbQs suO+3nh3wJKt1W7nxRoSHEawpyhj+HVpYEIX8hsRIo57i8QdQQku+x+sjTF6yNdo A1RiIb85HoKodG95+gooDBt2c0ewpazBFDvRdBQNcieTPAcr6dwofKz+4JQaef4x /22ghSwxtNT3O1d/r55gjvCJFXOBWxBjn2Euz8wPR61cALXVD/A=
    =mxHr
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to confabulate@kintzios.com on Sat Nov 16 15:40:01 2024
    On Sat, Nov 16, 2024 at 6:02 AM Michael <confabulate@kintzios.com> wrote:

    I assume (simplistically) with DM-SMRs the
    discard-garbage collection is managed wholly by the onboard drive controller, while with HM-SMRs the OS will signal the drive to start trimming when the workload is low in order to distribute the timing overheads to the system's idle time.

    I'll admit I haven't looked into the details as I have no need for SMR
    and there aren't any good FOSS solutions for using it that I'm aware
    of (just a few that might be slightly less terrible). However, this
    doesn't seem correct for two reasons:

    First, I'm not sure why HM-SMR would even need a discard function.
    The discard command is used to tell a drive that a block is safe to
    overwrite without preservation. A host-managed SMR drive doesn't need
    to know what data is disposable and what data is not. It simply needs
    to write data when the host instructs it to do so, destroying other
    data in the process, and it is the host's job to not destroy anything
    it cares about. If a write requires a prior read, then the host needs
    to first do the read, then adjust the written data appropriately so
    that nothing is lost.

    Second, there is no reason that any drive of any kind (SMR or SSD)
    NEEDS to do discard/trim operations when the drive is idle, because discard/trim is entirely a metadata operation that doesn't require IO
    with the drive data itself. Now, some drives might CHOOSE to
    implement it that way, but they don't have to. On an SSD, a discard
    command does not mean that the drive needs to erase or move any data
    at all. It just means that if there is a subsequent erase that would
    impact that block, it isn't necessary to first read the data and
    re-write it afterwards. A discard could be implemented entirely in non-volatile metadata storage, such as with a bitmap. For a DM-SMR
    using flash for this purpose would make a lot of sense - you wouldn't
    need much of it.

    This is probably why you have endless arguing online about whether
    discard/trim is helpful for SSDs. It completely depends on how the
    drive implements the command. The drives I've owned can discard
    blocks without any impact on IO, but I've heard some have a terrible
    impact on IO. It is just like how you can complete the same sort
    operation in seconds or hours depending on how dumb your sorting
    algorithm is.

    In any case, to really take advantage of SMR the OS needs to
    understand exactly how to structure its writes so as to not take a
    penalty, and that requires information about the implementation of the
    storage that isn't visible in a DM-SMR. Sure, some designs will do
    better on SMR even without this information, but I don't think they'll
    ever be all that efficient. It is no different from putting f2fs on a
    flash drive with a brain-dead discard implementation - even if the OS
    does all its discards in nice consolidated contiguous operations it
    doesn't mean that the drive will handle that in milliseconds instead
    of just blocking all IO for an hour - sure, the drive COULD do the
    operation quickly, but that doesn't mean that the firmware designers
    didn't just ignore the simplest use case in favor of just optimizing
    around the assumption that NTFS is the only filesystem in the world.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to I think all the OS can do is seek t on Sat Nov 16 19:47:02 2024
    On Saturday 16 November 2024 14:36:02 GMT Rich Freeman wrote:
    On Sat, Nov 16, 2024 at 6:02 AM Michael <confabulate@kintzios.com> wrote:
    I assume (simplistically) with DM-SMRs the
    discard-garbage collection is managed wholly by the onboard drive controller, while with HM-SMRs the OS will signal the drive to start trimming when the workload is low in order to distribute the timing overheads to the system's idle time.

    I'll admit I haven't looked into the details as I have no need for SMR
    and there aren't any good FOSS solutions for using it that I'm aware
    of (just a few that might be slightly less terrible). However, this
    doesn't seem correct for two reasons:

    First, I'm not sure why HM-SMR would even need a discard function.
    The discard command is used to tell a drive that a block is safe to
    overwrite without preservation. A host-managed SMR drive doesn't need
    to know what data is disposable and what data is not. It simply needs
    to write data when the host instructs it to do so, destroying other
    data in the process, and it is the host's job to not destroy anything
    it cares about. If a write requires a prior read, then the host needs
    to first do the read, then adjust the written data appropriately so
    that nothing is lost.

    As I understand it from reading various articles, the constraint of having to write sequentially a whole band when a random block changes within a band works the same on both HM-SMR and the more common DM-SMR. What differs with HM-SMR instructions is the host is meant to take over the management of random writes and submit these as sequential whole band streams to the drive to be committed without a read-modify-write penalty. I suppose for the host to have to read the whole band first from the drive, modify it and then submit it to the drive to write it as a whole band will be faster than letting the drive manage this operation internally and getting its internal cache full. This will not absolve the drive firmware from having to manage its own trim operations and the impact metadata changes could have on the drive, but some timing optimisation is perhaps reasonable. I can't recall where I read this bit - perhaps some presentation on XFS or ext4 - not sure.


    Second, there is no reason that any drive of any kind (SMR or SSD)
    NEEDS to do discard/trim operations when the drive is idle, because discard/trim is entirely a metadata operation that doesn't require IO
    with the drive data itself. Now, some drives might CHOOSE to
    implement it that way, but they don't have to. On an SSD, a discard
    command does not mean that the drive needs to erase or move any data
    at all. It just means that if there is a subsequent erase that would
    impact that block, it isn't necessary to first read the data and
    re-write it afterwards. A discard could be implemented entirely in non-volatile metadata storage, such as with a bitmap. For a DM-SMR
    using flash for this purpose would make a lot of sense - you wouldn't
    need much of it.

    I don't know if SMRs use flash to record their STL status and data allocation between their persistent cache and shingled storage space. I would think yes, or at least they ought to. Without metadata written to different media, for such a small random write to take place atomically a whole SMR band will be read, modified in memory, written to a new temporary location and finally overwrite the original SMR band.


    This is probably why you have endless arguing online about whether discard/trim is helpful for SSDs. It completely depends on how the
    drive implements the command. The drives I've owned can discard
    blocks without any impact on IO, but I've heard some have a terrible
    impact on IO. It is just like how you can complete the same sort
    operation in seconds or hours depending on how dumb your sorting
    algorithm is.

    I have an old OCZ which would increase IO latency to many seconds if not minutes whenever trim was running, to the point where users started complaining I had 'broken' their PC. As if I would do such a thing. LOL! Never mind trying to write anything, reading from the disk would take ages and the drive IO LED on the case stayed on for many minutes while TRIM was running. I reformatted with btrfs, overprovisioned enough spare capacity and reduced the cron job for trim to once a month, which stopped them complaining.
    I don't know if the firmware was trying to write zeros to the drive deterministically, instead of just de-allocating the trimmed blocks.


    In any case, to really take advantage of SMR the OS needs to
    understand exactly how to structure its writes so as to not take a
    penalty, and that requires information about the implementation of the storage that isn't visible in a DM-SMR.

    Yes, I think all the OS can do is seek to minimise random writes and from what I read a SMR-friendlier fs will try to do this.


    Sure, some designs will do
    better on SMR even without this information, but I don't think they'll
    ever be all that efficient. It is no different from putting f2fs on a
    flash drive with a brain-dead discard implementation - even if the OS
    does all its discards in nice consolidated contiguous operations it
    doesn't mean that the drive will handle that in milliseconds instead
    of just blocking all IO for an hour - sure, the drive COULD do the
    operation quickly, but that doesn't mean that the firmware designers
    didn't just ignore the simplest use case in favor of just optimizing
    around the assumption that NTFS is the only filesystem in the world.

    For all I know consumer grade USB sticks with their cheap controller chips use no wear levelling at all:

    https://support-en.sandisk.com/app/answers/detailweb/a_id/25185/~/learn-about-trim-support-for-usb-flash%2C-memory-cards%2C-and-ssd-on-windows-and

    Consequently, all flash friendly fs can do is perhaps compress and write in batched mode to minimise write ops.

    I can see where an SMR drive would be a suitable solution for storing media files, but I don't know if the shingled bands would cause leakage due to their proximity and eventually start losing data. I haven't seen any reliability reports on this technology.
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmc49rYACgkQseqq9sKV ZxmtwxAA1MBt2lq0rk5pKr+//eiBp3BKpAwp+5uAQUpf9iT/z+pTGfky4CmXM1K+ BSZ3mGUfHCcCJffxbC/zEKWf2O5Mp/jEP1Tb32mNeBIzSUC8QHbXmaeZkCucuXnm PJjuequjx3XvlQk8ME4LMZyAaCWRaL8JGWxsYX7XuMIIMax6JR/1zfj1isuvZcGq bA8RNcNPm9Tp+To+me4nMCuqv4P359Dpm2A5TaLdSGzoWX2FM/4GJ2pKMXVY/0hJ Nz0ppRASXzeZaXeYtAjc2aO81ckFcOpuxDoCjinR7zFbVdWsdEthHwA+b+9dHrUo hRL/SpX4Fi32dpJKv2DBm61e1vIITCNHSwI1rgShS55GLgZV+OutVJDS/q26ZqTe HUvmDVyIq5Y4ZY8SJMwJ1YMuITRXBq2W8hzIcwDZUuAQXUP+OTJSadScQzELeHjm BX/9oDYdMJbFWWtn7Ymxluo0Q/bTXhgbAp84F4IIC/Ye+7G6vrmVnYG39hrqgZqC cW7TNWw3GMSHcbl5b0a+timEIbN78CyTIFDJRHsmwoUm3wQPxEl+g2A4oY6bPtEO 4BIFcJk5GqAhoAnUZmAm9FAMBtCukzkyYq1eweP8ff6E2L74Yg1zfi3UVTdNRDa4 2r6N0Z2HxFNh9nBBWgfVFHx1bJ5UAZY/jjNQrxMjgr+uazUdeFE=
    =Zb40
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to confabulate@kintzios.com on Sat Nov 16 21:20:02 2024
    On Sat, Nov 16, 2024 at 2:47 PM Michael <confabulate@kintzios.com> wrote:

    As I understand it from reading various articles, the constraint of having to write sequentially a whole band when a random block changes within a band works the same on both HM-SMR and the more common DM-SMR.

    Correct.

    What differs with
    HM-SMR instructions is the host is meant to take over the management of random
    writes and submit these as sequential whole band streams to the drive to be committed without a read-modify-write penalty. I suppose for the host to have
    to read the whole band first from the drive, modify it and then submit it to the drive to write it as a whole band will be faster than letting the drive manage this operation internally and getting its internal cache full.

    I doubt this would be any faster with a host-managed drive. The same
    pattern of writes is going to incur the same penalties.

    The idea of a host-managed drive is to avoid the random writes in the
    first place, and the need to do the random reads. For this to work
    the host has to know where the boundaries of the various regions are
    and where it is safe to begin writes in each region.

    Sure, a host could just use software to make the host-managed drive
    behave the same as a drive-managed drive, but there isn't much benefit
    there. You'd want to use a log-based storage system/etc to just avoid
    the random writes entirely. You might not even want to use a POSIX
    filesystem on it.

    This
    will not absolve the drive firmware from having to manage its own trim operations and the impact metadata changes could have on the drive, but some timing optimisation is perhaps reasonable.

    Why would a host-managed SMR drive have ANY trim operations? What
    does trimming even mean on a host-managed drive?

    Trimming is the act of telling the drive that it is safe to delete a
    block without preserving it. A host-managed drive shouldn't need to
    be concerned with preserving any data during a write operation. If it
    is told to write something, it will just overwrite the data in the
    subsequent overlapping cylinders.

    Trimming is helpful with drive-managed SMR because if the drive isn't
    told to trim a block that is about to be overwritten due to a write to
    a different block, then the drive needs to first read, and then
    rewrite the block. Trimming tells the drive that this step can be
    skipped, assuming ALL the blocks in that region have been trimmed.

    I don't know if SMRs use flash to record their STL status and data allocation between their persistent cache and shingled storage space. I would think yes,
    or at least they ought to. Without metadata written to different media, for such a small random write to take place atomically a whole SMR band will be read, modified in memory, written to a new temporary location and finally overwrite the original SMR band.

    Well, drive-managed SMR drives typically have CMR regions for data
    caching, and they could also be used to store the bitmap. Cheap
    drives might not support trim at all, and would just preserve all data
    on write. After all, it isn't performance that is driving the
    decision to sneak SMR into consumer drives. Flash would be the most
    sensible way to do it though.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wol@21:1/5 to Rich Freeman on Sun Nov 17 00:30:01 2024
    On 16/11/2024 20:13, Rich Freeman wrote:
    Well, drive-managed SMR drives typically have CMR regions for data
    caching, and they could also be used to store the bitmap. Cheap
    drives might not support trim at all, and would just preserve all data
    on write. After all, it isn't performance that is driving the
    decision to sneak SMR into consumer drives. Flash would be the most
    sensible way to do it though.

    I would have thought the best way for a host-managed drive to avoid
    masses of read-write would simply be to stream the data (files) to an
    SMR region, and the metadata (directory structure) to SMR.

    That way, if you store the block list in the directory, you just drop
    data blocks, and if you keep track of which directory contents are
    stored in which SMR block, you can simply recover the space by copying
    the directory(ies) to a new block.

    Cheers,
    Wol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to All on Sun Nov 17 11:22:15 2024
    On Saturday 16 November 2024 20:13:30 GMT Rich Freeman wrote:
    On Sat, Nov 16, 2024 at 2:47 PM Michael <confabulate@kintzios.com> wrote:

    What differs with
    HM-SMR instructions is the host is meant to take over the management of random writes and submit these as sequential whole band streams to the drive to be committed without a read-modify-write penalty. I suppose for the host to have to read the whole band first from the drive, modify it
    and then submit it to the drive to write it as a whole band will be
    faster than letting the drive manage this operation internally and
    getting its internal cache full.

    I doubt this would be any faster with a host-managed drive. The same
    pattern of writes is going to incur the same penalties.

    The idea of a host-managed drive is to avoid the random writes in the
    first place, and the need to do the random reads. For this to work
    the host has to know where the boundaries of the various regions are
    and where it is safe to begin writes in each region.

    The random reads do not incur a time penalty, it is the R-M-W ops that cost time. The host don't need to know where bands start and finish, only needs to submit data in whole sequential streams, so they can be written directly to the disk as in a CMR. As long as data and metadata are submitted and written directly, the SMR would be alike a CMR in terms of its performance.


    Sure, a host could just use software to make the host-managed drive
    behave the same as a drive-managed drive, but there isn't much benefit
    there. You'd want to use a log-based storage system/etc to just avoid
    the random writes entirely. You might not even want to use a POSIX filesystem on it.

    This
    will not absolve the drive firmware from having to manage its own trim operations and the impact metadata changes could have on the drive, but some timing optimisation is perhaps reasonable.

    Why would a host-managed SMR drive have ANY trim operations? What
    does trimming even mean on a host-managed drive?

    Trimming is the act of telling the drive that it is safe to delete a
    block without preserving it. A host-managed drive shouldn't need to
    be concerned with preserving any data during a write operation. If it
    is told to write something, it will just overwrite the data in the
    subsequent overlapping cylinders.

    I assumed, may be wrongly, there is still an STL function performed by the controller on HM-SMRs, to de-allocate deleted data bands whenever files are deleted, perform secure data deletions via its firmware, etc. However, I can see if this is managed at the fs journal layer the drive controller could be dumb in this respect. Perhaps what I had read referred to HM-SMR 'aware' drives, which may behave as DM-SMRs depending on the OS capability.

    It would be interesting to see how different fs types perform on DM-SMRs. Looking at used drives on ebay they are rather still rather pricey for me to splash out on one of them, but since my PVR 14 year old WD SATA 2 refuses to die they may get cheaper by the time I need a replacement.

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmc50ecACgkQseqq9sKV ZxlYmQ/8CM473pnaCp5BxFIPAMhG4xZNRqXgqiwX38jQWELuFbopcUkx82rZaT4e yVC6zR9BIBreXB4NSBcokUsvJNcJxpHOOmQcozfJPVK9yWvssmjaOKYkBORo2VNK gklB3ew4I81YW6dSzakctICeZfophhhQ0pamQYpbmL4MU5qHqouECRncsPOaj8RF jAjvm3YhG0mM0A47J17zhoXMaVHjFF+zZ4uQDqPZ6vGr80MBo15nu+vgDdL4dW3Z T0HXAt7Ap21AkKDo9GqGyLN3MEj4+McpnqhphZLLyy1tjnp4CrVFnjIcV44CqNSo 8MdgySVAwFQK99Q39IOpAKWlYnzyRPP5sP7JCJ4ugUA3p8CKN7r02/5ZldEMvJDu vnJqkEsEw5pMvRnrGQrCR3zkZSbHXNslVcqlQhYTg0GR6P739+RVcNfYTOkNvDWq nqQYyZQDzE+PbviVWP/kGElWOOC7EIQbsu0JF1Y5e5axnZ7i4p097BGWNAMxxd64 +oPZxwoJHKgvHzTXcxwKKcH9vAsaPncYviQQ8KP9xQgY4WFAOqsf/gVGrfTa2R7D DEqfjP2Oc7aiHBCIqE8Qug2BvTWXYfEO3bpPFF44DFa1Wb1EvCTaP4dtgq9Syx5o ggJ6gEuzxlwPPnPgtTcvcg2umuACDLD87xXFzMMfz3/cgjHmj74=
    =yK+r
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to confabulate@kintzios.com on Sun Nov 17 22:30:01 2024
    On Sun, Nov 17, 2024 at 6:22 AM Michael <confabulate@kintzios.com> wrote:

    On Saturday 16 November 2024 20:13:30 GMT Rich Freeman wrote:

    The idea of a host-managed drive is to avoid the random writes in the
    first place, and the need to do the random reads. For this to work
    the host has to know where the boundaries of the various regions are
    and where it is safe to begin writes in each region.

    The random reads do not incur a time penalty, it is the R-M-W ops that cost time.

    We're saying the same thing. If you don't preserve data that you
    overwrite, then there is no need to read anything. Random reads are
    the same speed on CMR and SMR, but not doing a read is faster than
    doing a read on either platform, and any read on an HDD is very slow.

    The host don't need to know where bands start and finish, only needs to submit data in whole sequential streams, so they can be written directly to the disk as in a CMR. As long as data and metadata are submitted and written directly, the SMR would be alike a CMR in terms of its performance.

    Again, we're saying the same thing, but making different assumptions
    about how HM-SMR is implemented.

    SMR can be appended to without penalty, just like tape. In order to
    append and not overwrite, the host needs to know where the boundaries
    of the SMR domains are.

    I assumed, may be wrongly, there is still an STL function performed by the controller on HM-SMRs, to de-allocate deleted data bands whenever files are deleted, perform secure data deletions via its firmware, etc. However, I can see if this is managed at the fs journal layer the drive controller could be dumb in this respect.

    Honestly, I don't know exactly what commands an HM-SMR implements, and
    since I doubt I'll ever use one, I can't really be bothered to look
    them up. The whole point of a HM-SMR drive is that the drive just
    does exactly what the host does, and doesn't try to shield the host
    from the details of how SMR works. That's why they can be used
    without performance penalties. They're just very destructive to data
    if they aren't used correctly.

    It would be interesting to see how different fs types perform on DM-SMRs.

    Not that interesting, for me personally. That's like asking how well
    different filesystems would perform on tape. If I'm storing data on
    tape, I'll use an algorithm designed to work on tape, and a tape drive
    that actually has a command set that doesn't try to pretend that it is
    useful for random writes. SMR is pretty analogous to tape, with the
    benefit of being as fast as CMR for random reads.

    If anything I've been trying to migrate away from HDD entirely. NVMe
    will always be more expensive I'm sure but the density and endurance
    are continuing to improve, and of course the speed is incomparable.
    Cost is only a few times more. Biggest challenge is the lanes but
    used workstations seem to be a way to get around that.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jack@21:1/5 to Rich Freeman on Mon Nov 18 00:10:02 2024
    On 2024.11.17 16:26, Rich Freeman wrote:
    On Sun, Nov 17, 2024 at 6:22 AM Michael <confabulate@kintzios.com>
    wrote:
    On Saturday 16 November 2024 20:13:30 GMT Rich Freeman wrote:
    [snip ....]
    It would be interesting to see how different fs types perform on
    DM-SMRs.

    Not that interesting, for me personally. That's like asking how well different filesystems would perform on tape. If I'm storing data on
    tape, I'll use an algorithm designed to work on tape, and a tape
    drive that actually has a command set that doesn't try to pretend
    that it is useful for random writes.
    What about DEC-Tape? :-) (https://en.wikipedia.org/wiki/DECtape) (I
    may even have a few left in a closet somewhere, if only I could find
    someone to read them.)

    SMR is pretty analogous to tape, with the benefit of being as fast as
    CMR for random reads.

    [snip ....]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to ostroffjh@users.sourceforge.net on Mon Nov 18 01:30:01 2024
    On Sun, Nov 17, 2024 at 6:04 PM Jack <ostroffjh@users.sourceforge.net> wrote:

    What about DEC-Tape? :-) (https://en.wikipedia.org/wiki/DECtape) (I
    may even have a few left in a closet somewhere, if only I could find
    someone to read them.)


    LTO is pretty much the only sensible choice these days as I understand
    it. I've looked into it for backup but you need to store a LOT of
    data for it to make sense. The issue is that the drives are just super-expensive. You can get much older generation drives used for
    reasonable prices, but then the tapes have a very low capacity but
    they aren't that cheap, so your cost per TB is pretty high, and of
    course you have the inconvenience of long backup times and lots of
    tape changes. The newer generation drives are very reasonable in
    terms of cost per TB, but the drives themselves cost thousands of
    dollars. Unless you're archiving hundreds of TB it is cheaper to just
    buy lots of USB3 hard drives at $15/TB, and then you get the random IO performance as a bonus. The main downside to HDD at smaller scales is
    that the drives themselves are more fragile, but that is mostly if
    you're dropping them - in terms of storage conditions tape needs
    better care than many appreciate for it to remain reliable.

    For my offline onsite backups I just use a pair of USB3 drives on ZFS
    right now. For actual storage I'm trying to buy U.2 NVMe for future
    expansion, but I don't have a pressing need for that until HDDs die or
    I need more space. Never makes sense to buy more than you need since
    all this stuff gets cheaper with time...

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matt Jolly@21:1/5 to All on Mon Nov 18 03:40:01 2024
    Hi,
    > LTO is pretty much the only sensible choice these days as I understand
    it.

    That's really the case, for bulk storage of any type you need to be able
    to tier a lot of it offsite/offline. I'm responsible for a tape library
    with a robot arm and about 13 drives raging from LTO7 through to LTO9.

    I've looked into it for backup but you need to store a LOT of
    data for it to make sense. The issue is that the drives are just super-expensive. You can get much older generation drives used for reasonable prices, but then the tapes have a very low capacity but
    they aren't that cheap, so your cost per TB is pretty high, and of
    course you have the inconvenience of long backup times and lots of
    tape changes.

    The 7s are on their way out atm, so I'd expect to start seeing more pop
    up for sale secondhand.

    If you're doing tape though, 3:2:1 still applies, and you also (ideally)
    want two different manufacturer drives writing to two different
    manufacturer tapes to mitigate against issues like 'oh I got a firmware
    update and my tape drive started writing garbage'.

    The newer generation drives are very reasonable in
    terms of cost per TB, but the drives themselves cost thousands of
    dollars. Unless you're archiving hundreds of TB it is cheaper to just
    buy lots of USB3 hard drives at $15/TB, and then you get the random IO performance as a bonus.

    ... but you have to swap out hundreds of USB drives which, especially
    on the cheap side, are likely to be significantly less robust than
    tape carts over time.

    The main downside to HDD at smaller scales is
    that the drives themselves are more fragile, but that is mostly if
    you're dropping them - in terms of storage conditions tape needs
    better care than many appreciate for it to remain reliable.

    It's really all downsides at the home scale. Your best bet is often
    ensuring that you have an offsite (s3 + restic can be decent)
    backup of your essential data and otherwise assuming that everything
    else could be lost at any time and just not caring about it.

    Anecdotally rsync (or s3...) to a cheap Hetzner VPS can be cost
    effective for smaller datasets / backups. My 100T server at
    home though? If it dies I lose everything non-critical on it!

    Cheers,

    Matt

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matt Jolly@21:1/5 to All on Thu Nov 14 01:50:01 2024
    Hi Dale,

    My question is this. Given they cost about $20 more, from what I've
    found anyway, is it worth it? Is there a downside to this new set of
    heads being added? I'm thinking a higher failure rate, more risk to
    data or something like that. I think this is a fairly new thing, last
    couple years or so maybe. We all know how some new things don't work out.

    At least one vendor has been trying to sell me on these recently,
    claiming higher bandwidth and better ability to seek. I have not yet
    actually used these.

    Just looking for thoughts and opinions, facts if someone has some.
    Failure rate compared to single actuator drives if there is such data.
    My searched didn't help me find anything useful.

    There's no reason to suspect that these are going to be terrible, they
    appear to be the next step on the Seagate roadmap before HAMR drives
    hit the market in the coming years.

    I haven't seen much on the reliability side of things, however I
    wouldn't be too concerned, assuming that there's proper backup in place
    - Any other drive (including SSDs) could "just die" for a multitude of
    reasons.

    P. S. My greens are growing like weeds. Usually they ready to pick by
    now but having to wait for the tree to be cut down and cut up delayed
    that. Should be ready by Christmas, I hope. Oh, planted oats, clover,
    kale and some other extra seeds I had in open area. I saw a LARGE buck
    deer the other night snacking on the oats. My neighbor would rather see
    it in his freezer tho. o_0

    Hopefully there's some left for you!

    Cheers,

    Matt

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wols Lists@21:1/5 to Dale on Thu Nov 14 09:00:01 2024
    On 13/11/2024 23:10, Dale wrote:
    My question is this. Given they cost about $20 more, from what I've
    found anyway, is it worth it? Is there a downside to this new set of
    heads being added? I'm thinking a higher failure rate, more risk to
    data or something like that. I think this is a fairly new thing, last
    couple years or so maybe. We all know how some new things don't work out.

    I think this technology has actually been out for a long time. I'm sure
    I've heard of it ages ago.

    Thing is, it's probably one of those things that's been available in
    high-end drives for years, but the cost-benefit ratio has been low so
    few people bought them. Now presumably the economics have changed.

    If the actuators are mounted opposite each other, then they can't
    collide, and presumably can operate completely independent of each
    other. The costs of two of them were presumably just deemed not worth it.

    An opposite niche (and rather apposite for you) was when I started
    buying disk drives. I think my first was a 2GB Bigfoot, followed by a
    6GB, and I bought several 18GBs for work. They were "old tech", 5.25"
    5200rpm in an era of 3.5" 7500rpm, but their capacities were huge and
    cheap. If all you wanted was storage, they were great. Most people
    thought the size and speed of the smaller drives was better value, even
    if it cost more per meg.

    Cheers,
    Wol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to All on Thu Nov 14 11:21:49 2024
    On Wednesday 13 November 2024 23:10:10 GMT Dale wrote:
    Howdy,

    One of my PVs is about 83% full. Time to add more space, soon anyway.
    I try not to go past 90%. Anyway, I was looking at hard drives and
    noticed something new. I think I saw one a while back but didn't look
    into it at the time. I'm looking at 18TB drives, right now. Some new Seagate drives have dual actuators. Basically, they have two sets of
    heads. In theory, if circumstances are right, it could read data twice
    as fast. Of course, most of the time that won't be the case but it can happen often enough to make it get data a little faster. Even a 25% or
    30% increase gives Seagate something to brag about. Another sales tool.
    Some heavy data users wouldn't mind either.

    My question is this. Given they cost about $20 more, from what I've
    found anyway, is it worth it? Is there a downside to this new set of
    heads being added? I'm thinking a higher failure rate, more risk to
    data or something like that. I think this is a fairly new thing, last
    couple years or so maybe. We all know how some new things don't work out.

    Just looking for thoughts and opinions, facts if someone has some.
    Failure rate compared to single actuator drives if there is such data.
    My searched didn't help me find anything useful.

    Thanks.

    Dale

    :-) :-)

    I don't know much about these drives beyond what the OEM claims. From what I read, I can surmise the following hypotheses:

    These drives draw more power from your PSU and although they are filled with helium to mitigate against higher power/heat, they will require better cooling at the margin than a conventional drive.

    Your system will use dev-libs/libaio to read the whole disk as a single SATA drive (a SAS port will read it as two separate LUNs). The first 50% of LBAs will be accessed by the first head and the last 50% by the other head. So far, so good.

    Theoretically, I suspect this creates a higher probability of failure. In the hypothetical scenario of a large sequential write where both heads are writing data of a single file, then both heads must succeed in their write operation. The cumulative probability of success of head A + head B is calculated as P(A⋂B). As an example, if say the probability of a successful write of each head is 80%, the cumulative probability of both heads succeeding is only 64%:

    0.8 * 0.8 = 0.64

    As long as I didn't make any glaring errors, this simplistic thought experiment assumes all else being equal with a conventional single head drive, but it never is. The reliability of a conventional non-helium filled drive may be lower to start with. Seagate claim their Exos 2 reliability is comparable to other enterprise-grade hard drives, but I don't have any real world experience to share here. I expect by the time enough reliability statistics are available, the OEMs would have moved on to different drive technologies.

    When considering buying this drive you could look at the market segment needs and use cases Seagate/WD could have tried to address by developing and marketing this technology. These drives are for cloud storage implementations, where higher IOPS, data density and speed of read/write is desired, while everything is RAID'ed and backed up. The trade off is power usage and heat.

    Personally, I tend to buy n-1 versions of storage solutions, for the following reasons:

    1. Price per GB is cheaper.
    2. Any bad news and rumours about novel failing technologies or unsuitable implementations (e.g. unmarked SMRs being used in NAS) tend to spread far and wide over time.
    3. High volume sellers start offering discounts for older models.

    However, I don't have a need to store the amount of data you do. Most of my drives stay empty. Here's a 4TB spinning disk with 3 OS and 9 partitions:

    ~ # gdisk -l /dev/sda | grep TiB
    Disk /dev/sda: 7814037168 sectors, 3.6 TiB
    Total free space is 6986885052 sectors (3.3 TiB)

    HTH
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmc13U0ACgkQseqq9sKV ZxnEsw//egG5hni0+yhHh8P8mdk7d+L62tkY5EoNjMWIjBGN4u2AvLF9mfVvG5tN wiy8hDkMv9R9BsEfrn0rnXYV4rDAr6IyKG3M8qgUdEnzfaJysxCuReyb361lN3lj Cr2xf3rNUOWPuu5hVfxTNmGfYN0MihmybFiOqXIyRDff492cJLw/Tc+A2RQsL4eQ F61h2/yY3f1kpqeeKXMbaw2o1OnbZ18oHFMkeZu7eUrw+KIIyxjgxnLeZAP6MlSc 3HLvcbcDddiYoXbWFEqy2vOy2CljfMiLWFFpM/4Hpn9v9va+R4q7wQWJIXuNbr8w DE+U2Xq5vFb8VHAJJ3Keuv2a0Q/HEAftONcB6Qjx2rLv3IwCe6YyMImeAsBnnzG6 XQ0ctIQngy6Bn7mcPakdGUTOuKZGvRtI2+CxabE73ZySnq4OENxancyWkMBICtbZ hyYPHrTNYIElce+ijSdKWzF2NMkVA/ys+pON4fRukcUCp0XOlo0m7JPXGUykjlml iOGM+SSh9ZAqZ9XdMCVMgyAsW/+7dYagrLsEEWiLkYOrPX1R8bnQHwtYnD9Pz2lJ +cp7cgQTwnozh+ZGCFrrbdoXJwCyKvGidnbwTJ5c7LhEZSyS+2jqWp5hULapAiRZ aM7pmBZn8GpANicqVLLLc3RST8Q90Ibkusd9qvWxREom8sgHHXA=
    =syKm
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Frank Steinmetzger@21:1/5 to All on Thu Nov 14 21:00:01 2024
    Am Donnerstag, 14. November 2024, 20:51:32 Mitteleuropäische Normalzeit schrieb Frank Steinmetzger:
    Am Donnerstag, 14. November 2024, 20:12:25 Mitteleuropäische Normalzeit

    The only Seagate 7200RPM disk I have started playing up a month ago. I
    now
    have to replace it. :-(

    The German tech bubble has a saying when it’s about Seagate: “Sie geht oder
    sie geht nicht”. It plays on the fact that “sie geht” (literally “she runs”¹, meaning “it works”) sounds very similar to “Seagate”. So the
    literal joke is “Either it works or it doesn’t”, and the meta joke is “Seagate or not Seagate”.


    Lol, writing the above text gave me the strange feeling of having written it before. So I looked into my archive and I have indeed: in June 2014 *and* in December 2020. 🫣

    --
    Grüße | Greetings | Salut | Qapla’
    What do brushing teeth and voting have in common?
    If you don’t do it, it becomes brown on its own.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to All on Thu Nov 14 19:12:25 2024
    On Thursday 14 November 2024 17:00:07 GMT Dale wrote:
    Michael wrote:
    On Wednesday 13 November 2024 23:10:10 GMT Dale wrote:
    Howdy,

    One of my PVs is about 83% full. Time to add more space, soon anyway.
    I try not to go past 90%. Anyway, I was looking at hard drives and
    noticed something new. I think I saw one a while back but didn't look
    into it at the time. I'm looking at 18TB drives, right now. Some new
    Seagate drives have dual actuators. Basically, they have two sets of
    heads. In theory, if circumstances are right, it could read data twice
    as fast. Of course, most of the time that won't be the case but it can
    happen often enough to make it get data a little faster. Even a 25% or
    30% increase gives Seagate something to brag about. Another sales tool. >>
    Some heavy data users wouldn't mind either.

    My question is this. Given they cost about $20 more, from what I've
    found anyway, is it worth it? Is there a downside to this new set of
    heads being added? I'm thinking a higher failure rate, more risk to
    data or something like that. I think this is a fairly new thing, last
    couple years or so maybe. We all know how some new things don't work
    out.

    Just looking for thoughts and opinions, facts if someone has some.
    Failure rate compared to single actuator drives if there is such data.
    My searched didn't help me find anything useful.

    Thanks.

    Dale

    :-) :-)

    I don't know much about these drives beyond what the OEM claims. From
    what I read, I can surmise the following hypotheses:

    These drives draw more power from your PSU and although they are filled with helium to mitigate against higher power/heat, they will require
    better cooling at the margin than a conventional drive.

    Your system will use dev-libs/libaio to read the whole disk as a single SATA drive (a SAS port will read it as two separate LUNs). The first 50% of LBAs will be accessed by the first head and the last 50% by the other head. So far, so good.

    Theoretically, I suspect this creates a higher probability of failure. In the hypothetical scenario of a large sequential write where both heads
    are writing data of a single file, then both heads must succeed in their write operation. The cumulative probability of success of head A + head B is calculated as P(A⋂B). As an example, if say the probability of a successful write of each head is 80%, the cumulative probability of both heads succeeding is only 64%:

    0.8 * 0.8 = 0.64

    As long as I didn't make any glaring errors, this simplistic thought experiment assumes all else being equal with a conventional single head drive, but it never is. The reliability of a conventional non-helium filled drive may be lower to start with. Seagate claim their Exos 2 reliability is comparable to other enterprise-grade hard drives, but I don't have any real world experience to share here. I expect by the time enough reliability statistics are available, the OEMs would have moved on to different drive technologies.

    When considering buying this drive you could look at the market segment needs and use cases Seagate/WD could have tried to address by developing and marketing this technology. These drives are for cloud storage implementations, where higher IOPS, data density and speed of read/write
    is
    desired, while everything is RAID'ed and backed up. The trade off is
    power
    usage and heat.

    Personally, I tend to buy n-1 versions of storage solutions, for the following reasons:

    1. Price per GB is cheaper.
    2. Any bad news and rumours about novel failing technologies or unsuitable implementations (e.g. unmarked SMRs being used in NAS) tend to spread far and wide over time.
    3. High volume sellers start offering discounts for older models.

    However, I don't have a need to store the amount of data you do. Most of my drives stay empty. Here's a 4TB spinning disk with 3 OS and 9 partitions:

    ~ # gdisk -l /dev/sda | grep TiB
    Disk /dev/sda: 7814037168 sectors, 3.6 TiB
    Total free space is 6986885052 sectors (3.3 TiB)

    HTH

    Sounds like my system may not can even handle one of these. I'm not
    sure my SATA ports support that stuff.

    I think your PC would handle these fine.


    It sounds like this is not something I really need anyway.

    Well, this is more to the point. ;-)


    After all, I'm already spanning my data
    over three drives. I'm sure some data is coming from each drive. No
    way to really know for sure but makes sense.

    Do you have a link or something to a place that explains what parts of
    the Seagate model number means? I know ST is for Seagate. The size is
    next. After that, everything I find is old and outdated. I looked on
    the Seagate website to but had no luck. I figure someone made one, somewhere. A link would be fine.

    This document is from 2011, I don't know if they changed their nomenclature since then.

    https://www.seagate.com/files/staticfiles/docs/pdf/marketing/st-model-number-cheat-sheet-sc504-1-1102us.pdf


    Thanks.

    Dale

    :-) :-)

    The only Seagate 7200RPM disk I have started playing up a month ago. I now have to replace it. :-(
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmc2S5kACgkQseqq9sKV ZxlPtg/+IDBFx5iQNDgHlKyptn9f/sl/qHgOnd2wPuAXJXFkoGPQh7MVLLiAEuvO 5w2YeL7+omQ10P+AJD+fmoNxoERmRySSHU8uRqyf22A/dZsQSQ+Bl/BveNbRMd1z G4Xh7H4cJJ1vv0LJt1ZkthpAWcxJdb4TLD6rrZ0rsB9817aUXxqF2CwzMR4LGXQO DhngouEEHicyDeAL+uscbVVfwJPZtrwt3CfcYFqAF3Mw27Zp0/Mbfno/79tTE4Th rg7gL9qRv4h5ZB7Rz5artutZePOZh+M5jdHBou0Tu3zxsD+TrGRcQHo40iLej3Uw CziBPMrvl5ZhnCGOOxFPD3a47MaSKTRrz+lywDoQLj2eIM7JrPB7PryU7EViaC2o G3nqR+avKtVL7UgJOl21DZ2+w3z6yrQUwN958fH3T/nHe5HQ2NPsngF8oM1bsA7F M/mLmvvSasj0O+KUtRP3s037WL2L8EB+Pt7gSX78jxjcRxMBdfCvAw/NN4jkSG+F Fe/X9eOOSrSY9ebw0ozq43wlJhTKdKa9Lmwm4eiAZ44HGQ/crF6NjU/PDZ7UyzRE e7NAgno7enD0Lc997BZckAT1wsaj3gtPQAgNi/+Sc/P25tss23fTOQO2uZxdCqgW 6x+06WXTJRJ/Vu3S0N5mp53VXnNiIgRje0skHu3urKsRmpmaPQo=
    =Aghc
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to rdalek1967@gmail.com on Thu Nov 14 22:00:02 2024
    On Thu, Nov 14, 2024 at 3:33 PM Dale <rdalek1967@gmail.com> wrote:


    I've had a Seagate, a Maxtor from way back and a Western Digital go bad. This is one reason I don't knock any drive maker. Any of them can produce a bad drive.

    ++

    All the consumer drive manufacturers are in a super-price-conscious
    market. For the most part their drives work as advertised, and
    basically all of them have made a model with an abysmal failure rate
    at some point. I think Backblaze still publishes their stats and I
    don't think any manufacturer stood out when it comes to these cheaper
    consumer drives. The enterprise stuff probably is more reliable, but
    for HDD I don't think it is worth it - just use RAID.

    I've personally been trying to shift towards solid state. Granted, it
    is about double the price of large 5400RPM drives, but the performance
    is incomparable. I've also been moving more towards used enterprise
    drives with PLP/etc and where I can find the drive endurance info on
    the ebay listing. You can typically pay about $50/TB for an
    enterprise SSD - either SATA or NVMe. You'll pay a bit more if you
    want a high capacity drive (U.2 16+TB or whatever). That's in part
    because I've been shifting to Ceph which is pretty IOPS-sensitive.
    However, it is nice that when I add/replace a drive the cluster
    rebuilds in an hour at most with kinda shocking network IO.

    I'll use cheap consumer SATA/M.2 SSDs for OS drives that are easily
    reimaged. I'll use higher performance M.2 for gaming rigs (mostly read-oriented), and back it up. Be aware that the consumer write
    benchmarks only work for short bursts and fall to a fraction of the advertisement for sustained writes - the enterprise write benchmarks
    reflect sustained writes and you can run at that speed until the drive
    dies.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wols Lists@21:1/5 to Dale on Thu Nov 14 23:40:01 2024
    On 14/11/2024 20:33, Dale wrote:
    It's one thing that kinda gets on my nerves about SMR.  It seems,
    sounds, like they tried to hide it from people to make money.  Thing is,
    as some learned, they don't do well in a RAID and some other
    situations.  Heck, they do OK reading but when writing, they can get
    real slow when writing a lot of data.  Then you have to wait until it
    gets done redoing things so that it is complete.

    Incidentally, when I looked up HAMR (I didn't know what it was) it's
    touted as making SMR obsolete. I can see why ...

    And dual actuator? I would have thought that would be good for SMR
    drives. Not that I have a clue how they work internally, but I would
    have thought it made sense to have zones and a streaming log-structured
    layout. So when the user is using it, you're filling up the zones, and
    then when the drive has "free time", it takes a full zone that has the
    largest "freed/dead space" and streams it to the current zone, one
    actuator to read and one to write. Indeed, it could possibly do that
    while the drive is being used ...

    Cheers,
    Wol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Humphrey@21:1/5 to All on Fri Nov 15 00:20:01 2024
    On Thursday 14 November 2024 19:55:19 GMT Frank Steinmetzger wrote:

    Lol, writing the above text gave me the strange feeling of having written it before. So I looked into my archive and I have indeed: in June 2014 *and*
    in December 2020. 🫣

    Tiresomely repetitious, then...

    :)

    --
    Regards,
    Peter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to rdalek1967@gmail.com on Fri Nov 15 02:10:01 2024
    On Thu, Nov 14, 2024 at 6:10 PM Dale <rdalek1967@gmail.com> wrote:

    The biggest downside to the large drives available now, even if SMART
    tells you a drive is failing, you likely won't have time to copy the
    data over to a new drive before it fails. On a 18TB drive, using
    pvmove, it can take a long time to move data.

    Very true. This is why I'm essentially running RAID6. Well, that and
    for various reasons you don't want to allow writes to Ceph without at
    least one drive worth of redundancy, so having an extra replica means
    that you can lose one and remain read-write, and then if you lose a
    second during recovery you might be read-only but you still have data integrity. (Don't want to get into details - it is a Ceph-specific
    issue.)

    I don't even want to think what it would cost to put
    all my 100TBs or so on SSD or NVME drives. WOW!!!

    # kubectl rook-ceph ceph osd df class ssd
    ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP
    META AVAIL %USE VAR PGS STATUS
    8 ssd 6.98630 1.00000 7.0 TiB 1.7 TiB 1.7 TiB 63 MiB 3.9
    GiB 5.3 TiB 24.66 1.04 179 up
    4 ssd 1.74660 1.00000 1.7 TiB 465 GiB 462 GiB 16 MiB 2.5
    GiB 1.3 TiB 25.99 1.10 45 up
    12 ssd 1.74660 1.00000 1.7 TiB 547 GiB 545 GiB 30 MiB 2.1
    GiB 1.2 TiB 30.57 1.29 52 up
    1 ssd 6.98630 1.00000 7.0 TiB 1.7 TiB 1.7 TiB 50 MiB 4.2
    GiB 5.3 TiB 24.42 1.03 177 up
    5 ssd 6.98630 1.00000 7.0 TiB 1.8 TiB 1.8 TiB 24 MiB 5.0
    GiB 5.2 TiB 25.14 1.07 180 up
    3 ssd 1.74660 1.00000 1.7 TiB 585 GiB 583 GiB 18 MiB 2.0
    GiB 1.2 TiB 32.70 1.39 57 up
    21 ssd 1.74660 1.00000 1.7 TiB 470 GiB 468 GiB 27 MiB 1.9
    GiB 1.3 TiB 26.26 1.11 52 up
    9 ssd 1.74660 1.00000 1.7 TiB 506 GiB 504 GiB 11 MiB 2.0
    GiB 1.3 TiB 28.29 1.20 49 up
    18 ssd 1.74660 1.00000 1.7 TiB 565 GiB 563 GiB 16 MiB 1.7
    GiB 1.2 TiB 31.59 1.34 55 up
    10 ssd 1.74660 1.00000 1.7 TiB 490 GiB 489 GiB 28 MiB 1.6
    GiB 1.3 TiB 27.42 1.16 53 up
    22 ssd 1.74660 1.00000 1.7 TiB 479 GiB 478 GiB 19 MiB 1.7
    GiB 1.3 TiB 26.80 1.14 50 up
    19 ssd 13.97249 1.00000 14 TiB 2.3 TiB 2.3 TiB 87 MiB 5.2
    GiB 12 TiB 16.81 0.71 262 up
    TOTAL 49 TiB 12 TiB 12 TiB 388 MiB 34
    GiB 37 TiB 23.61

    I'm getting there. Granted, at 3+2 erasure coding that's only a bit
    over 30TiB usable space.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to All on Fri Nov 15 09:35:38 2024
    On Thursday 14 November 2024 22:38:59 GMT Wols Lists wrote:
    On 14/11/2024 20:33, Dale wrote:
    It's one thing that kinda gets on my nerves about SMR. It seems,
    sounds, like they tried to hide it from people to make money. Thing is,
    as some learned, they don't do well in a RAID and some other
    situations. Heck, they do OK reading but when writing, they can get
    real slow when writing a lot of data. Then you have to wait until it
    gets done redoing things so that it is complete.

    Incidentally, when I looked up HAMR (I didn't know what it was) it's
    touted as making SMR obsolete. I can see why ...

    aaaand .... it's gone! LOL! Apparently, HDMR is on the cards to replace
    HAMR.


    And dual actuator? I would have thought that would be good for SMR
    drives. Not that I have a clue how they work internally, but I would
    have thought it made sense to have zones and a streaming log-structured layout. So when the user is using it, you're filling up the zones, and
    then when the drive has "free time", it takes a full zone that has the largest "freed/dead space" and streams it to the current zone, one
    actuator to read and one to write. Indeed, it could possibly do that
    while the drive is being used ...

    Cheers,
    Wol

    As I understand it the dual actuator drives are like two separate drives
    placed inside the same casing, but being accessed in parallel by their respective head, hence the higher IOPS. The helium allows thinner platters which makes it possible to have higher total storage capacity within the same physical size.

    The SMR is more complicated in technological terms, with its overlapping multilayered recording. It achieves higher storage density than conventional drives, but with lower IOPS on writing operations once their cache is exhausted.
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmc3FeoACgkQseqq9sKV ZxlOJg//VOrtn3N+S0JTF1F2XROgATZN0eKIiVcTKOwDi55UdtVjznWCDcSTDktR I+ytLdonWJcpWH8ti/3AqENL1PQuDmatXDbYPrBTrymSd/h8rRN5GAv3gdUe5YkJ P8AD0E3y7gpQHp0gOUc8VWhiVKfe/RK5pMsoN/8170muQZeGa5XGd3JR5iNrvJBN t6hijoMXbQ3WHjNfCsfKAbtz3zMSg6tksUD4sh7oaX9rUvISR4/Zq551eauXlXdo oorXEEGwsGScyxWsIvy6zp35ieYkFrUt+9aqGLYzG92HAP0i7waa7aVY8x8Ku0oA mvbMznhiRzgi1kLXp+bj5SdT3OoTZpmvMmXREyvu06KBXM2bbSEB72wO+sTKqi+N CGE2DzA5z1g1wV6SphJHes0MPIGciD0uVE81tJgSkLaVefxM1TqDDup/xfV379O3 ddvk8MFm7ABF1bH3pscs8ARQzZ/ooAAr8I6AYLKgA3m9co3AFrAUUn6RFOw6mMrV r4YzGsR2VfhSVganQQxk7bK8BHkvfAdlfDeMbz4FwMen19IeTVADOrh8QdjtYxxo xkrEMq+7fBFYu00z/g8VTr9XW2I9nXaTnsSwq/BBRDZA5Eht1FmTbXqzBQNt3dNv JzeZIg87MLDzwtJhPqtLFAyPHifa1wkmkNrBcK71c5AKPRr3Ng8=
    =kKf5
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to All on Fri Nov 15 10:09:33 2024
    On Friday 15 November 2024 05:53:53 GMT Dale wrote:

    The thing about my data, it's mostly large video files. If I were
    storing documents or something, then SSD or something would be a good
    option. Plus, I mostly write once, then it either sits there a while or
    gets read on occasion.

    For a write once - read often use case, the SMR drives are a good solution. They were designed for this purpose. Because of their shingled layers they provide higher storage density than comparable CMR drives.


    I checked and I have some 56,000 videos. That
    doesn't include Youtube videos. This is also why I wanted to use that checksum script.

    I do wish there was a easy way to make columns work when we copy and
    paste into email. :/

    Dale

    :-) :-)


    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmc3Hd0ACgkQseqq9sKV ZxmSXhAAglkweryfa4zfpVIPt9OlB3uIIDlVcy7jiWXJtNJVm/x7j7dnDup4ujOg 50nhRRbgpzjhIKPeljcT5yWWHA2tzoN3fKl1Qd1Um3zNF0wvd6IXqAa7N/UydIl5 o3jS/6zMU/3O9Aq6WO95B5dLARrIhcyDl0LLEOPcMidVa/8QV/OzvCIxDKjmZvt2 2ieWVYOzSTWVPFZBVi5+etOgcAeBgBpJ4clTYiacUduPsdEOP9Wrvk1NDQmo3MEm 41+MRhKg789JxhkZ80UZffJAOyZn6Iz2QB36RXZEpMk47Ul2bovisDejIYL1DPq+ F9GnGnT4s+qaXZBdosMDM4trwVHg1WMESahGIk07N76Ue7pI8D6O6wsv8SVelI2L TUFIW+Zgiu+4BzLZDKLVpBYmnC6PQok0kw37vD/WTqdDdiTBOu4rgT8Pkwiuxxkf B2i8bIFzX5apP6l4n73Ux7wc51o9uRrLuUlFCwyVuv437q9Lt6Cebh7M5QnlXCK9 BdDaxJhavE4lyh6ilxDrg2AaGEV9ZNG2CFJt0uQFGrcrRvXuFPvsVdZSOqEx5BTs e1Dn3UnhLVzQ3Yjtr7xbQJ8zCkLkgij6Ddkdd3Xm1mWQso5B9E+TFIA/U162SR8e VvrWh0knZ4rYdQkBZFWnUR4V8b1RgscgBMU521z15JuUeteWNDU=
    =Vnng
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Frank Steinmetzger@21:1/5 to All on Fri Nov 15 11:40:01 2024
    Am Freitag, 15. November 2024, 06:53:53 Mitteleuropäische Normalzeit schrieb Dale:
    Rich Freeman wrote:
    On Thu, Nov 14, 2024 at 6:10 PM Dale <rdalek1967@gmail.com> wrote:
    The biggest downside to the large drives available now, even if SMART
    tells you a drive is failing, you likely won't have time to copy the
    data over to a new drive before it fails. On a 18TB drive, using
    pvmove, it can take a long time to move data.
    […]

    I think I did some math on this once. I'm not positive on this and it
    could vary depending on system ability of moving data. I think about
    8TB is as large as you want if you get a 24 hour notice from SMART and
    see that notice fairly quickly to act on. Anything beyond that and you
    may not have enough time to move data, if the data is even good still.

    I have 6 TB drives in my NAS, good ol’ WD Reds from before SMR time. When I scrub them, i.e. read the whole of their data out sequentially at 80 % capacity (so effectively around 5 TB), it takes 10½ hours. Looks like your math adds up. Maybe 10 or even 12 TB would also still work in that time window. Recently I switched from ZFS’s Raid6 to Raid5 because of said 80 % occupancy and I needed more space, but had neither any free slots left nor wanted to buy new hardware. Fingers crossed …


    I don't even want to think what it would cost to put
    all my 100TBs or so on SSD or NVME drives. WOW!!!

    # kubectl rook-ceph ceph osd df class ssd
    ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP
    META AVAIL %USE VAR PGS STATUS

    8 ssd 6.98630 1.00000 7.0 TiB 1.7 TiB 1.7 TiB 63 MiB 3.9

    GiB 5.3 TiB 24.66 1.04 179 up
    […]

    I do wish there was a easy way to make columns work when we copy and
    paste into email. :/

    For special cases like this I think we wouldn’t mind using HTML mail. Or simply disable automatic wrapping and use long lines of text for the entire message. The client can then decide where to wrap.

    I know it’s like a religious debate whether to wrap at <80 columns (please don’t start one here), but there is actually an automatism for this: if you end the line with a space, you can still wrap you text statically at <80 columns, but the space tells the client that it may wrap here or not. I forgot the name of it though, I learned about it in the mutt user group.

    For me it’s easier: as I use vim in mutt. I usually let it do the wrapping for
    me (including the mechanism I described). But I can disable wrapping on-the- fly, so I can paste longer terminal output.

    Dale

    :-) :-)

    --
    Grüße | Greetings | Salut | Qapla’
    Feet are fat, lumpy hands with short, stubby fingers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael@21:1/5 to All on Fri Nov 15 15:35:53 2024
    On Friday 15 November 2024 11:59:34 GMT Dale wrote:
    Michael wrote:
    On Friday 15 November 2024 05:53:53 GMT Dale wrote:
    The thing about my data, it's mostly large video files. If I were
    storing documents or something, then SSD or something would be a good
    option. Plus, I mostly write once, then it either sits there a while or >> gets read on occasion.

    For a write once - read often use case, the SMR drives are a good
    solution.
    They were designed for this purpose. Because of their shingled layers
    they
    provide higher storage density than comparable CMR drives.

    True but I don't like when I'm told a write is done, it kinda isn't. I recall a while back I reorganized some stuff, mostly renamed directories
    but also moved some files. Some were Youtube videos. It took about 30 minutes to update the data on the SMR backup drive. The part I see
    anyway.

    Right there is your problem, "... SMR backup drive". SMRs are best suited to sequential writes. With repeat random writes they go into a read-modify-write cycle and slow down.

    Consequently, they are well suited to storage of media files, archiving data long term and such write-once read-often applications. They are not suited to heavy transactional loads and frequently overwritten data.


    It sat there for a hour at least doing that bumpy thing before
    it finally finished. I realize if I just turn the drive off, the data
    is still there. Still, I don't like it appearing to be done when it
    really is still working on it.

    SMR drives have to read a whole band of shingled tracks, modify the small region where the data has changed and then write the whole band of tracks back on the disk in one go. The onboard cache on drive managed SMRs (DM-SMR) is meant to hide this from the OS by queuing up writes before writing them on the disk in a sequential stream, but if you keep hammering it with many random writes you will soon exhaust the onboard cache and performance then becomes glacial.

    Host managed SMRs (HM-SMR) require the OS and FS to be aware of the need for sequential writes and manage submitted data sympathetically to this limitation of the SMR drive, by queuing up random writes in batches and submitting these as a sequential stream.

    I understand the ext4-lazy option and some patches on btrfs have improved performance of these filesystems on SMR drivers, but perhaps f2fs will perform better? :-/


    Another thing, I may switch to RAID one
    of these days. If I do, that drive isn't a good option.

    Ugh! RAID striping will combine shingled bands across drives. A random write on one drive will cause other drives to read-modify-write bands. Whatever speed benefit is meant to be derived from striping will be reversed. On a NAS application, where many users could be accessing the storage simultaneously trying to save their interwebs downloads, etc., the SMR performance will nose dive.


    When I update my backups, I start the one I do with my NAS setup first.
    Then I start the home directory backup with the SMR drive. I then
    backup everything else I backup on other drives. I do that so that I
    can leave the SMR drive at least powered on while it does it's bumpy
    thing and I do other backups. Quite often, the SMR drive is the last
    one I put back in the safe. That bumpy thing can take quite a while at times.

    Instead of using the SMR for your /home fs backup, you would do better if you repurposed it for media files and document backups which do not change as frequently.

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmc3alkACgkQseqq9sKV ZxlRXA//cFvP/HeA9FzchtzdrtAae7q3MQsF1YY9b7M7G5bphbK+qgIda+ZTmWHl FzkazDpk3rileFf3QrHarXvtrsCz/xghXKAjHoPHzdtm9QNiDlDeFZm/6STSNcky hHu1u0husnq2PoIhg2AtsvVf6A9ZTytvNQfehMBUcO7WgTQ63Fz/pcn7S0IMvR2V jtf2sLnsPrmx9rgU8IUJUqnouhkBf5143vf1t0m7ycikO/miDEdWtwiX+Ev+i+28 XRp/AgDLY18SxIgyvpMQ9M8qz2fF0TfRMMNZ5DIundl2bCUenArmO0uO/eCyiM98 B8IJXUwu7g/Yv3qeBJkHx4ocVlaMItNsRtJ8X6A0lRsVdfkVyLzMGeFaiWY1CHS4 zbbR2EVn08mMj6tmWRLw6JG5gb+yVkPm18Ra7PuPwQhCvJYThrEPVfFmn29d8VMA VjkkLS2WAiEBOtPZ9UWjS92wwplVmc8jeHztq7I8PZTWb272j18CA7xS6/G71CHz 6bxpPIzoFOVDEIt3BCnB0+BZtR/52G8/4kJwdbEUXOY6UVzmYYmH/KAXHsIJQLNq s1K7J7mbtFuXf8JQ645GgQOt3ES4AujwsVDisSp2K+pC7Cttb5U0hDW2Um6Bv/Te On9B1Lkn4YJFVQnOlR7qTqFn/DtT1QOaGE1KYUOLC/OX93ZE8f8=
    =3REc
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to confabulate@kintzios.com on Fri Nov 15 23:20:01 2024
    On Fri, Nov 15, 2024 at 10:35 AM Michael <confabulate@kintzios.com> wrote:

    Host managed SMRs (HM-SMR) require the OS and FS to be aware of the need for sequential writes and manage submitted data sympathetically to this limitation
    of the SMR drive, by queuing up random writes in batches and submitting these as a sequential stream.

    I understand the ext4-lazy option and some patches on btrfs have improved performance of these filesystems on SMR drivers, but perhaps f2fs will perform
    better? :-/

    IMO a host-managed solution is likely to be the only thing that will
    work reliably. If the drive supports discard/trim MAYBE a dumber
    drive might be able to be used with the right filesystem. Even if
    you're doing "write-once" workloads any kind of metadata change is
    going to cause random writes unless the filesystem was designed for
    SMR. Ideally you'd store metadata on a non-SMR device, though it
    isn't strictly necessary with a log-based approach.

    If the SMR drive tries really hard to not look like an SMR drive and
    doesn't support discard/trim then even an SMR-aware solution probably
    won't be able to use it effectively. The drive is going to keep doing read-before-write cycles to preserve data even if there is nothing
    useful to preserve.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)