Slashdot Mirror


Linux Not Quite Ready For New 4K-Sector Drives

Theovon writes "We've seen a few stories recently about the new Western Digital Green drives. According to WD, their new 4096-byte sector drives are problematic for Windows XP users but not Linux or most other OSes. Linux users should not be complacent about this, because not all the Linux tools like fdisk have caught up. The result is a reduction in write throughput by a factor of 3.3 across the board (a 230% overhead) when 4096-byte clusters are misaligned to 4096-byte physical sectors by one or more 512-byte logical sectors. The author does some benchmarks to demonstrate this. Also, from the comments on the article, it appears that even parted is not ready, since by default it aligns to 'cylinder' boundaries, which are not physical cylinder boundaries and are multiples of 63."

26 of 258 comments (clear)

  1. Set 32 sectors per track by tchuladdiass · · Score: 4, Insightful

    The simple solution is to set you Sectors per Track to 32. This would make sure that everything is properly aligned (except the first partition, usually /boot, which is mis-aligned by one cylinder).

    1. Re:Set 32 sectors per track by walshy007 · · Score: 4, Insightful

      the now-irrelevant concept of a terminal

      Speak for yourself sir, I for one like my rs-232 terminals to be handy for when ethernet is down and you can't ssh (and can't be assed hooking up keyboard and monitor). Seriously, anyone adept at the command line uses it far more than the gui to get things done, terminals will never disappear.

    2. Re:Set 32 sectors per track by rubycodez · · Score: 4, Insightful

      terminals are a very necessary and relevant part of Linux. That's how most server administration is done. That's how sending commands to many network appliances is done. That's how setting up high end computers is done (e.g. set up a midrange Integrity or Superdome and you'll start with terminal on the serial port, whether cu in linux or hyperterminal in windowws or a real terminal). Also how certain tasks are performed in GUI environments. It doesn't matter that the terminal is now mostly virtual, the cursor control and font attribute features make convenient applications possible. Even on the weekend here I am chatting via IRC to some tech friends with irssi in terminal under screen, and reading server status emails with mutt. the terminal, it's 21st century tool.

    3. Re:Set 32 sectors per track by Anonymous Coward · · Score: 5, Insightful

      terminals have nothing to do with the command line!
      i think the op is complaining about the fact that things like
      baud, stopbits and whatnot are deeply embedded in the
      linux kernel. these concepts are not necessary to
      have a command line. c.f. plan 9.

    4. Re:Set 32 sectors per track by amorsen · · Score: 4, Insightful

      Anyway - sooner or later we will have flash drives instead, and then this isn't a problem.

      Actually this problem is potentially much worse on SSD's. Erase blocks are huge, and read-modify-write really sucks on flash.

      --
      Finally! A year of moderation! Ready for 2019?
    5. Re:Set 32 sectors per track by kimvette · · Score: 4, Informative

      The terminal is not irrelevant. If your Cisco router is ever compromised (it happens) or if IOS becomes corrupt (or if you have an IOS install with a nasty bug where the password does not save correctly, or when an IOS upgrade goes badly) or someone fudges the configuration up, the only way you can recover it is often through the serial port. Serial ports are also very handy for integrating video surveillance with point-of-sales systems that are not IP-aware (or worse, antiquated DVR appliances which can't do POS integration over IP), for some smart switches, *NIX boxes that have been rooted (I've rescued a Solaris box through a serial connection in an enterprise environment where reinstall was not possible due to poor timing - week of finals - and backups were sabotaged by a disgruntled gradute student and logins through IP and at the console were blocked), and so forth. However, I'd rather see RS-485 or RS-422 take RS232's place, since RS-485 and RS-422 can work over much longer distances and you can hang multiple serial devices off of a single bus.

      RS-232 might be absent from a lot of consumer motherboards, but it is far from dead and certainly not irrelevant, even now in 2010.

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    6. Re:Set 32 sectors per track by kimvette · · Score: 4, Insightful

      Oh, in addition, now that Windows Server (Core) has a real GUI-less mode and Powershell and UNIX environment shells on Windows finally have usable interfaces, shell prompts are becoming even more relevant even in large Window shops. So, even Microsoft has acknowledged that the UNIX-y way of doing things is key for automation and uptime in an enterprise environment. Now, most PCs won't boot with output to the serial port, but some enterprise server boards do have such options.

      A GUI is great for basic tasks, but for repetitive tasks a command shell and scripting environment are key for efficiency, and reliable automation. VBS/Windows Scripting Host was an "acceptable" workaround for a while but in the past many Windows administrative tools required the box to not be headless, the workstation unlocked and the windows open for the GUI to be accessible for scripting - and even then it was iffy because not all GUI elements are accessible (especially third-party tools with custom controls).

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    7. Re:Set 32 sectors per track by bertok · · Score: 5, Informative

      Actually this problem is potentially much worse on SSD's. Erase blocks are huge, and read-modify-write really sucks on flash.

      Couldn't this be addressed (at least in part) by a battery-backed write cache like better RAID controllers use? Set it up like SAN snapshots (so it just stores the diff between what's in the actual flash storage and what's been changed so far), and then write the changed blocks when it's most advantageous (e.g. when there's an entire block's worth of data, so it would all have to be erased by the flash storage anyway).
      Maybe combine that with something like a disk defrag, except instead of storing frequently-sequentially-read data in physical sequence, store frequently-written data (regardless of if it's sequentially-read or not) in physical sequence.

      That's exactly what most SSD controllers do!

      Some now come with 32 to 64MB of cache, and some of the new Sandforce controller based SSDs also come with a little ultracapacitor that acts like a mini UPS. The cache is used as scratch space for reordering writes and defragging blocks.

      There was a firmware patch recently for the OCZ Vertex series of SSDs that enabled background defrag. If you let the drive site there for a few minutes, it would start getting faster until it returned to 'as new' speeds

    8. Re:Set 32 sectors per track by Anonymous Coward · · Score: 4, Insightful

      I don't think he was dissing command line interfaces.

      I think his complaint was that even newfangled RS-232 terminals had to jump through hoops to remain compatible with computers that were hooked up to typewriters and line printers. The protocols and underlying software have idiosyncrasies built into them that just don't make sense any more. Instead of throwing away the cruft to make something better, everybody's hacking onto the same old outdated shit. It's limiting progress, in a way.

  2. Good thread on this. by Anonymous Coward · · Score: 4, Informative
  3. Check with your distribution by macemoneta · · Score: 4, Interesting

    I know that Fedora seems to have addressed this with parted 2.1.1 and util-linux-ng 2.1. Both are scheduled for Fedora 13, but can be pulled into Fedora 12 by those getting the hardware early.

    --

    Can You Say Linux? I Knew That You Could.

  4. Oh slashdot.. by JeffSh · · Score: 5, Insightful

    Dear Slashdot,

    I've been around for a while. Enough to understand, nay, love the fact that you are linux supporters and all that. But I remain an ardent supporter of truth and speaking in ways which are concise and leads the reader in the direction of truth. Nothing in this news story is inaccurate, but to make it a point to say that Windows XP is incompatible with no mention of Vista and 7 being perfectly compatible should be an embarrassment of journalistic integrity.

    Windows XP may not work with the new WD Green drives, but Vista and on have been perfectly comfortable with 4096 byte sectors. A lay reader may read this story and not "Read between the lines" as I have learned to do here. Their take away may be that Microsoft operating systems are broken in some way (which they are in a lot of ways), but not this one!

  5. Re:Open Source to the rescue by marcansoft · · Score: 4, Informative

    Exactly. Drives are pretending to have 512-byte sectors because Windows can't deal with 4k sectors, and then silently reducing performance when you believe them and use 512-byte sector sizes. Had the drives reported 4k sector sizes, they'd work great under Linux and not at all under Windows.

    This isn't a Linux problem, it's a drive problem caused by Windows. The solution is to implement yet another workaround for stupid devices, and start aligning partitions to 4k by default.

    Nitpick: SDHC card sectors are always 512 bytes, and most SD card sectors are 512 bytes too. Flash memory would benefit from larger sector sizes too, but they've probably stuck to 512 bytes for Windows compatibility.

  6. slashdot is not journalism by SuperBanana · · Score: 4, Insightful

    should be an embarrassment of journalistic integrity.

    Slashvertisements, basic English grammar and spelling problems, completely wrong summaries and titles...

    ...and you a)think that Slashdot is "journalism" and b)it's had integrity to lose in the first place?

    I like Slashdot, but gimme a break...it's a user-driven blog which directs readers to existing stories (now often lagging behind the major news wires) with good categorization and semi-sophisticated commenting system, utilized by a larger commenter population. Not much more, and definitely not journalism.

  7. Re:I just bought one of these by King+Kwame+Kilpatric · · Score: 5, Informative
    The problem is that WD doesn't tell the system about the sector size.
    dev/sdd:

    Model=WDC WD15EARS-00Z5B1, FwRev=80.00A80, SerialNo=
    Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
    RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=50
    BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16

    It looks to me that this should *really* be fixed by WD with a firmware update

    .

    Solution: Instead of fdisk, call it as fdisk -H 224 -S 56 as per Theodore Tso's blog.

  8. Re:if vista/win7 really do support this correctly. by walshy007 · · Score: 4, Insightful

    The real problem is that it is lying about it's sector size, it's reporting 512 bytes when it's using 4k, if it told linux it was using 4k everything would be fine and dandy.

    Why does it lie about it's sector size when it doesn't need to? because if it didn't the drives would not work on windows XP at all. Which would not bode well for sales.

    Once drives with 4k sectors arrive its up the individual maintainers of each affected tool (fdisk, et. al.) to update their code.

    Kernel handles sector sizes, and could handle 4k sectors ages ago, but when the hardware reports something it tends to trust it, which is now apparent it shouldn't. (512 byte sectors being implemented as an emulation layer of sorts on these drives.. and enabled by default)

  9. Drive lies and future fixes by Sits · · Score: 4, Interesting

    There is an excellent thread talking about how recent (2.6.31+) linux kernels try to report the underlying hard drive architecture (found via the OSNews comments). Alas, it looks like some of these drives are not reporting this data correctly and thus automatic adjustment (at partitioning time) is not taking place. It looks like in the future rather than trying to do detection by reported capability fdisk (and hopefully gparted) will default to sectors of 1MiB if the topology can't be found by default (unless your media is small).

    Additionally, I gather that recent Fedoras will try to adjust things like LVM to match larger sectors too. Hopefully whatever is laying out LVM will also be fixed too.

    Coincidentally, it looks like Oracle have a very committed dev trying to make this stuff work by default...

  10. Re:Interesting by markus_baertschi · · Score: 5, Interesting

    About the microcode part. The drive pretends to be a 512byte drive, but internally is using 4k sectors and and claims to 'translate transparently'. I can understand that in a random-access scenario it it has to read-modify-write 2 sectors each time and performance suffers (2 additional reads and one additional write). But in a sequential access scenario, the penalty should be once per sequence/file, not once per sector. Here the microcode fails completely to make the best out of the suboptimal situation.

  11. Re:Interesting by hedwards · · Score: 5, Insightful

    That's true, but it's also true that having hardware lie to the OS isn't a great situation to be in. At the very least there should be some way of forcing it to be honest for the benefit of OSes that can handle the reality. A lot of the gunk and instability in computing comes from hardware that does things that are more appropriately done by software and vice versa.

    Forcing users to optimize isn't inherently wrong, it's just that they shouldn't need to do it for things which are somewhat standard as a work around for weird hardware designs. And yes, I realize that the 4096byte sectors aren't being implemented arbitrarily.

  12. I was worried about this... and am still unclear by bmajik · · Score: 4, Informative

    I just got one of the 1TB 64mb WD drives that is known to be 4kb sector based.

    Here is how it shows up in dmesg:
    [ 3.420488] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)

    and here's what hdparm -I says:
    ATA device, with non-removable media
    Model Number: WDC WD10EARS-00Y5B1
    Serial Number: WD-WCAV55227529
    Firmware Revision: 80.00A80
    Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6
    Standards:
    Supported: 8 7 6 5
    Likely used: 8
    Configuration:
    Logical max current
    cylinders 16383 16383
    heads 16 16
    sectors/track 63 63
    --
    CHS current addressable sectors: 16514064
    LBA user addressable sectors: 268435455
    LBA48 user addressable sectors: 1953525168
    Logical/Physical Sector size: 512 bytes
    device size with M = 1024*1024: 953869 MBytes
    device size with M = 1000*1000: 1000204 MBytes (1000 GB)
    cache/buffer size = unknown
    Capabilities:
    LBA, IORDY(can be disabled)
    Queue depth: 32
    Standby timer values: spec'd by Standard, with device specific minimum
    R/W multiple sector transfer: Max = 16 Current = 1
    Recommended acoustic management value: 128, current value: 254
    DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
    Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4
    Cycle time: no flow control=120ns IORDY flow control=120ns
    Commands/features:
    Enabled Supported:
    * SMART feature set
    Security Mode feature set
    * Power Management feature set
    * Write cache
    * Look-ahead
    * Host Protected Area feature set
    * WRITE_BUFFER command
    * READ_B

    --
    My opinions are my own, and do not necessarily represent those of my employer.
  13. Re:Open Source to the rescue by rsmith-mac · · Score: 5, Insightful

    On the contrary, this has (almost) nothing to do with Windows - it has everything to do with old OSes. The IDEMA didn't approve the 4K sector standard until 2006; it was only in the late 90's that the first meaningful research was begun by IBM on whether 512B sectors would be an issue.

    As it turns out, yes, 512B sectors would be an issue, and drive manufacturers would be best served by moving to larger sectors (with some arguing over whether to go to 1K or 4K). So the IDEMA hashed this out over the first half of the decade, and finally in 2006 approved the 4K specification.

    The point of all of this is that software written at the turn of the century was all done well before changing drive sector sizes was a serious discussion. WinXP was released in 2001, Mac OS X 10.0 was in 2001, and of course Linux 2.4 was also in 2001. None of those OSes know what to do with anything other than a 512B sector - the only reason Windows factors in to this equation is that WinXP just happens to be with us (no doubt trying to eat our brains) while the other two are dead. Anything circa 2005 or later such as WinVista, Linux 2.6, and Mac OS X 10.5 know full well what to do with a 4K drive.

    But even that is beside the point. You don't just make major jumps like this, you have to do it in a transition so that you don't break old hardware and old software alike. Even if XP/Lin2.4/MacOSX knew what to do with 4K sectors, at some point you'd run in to hardware, 3rd party devices, etc that would not. A transition is necessary to let old hardware and software get flushed out of the ecosystem, and as such we're still years out from consumer drives offering native 4K access.

    In short: drives are pretending to have 512-byte sectors because there's a lot of old stuff, including Windows XP that can't deal with 4K sectors.

  14. DragonFly's solution by m.dillon · · Score: 4, Interesting

    We're adjusting our disklabel64 utility and kernel support to set the partition base offset such that it is physically aligned instead of slice-aligned, and we are using 32K alignment. That should fix the problem without having to mess around with fdisk.

    The DragonFly 64-bit disklabel structure uses 64-bit byte offsets instead of sector addressing to specify everything. It ensures things are at least sector aligned but we wanted to make disk images more portable across devices with potentially different sector sizes. The HAMMER fs uses byte-granular addressing for the same reason, 16K aligned.

    -Matt

  15. Poorly researched article. by Vellmont · · Score: 4, Insightful

    The article represents one data point, for one particular way to install a drive, on one (un-named) version of Gentoo, on one particular model of a WD drive that had a bugzilla entry entered by the author all of 2 days ago. So this is supposed to be an indictment of all of Linux?

    The author even mentions that Ubuntu has an option on parted that accomplishes the task properly. I'd be much more interested in an article that talks about how the default installer handles this task rather than concentrating on one particular expert tool that does so. It's still good to know that fdisk on his un-named Gentoo distribution does the wrong thing.. but this hardly means we should fire up the klaxon and declare "Linux not fully prepared for 4096 sector hard drives!". It's certainly interesting, but I'll withhold judgment until we actually know more about the implications of this across the entire spectrum of Linux distributions and the various 4096 sector HDs.

    --
    AccountKiller
    1. Re:Poorly researched article. by Radtoo · · Score: 4, Informative

      I agree with the headlines being grossly misleading. Linux does support 4k block sizes just fine. But this is not a distro-specific issue, so you are wrong, too.
      This is simply a matter of fdisk from that version of util-linux-ng (which is clearly named in the article) trusting the hardware vendor to specify correct block sizes. The vendor did not. Thus fdisk does not end up with 4k block sizes, as happens for many programs. And only(?) parted apparently contains a workaround that detects the correct block size.

      Its not that you can't use parted on Gentoo, though, it is just that in the world of user choices that is Gentoo, not everyone will be using that program or that particular option.

  16. Re:if vista/win7 really do support this correctly. by thisissilly · · Score: 4, Insightful

    It seems these drives need a new "don't lie to me, I can handle it" command, so OSes that don't have a problem with 4k size sectors can get the real info.

  17. Re:Open Source to the rescue by guruevi · · Score: 4, Insightful

    Ya know in the olden days, when I was young (love saying that) we didn't have hardware that configured itself so it would work on all platforms. We had to put in settings with jumpers and do low-level disk formats through the BIOS or a boot-floppy and WE LIKED IT (seriously).

    These days all ya new-fangled hardware doesn't have to worry about being a master or a slave, getting 5V or 3.3V to the PCI bus or RAM modules, CPU multipliers on the motherboard.

    I would simply do the same - get a jumper on the back of the drive that says 512 or 4k - we left it on 512 for ya because we assume you numbnuts still use Windows (XP) but if you want performance and use anything but DOS or WinDOS feel free to switch it. You can then reformat the drive.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com