Slashdot Mirror


Linux Not Quite Ready For New 4K-Sector Drives

Theovon writes "We've seen a few stories recently about the new Western Digital Green drives. According to WD, their new 4096-byte sector drives are problematic for Windows XP users but not Linux or most other OSes. Linux users should not be complacent about this, because not all the Linux tools like fdisk have caught up. The result is a reduction in write throughput by a factor of 3.3 across the board (a 230% overhead) when 4096-byte clusters are misaligned to 4096-byte physical sectors by one or more 512-byte logical sectors. The author does some benchmarks to demonstrate this. Also, from the comments on the article, it appears that even parted is not ready, since by default it aligns to 'cylinder' boundaries, which are not physical cylinder boundaries and are multiples of 63."

7 of 258 comments (clear)

  1. Good thread on this. by Anonymous Coward · · Score: 4, Informative
  2. Re:Open Source to the rescue by marcansoft · · Score: 4, Informative

    Exactly. Drives are pretending to have 512-byte sectors because Windows can't deal with 4k sectors, and then silently reducing performance when you believe them and use 512-byte sector sizes. Had the drives reported 4k sector sizes, they'd work great under Linux and not at all under Windows.

    This isn't a Linux problem, it's a drive problem caused by Windows. The solution is to implement yet another workaround for stupid devices, and start aligning partitions to 4k by default.

    Nitpick: SDHC card sectors are always 512 bytes, and most SD card sectors are 512 bytes too. Flash memory would benefit from larger sector sizes too, but they've probably stuck to 512 bytes for Windows compatibility.

  3. Re:I just bought one of these by King+Kwame+Kilpatric · · Score: 5, Informative
    The problem is that WD doesn't tell the system about the sector size.
    dev/sdd:

    Model=WDC WD15EARS-00Z5B1, FwRev=80.00A80, SerialNo=
    Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
    RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=50
    BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16

    It looks to me that this should *really* be fixed by WD with a firmware update

    .

    Solution: Instead of fdisk, call it as fdisk -H 224 -S 56 as per Theodore Tso's blog.

  4. I was worried about this... and am still unclear by bmajik · · Score: 4, Informative

    I just got one of the 1TB 64mb WD drives that is known to be 4kb sector based.

    Here is how it shows up in dmesg:
    [ 3.420488] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)

    and here's what hdparm -I says:
    ATA device, with non-removable media
    Model Number: WDC WD10EARS-00Y5B1
    Serial Number: WD-WCAV55227529
    Firmware Revision: 80.00A80
    Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6
    Standards:
    Supported: 8 7 6 5
    Likely used: 8
    Configuration:
    Logical max current
    cylinders 16383 16383
    heads 16 16
    sectors/track 63 63
    --
    CHS current addressable sectors: 16514064
    LBA user addressable sectors: 268435455
    LBA48 user addressable sectors: 1953525168
    Logical/Physical Sector size: 512 bytes
    device size with M = 1024*1024: 953869 MBytes
    device size with M = 1000*1000: 1000204 MBytes (1000 GB)
    cache/buffer size = unknown
    Capabilities:
    LBA, IORDY(can be disabled)
    Queue depth: 32
    Standby timer values: spec'd by Standard, with device specific minimum
    R/W multiple sector transfer: Max = 16 Current = 1
    Recommended acoustic management value: 128, current value: 254
    DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
    Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4
    Cycle time: no flow control=120ns IORDY flow control=120ns
    Commands/features:
    Enabled Supported:
    * SMART feature set
    Security Mode feature set
    * Power Management feature set
    * Write cache
    * Look-ahead
    * Host Protected Area feature set
    * WRITE_BUFFER command
    * READ_B

    --
    My opinions are my own, and do not necessarily represent those of my employer.
  5. Re:Set 32 sectors per track by kimvette · · Score: 4, Informative

    The terminal is not irrelevant. If your Cisco router is ever compromised (it happens) or if IOS becomes corrupt (or if you have an IOS install with a nasty bug where the password does not save correctly, or when an IOS upgrade goes badly) or someone fudges the configuration up, the only way you can recover it is often through the serial port. Serial ports are also very handy for integrating video surveillance with point-of-sales systems that are not IP-aware (or worse, antiquated DVR appliances which can't do POS integration over IP), for some smart switches, *NIX boxes that have been rooted (I've rescued a Solaris box through a serial connection in an enterprise environment where reinstall was not possible due to poor timing - week of finals - and backups were sabotaged by a disgruntled gradute student and logins through IP and at the console were blocked), and so forth. However, I'd rather see RS-485 or RS-422 take RS232's place, since RS-485 and RS-422 can work over much longer distances and you can hang multiple serial devices off of a single bus.

    RS-232 might be absent from a lot of consumer motherboards, but it is far from dead and certainly not irrelevant, even now in 2010.

    --
    The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
  6. Re:Set 32 sectors per track by bertok · · Score: 5, Informative

    Actually this problem is potentially much worse on SSD's. Erase blocks are huge, and read-modify-write really sucks on flash.

    Couldn't this be addressed (at least in part) by a battery-backed write cache like better RAID controllers use? Set it up like SAN snapshots (so it just stores the diff between what's in the actual flash storage and what's been changed so far), and then write the changed blocks when it's most advantageous (e.g. when there's an entire block's worth of data, so it would all have to be erased by the flash storage anyway).
    Maybe combine that with something like a disk defrag, except instead of storing frequently-sequentially-read data in physical sequence, store frequently-written data (regardless of if it's sequentially-read or not) in physical sequence.

    That's exactly what most SSD controllers do!

    Some now come with 32 to 64MB of cache, and some of the new Sandforce controller based SSDs also come with a little ultracapacitor that acts like a mini UPS. The cache is used as scratch space for reordering writes and defragging blocks.

    There was a firmware patch recently for the OCZ Vertex series of SSDs that enabled background defrag. If you let the drive site there for a few minutes, it would start getting faster until it returned to 'as new' speeds

  7. Re:Poorly researched article. by Radtoo · · Score: 4, Informative

    I agree with the headlines being grossly misleading. Linux does support 4k block sizes just fine. But this is not a distro-specific issue, so you are wrong, too.
    This is simply a matter of fdisk from that version of util-linux-ng (which is clearly named in the article) trusting the hardware vendor to specify correct block sizes. The vendor did not. Thus fdisk does not end up with 4k block sizes, as happens for many programs. And only(?) parted apparently contains a workaround that detects the correct block size.

    Its not that you can't use parted on Gentoo, though, it is just that in the world of user choices that is Gentoo, not everyone will be using that program or that particular option.