Slashdot Mirror


Your Hard Drive Lies to You

fenderdb writes "Brad Fitzgerald of LiveJournal fame has written a utility and a quick article on how all hard drives from the consumer level to the highest level 'enterprise' grade SCSI and SATA drives do not obey the fsync() function. Manufacturers are blatantly sacrificing integrity in favor of scoring higher on 'pure speed' performance benchmarking."

512 comments

  1. Hardly a new thing... by |>>? · · Score: 4, Funny

    Since when do computers do what you mean?

    --
    |>>? ..EBCDIC for Onno..
    1. Re:Hardly a new thing... by Clay+Pigeon+-TPF-VS- · · Score: 5, Funny

      You must be new here. Computers always do what you tell them to do in the command line. What, you're using a gui? Well that's your fault then.

      --
      Viral software licensing is not freedom, it is in fact GNU/Socialism.
    2. Re:Hardly a new thing... by pyrrhonist · · Score: 5, Funny
      Computers always do what you tell them to do in the command line.

      They sure do.

      $ rm -rf * .o
      $ ls -a
      . ..
      $
      FUCK!!!!!

      --
      Show me on the doll where his noodly appendage touched you.
    3. Re:Hardly a new thing... by darkpixel2k · · Score: 1

      Since when do computers do what you mean?

      Even computers like the human brain.
      I mean--I'm *sure* the submitter meant to type "Brad Fitzpatrick"...you know...especially since it's the title of the site he's linking to. I'm sure he didn't mean to guess his last name.

      --
      There's no place like ::1 (I've completed my transition to IPv6)
    4. Re:Hardly a new thing... by pyropunk51 · · Score: 5, Funny

      I really hate this damned machine!
      I wish that I could sell it.
      It never does quite what I want,
      But only what I tell it!

      --
      double penetration; //ouch
    5. Re:Hardly a new thing... by khrtt · · Score: 1, Offtopic

      How is parent offtopic if it's a direct answer to gramdparent. Who the fuck modded grandparent troll when it's obviously a joke? Are the mods on drugs today, or are they just being mods?

    6. Re:Hardly a new thing... by fvbommel · · Score: 1

      Just because you're telling them to do something else than you wanted to tell them to do, doesn't mean they did something wrong :)

    7. Re:Hardly a new thing... by indifferent+children · · Score: 3, Informative
      Windows is WYSIWYG; Linux is YAFIYGI (You asked for it, you got it).

      This is an old quote, but not everyone has seen it. This is much like Neal Stephenson comparing Linux to the Hole Hawg drill in "In the Beginning Was the Command Line". Great read!

      --
      Censorship is telling a man he can't have a steak just because a baby can't chew it. --Mark Twain
    8. Re:Hardly a new thing... by MooseGuy529 · · Score: 1, Offtopic
      Are the mods on drugs today, or are they just being mods?

      Is there a difference? ;-)

      You are right, it was obviously a joke. Slashdot should have a quiz thing before you moderate that shows a bunch of typical posts and makes sure you moderate them right. Show anti-Microsoft and anti-Linux posts of varying quality, and make sure the moderator rates the post and not the content; show funny posts and make sure they get modded funny... sort of like quality control.

      --

      Tired of free iPod sigs? Subscribe to my blacklist

    9. Re:Hardly a new thing... by Anonymous Coward · · Score: 0

      you must be new here

    10. Re:Hardly a new thing... by Afrosheen · · Score: 2, Informative

      You must be new here. That's the whole point of meta-moderation. If you do it, it improves your chances of moderating in the future, because it basically reviews the moderating quality of previous moderation from other people.

    11. Re:Hardly a new thing... by Fulcrum+of+Evil · · Score: 4, Funny

      Is there a difference? ;-)

      I don't know about you, but when I mod slashdot, I'm almost always drunk or stoned. Really, it's the only way to fit in.

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
    12. Re:Hardly a new thing... by slimey_limey · · Score: 2, Informative

      I love that article/essay. Link: In the Beginning was the Command Line. It's a plain CRLF text file in a ZIP archive.

    13. Re:Hardly a new thing... by psylew · · Score: 1

      Interesting idea. But even if people can do it right once, it doesn't mean they'll continue to mod reasonably once they're out of the controlled test environment. Beware of people with anonymity.

    14. Re:Hardly a new thing... by Anonymous Coward · · Score: 0

      Even with a GUI, the box is just doing what it's told... It's the programmers fault if the GUI is buggy or just plain stinks.

    15. Re:Hardly a new thing... by MooseGuy529 · · Score: 1, Offtopic

      No, I'm not new here. I know what metamoderation is. My point is that many moderators (and meta-moderators) have biases, and if we could filter out the people who automatically mark any pro-Microsoft or anti-Linux post as Troll, then moderation quality would automatically go up. It's basically like a static meta-moderation done before they can start screwing things up.

      --

      Tired of free iPod sigs? Subscribe to my blacklist

    16. Re:Hardly a new thing... by MooseGuy529 · · Score: 1

      Hmm... basically they need a way of changing M2 so that instead of identifying moderators whose choices seem fair to Slashdot readers, it would identify moderators whose choices are fair. Right now Slashdot's moderation is pretty badly biased, so if we could have a static metamoderation system that is fed posts that are likely to trigger biases, we could catch biased moderators instead of having their bias-driven moderation encouraged by biased metamoderation.

      --

      Tired of free iPod sigs? Subscribe to my blacklist

    17. Re:Hardly a new thing... by pipingguy · · Score: 1


      That reminds me - I've been getting mod points weekly lately. Should I start smoking crack now?

    18. Re:Hardly a new thing... by Anonymous Coward · · Score: 0
      You must be new here. That's the whole point of meta-moderation. If you do it, it improves your chances of moderating in the future

      No, you must be new here. Meta moderating reduces your chances of getting mod points. I've been through long periods (3 months-ish) where I've meta moderated religously - practically every time the option was presented to me, and I've been through equally long periods where I've studiously ignored the meta mod requests. I always got mod points more often when I didn't meta-moderate.

      There, my secret is out. I guess someone will fix the code now.

    19. Re:Hardly a new thing... by ScrewMaster · · Score: 1

      C:\> Undo Undo Undo Undo Undo

      --
      The higher the technology, the sharper that two-edged sword.
    20. Re:Hardly a new thing... by DrMrLordX · · Score: 1

      No, the whole point to metamoderating is to rate every moderation Unfair or Unfunny in order to cast doubt upon the entire moderation process itself.

      Hopefully, we'll all be forced to browse at -1 someday.

  2. Err... "lying" is the default setting. RTFM. by Tetard · · Score: 3, Informative

    Write Cache enable is default on most IDE/ATA
    drives. Most SCSI drives don't enable it.
    If you don't like it, turn it off. There's
    no "lying", and I'm sure the fsync() function
    doesn't know diddly squat about the cache of
    your disk. Maybe the ATA/device abstraction layer does, and I'm sure there's a configurable registry/sysctl/frob you can twiddle to make it DTRT (like FreeBSD has).

    Move along, nothing to see...

  3. What's this? by binaryspiral · · Score: 4, Funny

    Hard drive manufacturers screwing over customers? Why, who would have thought?

    1 billion bytes equals 1 gigabyte - since when?

    Dropped MTBF right after reducing the 3 year standard wrty to a 1 year - good timing.

    Now this?

    Wow what a track record of consumer loving...

    1. Re:What's this? by Anonymous Coward · · Score: 1, Informative

      > 1 billion bytes equals 1 gigabyte - since when?

      Billion has equalled Giga since forever.

      Then people with computers decided close enough is good enough (the LAST people who should have done such a braindead thing) and decided to make some kilo, mega, giga, tera etc prefixes equal to the closest binary representation, 1024, 1048576 etc and it's confused everybody ever since.

      What's worse is that not all kilo/mega/giga in computing actually means 1024/1048576/etc, just some. One gigabyte? One Gigahertz? One Gigabit/second?

    2. Re:What's this? by maxwell+demon · · Score: 2, Insightful

      Not to mention the 1.44 "Megabyte" floppy disk where "Megabyte" means 1024000 Bytes ...

      --
      The Tao of math: The numbers you can count are not the real numbers.
    3. Re:What's this? by digitalchinky · · Score: 2, Informative

      Just about every drive you can buy in the Philippines (of all places) now has a 5 year warranty as standard... This is from all manufacturers. No difference in price, though the drives themselves both look and feel much more solid when compared to their one year warranty counterparts from last year.

      It's been a good three years now since I've had a drive fail, either that's just good luck, or drives are just not as fragile as... that maxtor rubbish.

    4. Re:What's this? by Anonymous Coward · · Score: 2, Informative
      1 billion bytes equals 1 gigabyte - since when?

      Since 1960. Since 1998, 2^30 bytes = 1 gibibyte.

    5. Re:What's this? by pe1rxq · · Score: 0, Flamebait

      Are you an American?
      (That would explain why you don't know shit about Kilo, Mega and Giga....)

      --
      Secure messaging: http://quickmsg.vreeken.net/
    6. Re:What's this? by Anonymous Coward · · Score: 1, Funny

      Actualy you are wrong.

      1024 is always "Kilo". 1000 is "thou".


      So a kilometer is 1024 meters? Thanks for the correction! From now on I'll start saying "thoumeters" when I mean "1000 meters".

    7. Re:What's this? by krymsin01 · · Score: 3, Funny

      Down with the kilometer. Up with the thoumeter!

      --
      stuff
    8. Re:What's this? by c_g_hills · · Score: 1

      In the UK, a billion is equal to a million million, or 1,000,000,000,000. We're getting screwed even worse!

    9. Re:What's this? by MaynardJanKeymeulen · · Score: 1

      He has a point. 1 megabyte uses to be 2^20 bytes, everybody used to do so, but when some harddisk manufacturers started to change things in their favor, it became very handy for them to suddenly call 2^20 bytes a mibibyte. And so it became official.

      --
      "The day Microsoft makes a product that doesn't suck is the day they make a vacuum cleaner."
    10. Re:What's this? by binaryspiral · · Score: 1

      Come'on you all know what I'm talking about, don't be a chode about it.

      1,000,000,000 bytes != 1 Gigabyte, unless you put a little legal disclaimer on the box so you can sell smaller harddrives with big numbers on it.

      So when I plug my 160GB hard drive in, linux, windows, and mac all say I have 152.66GB - this is the screw I don't enjoy. And no - it's not all lost to formatting.

    11. Re:What's this? by Anonymous Coward · · Score: 0

      To make it worse again, not only does "giga" change depending on the postfix, but even 'gigabit' changes depending on its meaning

      RAM chips measured in kilobits, megabits and gigabits are binary (powers of 2) whereas networking kilobits, megabits and gigabits are decimal.

      Then there's hard drive controller bandwidth measured as decimal (133megabytes per second is 133,000,000 bytes per second) but the files sent over it are measured binary when in RAM, but decimal when going over the controller, and binary or decimal when on disk, depending which app is counting.

      Whoever started using this "binary is close enough to decimal" was a moron who put as much thought into things as our old friends who started the Y2K thing, except their legacy lends to confusion every single day!

    12. Re:What's this? by pyrrhonist · · Score: 4, Funny
      2^30 bytes = 1 gibibyte.

      AaARaaGGgHHhh! I simply loathe the IEC binary prefix names.

      Kibibits sounds like dog food.

      "Kibibits, Kibibits, I'm gonna get me some Kibibits..."

      --
      Show me on the doll where his noodly appendage touched you.
    13. Re:What's this? by KiloByte · · Score: 2, Insightful

      No, the gibi crap is a new invention, going against established practice. And, it sounds awful.

      --
      The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
    14. Re:What's this? by Anonymous Coward · · Score: 0
      In the UK... We're getting screwed even worse!

      Stop voting Labour Party, then! *zing*

    15. Re:What's this? by thsths · · Score: 5, Informative

      > 1,000,000,000 bytes != 1 Gigabyte

      Actually, it is. The standard was updated in 1998 to avoid confusion (Standard IEC 60027-2). Giga is 10^9, and it is constant, which means it does not change just because you use it for hard disks or memory.

      If you mean 2^30, then you have to say gigabinary, abbreviated as gibi or Gi. Having different name for different things can avoid an awful lot of confusion, so it would very much recommend using them.

      And now please put the following events into the correct order: America goes metric, hell freezes over, people use Gibi correctly.

    16. Re:What's this? by Anonymous Coward · · Score: 0

      No, the gibi crap is a new invention, going against established practice. And, it sounds awful.

      My established practice predates your established practice by about 45 years.

      And yes, the binary prefixes sound stupid. But any more stupid than using "kilo" as 1024 because it's "close enough?"

    17. Re:What's this? by Anonymous Coward · · Score: 0

      You're confusing Mebibyte for Megabyte, etc.

    18. Re:What's this? by KiloByte · · Score: 2, Informative

      Well, they are usually (corrected) labelled as 1440KB instead of 1.44MB. They have 2 sides and 80 tracks with 18 512 byte sectors on each track.

      It's real 1440KB, without cheating on the sector's headers and/or inter-sector gaps. If you format the floppy yourself, you can shave quite a bit of space from those gaps, and this was a quite popular thing to do.

      --
      The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
    19. Re:What's this? by Alioth · · Score: 1

      I dunno. We have 70 desktop systems (HPaq d530) with Maxtor 40GB drives. We've had 8 hard drives fail in these machines in the last 12 months - that's >10% failure rate. We've also had a 35GB SCSI drive (less than 1 year old) fail in one of the HP dl380 servers we have (we have 20 of these disks).

    20. Re:What's this? by Crayon+Kid · · Score: 3, Informative

      I was also under the (wrong) impression that gigabit was the good old binary thing, and that gibi was something they made to express decimal alternatives. And in fact I find out it's quite contrary, thanks to the parent poster. :)

      Having repented, I point you to the this reference which does a very nice job of summing everything up.

      --
      i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
    21. Re:What's this? by BrianRaker · · Score: 1

      I've never been a fan of the low-profile Maxtor drives. I run a helldesk for a reasonably large company's satellite office and have approx 150 d530cmt's on active deployment at this location. Though, my failure rate is quite a bit less: in the 6 months I've been here I've only had one dead drive. The tech before me had two over the life of the deployment (late 2003 ~ Dec'04, I believe) while she was there.

      --
      As I walk through the valley of death I fear no one, for I am the meanest sonova bitch in the valley!
    22. Re:What's this? by arivanov · · Score: 1

      Maxtor 30 and 40 low profile run very hot.

      I have had so far 0 failure rate with them in servers with good cooling where they can be chilled down nicely to sub-30C due to the low profile which greatly improves airflow (knocking on wood here).

      At the same time I have had 50%+ 1st year failure rate in Cubid mini ITX cases, 30%+ in Compaqs and other similar systems without dedicated disk fans. If their SMART sensors are to be believed they heat up to 42C+ in a HPaq and 50C+ in an ITX case.

      If you run them in a desktop you must have all power management features enabled and must spindown the drive when not in use. If you do that, it will survive even in an mini ITX case. Otherwise your data will be toast. Literally.

      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
    23. Re:What's this? by vought · · Score: 1

      I'll never buy another Maxtor product again after my MaxLine Plus II died.

      I was happy to read on their web site that the MaxLine Plus II drives have a five year warranty.

      I was, ah, disappointed to find out that the MaxLine Plus II I bought in February 2003 had a TWO year warranty.

      Same drive, different warranty. I'm fucked. Yay Maxtor.

    24. Re:What's this? by MattBurke · · Score: 1

      Not quite. 3.5" double-sided, high density floppies are actually 2000KB. They are only reduced to 1440KB after FAT formatting which consumes 7 of the 25 sectors per track, or if you were an Acorn user you only lost 5 sectors per track to formatting, giving you 1600KB/disk to play with.

    25. Re:What's this? by Kadmium · · Score: 1
      '1 billion bytes equals 1 gigabyte - since when?'

      Actually, that's accurate. The unit you're thinking of is called a gibibyte. See http://en.wikipedia.org/wiki/Gibibyte

    26. Re:What's this? by darien · · Score: 1

      The Amiga could format high-density disks to 1,760 Kb, given a high-density disk drive (which only came as standard in the high-end machines). However, if I remember rightly, to do this it had to slow the drive down. Very flexible machines, Amigas...

    27. Re:What's this? by darien · · Score: 3, Funny

      No need to get all holier-than-thou.

    28. Re:What's this? by KiloByte · · Score: 2, Informative

      Wrong. You're confusing low-level formatting (laying physical sectors onto the disk's surface) with creating a filesystem -- this is what's usually called formatting these days.

      If you obey the standard PC format, you'll get 18 sectors per track, letting quite a lot of margin space. The margins are needed as the drive doesn't really care whether the new data is put in the exactly same place as the old sector was. Still, the standard is way too conservative, and many programs like fdformat let you reduce the margins. Even Microsoft's original Win95 install floppies used a 1.7MB format.

      That was the physical low-level format, a rough equivalent to the level 2 ISO/OSI network layer (level 1 is twiddling the bits, level 2 defines the byte and sector boundaries in the raw bit stream).

      FAT formatting (the filesystem) uses up 33 sectors (on the whole disk, not per-track), reducing the useful space to 2847 sectors, that is 1457664 bytes. And this is what you see when you check the free space on an empty floppy.

      --
      The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
    29. Re:What's this? by DarkEdgeX · · Score: 1

      Actually it's not. Those new SI prefixes will never be universally accepted (and are really rather retarded when the existing prefixes function fine). I suspect it'll be about as successful as trying to make the metric system universal in the US.

      --
      All I know about Bush is I had a good job when Clinton was president.
    30. Re:What's this? by P-Nuts · · Score: 0, Flamebait

      It would be appropriate to market hard disks in terms of metric gigabytes etc, if any operating systems reported disk space in terms of metric gigabytes. But since they all report space in GB, meaning 1,073,741,824 bytes, you can understand why consumers are annoyed to find that the disk they bought isn't as big as it said it was. This IEC standard to introduce kiB, MiB, GiB is just a joke. Standards are supposed to standardize existing practice, not just make up some stupid names for things.

    31. Re:What's this? by Targon · · Score: 1

      Of course, if you REALLY need that much storage space, you should get a larger hard drive. If I know I need to store 80 gigs worth of information on a hard drive, I'd want 120 gigs just to give me some extra space. It's a pain to run out of room for log files and temp space for example.

      Disk space is also VERY cheap these days, so how much room you have on a drive isn't as big a deal as it used to be. Remember the old days of 20 Megabyte hard drives and how they were more than enough for most people? Remember the cost?

    32. Re:What's this? by iwan-nl · · Score: 1

      I agree. It doesn't matter though, since I've never met a single person who actually uses these units in real life.

      --
      I'm trying to improve my English. Please correct me on any spelling/grammar errors in this post.
    33. Re:What's this? by Anonymous Coward · · Score: 0

      G (giga) is 1 billion (1000^3), as it's always been.
      Gi (gibi - giga binary) is 1024^3, but Windows and software developers who apparently don't know the difference keep calling it G.

      Software that I can think of instantly that have adapted: DC++ (and derivatives), Valknut.

      Read more about it on http://en.wikipedia.org/wiki/Binary_prefixes#IEC_s tandard_prefixes //MMN-o, the anonymous coward

    34. Re:What's this? by Anonymous Coward · · Score: 0

      So that kilogram of sugar I bought the other day is actually 1024 grams?

      Ah you learn something new every day...

    35. Re:What's this? by Shinobi · · Score: 2, Insightful

      Ever since they started using the Giga prefix. Giga is explicitly defined as 10^9 base-10, ever since 1873 when the kilo, Mega, Giga etc prefixes were standardized.

      Ergo, 1 GigaByte=1 000 000 000 Bytes.

      Anything else is a result of comp sci people fucking up their standards compliance.

    36. Re:What's this? by Anonymous Coward · · Score: 0

      Please, kilo was used for 1024 only in when adressing computer memory, where there was technical reasons for it... And then the amount of memory used the same units, unfortunatelly. Thats it. There are no such thing as "metric GB" - a giga is giga is a giga. The use of kilo = 1024 in data storage has always been wrong by misinformed individuals not understanding basic computer science, and most wrong are the hybrids that were used by floppies. And don't try using it in data transfer either. Sending 8 gigabytes of data over a 1 gigabit/s network (1000000000 bit/s) takes 8 seconds. Don't suggest anything else or you will shoot yourself in the foot. I just wonder why I am writing this message, the uninformed will just not understand and the informed are bored enough of the uninformed masses.

    37. Re:What's this? by dirty · · Score: 1

      I remember most floppies being advertised as 1.4Megabytes, which is accurate (actually I think it's .0625MB less than the actual capacity).

      --

      -matt
    38. Re:What's this? by drsmithy · · Score: 1
      Then people with computers decided close enough is good enough (the LAST people who should have done such a braindead thing) and decided to make some kilo, mega, giga, tera etc prefixes equal to the closest binary representation, 1024, 1048576 etc and it's confused everybody ever since.

      No, it hasn't. I've yet to meet *anyone* who should have been confused about those prefixes that actually were.

    39. Re:What's this? by Anonymous Coward · · Score: 0

      seagate increased the waranty on all new, and even retroactively increased a good portion of drives they had already sold to 5 years.

    40. Re:What's this? by Anonymous Coward · · Score: 0

      Yes, we all remember when that happened in 1998. How many times have you uttered anything-bi? Nobody does. It isn't taught at university. Kilobytes are still 1024 because that's the way every textbook has been written. Newer ones include the "-binary" garbage, but are careful to point out what I'm saying here: language usage comes from people speaking, not from a mandate. If people use "Gigabyte" to mean 10247^3 bytes, it doesn't matter that some marketdroids got together and labeled that usage "nonstandard". I am more concerned with being understood (which is the purpose of speech and language) than with obeying the minority idea of what would be more easily understood.

    41. Re:What's this? by wisdom_brewing · · Score: 1

      could someone tell microsoft then? windows should still read 160gb for a 160gb drive... it doesnt, so obviously theres an inconsistency.

    42. Re:What's this? by MonkeyOfRage · · Score: 1

      Are you an American? (That would explain why you don't know shit about Kilo, Mega and Giga....)

      And what would account for your apparent misconceptions about the charm or cleverness of bigotry?

    43. Re:What's this? by MonkeyOfRage · · Score: 1

      Almost as delightful as having Western Digital refuse to honor a warranty because they said it had originally been shipped by them to another region (although it was purchased retain in this one). Same drive, SAME warranty, still fucked.

    44. Re:What's this? by some+guy+I+know · · Score: 2, Informative
      The Amiga could format high-density disks to 1,760 Kb, given a high-density disk drive (which only came as standard in the high-end machines). However, if I remember rightly, to do this it had to slow the drive down.
      The drive itself was not slowed down.
      Instead, the Amiga had the drive write an entire track at a time, rather than just one sector at a time.
      (This meant that it could store more sectors per track, because there were no inter-sector gaps, just a lead-in and lead-out for the track as a whole.)
      The reason that the drive seemed slower was because to write one sector, the Amiga had to read an entire track, replace the sector to be written, and then rewrite the entire track back to disk.
      All a PC had to do to write a sector was to write the sector.
      So it was the OS and method of storage that caused the slowdown, not the drive (hardware/firmware) itself.

      What this meant was that writing random sectors would take more time, but writing sectors sequentially would not (usually).
      In fact, disk-to-disk copies were faster, because the Amiga could start reading in the middle of a track to get the whole track, whereas a PC had to wait for the particular sector that it wanted to read to come around under the read head.
      --
      Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
    45. Re:What's this? by Anonymous Coward · · Score: 0

      Oh my god he's redorkulated.

      No, you cannot have a conference to decide that what people say is wrong and suggest they use something else instead. A room full of egg heads doesn't override a country or world full of shit heads.

    46. Re:What's this? by NoMoreNicksLeft · · Score: 1

      Standards are supposed to standardize existing practice, not just make up some stupid names for things.

      Except in europe, where standards are made up by aristocrats who have never needed to count anything themselves, having servants run around in dusted wigs and so forth. They are the inventors of (in no particular order) the metric system and the ten day week (I'm not making this up, you can't make this stuff up).

    47. Re:What's this? by gweihir · · Score: 1

      1 billion bytes equals 1 gigabyte - since when?

      Ever since the SI was defined. Do you live in the stone-age?

      The use of 1GB = 2^30 Bytes is wrong and illegal in the civilised world when used in a measurement. (RAM size is _not_ measured, since it only grows by factors of 2. It is in size-classes. HDDs can take any size, so there it is a measurement and _must_ use legal, i.e. SI, units and prefixes or face legal challences and possible fines.)

      Correct usage:
      2^30 Bytes = 1 GiB
      10^9 Bytes = 1 GB

      The first is only IEC, but the second is SI and the legal usage allmost anywhere.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    48. Re:What's this? by Anonymous Coward · · Score: 1, Insightful

      Yes, but that number has NO importance. No has ever needed to refer to 8*(10^9) bits of data, and no one ever will. We could either say I have 512MB of RAM, or that I have 536.870912MB of RAM.

      Knowing the sort of people who argue for mebi, I'd imagine you'll next suggest that's the RAM manufacturers problem, and they should stop making address spaces powers of two...

    49. Re:What's this? by Lord+Ender · · Score: 1

      So you prefer ambiguity? I'm sorry, but "pyrrhonist doesn't like the sound of the word" is NO reason to continue using ambiguous language. The new prefixes are good for everyone, even the dumb people who have to be dragged to them kicking and screaming.

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    50. Re:What's this? by GraemeDonaldson · · Score: 1
      please put the following events into the correct order: America goes metric, hell freezes over, people use Gibi correctly.

      It doesn't really matter what order you put them in, since they are all scheduled to happen at roughly now()+infinity. ;-)
      --
      I think, therefore I am. I think?
    51. Re:What's this? by Little+Brother · · Score: 1

      Am I reading you right? You would have 1GB on a HDD be different from 1GB on RAM? Surly not? What were you tryig to say?

      --

      Little Brother, watching the watchers

    52. Re:What's this? by timster · · Score: 1

      All the drives Maxtor sold HP for that model of desktop were defective. I'm pretty sure it's not even a mechanical issue, it's some sort of logic issue. Some of these drives even forget what their model number is.

      Your failure rate will exceed 10% by quite a bit, I'd expect. Keep backups. I wouldn't be surprised to see a class-action lawsuit over this one. HP has stopped using that drive, though, so your warranty replacements will be Seagate.

      --
      I have seen the future, and it is inconvenient.
    53. Re:What's this? by orim · · Score: 1

      Recently had a 250Gb hard drive fail, after about 6 months. They gave me a new drive, but the warranty on that is only another 6 months. What a great way to say:
      "Yeah, we fucked up, we apologize for the inconvenience. Oh, btw, our bottom line is more important than your data, and we don't guarantee the new drive at all really."

      Greedy fuckers!

      --
      "If you could only see what I've seen with your eyes..." - Roy Batty
    54. Re:What's this? by Bob+Uhl · · Score: 1

      There was never any ambiguity: everyone in computers knew that kilo- indicated 1,024x, giga- 1,024x1,024 and so on. The fact that users of French units have such small brains that they cannot imagine one word having multiple meanings has no bearing on the rest of us, who can.

    55. Re:What's this? by Bob+Uhl · · Score: 1

      No--we just came up with a new standard. Base-10 is foolish for computer units (it's also foolish for other units, but that's another story). There's no ambiguity, since computer units are always base-2.

    56. Re:What's this? by Anonymous Coward · · Score: 0

      ergo

    57. Re:What's this? by admiralh · · Score: 1

      According the standard SI naming convention, gigabyte *is* 1,000,000,000 bytes.

      If you're looking for the correct, though rarely used prefix that floats in the the computer world, it's "gibi".

      kilo - 10^3
      kibi - 2^10 = 1024
      mega - 10^6
      mebi - 2^20 = 1024^2 = 1,048,576
      giga - 10^9
      gibi - 2^30 = 1024^3 = 1,073,741,824

      and so on.

      Link here.

      --
      Hopelessly pedantic since 1963.
    58. Re:What's this? by operagost · · Score: 2, Funny

      My computer gets forty gibibytes to the thoumeter, and that's the ways I likes it!

      --

      Gamingmuseum.com: Give your 3D accelerator a rest.
    59. Re:What's this? by operagost · · Score: 1

      Read the documentation or manual next time. If they had said 5 years on the box and then refused replacement after two, THAT would have been a fucking-over.

      --

      Gamingmuseum.com: Give your 3D accelerator a rest.
    60. Re:What's this? by Anonymous Coward · · Score: 0


      Hard drives store data in 512 byte blocks, not 500 byte blocks. Computers count storage in units of 1024 bytes (2 blocks) = 1 KB, 1024 KB = 1 MB, 1024 MB = 1 GB and so on.

      Bringing up Gi versus GB means you are moron. And a pedant. Pedantic morons, such as you, should never be invited to parties as you are boring and will yammer on pompously about subjects you don't understand and no once cares about.

      I get over my pedantic side by not being a moron and bringing really good beer to parties. Good luck finding your own coping mechanism!

      And now please put the following events into the correct order: America goes metric, hell freezes over, people use Gibi correctly.
      America goes metric, hell freezes over, people use Gibi correctly, you get invited to a party.

    61. Re:What's this? by Anonymous Coward · · Score: 0
      Then people with computers decided close enough is good enough (the LAST people who should have done such a braindead thing) and decided to make some kilo, mega, giga, tera etc prefixes equal to the closest binary representation, 1024, 1048576 etc and it's confused everybody ever since.

      It gets worse than that. Remember 1.44MB floppies? "Mega" in that case means neither a million nor 1024*1024. For these, the formatted capacity is 1.44 * 1000 * 1024 = 1,474,560 bytes.

      Yes, they are idiots. :)

    62. Re:What's this? by Anonymous Coward · · Score: 0

      The only people who aren't confused about those prefixes are the ones who presume idiotic simplicities like "In computing kilo = 1024".

      You know the type, the ones who never got past "I before E except after C" and get hung up on "wierd"

    63. Re:What's this? by Anonymous Coward · · Score: 0

      Actually, up until hard disks started getting close to the 1GB (traditional meaning (1024*1024*1024 Bytes) mark, hard disk manufacturers used the traditional MegaByte meaning to indicate the capacity of their products. Right around then, one of the manufacturer's marketing drones realized they could 'win' the race to 1GB by labling their drives with non-standard base 10 labels. For a short while, only one manufacturer did this, but the others started losing business because people who didn't know about the mislabeling would see that nGB drives from company X cost less than the 'same size' drive from the other companies.

      Faced with potentially going out of business, the other manufacturers quickly followed suit. A few years later, as the result of a class-action lawsuit over mislabeling the size of hard drives, they were forced to indicate on the box how they measured the size of their drives.

      At some point after that (1998ish), the folks in charge of the SI definitions decided to 'fix' the confusion by adding some brand-new, never before seen prefixes that 99% of the world *still* has never heard of, and never will.

      In the mean time, software and RAM manufacturers have continued using the traditional meaning of KB, MB, GB, TB, etc.

      After all, if you want to insist on base 10 definitions of GB, etc, you'll have to let us know just how much data you can store in a cB (centibyte).

    64. Re:What's this? by djdanlib · · Score: 1

      It doesn't seem to me that hard drives have gotten less reliable in the past ten years. I think they more or less leveled off in true MTBF. I've used quite a few (because I need more and more storage all the time) and yet they all seem to still work.

    65. Re:What's this? by stonecypher · · Score: 1

      1 billion bytes equals 1 gigabyte - since when?

      1971. Look up a mebibyte.

      --
      StoneCypher is Full of BS
    66. Re:What's this? by stonecypher · · Score: 1

      New to 1971, perhaps.

      --
      StoneCypher is Full of BS
    67. Re:What's this? by Lord+Ender · · Score: 1

      So how fast is that 1megabit connection you have?

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    68. Re:What's this? by julesh · · Score: 1

      We have 70 desktop systems (HPaq d530) with Maxtor 40GB drives. We've had 8 hard drives fail in these machines in the last 12 months - that's >10% failure rate

      That's fortunate for you. I have 2 desktop systems with Maxtor 40Gbs and a server with 2xMaxtor 40Gb. All 4 disks failed within a month of each other, towards the end of last year.

      They weren't all from the same batch, either.

    69. Re:What's this? by fm6 · · Score: 1
      There should be a FAQ for this issue. People learn language by imitation, not by reading reference books. So informal, imprecise usage creeps in. Sometimes that's dangerous, sometimes not -- but it's a natural human thing.

      Which is why they hire tech writers like me to nitpick manuals, data sheets, and other documentation. Though when we do so, people call us "anal". Oh well.

    70. Re:What's this? by HiThere · · Score: 1

      If they were consistent, then I might agree with you, but the mfgrs actually use the best sounding number that they are close to...for some definition of close.

      Do you really believe that a disk drive will have a number of bytes that is a power of 10? a power of two? (That's more likely, though only for "solid state disks".)

      The giga == 1,000,000,000 standard also gives a false answer, both before and after the drive is formatted. And most computer numbers ARE powers of 2, so I *DO* blame marketing for this. Still, it's a tradition among disk sellers, and we aren't going to change them.

      It's like amplitude on a stereo speaker system. The numbers can have some kind of justification, if you push at things hard enough, and cross your eyes, but it's not the meaning that the (naive) customer will expect it to have, but rather one much more favorable to the manufacturer.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    71. Re:What's this? by evilviper · · Score: 1
      The standard was updated in 1998 to avoid confusion (Standard IEC 60027-2)

      Yes, there's nothing better than an organization that arbitrarily decides that they have the authority to change the meanings of words as they see fit...

      NIST is almost as bad as the hard drive companies.
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    72. Re:What's this? by pyrrhonist · · Score: 2, Funny
      So you prefer ambiguity? I'm sorry, but "pyrrhonist doesn't like the sound of the word" is NO reason to continue using ambiguous language.

      Relax, it was supposed to be a jo....

      waitaminute....

      You're the guy who came up with these prefixes aren't you?

      --
      Show me on the doll where his noodly appendage touched you.
    73. Re:What's this? by Bitmanhome · · Score: 1

      'Tis all true. But high-density disks actually spun at half speed on the Amiga. The disk controller could only handle a certain bit rate, so the HD drives had to be slowed down.

      --
      Not that this wasn't entirely predictable.
    74. Re:What's this? by toddestan · · Score: 1

      Apparently, no one sent the memo to the manufacturers of RAM ands flash memory, where a 512MB memory stick means 512*1024*1024 bytes. It's just the harddrive manufacturers which propogate this gigabyte=10^9 bytes marketing bullshit.

    75. Re:What's this? by toddestan · · Score: 1

      Either you have really bad luck, or I would question just how clean the power is to your house.

      I have had countless Maxtor drives, never had one fail. *goes and knocks on some wood*

    76. Re:What's this? by dickrichardv8 · · Score: 1

      SCOX lawyers must be French by that logic. A word to them has only one meaning on a particular day (they get to define it).

    77. Re:What's this? by Anonymous Coward · · Score: 0

      'cept you are the uninformed. Hard drives store data in 512 byte blocks.

      Hope your foot heals!

    78. Re:What's this? by Anonymous Coward · · Score: 0

      The bus supports 512 octet blocks for historical compatibility reasons. What's actually on the platter for that block is some larger number of magnetic domains sufficient to store both those 4 kib and however many error correcting code bits the vendor decided to use.

    79. Re:What's this? by drsmithy · · Score: 1
      You know the type, the ones who never got past "I before E except after C" and get hung up on "wierd"

      I think you need to consult a dictionary.

    80. Re:What's this? by sjames · · Score: 1

      Ever since they started using the Giga prefix. Giga is explicitly defined as 10^9 base-10, ever since 1873 when the kilo, Mega, Giga etc prefixes were standardized.

      Yes, when applied to computers, the terms were 'corrupted' to neatly match powers of 2. Yes, technically, when the HD manufacturers went from the power of 2 convention to the power of 10 convention, they were just bringing things inline with other measures.

      However, I think it's fairly obvious that their intent was to mislead the customer while staying just a fraction of a micron on the legal side of the line.

      If a gas station switched their pumps bto dispense imperial gallons in the U.S. I'm pretty sure they'd be eaten alive by the beureu of standards on the grounds that while an imperial gallon IS called a gallon in some places, it violates the conventions used here and decieves the consumer.

      It's an all too common tactic with an obvious intent, just like 'bricks' of coffee are steadily shrinking from the old standard that 1 brick = 1 pound. The HD manufacturers are worse though since unlike coffee bricks, 1GB IS (or was) a defined value with a specific size when applied to memory bor storage. I hereby present the rubber spine award to the standards bodies that gave in on the subject.

    81. Re:What's this? by julesh · · Score: 1

      Either you have really bad luck, or I would question just how clean the power is to your house.

      I've had no trouble with any of my seagate drives, most of which are much older than these maxtors, and some of which are used for much heavier duties.

      The server with the two disks was also plugged in through a rather large spike suppressor, which should have ironed out most irregularities that wouldn't utterly fry everything not plugged into the same socket.

      Another poster has pointed out that these disks run rather hot; this might well be the cause of my problem -- all the disks were installed in fairly small cases that might not have provided the best possible airflow to the disks. I've since moved to a much bigger case for the server, and the two replacement Maxtors in there are going strong.

    82. Re:What's this? by Alberic · · Score: 1

      Relax, you can still rename yourself KibiByte, it's okay. ;)

      --
      *squeak*
  4. ...and Statistics. by Kaenneth · · Score: 4, Funny

    So, do you think someone typed "Nuclear weapons are being developed by the government of Iraq.^H^Hn." just before the power went out?

  5. Why do we need it? by Godman · · Score: 4, Interesting

    If we are just now figuring out that fsync's don't work, then the question is, why do we care? Have we been using them, and they just haven't been working or something?

    If we've made it this far without it, why do we need it now?

    I'm just curious...

    --
    I have this really funny quote that I like to put here. Unfortunately, there's this really annoying thing called a char
    1. Re:Why do we need it? by Erik+Hensema · · Score: 5, Insightful

      We need it because of journalling filesystems. A JFS needs to be sure the journal has been flushed out to disk (and resides safely on the platters) before continuing to write the actual (meta)data. Afterwards, it needs to be sure the (meta)data is written properly to disk in order to start writing the journal again.

      When both the journal and the data are in the write cache of the drive, the data on the platters is in an undefined state. Loss of power means filesystem corruption -- just the thing a JFS is supposed to avoid.

      Also, switching off the machine the regular way is a hazard. As an OS you simply don't know when you can safely signal the PSU to switch itself off.

      --

      This is your sig. There are thousands more, but this one is yours.

    2. Re:Why do we need it? by spectecjr · · Score: 2, Informative

      When both the journal and the data are in the write cache of the drive, the data on the platters is in an undefined state. Loss of power means filesystem corruption -- just the thing a JFS is supposed to avoid. ... except most drives use the angular momentum of the drive, the power left in the PSU and any spare voltage in the on-board capacitors to provide the power to finish writing and park the drive heads.

      At least, that was the state of the art in the early 90s.

      --
      Coming soon - pyrogyra
    3. Re:Why do we need it? by Vellmont · · Score: 1


      If we've made it this far without it, why do we need it now?


      Maybe you've made it this far, but I'm sure there's other people that have mysteriously lost data, or had it corrupted. They probbably blamed the OS, faulty hardware, drivers, whatever.

      Data security is based on assumptions (a contract if you will). If you assume the contract hasn't been broken, you look elsewhere for blame when something goes wrong. Up until now I'm sure no one questioned whether fsync() was doing what it was supposed to (at least at the actual HD level).

      --
      AccountKiller
    4. Re:Why do we need it? by cahiha · · Score: 1

      A JFS needs to be sure the journal has been flushed out to disk (and resides safely on the platters) before continuing to write the actual (meta)data.

      That won't happen reliably even if the disk behaves properly: you still get bad sectors and other problems (like this one, for example).

      A file system must take into account that arbitrary blocks can disappear at arbitrary times. Building a JFS assuming that the disk is going to work as advertised is simply insane.

      Unfortunately, there is a lot of insanity like that in software.

    5. Re:Why do we need it? by Darren+Winsper · · Score: 1

      And that's got what to do with the price of fish? Filesystem corruption is a software problem, not a hardware problem (in principle, that is).

    6. Re:Why do we need it? by Darren+Winsper · · Score: 1

      Some assumptions have to be made of the hardware, or you really can't build much of anything. How do you work around the problem that any sector can suddenly turn bad? Write two copies of the journal? What if you happen to hit two bad sectors, one in your primary journal, one in the secondary journal? Keep a third journal? Madness, I tell you!

      Sorry, but sometimes you have to make assumptions. Hell, people make those kinds of assumptions in critical systems, where errors can cost lives!

    7. Re:Why do we need it? by tezza · · Score: 1
      V Interesting indeed.

      Journalled filesystems get mentioned as an answer. But these have been around for a while, since they were developed by the likes of SGI in their heyday. Any catastrophic failures caused by this would have been noticed under development.

      But the BIG two reasons why you are right are:

      1. Oracle: They would have placed warnings about drives failing. They have zero interest in having their stability undermined by other layers.
      2. Slashdot: Someone would have worked with an Oracle uber-admin and posted a story about how a crucial site was brought down.

      So yeah, seems like isn't that crucial beyond the theoretical and the super-critical.

      --
      [% slash_sig_val.text %]
    8. Re:Why do we need it? by Anonymous Coward · · Score: 0

      Sounds nice, but if it worked, this tool wouldn't find errors either.

    9. Re:Why do we need it? by bgog · · Score: 4, Insightful

      The author is specifically talking about the fsync function not the ATA sync command. fsync is an OS call notifying the system to flush it's write caches to the physical device. This writes to the disks write cache but I don't believe it actually issues the sync command to the drive.

      In the case of a journaling file system they issue the sync command to the drive to flush the data out.

      I work on a block-level transactional system that requires blocks to be synced to the platters. There where two options, modify the kernel to issue syncs to the ata drives on all writes (to the the disk in question) or to just disable the physical write cache on the drive. Turned out to be a touch faster to just diable the cache but the two are effectivly equal.

      However drives operate fine under normal conditions, applications write to file systems which take care of forcing the disks to sync. fsync (which the author is talking about) is an OS command and not directly related to the disk sync command.

    10. Re:Why do we need it? by prefect42 · · Score: 1

      Are you being intentionally facetious? How can software work reliably when hardware doesn't do what it says it does?

      Please explain how you're planning on coding around a hard disk that every full moon replaces all of your data with an ELO song on a loop?

      --

      jh

    11. Re:Why do we need it? by Anonymous Coward · · Score: 0

      I love it when technically ignorant people come on Slashdot and talk as if they know what the fuck they're babbling on about.

    12. Re:Why do we need it? by Darren+Winsper · · Score: 1

      Read what spectecjr was saying. He was saying "hard disks use the residual power from an abrupt power-off to park their heads." Well, that's true and a useful feature, but it helps didly squat when you've got unflushed caches.

    13. Re:Why do we need it? by pe1chl · · Score: 2, Interesting

      But since then, the angular momentum of drives has decreased, and cache size has increased.
      Of course write speed has increased as well, but typical cache size of 8MB and write speed of 50MB/s would mean 160ms of continuous writing when the head already is positioned correctly.
      Assuming the cache can contain blocks scattered over the entire disk, it does not seem realistic to write everything back on power failure.

    14. Re:Why do we need it? by prefect42 · · Score: 1

      Mea culpa ;)

      --

      jh

    15. Re:Why do we need it? by Anonymous Coward · · Score: 0
      How can software work reliably when hardware doesn't do what it says it does?


      This is the explanation you need.


      Hardware failure are a fact of life.


      Please explain how you're planning on coding around a hard disk that every full moon replaces all of your data with an ELO song on a loop?


      This is not very realistic, but is very similar to a hard disk that would totally fail every full moon. Raid systems works with those assumptions, so I don't see a problem here.


      In the book I pointed you to, gray implement a disk subsystem based on function that read and write sectors. Those functions are such that, on a random basis, they perform a different function (like not writing the data, writing it in a different place, writing a different data), but pretends everything went well. On top of that he implements a 100% reliable transactional system.

    16. Re:Why do we need it? by prefect42 · · Score: 1

      But then aren't you relying on your RAID controller being trustworthy? And you system bus, and you processor and...?

      --

      jh

    17. Re:Why do we need it? by Peaker · · Score: 1

      Actually a smart Journaling filesystem can rely only on the order of writes to the disk remaining faithful to the order of write commands, in order to avoid corruption.

      Allowing sync commands helps but is not required.

    18. Re:Why do we need it? by Anonymous Coward · · Score: 0

      Well, as I told you, he design the system on top of a read/write routine that sometimes silently don't do what is expected. At this point you don't care if the problem come from the disk hardware, the disk firware, the disk bus, the raid controller, etc, etc.

      The key is the probability of failure. Having something 100% reliable is of course impossible (I bet you didn't plan for that comet hitting the earth :-) ).

      But the probability of failure of a subsystem can be reduced by software as much as you want, using redundancy and integrity control. The amount of redundancy is directly linked to the probability of failure of sub-somponents. Hard disk drives are much less reliable than you think, but the CRC implemented in firmware hides this from the host.

      Now, how do you plan against a failure of your processor? Depends on what kind of service you need (and how much resources you are ready to give, in term of money and/or performance):

      * Get a hotswappable multiprocessor, with the associated service so the processor never fails (aka get a mainframe). Costly, best performance.
      * Build a software cluster. If one machine fails, the second restarts the transaction without any visible glitch from the outside. Less costly, some measurable downtime.
      * If a processor fails, restart your database with a different machine. Less costly, some minutes of downtime (no lost transactions, if the system is transactional).
      * If a processor fails, change it, and restart your database. Even less costly, potential days of downtime (still no lost transaction)

      I read somewhere, that, in MVS (mainframe OS), 70% of the code is used for percolation. Here is the definition of percolation:

      "percolation: In error recovery, the passing along a preestablished path of control from a recovery routine to a higher-level recovery routine."

      You can build arbitrary reliable system with unreliable pieces.

    19. Re:Why do we need it? by prefect42 · · Score: 1

      Okay, in which case I agree with what you were saying. I was just making sure you weren't saying you could make a 100% reliable system using unreliable components.

      --

      jh

    20. Re:Why do we need it? by Darren+Winsper · · Score: 1

      I'm curious, are you talking about me in that statement? It's just that if you are, I can back up my statements relatively easily.

    21. Re:Why do we need it? by Erik+Hensema · · Score: 1

      However, drives are free to re-order writes as they see fit, AFAIK. So you can't rely on that either.

      --

      This is your sig. There are thousands more, but this one is yours.

    22. Re:Why do we need it? by swmccracken · · Score: 5, Insightful

      This writes to the disks write cache but I don't believe it actually issues the sync command to the drive.

      Yeah - that's the point of this thing - what's supposed to happen with fsync? From memory, sometimes it will guarentee it's all the way to the platters, sometimes it will not, depending on what storage system you're using, and how easy such a guarentee is to make.

      Linus in 2001 discussing this issue - it's not new. That whole thread was about comparing SCSI against IDE drives, and it seemed that the IDE drives were either breaking the laws of physics, or lying, but the SCSI drives were being honest.

      From hazy memory, one problem is that without tagged-command-queing or native-command-queuing, one process issuing a sync will cause the hard drive and related software to wait until it has fully synched for all i/o "in flight"; holding up any other i/o tasks for other processes!

      That's why fsync often lies; because it's not pratical for people that fsync all the time to flush buffers to screw around with the whole i/o subsystem, and apparently some programs were overzealous with calling fsync when they shouldn't.

      However, with TCQ, commands that are synched overlap with other commands, so it's not that big a deal (other i/o tasks are not impacted any more than they would by other, unsynchronised, i/o). (Thus, with TCQ, fsync might go all the way to the platters, but without it it might just go to the IDE bus.) SCSI has had TCQ from day one, which is why a SCSI system is more likely to sync all the way than IDE.

      If I'm wrong, somebody correct me please.

      Brad's program certainly points out an issue - it should be possible for a database engine to write to disk and guarentee that it gets written; perhaps fsync() isn't good enough - be this fault in the drives, the IDE spec, IDE drivers or the OS.

    23. Re:Why do we need it? by bill_mcgonigle · · Score: 2, Informative

      I work on a block-level transactional system that requires blocks to be synced to the platters. There where two options, modify the kernel to issue syncs to the ata drives on all writes (to the the disk in question) or to just disable the physical write cache on the drive. Turned out to be a touch faster to just diable the cache but the two are effectivly equal.

      Just to clarify - use hdparm -W to fiddle with the write cache on the drive. I've built linux-based network appliances that go out in the field, sometimes overseas, and can't be touched by a competent tech and sometimes loose power. You have to use a journaled filesystem and turn off the write cache. The write speed starts to suck, but I was network-bound anyway.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    24. Re:Why do we need it? by Anonymous Coward · · Score: 0

      That might help in the cases where the drive doesn't just pretend to turn the write cache off. :-}

      Behavior like this is why storage array vendors qualify drives down to the firmware level.

    25. Re:Why do we need it? by DrSkwid · · Score: 1

      to finish writing and park the drive heads.

      I don't know if it's true but it's what he said

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    26. Re:Why do we need it? by Darren+Winsper · · Score: 1

      You're right, I made a mistake. So, let's think about this logically:
      Some drives have 8MB of cache.
      Most of those hard drives can sustain around 40-50MB/sec.
      Writing 8MB would take around 200-250ms, minus seek time.
      The cache may have lots of small files, which would kill your throughput. 16 500K files would put your write time at getting on for 500ms or more.
      Can a hard drive continue spinning under its own residual power at 7,200 RPM or more for half a second or longer? I'd wager no, considering I hear my hard drives start spinning down straight away when the power goes.

      So, parking heads, fine. Writing out the cache? I'd wager there are cases where there simply isn't the time.

    27. Re:Why do we need it? by Anonymous Coward · · Score: 0

      Or you can blame it on the application. Properly coded applications, for better or for worse, can't assume that fsync() goes all the way to the disk. Anyone setting up, say, an Oracle installation knows that they need to jump through hoops to ensure that their storage is set up correctly. In fact, Oracle won't even let you store a database on a regular file system; it wants raw access to its own device.

    28. Re:Why do we need it? by Anonymous Coward · · Score: 0

      1 sec = 1000 ms

      So if the drive had enough stored powered to continue operating for 1 second that should be plently of time to flash the cache and park the head.

      But I doubt that when a hard drive is cut off from power it starts to flush the cache. I bet it continues doing what ever is currently working on. So in the end it doesn't make any difference and the parent talking about angular momentem and such needs to RTFA.

    29. Re:Why do we need it? by HornWumpus · · Score: 1
      Journaling (JFS) is about leaving an audit trail of changes to the hard drive.

      RAID is what you are thinking of.

      You would run your JFS on a RAID to get rock solid storage consuming performance.

      Of course all of this will still get screwed up if the disk refuses to flush its buffers.

      --
      John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
    30. Re:Why do we need it? by spectecjr · · Score: 1

      So, parking heads, fine. Writing out the cache? I'd wager there are cases where there simply isn't the time.

      The drive could simply write all 8Mb to a known block near to the parking area, and then on next power up, write the data in the correct place.

      --
      Coming soon - pyrogyra
    31. Re:Why do we need it? by bill_mcgonigle · · Score: 1

      That might help in the cases where the drive doesn't just pretend to turn the write cache off. :-}

      Behavior like this is why storage array vendors qualify drives down to the firmware level.


      Good point - I'm assuming that if the write speed sucks after turning off the cache then the write cache is really disabled. They could put in extra wait states just to fool us into thinking the cache was off, but I think they're too lazy to do that.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    32. Re:Why do we need it? by Anonymous Coward · · Score: 0

      That's one simple example. The interesting ones are where behavior isn't so consistent (and easily tested) but dependent upon timing or load.

  6. Of course it does! by grahamsz · · Score: 4, Interesting

    Having written some diagnostic tools for a smaller hard disk maker (who i'll refrain from naming) it's amazing to me that disks work at all.

    Most systems can identify and patch out bad sectors so that they aren't used. What surprised me is that the manufacturers have their own bad sector table, so when you get the disk it's fairly likely that there are already bad areas which have been mapped out.

    Secondly the raw error rate was astoundingly high. It's been quite a few years but it was somewhere between on error in every 10E5 to 10E6 bits. So it's not unusual to find a mistake in every megabyte read. Of course CRC picks up this error and hides that from you too.

    Granted this was a few years ago, but i wouldn't be surprised if it's as bad (or even worse) now.

    1. Re:Of course it does! by frickenhell · · Score: 0

      When hard disks first came out they would sell/label them as 5 megabytes but contain a little more than that with the assumption that a percentage of the disk would fail.

    2. Re:Of course it does! by Nutria · · Score: 1

      It's been quite a few years but it was somewhere between on error in every 10E5 to 10E6 bits. So it's not unusual to find a mistake in every megabyte read.

      I'm surprised, but not that surprised.

      Areal densities are so high these days, the r/w heads are so small, and prices are so low, that I also am truly amazed that modern HDDs are made to work.

      But then, I remember 13" removable 5MB platters, and 8" floppy drives.

      --
      "I don't know, therefore Aliens" Wafflebox1
    3. Re:Of course it does! by ArbitraryConstant · · Score: 2, Informative

      "What surprised me is that the manufacturers have their own bad sector table, so when you get the disk it's fairly likely that there are already bad areas which have been mapped out."

      Can't you get the count with SMART?

      --
      I rarely criticize things I don't care about.
    4. Re:Of course it does! by cahiha · · Score: 1

      Most systems can identify and patch out bad sectors so that they aren't used. What surprised me is that the manufacturers have their own bad sector table, so when you get the disk it's fairly likely that there are already bad areas which have been mapped out.

      Well, the thing that's disturbing about that is that it potentially messes up scheduling algorithms in the OS, but apparently not enough to matter.

      It's been quite a few years but it was somewhere between on error in every 10E5 to 10E6 bits. So it's not unusual to find a mistake in every megabyte read. Of course CRC picks up this error and hides that from you too.

      The people who design these things calculate those error rates and make them part of the design.

    5. Re:Of course it does! by cowbutt · · Score: 4, Informative
      Sort of, yes:
      # smartctl -a /dev/hde | grep 'Reallocated_Sector_Ct'
      5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
      This indicates that /dev/hde is far from exhausting its supply of reserved blocks (the first 100) and never has been (the second 100, which is 'worst'). When it crosses the threshold (36) (or the threshold of any of the other 'Pre-fail' attributes for that matter), failure is imminent.
    6. Re:Of course it does! by pyropunk51 · · Score: 3, Interesting

      As anybody who's ever used (or had to use :-( ) SpinRite will tel you, your HDD not only lies to you, it cheats and steals as well. To whit: It makes it seem there are no bad sectors, when in fact the surface is riddled with them, only the manufacturer hides this fact from you by having a bad sector table. Also errors are corrected on the fly by some CRC checking. You can ask the SMART for the stats, but you can do very little about the results it gives you, other than maybe buying a new disk (which most likely has a different set of problems - you just don't know what they are). And where have you ever seen a 40Gb drive that is exactly 40 billion bytes big? The bottom line is: Reliability is NOT profitable. Where would Hardware manufacturers be if we didn't have to buy a new disk every 2 years!

      --
      double penetration; //ouch
    7. Re:Of course it does! by spitefulcrow · · Score: 1

      Too bad smartctl doesn't work with SATA!

      --
      Sorry, my karma just ran over your dogma.
    8. Re:Of course it does! by enosys · · Score: 1

      I couldn't get smartctl to work in XP because the VIA VT8237 drivers pretend the drive is SCSI but Linux doesn't do that and smartctl works.

    9. Re:Of course it does! by enosys · · Score: 2, Insightful
      IMHO having the drive hide bad sectors is a good idea. That way you don't have to enter any bad sector lists, you don't have to scan for them when formatting, and the OS doesn't have to worry about them.

      What would you do if you had full control over bad sectors? You're still able to keep trying to read a new bad sector that contains data. The drive will try to repair it when you write to it and if it can't then it will remap it. It seems to me the only thing you can't do is force the drive to try to repair bad sectors that it gave up on earlier.

      Also consider how hard would it be to make a perfect hard drive. Would you be willing to pay for that? Bad sectors that were there all along don't even hurt reliability. It's only a problem when new ones go bad.

    10. Re:Of course it does! by walt-sjc · · Score: 1

      Real men use paper tape or punch cards. We want to be able to SEE our bits. :-)

    11. Re:Of course it does! by Anonymous Coward · · Score: 0

      Damn that's intuitive!

    12. Re:Of course it does! by ArbitraryConstant · · Score: 1

      It works on SATA on my OpenBSD system...

      --
      I rarely criticize things I don't care about.
    13. Re:Of course it does! by delus10n0 · · Score: 1

      Might want to read this and this about SpinRite.

      Steve Gibson is a wacko, man.

      --
      Not All Who Wander Are Lost
    14. Re:Of course it does! by ArbitraryConstant · · Score: 1

      Okay.

      I have two disks purchased a year apart and they both reported 100, so I figured it meant 0 reallocated sectors... I guess 100 doesn't mean 0, just "not many".

      --
      I rarely criticize things I don't care about.
    15. Re:Of course it does! by cowbutt · · Score: 1
      I guess 100 doesn't mean 0, just "not many".

      That's my interpretation, too, FWIW.

    16. Re:Of course it does! by poot_rootbeer · · Score: 1

      Funny you should mention SpinRite, as I was thinking while reading the story submission that Bradfitz is acting a lot like Steve Gibson here -- raising a big stink about something which is behaving within the bounds of the technical specifications, even if maybe it is a bad idea.

      Given the heavily collaborative nature of computer engineering, I tend to take any alarms raised by one person while no one else in the community seems concerned with a few grains of rock salt.

    17. Re:Of course it does! by sconeu · · Score: 1

      Just last week, I tried to explain to my 15 year old daughter what a punch card was....

      --
      General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
    18. Re:Of course it does! by Nutria · · Score: 1

      Just last week, I tried to explain to my 15 year old daughter what a punch card was....

      I'm sure there's some on Ebay you could buy. And there's pictures galore on the internet.

      Besides, if she knows what a byte is, how hard is it to understand, A punch card stores 80 bytes of data on it by punching holes in various specific places on the card. The holes are punched using a big, scary-looking typewriter, and the cards are read by a big, noisy mechanical monster that plugs into a computer the size of our dinner table with a cable the size of your wrist. (Since girls' wrists are, on average, of course, smaller than teenage boys wrists.)

      --
      "I don't know, therefore Aliens" Wafflebox1
    19. Re:Of course it does! by stonecypher · · Score: 1

      Uh. This has always been the case. In fact, that was one of the big deals about IDE drives, was that new bad sectors forming had become rare enough that the manufacturer's original bad sector table was deemed good enough forever.

      If you were older, you'd remember the phrase "low level format," and how awful it was the first time you found out you couldn't do that to an IDE drive.

      --
      StoneCypher is Full of BS
    20. Re:Of course it does! by spitefulcrow · · Score: 1

      Really? Are you using libata or the IDE branch SATA drivers? The libata drivers currently do not pass SMART commands through the SCSI layer.

      --
      Sorry, my karma just ran over your dogma.
    21. Re:Of course it does! by enosys · · Score: 1

      I was using the IDE drivers, not libata.

    22. Re:Of course it does! by noidentity · · Score: 1

      Fortunately you aren't storing your data on the physical device, you're storing it on the virtual device that the hard drive presents via the interface to the computer. If the virtual device has a statistically insignificant error rate, who cares if the error rate of the physical device is much higher?

      Next thing you'll be complaining that even though your DVD plays without any visual glitches, there are hundreds of errors in the raw data being read off the disc before error correction.

      The bottom line is: absolute reliability at the hardware level is NOT profitable, but reliability at the software level IS.

    23. Re:Of course it does! by Frank+T.+Lofaro+Jr. · · Score: 1

      Remapped sectors can make what used to be an efficient access pattern less so.

      Having the OS know about the bad sectors and do its own remapping will allow head movement optimization routines in the OS to know the real situation - and thus bad sectors won't slow down access as much, since the OS knows where the drive will be seeking to and can thus minimize the amount of head travel needed.

      --
      Just because it CAN be done, doesn't mean it should!
    24. Re:Of course it does! by Anonymous Coward · · Score: 0

      The author of grcsucks.com has far too much time on his hands!

    25. Re:Of course it does! by Anonymous Coward · · Score: 0

      If the OS knew about marginally bad sectors it could use them for Squid cache files, redundant database indexes, and other data that can be recreated at need. It could also delegate repairs to a tool that understands various file formats (filesystem directory entries, archives, mbox, XML...) and knows which possible repairs would actually be valid for a particular file.

    26. Re:Of course it does! by tomoe27 · · Score: 1

      While I am not exactly happy with some of the practices of the HD industry, i would like to mention that many hard drives since the ST508 MFM/RLL days have had bad sectors "out of the box" Once upon a time you had to low level format the drive and enter in the map of sectors manually (usually found on a sticker on the drive). Now they have just auto-sensed it and automated the mapping process. However, those drives didn't have the high error rate you see today, it was usually only a few kb out of a 20-60mb drive.

  7. Yes, it lies right now, but by Anonymous Coward · · Score: 0

    hitachi's working on something for that right now.

  8. Corporate Integrity by dj245 · · Score: 1
    Manufacturers are blatently sacrificing integrity in favor of scoring higher on 'pure speed' performance benchmarking."

    Corporate Integrity, not data integrity. I've read through the article and don't see how you can lose data integrity unless you disable all caching, from the OS to the disk itself. In this day and age, nobody does that. Sure, somethings broke. But I fail to see how its very useful these days anyway. Maybe someone with a better grasp of why you would need Fsync could help out?

    --
    Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
    1. Re:Corporate Integrity by Dorsai65 · · Score: 4, Informative

      What the article is saying is that the drive (or sometimes the RAID card and/or OS) is lying (with fsync) when it answers that it wrote the data: it didn't; so when you lose power, the data that was in cache (and should have been written) gets lost. It isn't a question of whether caching is turned on or not, but the drive truthfully saying whether or not the data was actually written.

      --
      --- Asking inconvenient questions for over 30 years...
    2. Re:Corporate Integrity by Anonymous Coward · · Score: 0

      If caching is disabled then fsync does nothing anyway, doesn't it?

  9. In other news.... by ToraUma · · Score: 5, Funny

    96% of Livejournal users replied, "What's a hard drive? Is that like a modem?"

    1. Re:In other news.... by AlysseumWarrior · · Score: 0

      the other 4% thought it was when you write emo poetry for 6 hours in a row.

    2. Re:In other news.... by ameoba · · Score: 3, Funny

      No. It's memory. I just can't figure out why all these games that say 512MB is optimal are runnign so slow when I have 120GB.

      --
      my sig's at the bottom of the page.
    3. Re:In other news.... by paulymer5 · · Score: 1

      Likewise, 96% of Slashdot users thought, "Livejournal? WTF?"

    4. Re:In other news.... by Anonymous Coward · · Score: 0

      Is that like a modem?

      Sure.

    5. Re:In other news.... by HRH+King+Lerxst · · Score: 1

      No, you're modem is in the hard drive (see the phone jacks?...no the smaller ones).

      You might also know it as your cpu.

      --
      No one got beat up more often than the mimes of the old west!
    6. Re:In other news.... by Anonymous Coward · · Score: 0

      Current poll playing in the LiveJournal-sphere:

      "What Brand of HardDive are You?"

  10. Seems fair enough: We lie to our hardrives too by MonsieurCoward · · Score: 2, Funny

    ... "Swear to you there's no pr0n there !!"

    --
    Mcow.
  11. Which ones ? by nbharatvarma · · Score: 1
    "Most RAID cards lie (especially LSI ones), some OSes lie (rare) , and most disks lie (doesn't matter how expensive or cheap they are)."

    Can someone explain how OSes could lie ? I mean, is it because of a buggy implementation or is it intentional?? "lie" would mean I personally don't see any reason why an OS should lie...

    --
    ... and I shall strike upon thee with great vegeance, furious anger and a slightly positive karma.
    1. Re:Which ones ? by ewhac · · Score: 4, Interesting
      Can someone explain how OSes could lie?

      Easy. The driver gets a 'sync' command from the OS. However, the driver writer believes that most other programmers call fsync() when they don't really need to, and decides to "optimize" this case. So he passes the command on to the drive, but returns immediately (allowing the drive command to complete asynchronously). This makes his driver appear faster.

      Fortunately, most driver writers have their priorities straight about data integrity, so this kind of thinking isn't very common.

      Schwab

    2. Re:Which ones ? by gtkuhn · · Score: 1

      Apparently, that kind of thinking is pretty common among HD manufacturers.

    3. Re:Which ones ? by Anonymous Coward · · Score: 0

      One more reason to stay away from binary drivers.

    4. Re:Which ones ? by Anonymous Coward · · Score: 0

      On Linux with Ext2: Because the benchmarks became much worse if it was done right.

      Search the old Usenet archives for Linus' comments on this - I've had my fights with him over the theme. I saw it as unfair to the users and an unfair way of competing - he saw it as OK "as he didn't get many complaints" (that's a more or less word-by-word quote).

      For complete disclosure: I'm on the BSD side of the fence.

  12. Re:Err... "lying" is the default setting. RTFM. by ewhac · · Score: 5, Informative
    Yes, except there is a 'sync' command packet that is supposed to make the drive commit outstanding buffers to the platters, and not signal completion until those writes are done. It would appear, at first blush, that the drives are mis-handling this command when write-caching is enabled.

    There is historical precedent for this. There were recorded incidents of drives corrupting themselves when the OS, during shutdown, tried to flush buffers to the disk just before killing power. The drive said, "I'm done," when it really wasn't, and the OS said Okay, and killed power. This was relatively common on systems with older, slower disks that had been retrofitted with faster CPUs.

    However, once these incidents started ocurring, the issue was supposed to have been fixed. Clearly, closer study is needed here to discover what's really going on.

    Schwab

  13. This means... by Anonymous Coward · · Score: 0

    I can write p0rn faster to my disk, but I can't save my p0rn. Ack!

    $ sync
    $ sync
    $ sync

  14. An acceptable alternative. by rice_burners_suck · · Score: 3, Insightful
    Why am I not surprised at this? First, they decide that a kilobyte = 1000 bytes, rather than the correct value of 1024. This leads the megabyte to be 1000 kilobytes, again, rather than 1024. The gig is likewise 1000 megabytes. You might think, ok, big deal, right?

    Yeah. In the days when the biggest hard drive you could get was 2 gigs, you would get 147,483,648 bytes less storage than advertised, unless you read the fine print located somewhere. This is only about 140 megs less than advertised. Today, when you can get 200 gig hard drives, the difference is much larger: 14,748,364,800 bytes less storage than advertised. This means that now, you get almost FOURTEEN GIGABYTES less storage than advertised. That's bigger than any hard drive that existed in 1995. That is a big deal.

    I'm bringing up the size issue in a thread on fsync() because it is only one more area where hard drive manufacturers are cheating to get "better" performance numbers, instead of being honest and producing a good product. As a result, journaling filesystems and the like cannot be guaranteed to work properly.

    If the hard drive mfgs really want good performance numbers, this is what they should do: Hard drives already have a small amount of memory (cache) in the drive electronics. Unfortunately, when the power goes away, the data therein becomes incoherent within nanoseconds. So, embed a flash chip on the hard drive electronics, along with a small rechargeable battery. If the battery is dead or the flash is fscked up, both of which can easily be tested today, the hard drive obeys all fsync() more religiously than the pope and works slightly more slowly. If the battery is alive and the flash works, the hard drive will, in the event of power-off with data remaining in the cache (now backed by battery), that data would be written to the flash chip. Upon the next powerup, the hard drive will initialize as normal, but before it accepts any incoming read or write commands, it will first record the information from flash to the platter. This is a good enough guarantee that data will not be lost, as the reliability of flash memory exceeds that of the magnetic platter, provided the flash is not written too many times, which it won't be under this kind of design; and as I said, nothing will be written to flash if the flash doesn't work anymore.

    1. Re:An acceptable alternative. by Bacchuss · · Score: 1

      It's not flash (EEPROM), it's battery-backed RAM. Most expensive RAID-controllers already use that so it shouldn't be a problem to implement it in HD's. They just need a little rechargeable battery.

    2. Re:An acceptable alternative. by Anonymous Coward · · Score: 1, Informative

      Since 1960 [wikipedia.org], 1 kilobyte = 1000 bytes. Just like 1 kilometre = 1000 metres. Since 1998 [wikipedia.org], 2^10 bytes = 1 kibibyte.

    3. Re:An acceptable alternative. by Johan+Veenstra · · Score: 2, Informative

      kilo = 10^3 = 1,000
      mega = 10^6 = 1,000,000
      giga = 10^9 = 1,000,000,000

      kibi = 2^10 = 1,024
      mebi = 2^20 = 1,048,576
      gibi = 2^30 = 1,073,741,824

      So it's not the harddrive manufacturers that are wrong. You get 1 gigabyte harddisk space for every gigabyte advertised. When you're buying 1 gigabyte of memory you get 74 megabytes for free (because you actually get 1 gibibyte).

    4. Re:An acceptable alternative. by stud9920 · · Score: 1, Informative

      While a dubious abuse of popular culture, the hardware manufacturers are only correct about what a kilobyte, megabyte, gigabyte is : in the SI system, we do not use powers of 2 (2^10, 2^20, 2^30), but powers of 10 (10^3, 10^6, 10^9). That's already what the data transmission guys do with kilobits, megabits, gigabits and no one ever complains about them because they are correct.

      There is no reason to make an exception, the use of kilo, mega, giga was abuse in the first place (although acceptable in engineering terms, it's only a 2.4% error)

      The SI standards bodies have produced the horrendous prefixes kibi-, mebi- gibi- for your binary needs. They're horrible, but the only ones correct.

    5. Re:An acceptable alternative. by Sparr0 · · Score: 5, Insightful

      You have no grasp of what 'kilo', 'mega', and 'giga' mean. They have meant the same thing for 45 years, computers did not change that. There is a standard for binary powers, you simply refuse to use it.

    6. Re:An acceptable alternative. by Anonymous Coward · · Score: 0

      Fuck you! Stop using international standards to "prove" your theories.

    7. Re:An acceptable alternative. by Rinzwind · · Score: 2, Informative
      Why am I not surprised at this? First, they decide that a kilobyte = 1000 bytes, rather than the correct value of 1024. This leads the megabyte to be 1000 kilobytes, again, rather than 1024. The gig is likewise 1000 megabytes. You might think, ok, big deal, right?
      Wrong. If you start ranting get your FACTS STRAIGHT. It's been solved in 1998 allready.
      The Standards

      Although computer data is normally measured in binary code, the prefixes for the multiples are based on the metric system. The nearest binary number to 1,000 is 2^10 or 1,024; thus 1,024 bytes was named a Kilobyte. So, although a metric "kilo" equals 1,000 (e.g. one kilogram = 1,000 grams), a binary "Kilo" equals 1,024 (e.g. one Kilobyte = 1,024 bytes). Not surprisingly, this has led to a great deal of confusion. In December 1998, the International Electrotechnical Commission (IEC) approved a new IEC International Standard. Instead of using the metric prefixes for multiples in binary code, the new IEC standard invented specific prefixes for binary multiples made up of only the first two letters of the metric prefixes and adding the first two letters of the word "binary". Thus, for instance, instead of Kilobyte (KB) or Gigabyte (GB), the new terms would be kibibyte (KiB) or gibibyte (GiB).

      Here are brief summaries of the IEC Standard:

      bit bit 0 or 1
      byte B 8 bits
      kibibit Kibit 1024 bits
      kilobit kbit 1000 bits
      kibibyte (binary) KiB 1024 bytes
      kilobyte (decimal) kB 1000 bytes
      megabit Mbit 1000 kilobits
      mebibyte (binary) MiB 1024 kibibytes
      megabyte (decimal) MB 1000 kilobytes
      gigabit Gbit 1000 megabits
      gibibyte (binary) GiB 1024 mebibytes
      gigabyte (decimal) GB 1000 megabytes
      terabit Tbit 1000 gigabits
      tebibyte (binary) TiB 1024 gibibytes
      terabyte (decimal) TB 1000 gigabytes
      petabit Pbit 1000 terabits
      pebibyte (binary) PiB 1024 tebibytes
      petabyte (decimal) PB 1000 terabytes
      exabit Ebit 1000 petabits
      exbibyte (binary) EiB 1024 pebibytes
      exabyte (decimal) EB 1000 petabytes
      Check this: http://www.romulus2.com/articles/guides/misc/bitsb ytes.shtml and this: http://www.physics.nist.gov/Pubs/SP811/sec04.html# tab5 Stop spreading FUD.
    8. Re:An acceptable alternative. by daikokatana · · Score: 3, Insightful
      Ok, fair enough. Now step into any of the 99% of all computer shops out there and ask for a hard drive, 160 gibibyte in size.

      If they don't laugh until you exit the store, I'll pay your disk. Please make sure you record the event and share it on the net.

      --
      http://jcsnippets.atspace.com/ - a collection of Java & C# snippets
    9. Re:An acceptable alternative. by Anonymous Coward · · Score: 0
      You get 1 gigabyte harddisk space for every gigabyte advertised. When you're buying 1 gigabyte of memory you get 74 megabytes for free (because you actually get 1 gibibyte).

      ...and that's why RamDisks are better value than normal hard disks.

    10. Re:An acceptable alternative. by Anonymous Coward · · Score: 0

      Has there been any major confusion over the differing standards (1000 and 1024) before HD companies started this?

    11. Re:An acceptable alternative. by kasperd · · Score: 2, Informative

      It's not flash (EEPROM), it's battery-backed RAM.

      The suggestion was to use both, which I agree is a good idea, because you get the best from both worlds. Flash have a problem with being overwritten many times, which the suggested design solves by only using it in case of loss of power. Battery backed RAM have a problem with potential data loss if it needs to keep the data for longer time than there is power, which the suggested design solves by writing data to flash as soon as main power is lost. I hope what Samsung will also take care of those problems.

      --

      Do you care about the security of your wireless mouse?
    12. Re:An acceptable alternative. by sxpert · · Score: 1

      HDDs could use those 1F capacitors, that should be enough...

    13. Re:An acceptable alternative. by rve · · Score: 1

      The confusion is probably in the fact that software, in its early days, was al almost exclusively American affair, while engineering was an international thing for a long time before that. Americans don't use the metric system, while most of the rest of the world has adopted it since the days of Napoleon.

    14. Re:An acceptable alternative. by Johan+Veenstra · · Score: 1

      If harddrives are advertised in megabytes, it stand to reason that you ask for a harddisk in megabytes. Just like you do not pull up to a gas station and ask them to fill up the tank with a certain number of megajoules.

    15. Re:An acceptable alternative. by Alioth · · Score: 2, Insightful

      Ah, so now we know your 3GB space an 100GB of transfer advertised in your sig aren't binary gigabytes, but decimal, just like the hard drive manufacturers :-)

    16. Re:An acceptable alternative. by Anonymous Coward · · Score: 0

      That standard was created after everyone (except hard drive manufacturers) were already using kilo, mega and giga to mean binary sizes, as an attempt to solve the problem without suing for false advertising. It didn't solve the confusion, it just made it even bigger.

      The meaning of kilo, mega and giga didn't change just as much as TCP/IP doesn't exist. In the real world, inventing kibi, mebi, gibi and OSI doesn't suddenly make binary kilo, mega, giga and TCP/IP go away.

    17. Re:An acceptable alternative. by Anonymous Coward · · Score: 0

      There's something puzzling here. What did the IEC do to retroactively correct 60 years of computer science texts? What will they do to correct the next 60 years of computer science texts after the authors forget to use words like "kibibyte?"

    18. Re:An acceptable alternative. by daikokatana · · Score: 1
      But that is precisely my point - when I first bought my computer, a megabyte was 1024 kilobytes, a kilobyte was 1024 bytes. I've never heard otherwise, and later on there came gigabyte (1024 megabytes) etcetera...

      And now all of a sudden the concept of giga/mega/kilo is being changed within this context!

      You compare it to filling your tank at the gas station. Suppose you have your license for over ten years. You go to fill up today, and suddenly find that one liter of gas is about 97% of what it used to be. Wouldn't you be surprised?

      I'm not trying to defend initially giving the value 1024 to kilo, mega etc..., but I really detest the 'oh we are just trying to put things right' mentality companies use, basically only to rip us off.

      --
      http://jcsnippets.atspace.com/ - a collection of Java & C# snippets
    19. Re:An acceptable alternative. by Morlark · · Score: 1

      It may not have solved the confusion, but at least it brought the computing industry back into line with the rest of the world. Standards are there for a reason. They should be adhered to.

      --
      Santa's suicide mission go!
    20. Re:An acceptable alternative. by petermgreen · · Score: 1

      depends on your definition of correct

      is correct the definitions that were originally used? or is correct the revisionist definitions the standards bodies try to push that never get followed in reality?

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    21. Re:An acceptable alternative. by hyfe · · Score: 3, Insightful
      You have no grasp of what 'kilo', 'mega', and 'giga' mean. They have meant the same thing for 45 years, computers did not change that. There is a standard for binary powers, you simply refuse to use it.

      Being able to keep two thoughts in your head simultaniosly is a nice skill.

      Sure, kilo, mega and giga scientific meanings never changed, but kilo, mega and giga in computer science started as out the binary values. They are still in use, when reporting free space left on your hard-drive both Windows and Linux use binary thousands. Saying this is a clear cut case is just ignoring reality, as using 1024 really does simplify alot of the math.

      Secondly, if the manufacturers actually had come out and said 'we have decided to adhere to scientific standards and use regular 1000's' and clearly marked their products as such, we wouldn't have any problems now. The problem is, they didn't. They just silently changed it, causing shitloads of confusion along the way. Of all the alternatives in this mess, they choose the one which could ruin an engineers day, only for the purpose of having your drive look a few % larger.

      Some fool let the marketers in on the engineering meetings and we all lived to rue that day.

      --
      "" How about taking the safety labels off everything, and let the stupidity-problem solve itself? """
    22. Re:An acceptable alternative. by 10101001+10101001 · · Score: 1

      The funny thing is, while kilo, mega, and giga are all metric prefixes in computers when it comes to storage they've all meant 2^(10*n) for n = 1, 2, and 3 respectively. Now, kilobit might mean 1024 bits or 1000 bits, depending on if it's storage or communication, respectively. This is all true because everyone pretty quickly accepted the usage to mean such. Modern computers weren't designed under metric standards (8-bit bytes, not 10-bit bytes as the standard, for example).

      The simple fact is that at some point harddrive manufacturers all decided to start labelling storage in powers of 1000 instead of powers of 1024 (compare this to RAM which still uses powers of 1024). This wasn't something that began from the start but was solely a PR move to make the storage space sound bigger. Even when Windows, Linux, etc continued to use MB to mean 2^20 bytes, HD manufacturers left it up to OS makers to try to explain exactly why they're newly bought HD has less reported space. It's not even like HD makers can claim it makes any real sense for computers given that every modern filesystem uses clusters that are a power of 2.

      It's only years later that a binary standard was created to try to remove the confusion. And while you might be willing to switch to using these new measurements, the fact is that HD makers won't switch to these new units precisely because they don't care if it confuses the end user when the raw number in GiBs is less than than the raw number in GBs reported on the sticker. The only real surprise is that more HD makers haven't switched to reporting bits of storage to inflate the numbers even more.

      Point me to HD makers who report space in GiBs, and I'll reconsider my position that HD makers are a group of deceptive pricks.

      --
      Eurohacker European paranoia, gun rights, and h
    23. Re:An acceptable alternative. by Kiryat+Malachi · · Score: 2, Interesting

      Correct is the definitions that follow standard usage, and usage in EVERY OTHER BRANCH OF THE COMPUTER WORLD.

      How fast is a kilobit per second data transmission? Is it 1024 bits/s or 1000 bits/s?

      As much as it pains me, because I know they did it to screw customers, moving to the standard was correct. It *ought* to match everything else for reasons of consistency; it is more important to have current consistency across all current measurements inside of the computer than it is to have historical consistency of measurements used previously.

      --

      ---
      Mod me down, you fucking twits. Go ahead. I dare you.
      (I read with sigs off.)
    24. Re:An acceptable alternative. by Anonymous Coward · · Score: 0

      But that is precisely my point - when I first bought my computer, a megabyte was 1024 kilobytes, a kilobyte was 1024 bytes. I've never heard otherwise, and later on there came gigabyte (1024 megabytes) etcetera...

      As far as I can recall, hard drives have always used the kilo=1000 standard. At least back to the early 90s. The only thing that used kilo=1024 was RAM.

      I used to have one of the original 5MB hard drives that IBM made for the PC. Too bad I don't have access to it anymore. I'd love to see what the formated capacity is.

      I really don't see why people think there's some big conspiracy here. Hard drive manufacturers used one definition of kilo (the correct one), and memory manufacturers used a different one. Never attribute to malice and all that.

      An interesting factoid: floppy disks used mega=1000*1024 for some unfathomable reason. So a 1.44MB floppy is 1000*1024*1.44=1474560 bytes. That works out to 1.40625MiB :-)

    25. Re:An acceptable alternative. by beanzy · · Score: 1

      Maybe if OSes reported HD sizes in base 10 too, then the average joe wouldn't have the perception that they are being ripped off so much.

    26. Re:An acceptable alternative. by deimtee · · Score: 1

      It's only a 2.4% error at the kilo level. It compounds with each larger prefix, which is why it is now getting corrected.
      kilo 2.40%
      mega 4.86%
      giga 7.37%
      tera 9.95%
      peta 12.59%

      2.4% is tolerable, 12.6% is not.

      --
      I'm guessing that wasn't on their radar screen...
    27. Re:An acceptable alternative. by Sparr0 · · Score: 1

      Yes. If I meant Gibibytes then I would say Gibibytes.

    28. Re:An acceptable alternative. by MikeBabcock · · Score: 1

      The major problem was that DOS/Windows misreported the sizes in MB/KB ... although they weren't being measured in true metric mega and kilo values.

      (I'm sure other OS's did too)

      At the time that 20MB drives were normal, nobody cared much about a few lost bytes -- I recovered more space from defragmenting.

      Slack space from large allocation units was more of a problem still in the 540MB drive days.

      We finally have an official "MiB vs. MB" standard; the industry as a whole is *very young* compared to say cars (which are young compared to metalurgy), its normal to have changing standards.

      Having perspective helps.

      --
      - Michael T. Babcock (Yes, I blog)
    29. Re:An acceptable alternative. by barawn · · Score: 1

      They just silently changed it, causing shitloads of confusion along the way. Of all the alternatives in this mess, they choose the one which could ruin an engineers day, only for the purpose of having your drive look a few % larger.

      Silently? On all the hard drives I had for a long time, they specifically state what a MB (or GB) was - one billion bytes.

      Honestly, it's fairly retarded to have to define what a standard is when you're using it.

      You do realize, though, that you're complaining that they violated a convention to follow a standard, right? Standards are far more useful than conventions, because a standard is rigidly defined.

      This isn't just hard drive manufacturers, either. Theoretical data rate is often expressed in MB/s, because it's just transfer frequency times payload size. Hell, you can figure out rough hard drive capacity by bit density times area.

      The only place where binary prefixes are useful is for memory. The only place.

      I mean, jeez. The computing industry doesn't even agree on the right case on the prefix (kilo is a lowercase k, not uppercase). We haven't been consistent. This wasn't because of marketing people, it was because we were sloppy. Whoever pointed it out to us, it's our own fault. Own up, and fix it. kB = 1000 bytes. KiB (note the uppercase!) is 1024 bytes. At least now there's a standard. It's one freaking letter.

    30. Re:An acceptable alternative. by JabberWokky · · Score: 1
      The only place where binary prefixes are useful is for memory. The only place.

      To either pick a nit or to generalize that, it's actually only useful when you're dealing with a value addressed by a binary value. Early era harddrives did measure in binary values; the early operating systems were closer to the hardware and it made sense.

      But it is only used when you're talking about something addressed in binary values - memory of some sort.

      --
      Evan

      --
      "$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
    31. Re:An acceptable alternative. by Bent+Mind · · Score: 1

      Hmm, I have to sit and think about this for a minute. For a measurement to be useful, it has to measure something meaningful. Now a kilobyte equals 1000 bytes. This was set up in 1998 to avoid confusion between standard metric and the pseudo-metric used by computer science. Of course, everything being truly metric now, there are ten bits to a byte and each bit represents one of ten states of a switch. Those being: on, off, and ?

      Now in the real world, a bit represents one of two states: on and off. A byte is a collection of eight bits and a word is a collection of two bytes. There is a reason that they didn't try to make bits, bytes, and words fit the metric model; they can't. Binary is base 2 and metric is base 10.

      Binary can lead to some rather large numbers. Engineers needed units to represent numbers beyond words. Granted, it would have been nice if the current units had existed at the time. However, they didn't and so they used units that came close to what they needed. In 1998, units were created that were similar, but different from the metric units. Unfortunately, the misleading pseudo-metric units were in wide use by this time.

      In your statement, you say that the problem was solved in 1998. It wasn't. Instead, people started to freely mix true metric units, which have no meaning below kilobyte, with pseudo-metric units that had been in use for the previous 30+ years. A true solution would have involved completely removing the pseudo-metric units rather then replacing them with true metric units. This would have prevented the confusion over 1000 vs 1024 while maintaining units that had meaning all the way down to the bit. There wouldn't be any hard drives measured in gigabytes. Look around though, can you currently find any gigibyte hard drives for sale? I'd say that hard drive manufacturers did lie to customers in 1998 and continue to do so today.

      --
      Request a Linux Shockwave player here: http://www.macromedia.com/support/email/wishform/
    32. Re:An acceptable alternative. by rice_burners_suck · · Score: 1
      Americans don't use the metric system,

      Then maybe we should use the pintbyte, quartbyte, and gallonbyte, instead of the less intuitive kilobyte, megabyte, and gigabyte.

    33. Re:An acceptable alternative. by Anonymous Coward · · Score: 0

      Even on PCs, a word hasn't been two bytes since the days of the 80286. The 80386 used 32 bit words, the Athlon 64 uses 64 bit words, and mainframes often use 36 bit words (comprised of either six 6 bit bytes or four 9 bit bytes, depending on which character encoding you need--"octet" is the word RFCs use now for 8 bits, because "byte" was often wrong).

    34. Re:An acceptable alternative. by ArtStone · · Score: 1

      How many seconds are there in a metric minute? How many days in a metric year? How many degrees in a metric circle?

      Just because humans have 8 fingers + 2 thumbs does not mean the universe is based on 10, any more than God measures the passage of time by how frequently the Earth rotates on its axis :)

      --
      Final 2006 "Proof of Global Warming" US Hurricane Count -> 0
  15. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    How do I turn off write caching?

    With windows xp there is a setting available, but it no longer works with sp2; the setting is reset when you leave the dialog.

    Also, on my mac, I have two external drives in firewire enclosures--how do I turn it off here?

    In any case, this completely sucks. Are there any disks one can buy that default to a sane setting?

  16. As long as.. by cspring007 · · Score: 0

    it shows the right file when i click on it, i dont care...
    untill one day the thing dies, and i lose everything.
    then, i take it outside and set it on fire.
    Who's fSync()ing now!!!!

  17. More information by Halo1 · · Score: 5, Interesting

    There was an interesting discussion on this topic a while ago on Apple's Darwin development list a while ago.

    --
    Donate free food here
    1. Re:More information by Anonymous Coward · · Score: 0

      Yes, but did it happen a while ago?

  18. Not just fsync() by Anonymous Coward · · Score: 1, Insightful

    Lot's of stuff relies on knowing when blocks hit the disk. Think about it... knowing that something is on the disk means you can make assertions about write ordering. What relies on ordering? Databases and filesystems (i.e. BSD softupdates) for starters. If the disk lies to the OS about when data is written, bad stuff will happen sooner or later.

  19. Author lied when implied that DRIVES are the issue by Anonymous Coward · · Score: 5, Informative

    The author lied when implied that DRIVES are the issue.

    ATA-IDE, SCSI, and S-ATA drives from all major manufacturers will accept commands to flush the write buffer including track cache buffer completely.

    These commands are critical before cutting power and "sleeping" in machines that can perform a complete "deep sleep" (no power at all whatsoever sent to the ATA-IDE drive.

    Such OSes include Apples OS 9 on a G4 tower, and some versions of OSX on machines not supplied with certain nuaghty video cards.

    Laptops, for example need to flush drives... AND THEY do.

    All drives conform.

    As for DRIVER AUTHORS not heeding the special calls sent to them.... he is correct.

    Many driver writers (other than me) are loser shits that do not follow standards.

    As for LSI raid cards, he is right, and otehr raid cards... that is becasue the products are defective. But the drives are not and the drivers COULD be written to honor a true flush.

    As for his "discovery" of sync not working.... DUH!!!!!

    the REAL sync is usually a privelidged operation, sent from the OS, and not highly documented.

    For example on a Mac the REAL sync in OS9 is a jhook trap and not the documented normal OS call which has a governor on it.

    Mainframes such as PRIMOS and other old mainframes including even unix typically faked the sync command and ONLY allowed it if the user was at the actual physical systems console and furthermore logged in as a root or backup operator.

    This cheating always sickened me. but all OSes do this because so many people that think they know what they are doing try to sync all the time for idiotic self-rolled journalling file systems and journalled databases.

    But DRIVES, except a couple S-ATA seagates from 2004 with bad firmware, ALWAYS will flush.

    This author should have explained that its not the hard drives.

    They perform as documented.

    Admittedly Linux used to corrupt and not flush several years ago... but it was not the IDE drives. They never got the commands.

    Its all a mess... but setting a DRIVE to not cache is NOT the solution! Its retarded to do so, and all the comments in this thread taling of setting the cache off are foolish.

    As for caching device topics, there are many options.

    1> SCSI WCE permanent option

    2> ATA Seagate Set Features command 82h Disable write cache

    3> ATA config commands sent over SCSI (RAID card) device using a SCSI CDB in passthrough It uses 16 byte CBD with 8h, or 12 byte CDB with Ah for sending the tunneled command.

    4> ATA ATAPI commands for WCE bit, asif it was SCSI

    Fibre Channel drives of course honor SCSI commands.

    As for mere flushing, a variety of low level calls all have the same desired effect and are documented in respective standards manuals.

  20. Here's how by Moraelin · · Score: 4, Informative

    For example, don't think "home user losing the last porn pic", think for example "corporate databases using XA transactions".

    The semantics of XA transactions say that at the end of the "prepare" step, the data is already on the disc (or whatever other medium), just not yet made visible. That, basically all that could possibly fail, has in fact had its chance to fail. And if you got an OK, then it didn't.

    Introducing a time window (likely extending not just past "prepare", but also past "commit") where the data is still in some cache and God knows when it'll actually get flushed, throws those whole semantics out the window. If, say, power fails (e.g., PSU blows a fuse) or shit otherwise hits the fan in that time window, you have fucked up the data.

    The whole idea of transactions is ACID: Atomicity, Consistency, Isolation, and Durability:

    - Atomicity - The entire sequence of actions must be either completed or aborted. The transaction cannot be partially successful.

    - Consistency - The transaction takes the resources from one consistent state to another.

    - Isolation - A transaction's effect is not visible to other transactions until the transaction is committed.

    - Durability - Changes made by the committed transaction are permanent and must survive system failure.

    That time window we introduced makes it at least possible to screw 3 out of 4 there. An update that involves more than one hard drive may not be Atomically executed in that case: only one change was really persisted. (E.g., if you booked a flight online, maybe the money got taken from your account, but not given to the airline.) It hasn't left the data in a Consistent state. (In the above example some money have disappeared into nowhere.) And it's all because it wasn't Durable. (An update we thought we committed hasn't, in fact, survived a system failure.)

    --
    A polar bear is a cartesian bear after a coordinate transform.
    1. Re:Here's how by cahiha · · Score: 1

      The semantics of XA transactions say that at the end of the "prepare" step, the data is already on the disc (or whatever other medium), just not yet made visible. That, basically all that could possibly fail, has in fact had its chance to fail. And if you got an OK, then it didn't.

      Well, and the drives still guarantee that, as long as you don't interrupt their power. If you put the computer on a UPS (or the external enclosure), then you pretty much guarantee it.

      The small remaining probability that the drive loses power because of some other screwup is comparable to the probability that the disk block you wrote goes bad because of some other screwup. Whehther you want to deal with that probability, too, is up to you.

    2. Re:Here's how by arivanov · · Score: 2, Informative

      And this is the exact reason why any good SLQ based system must have means of integrity checking.

      As someone who have been writing database stuff for 10+ years now, I get really pissed off when I see lunatics raving on Acid about ACID. ACID in itself is not enough.

      You must have reference checking, offline integrity tests as well as ongoing online integrity test. Repeating your example a transaction for buying tickets for a holiday must insert a record in the Requests table, Tickets table, Holidays table, etc and you must have an offline tool (even better backround thread) which checks that all records are present. In fact for the same reason in a well designed system you must violate 3rd normal form and have the integrity checking tool use the redundant data as well. Another alternative is a state load and checksum across the state storing it back in at least two different places (once again breaking 3rd normal form).

      If you do it this way you can get a working system even if ACID breaks (databases have bugs), you can recover if hardware breaks and most importantly you have a considerable level of fraud resistance.

      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
    3. Re:Here's how by petermgreen · · Score: 1

      i beleive some data centres don't allow in-rack UPSs for fire safety reasons and things can go wrong between the datacenters backup supply system and your rack

      anyone remember the wikimedia incident known as "power currupts power failure currupts absoloutely"?

      yeah sure they had dumps and binlogs but restoring from those would have been extremely time consuming.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  21. Of course it does!-Perfect world. by Anonymous Coward · · Score: 0

    "Granted this was a few years ago, but i wouldn't be surprised if it's as bad (or even worse) now."

    Gee. Who would have guessed the world wasn't perfect. Anyone who doesn't take failure into account when designing something is an idiot.

    1. Re:Of course it does!-Perfect world. by grahamsz · · Score: 2, Interesting

      Obviously everything will ultimately fail. I know that the semiconductor industry make the same part, test it to see how fast it is, then sell it as different models based on the test results.

      I was surprised that some reasonable proportion of hard drives sold have errors on them at that point in time.

      Part of me wonders if this explains the anecdotal stories that SCSI disks are more reliable than their cheaper ATA counterparts - even when they use the same physical hardware. Perhaps (and this is blind speculation) the drives with fewer errors get sold to the customers willing to pay more.

    2. Re:Of course it does!-Perfect world. by CAPSLOCK2000 · · Score: 1

      That's exactly how it works. The best disks are used for SCSI, those of less quality are used for IDE.
      Combined with (genereally) higher quality of the other parts leads to a higher reliability of SCSI disks.

    3. Re:Of course it does!-Perfect world. by cowbutt · · Score: 4, Informative
      Part of me wonders if this explains the anecdotal stories that SCSI disks are more reliable than their cheaper ATA counterparts - even when they use the same physical hardware. Perhaps (and this is blind speculation) the drives with fewer errors get sold to the customers willing to pay more.

      Sort of. According to this paper from Seagate, the main differences between SCSI and ATA are:

      SCSI drives are individually tested, rather than tested in batch

      SCSI drives typically have a 5 year warranty, rather than 1 year for ATA (note that Seagate's ATA drives also have 5 years, and WD's Special Edition -JB ATA drives have 3 years).

      SCSI drives usually have higher rotational speeds (i.e. 10K or 15K RPM vs. 7200RPM)

      SCSI drives usually make use of the latest technology. ATA uses whatever older technology has been cost-engineered to a suitable price-point

      The physical and programming interface

      I also suspect that SCSI drives have a larger number of reserved blocks for remapping, and that they remap blocks on read operations when the ECC indicate that a block has crossed some threshold of near-unreadability. This would account for a) SCSI drives' lower capacities and b) a report I had from a SCSI-using friend running BSD who reports that a 'remapping' message turned up in his syslog without needing any special action to invoke.

      By contrast, in my experience, ATA drives only remap failed blocks on write operations. Lots of people think that when a drive returns a read error on a file, it's only fit for the bin, but I've forced the remapping to take place by writing to the affected blocks (either by zeroing the entire partition or drive using dd or badblocks -w, or by removing the affected file then creating a large file that fills all unallocated space in a partition, then removing it to reclaim the space).

    4. Re:Of course it does!-Perfect world. by pe1chl · · Score: 2, Insightful

      SCSI drives usually make use of the latest technology. ATA uses whatever older technology has been cost-engineered to a suitable price-point

      SCSI drives usually are a couple of years behind in drive capacity relative to ATA drives. This seems to contradict the above.

    5. Re:Of course it does!-Perfect world. by pe1chl · · Score: 3, Informative

      a report I had from a SCSI-using friend running BSD who reports that a 'remapping' message turned up in his syslog without needing any special action to invoke.

      SCSI drives can be set up to return "warning" codes like "I had trouble reading this sector but eventually I could read a good copy". When the driver is careful it will enable this, and when it occurs it will write back the sector to make sure a fresh copy is on the disk and/or it is remapped.
      Apparently BSD does this.

      By default, corrected sectors are just returned as OK. It is also possible to enable "auto remap on read" and the drive would be triggered to do the rewrite or remap by itself. Of course this means you have less control and less logging.
      (but you can read the remap tables)

      There are many details that can improve error handling but not all of them are fully worked out. For example, in Linux RAID-1, when a read error occurs the action is to take the drive offline, read the sector from the other disk and continue with 1 disk. Of course the proper handling would be to try writing the correct copy from the good disk back on the failed disk, and see if that fixes it. Only after several failures the disk should be taken offline, assuming that it has crashed.

      This has been like this for years, and is relatively easy to fix. I would be prepared to try fixing it but it seems one has to jump over many hurdles to get a fix in the kernel while not being the maintainer of the subsystem, and a mail to said person was not answered.

    6. Re:Of course it does!-Perfect world. by cowbutt · · Score: 1
      SCSI drives usually make use of the latest technology. ATA uses whatever older technology has been cost-engineered to a suitable price-point

      SCSI drives usually are a couple of years behind in drive capacity relative to ATA drives. This seems to contradict the above.

      Good point. I was referring to 1.4.2/1.4.3 from the Seagate paper. Perhaps it would be more accurate to say that SCSI gets mechanical technology improvements first (hence faster rotational speeds and seek times) whereas ATA gets new recording technologies first...

      Your other reply was informative, too. I think I agree with you on how Linux RAID-1 should work.

    7. Re:Of course it does!-Perfect world. by willie150 · · Score: 1
      I think the poster was actually referring to enterprise SATA drives, like WD's raptor drives.

      These are rated at 10K and come in 36 and 72GB sizes like SCSI drives. The MTBF is measured for 24x7 use like SCSI (unlike normal SATA which is not for sustained usage). They only cost 2/3 of the price of a SCSI drive, however.

      --
      Better to stay silent, and let people think you're an idiot than to open your mouth and remove all doubt
    8. Re:Of course it does!-Perfect world. by cowbutt · · Score: 1
      I think the poster was actually referring to enterprise SATA drives, like WD's raptor drives.

      WD are pretty much unique in that they don't make SCSI drives, therefore there are no SCSI drives that use exactly the 'same physical hardware' as those ESATA Raptors.

      That said, what Seagate do differently between their Personal (i.e. [S]ATA) and Enterprise Storage (SCSI and ESATA) may well be different for other manufacturers, of course.

    9. Re:Of course it does!-Perfect world. by Anonymous Coward · · Score: 1, Insightful

      SCSI drives usually are a couple of years behind in drive capacity relative to ATA drives.

      Not really, but with 300GB drives costing $700 for the "low end" and 500GB drives in the thousands of dollars range, you're not going to see them in your local computer shop.

    10. Re:Of course it does!-Perfect world. by walt-sjc · · Score: 1

      My guess is that the bit error rate on thie higher capacity platters is too high to pass quality tests for SCSI, and that it takes a while for manufacturing to catch up.

      Consider also that SCSI drives tend to have higher RPM, so it may be that writing to higher capacity platters would be more error prone at those high speeds.

      That said, I use SCSI RAID on my servers, ATA Raid on my desktops, and backup to tape (yes, even on my home machines.)

    11. Re:Of course it does!-Perfect world. by DavidTC · · Score: 1

      I'm not an expert on this, but doesn't SMART do the same thing for IDE? Return mapping messages?

      --
      If corporations are people, aren't stockholders guilty of slavery?
    12. Re:Of course it does!-Perfect world. by pe1chl · · Score: 1

      Yes, it would probably correct to claim that SCSI uses mature technology while ATA uses more modern but less proven technology.

      Same here, I always use RAID and that is why I found that the RAID-1 in Linux does not perform optimally when the disks start to become flaky. After a single sector read error you need to resync the disks, risking that another sector is not readable on the remaining disk, while it would have been valid on the disk that is now the target of the copy.

    13. Re:Of course it does!-Perfect world. by cowbutt · · Score: 1
      After a single sector read error you need to resync the disks, risking that another sector is not readable on the remaining disk, while it would have been valid on the disk that is now the target of the copy.

      Excellent reasoning! I've now become convinced that Linux's RAID-1 should work as you wrote in his post. Unfortunately, I'm in pretty much the same boat as you with respect to getting it fixed. :-(

    14. Re:Of course it does!-Perfect world. by cowbutt · · Score: 1
      S.M.A.R.T. isn't really analogous; you need to poll a drive's S.M.A.R.T. characteristics periodically, and run periodic tests (which can impair performance whilst they're running). It also doesn't give you 'I've remapped block X' messages, merely 'I've remapped another block'. Besides, what you'd be after would be 'I thought you should know - I had a few problems reading block X just now'.

      That said, someone who's familiar with the ATA command set (which is pretty similar to SCSI's command set anyway) might well be able to point to a command that enables the same level of notification as SCSI provides.

    15. Re:Of course it does!-Perfect world. by Fweeky · · Score: 1

      Capacity isn't the only metric of a drive; SCSI drives generally concentrate on keeping seek times low, and they do this by using lower areal density (so the heads don't need to spend so long settling on a track) and physically smaller platters (so they have less distance to travel), amongst other things. I dare say this also contributes to their robustness; lower density = lower error rates, and a greater ability to cope with minor defects.

    16. Re:Of course it does!-Perfect world. by taboo959 · · Score: 1
      SCSI drives typically have a 5 year warranty, rather than 1 year for ATA (note that Seagate's ATA drives also have 5 years, and WD's Special Edition -JB ATA drives have 3 years)

      The funny thing is.....that longer warranty for the ATA 'cuda is fairly recent (as in the last cpl/few years). It always kind of appalled/puzzled me that the ATA 'cuda warranty was 1 year or 3 years (depending on how far back we're talking about), while the SCSI 'cuda was 5....

  22. Sadly unpredictable by grahamsz · · Score: 5, Interesting

    i know all disks ultimately fail, but it's frustrating that some can be really abused and run for years, when others die abruptly.

    While working at said hard disk company i had one of their smaller disks sitting on the end of a steel ruler on my desk. I spun round on my chair, as i do when i'm thinking, and hit the other end of the ruler with my elbow. This of course launched the disk across the room, slamming it against the wall.

    Given that I was in the process of writing software to diagnose failure's I was quite excited about this accident. Of course i return the disk to the test setup and there's nothing wrong.

    In my experience, the only sure fire way to have a disk fail is to place any piece of important, but un-backed-up, work on it.

    1. Re:Sadly unpredictable by thePjunisher · · Score: 1
      In my experience, the only sure fire way to have a disk fail is to place any piece of important, but un-backed-up, work on it.


      That, or dropping it while it's spinning...
    2. Re:Sadly unpredictable by ben_rh · · Score: 1

      In my experience, the only sure fire way to have a disk fail is to place any piece of important, but un-backed-up, work on it.

      I completely agree.

      I've seen quite a few dead disks in my time (fixing friends' and relatives' machines, etc), but the only time I've actually seen a disk die in the arse as I used it was when I had just copied 20G of my own data on to it.

      The crap part was the data wasn't backed up.

      To add insult to injury, the reason the data wasn't backed up was that I was using the drive as a temporary scratch drive while I was creating a RAID5 array for said data on my main disks.

      Thanks Murphy.

    3. Re:Sadly unpredictable by UrgleHoth · · Score: 1


      In my experience, the only sure fire way to have a disk fail is to place any piece of important, but un-backed-up, work on it


      Maybe this is some obscure evidence that there is an "importance" attribute to a quark. Following entropy, too much "importance" causes a destabilization of the quantum structure and the drive fails.

      --

      Dogma - "let's just say we'd like to avoid any empirical entanglements."
    4. Re:Sadly unpredictable by sconeu · · Score: 1

      Back in the day, we were developing a handheld box for the airforce.

      One of our requirements was a 5-foot drop nonoperational, and a 3-foot drop operational. We were using the original HP 40MB "matchbox" drives.

      Turns out that it was easier to pass the operating test if we dropped from 5 feet instead of 3 feet, because it gave the internal accelerometer more time to park the heads!

      --
      General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
    5. Re:Sadly unpredictable by RancidMilk · · Score: 1

      I find a lot of truth in that. I had 4 drives in my computer, with two running striping, a third one with my OS, and the last with all my important data.

      First, the data drive died. So, I quickly through it in the freezer, pulled it out after a while, back into my computer, copied the files to my raid drive. The next day, the raid crashed, with both drives failing.

      So, I gave up and installed stuff to my OS drive. Not two weeks later! It crashes.

      Since then, I have bought 4 other hdds, and 2 more have failed.

      Conclusion:
      1.Seagate Drives will work forever if they are working after a month of having them.
      2.Western Digital Drives can handle the OS, but if you put your important data on them, watch out.
      3.IBM sucks some seriously sweaty...
      4.Maxtor, better than WD on occasion, but still crashes with the best of them.
      5.Save your data more than you brush your teeth, and save it to every media that you have available.

  23. Re:Err... "lying" is the default setting. RTFM. by Yokaze · · Score: 4, Insightful

    No. If you had no cache, there would be no need for a flush command. The flush command exists purely for the reason of flushing buffer and caches on the harddisc. The ATA-5 specifies the command as E7h (and as mandatory).

    The command is specified in practically in all storage interfaces for exactly the reason the author cited, integrity. Otherwise, you can't assure integrity without sacrificing a lot of performance.

    --
    "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
  24. Why do we need it?-Audiance stuffing. by Anonymous Coward · · Score: 0

    "If we've made it this far without it, why do we need it now?"

    There would be very few slashdot comments otherwise.

  25. Damn processor industry... by fo0bar · · Score: 3, Funny

    using the wrong definitions to make their products seem bigger. I bought a P4 2.4GHz CPU the other day, and was shocked to find it wasn't 2,576,980,377.6Hz like it should be! Lying thieves...

    1. Re:Damn processor industry... by Anonymous Coward · · Score: 0

      Because all the CPUs you bought before were 1024 Hz to the MHz, right?

    2. Re:Damn processor industry... by spectecjr · · Score: 2, Informative

      using the wrong definitions to make their products seem bigger. I bought a P4 2.4GHz CPU the other day, and was shocked to find it wasn't 2,576,980,377.6Hz like it should be! Lying thieves...

      Sad to see this post being marked "insightful". 2.4GHz has always meant 2,400,000,000,000 cycles per second, and nothing else. No matter what speed your crystal clocks at.

      The original poster was being ... kind of sarcastic.

      --
      Coming soon - pyrogyra
    3. Re:Damn processor industry... by spectecjr · · Score: 3, Funny

      Ooops. Make that 2,400,000,000 not 2,400,000,000,000. That's the problem with big numbers - it's like spelling bananananananananananana - once you start you can't stop.

      --
      Coming soon - pyrogyra
    4. Re:Damn processor industry... by xtracto · · Score: 1

      2.4GHz has always meant 2,400,000,000,000 cycles per second

      Nice that you corrected, I thought "This guy really has a HELL of a fast machine!" =o)

      --
      Ubuntu is an African word meaning 'I can't configure Debian'
    5. Re:Damn processor industry... by kEnder242 · · Score: 1

      Dont know if your trying to continue the spirit of the grandparent joke but I'm going to reply seriously.

      There's metric (1,000 meters = 1km) and binary metric (1024Bytes = 1MB).

      1 Hz is 1 cycle per second... I can understand why memory is counted in 2^10 instead of 10^3, but I think Hz has been used for things before binary became trendy, and should therefore not follow the 1024 "rule".

      --
      my associative arrays can kick your hash - TCL
    6. Re:Damn processor industry... by JabberWokky · · Score: 1
      Uh... no. Hertz is cycles per second and the usage of it predates digital computers by quite a bit. As a result, it uses the decimal SI prefixes just like 1 kilometer is 1000 meters exactly, not 1024 meters. You'd screw up quite a bit of acoutic, light and other measurements if you were to try and use a binary-based prefix with hertz.

      If you doubt this, take a look at a nearby radio and realize what those station numbers refer to. Marconi predates Moore by just a bit.

      --
      Evan

      --
      "$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
    7. Re:Damn processor industry... by unitron · · Score: 1

      Obviously it's a fast machine. Look how quickly it repeated the zero key. :-)

      --

      I see even classic Slashdot is now pretty much unusable on dial up anymore.

    8. Re:Damn processor industry... by chthon · · Score: 1

      You seem to have rather large bytes...

    9. Re:Damn processor industry... by Lush_trashed · · Score: 1

      I think you mean 1024Khz per mhz

    10. Re:Damn processor industry... by ShagratTheTitleless · · Score: 3, Funny
      2,400,000,000,000 cycles per second

      Fucking Trillahertz! or is it Terrahertz! Either way, imagine a Beowulf Clu....[Post terminated by Karma Police]

      --
      Sometimes at night I imagine the darkness is filled with horrible things with too many teeth, like Julia Roberts.
    11. Re:Damn processor industry... by the_bard17 · · Score: 1

      *cough* Tesla *cough*

    12. Re:Damn processor industry... by pegr · · Score: 1

      Uh... no. Hertz is cycles per second and the usage of it predates digital computers by quite a bit. As a result, it uses the decimal SI prefixes just like 1 kilometer is 1000 meters exactly, not 1024 meters. You'd screw up quite a bit of acoutic, light and other measurements if you were to try and use a binary-based prefix with hertz.

      If you doubt this, take a look at a nearby radio and realize what those station numbers refer to. Marconi predates Moore by just a bit.


      Pun acknowledged. Excellent job, young Jedi!

    13. Re:Damn processor industry... by spectecjr · · Score: 1


      Uh... no. Hertz is cycles per second and the usage of it predates digital computers by quite a bit. As a result, it uses the decimal SI prefixes just like 1 kilometer is 1000 meters exactly, not 1024 meters. You'd screw up quite a bit of acoutic, light and other measurements if you were to try and use a binary-based prefix with hertz.


      In other words, you agree with me that 2.4GHz is 2,400,000,000 cycles per second. Thanks for that. So, uh, no, I'm already right, and uh, thanks for the confirmation.

      jeeeeeezus cheerist.

      --
      Coming soon - pyrogyra
    14. Re:Damn processor industry... by spectecjr · · Score: 1

      Sorry. Apparently the modded down post you replied to vanished, but the rest of your thread didn't, so I thought you were replying to me. My apologies.

      --
      Coming soon - pyrogyra
    15. Re:Damn processor industry... by fm6 · · Score: 1
      2.4GHz has always meant...
      It's interesting that Language Nazis don't get irony. I guess that makes sense -- sublte forms expression can't be codified!
    16. Re:Damn processor industry... by Anonymous Coward · · Score: 0

      I think you both mean 1024kHz per MHz.

  26. Not really a Lie by bgog · · Score: 2, Informative

    It's not a lie. fsync syncs to a device. The device is a hard drive with a cache.

    You'd expect a fsync to complete only when the data is physically written to disk. However usually this is not the case it completes only when it is fully written to the cache on the physical disk.

    The downside of this is that it's possible to loose data if you pull the power plug (usually not just by hitting the power switch). However if the disks were to actually commit fully to the physical media on every fsync you would see a very very dramatic performance degredation. Not just a little slower so you look bad in a magazine article but incredibly slow, especially if you are running a database or similar application that fsyncs often.

    Server class machines solve this problem by providing battery backed cache on their controllers. This allow the full speed operation by fsyncing only to cache but if power is lost the data is then safe because of the battery.

    This doesn't matter too much for the average joe for a number of reasons. First the when the power switch is hit, the disks tend to finnish writing their caches before spinning down. IN the case of a power failure journaled file systems will usually keep you safe (but not always).

    This is a big issue however if you are trying to implement an enterprise class database server on everyday hardware.

    So turn off the write cache if you don't want it on but don't complain when your system starts to crawl.

    1. Re:Not really a Lie by ravenspear · · Score: 3, Informative

      However if the disks were to actually commit fully to the physical media on every fsync you would see a very very dramatic performance degredation. Not just a little slower so you look bad in a magazine article but incredibly slow, especially if you are running a database or similar application that fsyncs often.

      I think you are confusing write caching with fsyncing. Having no write cache to the disk would indeed slow things down quite a bit. I don't see how fsync fits the same description though. Simply honoring fsync (actually flushing the data to disk) would not slow things down anywhere near the same level as long as software makes intelligent use of it. Fsync is not designed to be used with every write to the disk, just for the occasional time when an application needs to guarantee certain data gets written.

    2. Re:Not really a Lie by bgog · · Score: 1

      You are correct. However the article is actually talking about the fact that the data is not on the physical disk when the author fsyncs. That's what I was addressing. And the cache that I'm talking about is the physical cache memory on the drive, not the OS write cache. The "fix" command he mentions that using hdparm is to instruct the drive to disable it's onboard write cache.

      What you said is exactly why it doesn't matter to most people and applications. The author was using fsync with the expectation that it would not return until all data was on the platters.

      You mention that fsync i fluhing the data to disk. This is close but not entirely true, in linux for example, it is the flushing of the data to the "drive", the OS is satisfied once the data has arrive at the drive, but this is often in the drives physical cache (not the os cache).

      There are however certain types of systems that do indeed need to ensure that data is physically written to disk every time. Transactional systems for example. In these cases it is required that either the (Pysical on-board) cache of the drive be disabled or to utilize a disk sub-system with battery-backed cache.

      Anyway I think we mostly agree. The author is pretty much having a cow about a known issue that only matters in specialized system and for those there are solutions.

    3. Re:Not really a Lie by OeLeWaPpErKe · · Score: 1

      If that is true, explain why this test (as demonstrated in the article) fails.

      I eagerly await your reply.

    4. Re:Not really a Lie by surprise_audit · · Score: 1
      The author was using fsync with the expectation that it would not return until all data was on the platters.

      According to the fsync man page, that's actually a reasonable expectation:

      fsync copies all in-core parts of a file to disk, and waits until the device reports that all parts are on stable storage.
      I.e. "stop caching and actually write all cache and buffers out to the platters." I don't think that non-battery-backed cache in the drive electronics would be included in the definition of "stable storage".
    5. Re:Not really a Lie by bgog · · Score: 1

      Yep, but that isn't what happens. On many many systems unless you disable the drives write cache it will not write to disk. You can see this by doing the following. Write two programs for linux, one that sits on computer a and sends data to computer b and expects a respons when it is synced. Program two resides on computer B and recieves the data, writes it and fsyncs then responds.

      Get you programs running and pull the plug out of computer b. When it comes back up you will sometimes find that it told computer A it synce the last piece of data but it acutally hadn't.



      I work on a transactional messaging system and observed this exact behavior with drives from several vendors. You can disable the physical write cache on the drive and rerun the test and it will work fine. No lost data.

      This is probably what the author means by lie. Anyway you have a fine point about the fsync man-page.

    6. Re:Not really a Lie by bgog · · Score: 1

      Perhaps you misunderstood me. I wasn't trying to say the author was wrong. The test will fail. I've this behavior in the past. My contention is only that it pretty much doesn't matter to an ordinary application. (proven by the fact that all our computers and servers work just fine) It is mostly an issue with transactional systems and there are solutions to the problem for them.

    7. Re:Not really a Lie by Anonymous Coward · · Score: 0

      Fsync is not designed to be used with every write to the disk, just for the occasional time when an application needs to guarantee certain data gets written.

      The parent obviously knows this, as he specifically referred to databases or similar applications, where fsync is called a lot more often than usual.

  27. Is this a lie? by gaanagaa · · Score: 0, Offtopic

    T bought this PC recently. There was a C:\ and a D:\. One day both failed at the same time. I opened the case and i saw only One hard disk. Where did the other one go? Who lied to me? The dealer or the hard disk?

    1. Re:Is this a lie? by Phil246 · · Score: 1

      its the one disk, partitioned into two drives.
      If the dealer said you have two hard disks, he lied. Otherwise neither did.

  28. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    simple fix... just dont power down. (its better on the wear and tear anyways nowadays)

  29. Re:Author lied when implied that DRIVES are the is by Sinner · · Score: 3, Interesting

    Parent either doesn't know what he's talking about, or is a troll. Pity there isn't an "incoherent rant" moderation option, or we could avoid the ambiguity.

    --
    fish and pipes
  30. He misunderstands fsync() by Dahan · · Score: 4, Informative
    According to SUSv3:
    The fsync() function shall request that all data for the open file descriptor named by fildes is to be transferred to the storage device associated with the file described by fildes. The nature of the transfer is implementation-defined.
    If _POSIX_SYNCHRONIZED_IO is not defined, the wording relies heavily on the conformance document to tell the user what can be expected from the system. It is explicitly intended that a null implementation is permitted. This could be valid in the case where the system cannot assure non-volatile storage under any circumstances or when the system is highly fault-tolerant and the functionality is not required. In the middle ground between these extremes, fsync() might or might not actually cause data to be written where it is safe from a power failure.
    (Emphasis added). If you don't want your hard drive to cache writes, send it a command to turn off the write cache. Don't rely on fsync(). Either that, or hack your kernel so that fsync() will send a SYNCHRONIZE CACHE command to the drive. That'll sync the entire drive cache though, not just the blocks associated with the file descriptor you passed to fsync().
    1. Re:He misunderstands fsync() by Anonymous Coward · · Score: 1, Informative

      Please note that the definition says ...all data ... is to be transferred to the storage device... and it is what fsync() is actually doing! fsync() is not requried to transfer to the physical disk but just sending it to the harddisk is enough. Now what happens inside the harddisk is not of interest to fsync(): The idea is to flush all buffers in the software and the specs are not talking about the buffers in the hardware.

    2. Re:He misunderstands fsync() by Anonymous Coward · · Score: 0

      On some operating systems, fsynch also flush the disk cache. And in almost all cases fsynch flush caches in disk controllers.

    3. Re:He misunderstands fsync() by mukund · · Score: 1

      A `synchronize cache' command is not going to help in the case of a journaling filesystem as the order of block writes are lost when a on-disk cache is used. This means having to run the `synchronize cache' command each time something is written to disk, which is just as good as turning write cache off.

      --
      Banu
    4. Re:He misunderstands fsync() by Dwonis · · Score: 1
      Quoting the same document:
      The fsync() function is intended to force a physical write of data from the buffer cache, and to assure that after a system crash or other failure that all data up to the time of the fsync() call is recorded on the disk. Since the concepts of "buffer cache", "system crash", "physical write", and "non-volatile storage" are not defined here, the wording has to be more abstract.

      If _POSIX_SYNCHRONIZED_IO is not defined, the wording relies heavily on the conformance document to tell the user what can be expected from the system. It is explicitly intended that a null implementation is permitted. This could be valid in the case where the system cannot assure non-volatile storage under any circumstances or when the system is highly fault-tolerant and the functionality is not required. In the middle ground between these extremes, fsync() might or might not actually cause data to be written where it is safe from a power failure. The conformance document should identify at least that one configuration exists (and how to obtain that configuration) where this can be assured for at least some files that the user can select to use for critical data. It is not intended that an exhaustive list is required, but rather sufficient information is provided so that if critical data needs to be saved, the user can determine how the system is to be configured to allow the data to be written to non-volatile storage.

      It is reasonable to assert that the key aspects of fsync() are unreasonable to test in a test suite. That does not make the function any less valuable, just more difficult to test. A formal conformance test should probably force a system crash (power shutdown) during the test for this condition, but it needs to be done in such a way that automated testing does not require this to be done except when a formal record of the results is being made. It would also not be unreasonable to omit testing for fsync(), allowing it to be treated as a quality-of-implementation issue.

      i.e. If your storage device is a RAM disk, you don't violate the spec by losing data that was fsync()'d to it upon a crash.

      In any case, I think it would be more relevant to cite the appropriate parts of the ATA or SCSI specs, not POSIX. Hard drives don't implement POSIX.

    5. Re:He misunderstands fsync() by Anonymous Coward · · Score: 0

      The original post was correct in quoting POSIX, since the author of the article is using fsync(2) and expecting it to send an ATA flush, which it is not required to do based on that POSIX quote.

    6. Re:He misunderstands fsync() by Dwonis · · Score: 1
      My argument is that, if sending an ATA flush is what is required to "assure that after a system crash or other failure that all data up to the time of the fsync() call is recorded on the disk", then sending an ATA flush is required by POSIX. The exception is where it is impossible to make that assurance, because of limitations of the media (such as on a RAM disk).

      What else is the point of providing a standardized fsync() call at all?

  31. Re:Err... "lying" is the default setting. RTFM. by turbofisk · · Score: 1

    I remember that MS had a fix for this (for laptops etc)... Which just made Windows wait a duration (~30s)... I call that breaking the OS... Fix the underlying problem. Cause and effect. MS should be kicking the teeth of the harddrive manufactures on pure principle atleast. Because let's face it, if MS wants to do anything, MS will do anything they damn please. This can of course be a good thing from time to time!

  32. fsync IS important by carstenkuckuk · · Score: 2, Informative

    fsync semantic is needed whenever you want to implement ACID transactions. This lies at the core of database systems and journaling file systems, for example. No fsync, no data integrity.

  33. Here's how-plan to fail. by Anonymous Coward · · Score: 0

    "That time window we introduced makes it at least possible to screw 3 out of 4 there. An update that involves more than one hard drive may not be Atomically executed in that case: only one change was really persisted. (E.g., if you booked a flight online, maybe the money got taken from your account, but not given to the airline.) It hasn't left the data in a Consistent state. (In the above example some money have disappeared into nowhere.) And it's all because it wasn't Durable. (An update we thought we committed hasn't, in fact, survived a system failure.)"

    True. However anyone designing such a critical system will not have a single point of failure. But will have multiple fail-safes. Remember failure is always "just around the corner" e.g. power failure, bad media, cosmic ray, server gets slashdotted, etc.

  34. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    There is _much_ to see here!

    This has nothing to do with whether the write caches are enabled or not.

    Unless you have an intelligent controller with battery backup, the fsynch chain should issue a SYNC command to the disc and flush the cache on the disk, the disk controller and the OS. If this does not happen, then there is a bug.

  35. Re:Err... "lying" is the default setting. RTFM. by leuk_he · · Score: 1

    I see a market for an extenede "real flush" since HD makers are not soon to change this practice. (just as they count gigabytes different from the OS)

  36. Small correction of his name. by deadcujo · · Score: 1, Informative

    His surname is actually Fitzpatrick and not Fitzgerald.

    1. Re:Small correction of his name. by Anonymous Coward · · Score: 0
      His surname is actually Fitzpatrick and not Fitzgerald.

      He actually fits Patrick and not Gerald.

      Not that there's anything wrong with that!!!

    2. Re:Small correction of his name. by Anonymous Coward · · Score: 0

      Two Irish Gays

      Patrick Fitzgerald & Greald Fitzpatrick

      Boom Boom

  37. RTFM by BigYawn · · Score: 2, Informative
    From the fsync man page (section "NOTES"):

    In case the hard disk has write cache enabled, the data may not really be on permanent storage when fsync/fdata sync return.
    When an ext2 file system is mounted with the sync option, directory entries are also implicitly synced by fsync.
    On kernels before 2.4, fsync on big files can be ineffi cient. An alternative might be to use the O_SYNC flag to open(2).

  38. drive write caching _is unsafe_. by TwoCans · · Score: 1

    Its all a mess... but setting a DRIVE to not
    cache is NOT the solution! Its retarded to do
    so, and all the comments in this thread taling
    of setting the cache off are foolish.


    I call BS here.

    When a power failure occurs, even if you are running a journalling filesystem, you _really_, _really_, _really_ care about drive write caching.

    If the drive has the journalled data in it's write cache and not on disk at the time of the power failure, you've just corrupted your filesystem because the journal replay will not be correct when you next mount your filesystem. Journalled filesystems rely on the journal being on disk when the I/O completes and not some time after.

    Toy (*ATA) use write cache for _performance_ because otherwise they suck really badly. And that results in filesystem corruption on power loss. I've seen the mess it makes far too many times....
    1. Re:drive write caching _is unsafe_. by putaro · · Score: 4, Informative

      Let's try a reply with a bit less flame attached.

      A journaling file system will know when it needs to get everything committed to disk in order to have a consistent state. At that point it will issue a sync to the drive to flush the drive's write cache. However, not every write has to get to the disk for the filesystem to be in a consistent state.

      Now, you're yelling BS, BS, BS...hold on and listen for a minute. I write file systems for a living and have done so for over 15 years.

      What is the commitment that a journaling file system makes to you? It makes the commitment that it will not be in an inconsistent state. It doesn't make the commitment that every last write will make it to disk. For example, ext3 in journaling mode only journals metadata transactions. Any data writes that you make are not guaranteed at all, unless you make the proper sync call. As someone pointed out above, fsync is not the proper call on many OS's.

      The way that we have settled on to make filesystems and databases work is to create atomic transactions and move from transaction to transaction. If a transaction fails (for any reason, but let's just assume it's because the system crashed), all of the data that was written as part of it is discarded when you restart. If the partial data was not discarded then the filesystem would be in an inconsistent state AND the data that you were writing (if you care about consistency) would be in an inconsistent state. So, forcing every write to immediately go to disk is pointless as if the transaction you're doing is interrupted you'll be discarding the data anyway. It's only when you are finishing the transaction that you need to make sure that everything is on disk. By that time it might be already, especially if that transaction was large.

      Let's take a simple situation. Say that you have a filesystem that guarantees that everytime you do a write() call, when the call returns that data will be on disk and available for you the next time and that if the write() errors or does not return, the file will be as it was before the write() was called. Now, you do a write of 100MB with a single call. The filesystem may scatter that data all over the disk depending on how fragmented it is. Forcing each write to disk in order will bang the head a lot and reduce your performance. By letting the write cache do its job and reorder writes as necessary your performance will be much better (we used to do this in the driver and file system cache. However, modern disk drives provided such an abstract interface that it's nearly impossible for the OS to micromanage write ordering. In the old days the OS knew where the head is because it told the damn drive where to put it. Now, you can sort of guess and you're usually wrong). Cache on ATA drives tops out at around 16MB so you will definitely flush most of the data out of the cache in the course of writing anyway. Finally, at the end, before returning, the FS would sync the drive's cache to the disks and mark the transaction as closed. Were the system to crash in the middle of the write when the system restarts it would need to discard any data that might have been written and it wouldn't matter which data had been written or not written. (Important note: Journaling file systems and databases have a recovery process after a crash. It's just a lot less involved than running fsck or DSKCHK over the whole disk)

      So, write caching is valuable and widely used. In order to avoid data corruption it's not necessary to turn off caching but it is necessary for the cache to do what it is told, when it is told (all of the write caches too, not just the disk's). Were the disks truly lying to the OS it would be bad. More likely, this guy's Perl script is just not OS specific enough to get the OS to really do what he thinks he is asking it to do. There's a reason why serious data management apps need to be ported and certified on an OS. Getting everything to do its job right is tough.

    2. Re:drive write caching _is unsafe_. by davegaramond · · Score: 1

      Ok, so for laymen like us, if we have important data on an IDE drive managed by Linux+PostgreSQL 8+ext3, should we turn on write caching or not?

    3. Re:drive write caching _is unsafe_. by putaro · · Score: 1

      Leave the write caching on and make sure you have a backup. If you don't have any backups your data can't be that important :-). If it's really important, you might want to track down whether your drive does the right thing, or plop down a few more bucks and run SCSI drives or a (quality) RAID subsystem.

  39. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0
    If you had no cache, there would be no need for a flush command.
    Wrong! your assumption is that the flush command is all about the hardware and don't take into account the cache in the OS and possibly in the device-drivers. Actually, my understanding is that fsync() is only supposed to flush these software caches and its specs do not talk about the cache inside the harddisk hardware. So, fsync() just flushes to the device but does not flush to physical disk.
  40. Mistaken assumption! by renuk007 · · Score: 1

    I guess one would assume that fsync() is meant to flush all data to the disk. Yes and no. It means, if you have in-memory buffers (like all OSs do) they should be flushed to the storage system. It does NOT guarantee that the storage sub-systems themselves will be flushed. To ensure that, most OSs just force the subsystems to remain idle for a few seconds, which is sufficient for them to write-back their cache contents. That's how it's been for 30 years ... so what's new?

  41. Please be clearer by leehwtsohg · · Score: 1

    Hey, parent, or someone who knows something - could you maybe ellaborate?
    It seems that the problem, as pointed out by the parent and posts below is not with the disks, but with fsync() - it seems that fsync only promises do give all data to the disk, but nothing about whether or not it is actually written (?). So, disks could actually flush, if one only asked them nicely enough? How? Can this be tested? Implemented?
    Do journaling FS just use regular fsync? Can they use some other call that actually does flush?

    1. Re:Please be clearer by surprise_audit · · Score: 1
      The man page states: fsync copies all in-core parts of a file to disk, and waits until the device reports that all parts are on stable storage.

      Here's a stupid, real world example: you go to the bank and pay in some money. The teller drops the money in a drawer and makes a note on a Postit which he then sticks on some surface that's out of your sight. You leave the bank believing that your account has been credited, whereas in fact the credit doesn't happen until the teller processes all the postits some time later. This system works fine, until some complete bastard opens a window and a bunch of postits flies away. You'd be extremely pissed if your transaction literally flew out the window, but you'd have no proof you were even in the bank, let alone making a deposit.

      The argument in the original article is that when fsync() is executed, the drive and/or drivers are not supposed to return the success code until all data is safely recorded on the platters. It's what the spec says, and it's what the OS and app writers expect.

  42. Re:Author lied when implied that DRIVES are the is by epine · · Score: 1


    Which SATA Seagates from 2004 have bad firmware and what is the nature of the flaw? I have drives of that description that cause some really strange errors in my hot-swap IDE enclosure.

  43. Re:Err... "lying" is the default setting. RTFM. by Basehart · · Score: 4, Insightful

    "I remember that MS had a fix for this (for laptops etc)... Which just made Windows wait a duration (~30s)..."

    This turned into the "my computer isn't doing what I want it to do, which is turn the F off" at which point the consumer simply reached down and yanked the power cord.

    Try writing a routine for this routine!

  44. power or non-volatile memory in disk by cahiha · · Score: 2, Informative

    Well, it's unlikely this is going to change. The real solution is to give power long enough to the disk drive to let it complete its writes no matter what, and/or to add non-volatile or flash memory to the disk drive so that it can complete its writes after coming back up.

    There is a fairly simple external solution for that: a UPS. They're good. Get one.

    And even then it is not guaranteed that just because you write a block, you can read it again, because nothing can guarantee it. So, file systems need to deal, one way or another, with the possibility that this case occurs.

    1. Re:power or non-volatile memory in disk by atrus · · Score: 1
      UPSs can still fail you. Something as simple as tripping on a power cord.

      Flash is too slow. What most raid cards worth their salt do is have a large DRAM cache which is battery backed. The stuff hits the fan? No problem, once the controller is booted and drives are powered, it can finish all thr writes it has previously delayed. Multiple levels of redundancy is a good thing.

    2. Re:power or non-volatile memory in disk by amorsen · · Score: 1

      Around here the probability of an UPS failure approaches that of losing power in the first place. Therefore it is unwise to use an UPS for a server with only one power supply. Other places are less fortunate, of course.

      --
      Finally! A year of moderation! Ready for 2019?
    3. Re:power or non-volatile memory in disk by cahiha · · Score: 1

      UPSs can still fail you. Something as simple as tripping on a power cord.

      Yes, and so can disks: even a disk that is constructed only to acknowledge writes only after they have completed still cannot guarantee things.

      Flash is too slow.

      Not necessarily--depends on how parallel you make it.

      What most raid cards worth their salt do is have a large DRAM cache which is battery backed.

      Yes, as I was saying.

  45. s/fitzgerald/fitzpatrick/ ? by Anonymous Coward · · Score: 0

    subj

  46. fitzwhat? by Anonymous Coward · · Score: 0

    does no one else notice the fitzgerald / fitzpatrick discrepency?

    1. Re:fitzwhat? by Anonymous Coward · · Score: 0

      fuck nevermind. stupid thresholds....

  47. Fitzpatrick != Fitzgerald by Anonymous Coward · · Score: 0

    I'm not trying to be pedantic here, but as a fellow Fitzpatrick I feel for Brad. I too have been mistakenly called Fitzgerald all my life.

    His name was Brad Fitzpatrick -- damnit.

  48. Re:Err... "lying" is the default setting. RTFM. by Everleet · · Score: 5, Informative
    fsync() is pretty clearly documented to cause a flush of the kernel buffers, not the disk buffers. This shouldn't come as a surprise to anyone.

    From Mac OS X --

    DESCRIPTION
    Fsync() causes all modified data and attributes of fd to be moved to a
    permanent storage device. This normally results in all in-core modified
    copies of buffers for the associated file to be written to a disk.

    Note that while fsync() will flush all data from the host to the drive
    (i.e. the "permanent storage device"), the drive itself may not physi-
    cally write the data to the platters for quite some time and it may be
    written in an out-of-order sequence.

    Specifically, if the drive loses power or the OS crashes, the application
    may find that only some or none of their data was written. The disk
    drive may also re-order the data so that later writes may be present
    while earlier writes are not.

    This is not a theoretical edge case. This scenario is easily reproduced
    with real world workloads and drive power failures.

    For applications that require tighter guarantess about the integrity of
    their data, MacOS X provides the F_FULLFSYNC fcntl. The F_FULLFSYNC
    fcntl asks the drive to flush all buffered data to permanent storage.
    Applications such as databases that require a strict ordering of writes
    should use F_FULLFSYNC to ensure their data is written in the order they
    expect. Please see fcntl(2) for more detail.

    From Linux --

    NOTES
    In case the hard disk has write cache enabled, the data may not really
    be on permanent storage when fsync/fdatasync return.

    From FreeBSD's tuning(7) --

    IDE WRITE CACHING
    FreeBSD 4.3 flirted with turning off IDE write caching. This reduced
    write bandwidth to IDE disks but was considered necessary due to serious
    data consistency issues introduced by hard drive vendors. Basically the
    problem is that IDE drives lie about when a write completes. With IDE
    write caching turned on, IDE hard drives will not only write data to disk
    out of order, they will sometimes delay some of the blocks indefinitely
    under heavy disk load. A crash or power failure can result in serious
    file system corruption. So our default was changed to be safe. Unfortu-
    nately, the result was such a huge loss in performance that we caved in
    and changed the default back to on after the release. You should check
    the default on your system by observing the hw.ata.wc sysctl variable.
    If IDE write caching is turned off, you can turn it back on by setting
    the hw.ata.wc loader tunable to 1. More information on tuning the ATA
    driver system may be found in the ata(4) man page.

    There is a new experimental feature for IDE hard drives called
    hw.ata.tags (you also set this in the boot loader) which allows write
    caching to be safely turned on. This brings SCSI tagging features to IDE
    drives. As of this writing only IBM DPTA and DTLA drives support the
    feature. Warning! These drives apparently have quality control problems
    and I do not recommend purchasing them at this time. If you need perfor-
    mance, go with SCSI.
    --
    It's tragic. Laugh.
  49. Linux 2.6 and IDE by Anonymous Coward · · Score: 1, Interesting

    Isn't this something that Alan Cox is complaining about in the Linux 2.6 IDE layer? Something about fsync not always waiting for the completion of the cache flush? He tells everyone on LKML to turn the disk write-cache off on IDE disks to make fsync work properly. Or am I clueless?

    1. Re:Linux 2.6 and IDE by asaul · · Score: 2, Informative

      I dont know about the Alan Cox comment, but for IDE this is a common thing. Simply put IDE disks struggle enough for performance, so by default have write caching enabled.

      I work for a major server vendor who creates their own firmware for their disks. By default all SCSI and FCAL disks are configured to have write cache disabled because data integrity is valued over performance. For ATA apparently the disk vendors dont give any option for it, so we are unable to work around that.

      This is actually quite a pain when it comes to benchmarks, because for SOME tests it makes OSes which enable the write cache to look really fast. Its not until you suffer a catastrophe that you find out the data never made the platter.

      And RAID devices dont lie about completing I/O - the device presents an "disk" interface to a slab of battery backed (hopefully) cache to disk which allows write performance to be massivly better. The RAID card itself takes care of syncing its cache to the disks, it just takes the data in cache and responds to the transaction immediately, flushing later. As far as the OS needs to know, the IO is complete - firmware bugs and battery failures aside, the RAID card handles it internally.

      And the author seems to be lacking clue about what he is testing - if anything all it is testing is the the OSes ability to get data down to the disks consistantly - all fsync() knows is that the calls it made to send data to the storage devices returned success, its totally dependant on the volume manger, disk and controller drivers as well as the actual physical storage to get the job done.

      For all he knows his drivers might be returning immediately just to make performance look better, but actually scheduling the I/O in some manner which causes it to be lost before commitment to storage.

      If he wants to complain about the disks, I think he is going to need a much lower level test that a perl script calling sync.

      --
      "If everybody is thinking alike, somebody isn't thinking" - Gen. George S. Patton
  50. I suggest you don't yank the powercord by Meredeth · · Score: 1

    If you have to try this, TURN IT OFF WITH THE WALL SWITCH. Yanking the powercord out can damage components easily if the earth comes out first, which is unfortunately very likely with PC power cords. I have several sticks of RAM that can provide supporting evidence, unfortunately.

  51. Re:Err... "lying" is the default setting. RTFM. by Mancat · · Score: 1

    The manufacturers DID fix it. What do you expect MS to do? They can't kick the manufacturers into replacing every affected drive on the planet, and they sure as hell can't get clueless/empathetic users to do anything about it either.

    So... Release a patch for it. The patch was not required either, and was quite clearly documented when displayed in Windows Update for users of Windows 9x.

    --
    hello dear sirs my name is jamesh i are india (bihar) can u guide me install red had linux 9?
  52. Re:I suggest you don't yank the powercord by pe1chl · · Score: 1

    You must be talking about some locally specific powercord and generalize it to all PC power cords?

    A reasonably designed power connection standard always makes sure that earth is connected first and disconnected last. I am sure our local standard does.

  53. Re:Author lied when implied that DRIVES are the is by Anonymous Coward · · Score: 0

    Would you please elaborate on how 2> or 4> are done? I have a seagate 7200.7 disc, and the hdparm -W 0 setting does not seem to be permanent. Thanks.

  54. Irrelevant? by khrtt · · Score: 1

    I've never heard "kibibyte" or "mebibyte", or seen them actually used. Have you? I didn't even know what they were until I've read your post. I have seen GiB and MiB, and I've always confused GB and GiB as to which one is which.

    Now, it might be that I'm just sooo ignorant, or it might be that no one actually uses that fine standard... the latter being, I'm afraid, the more likely case. Which means that ths standard is rather irrelevant.

    1. Re:Irrelevant? by Anonymous Coward · · Score: 0

      Its used all the time in engineering reference and texts.. Just becuase you never heard of it doesn't make it irrelevant, it just means you're both ignorant and irrelevant.

    2. Re:Irrelevant? by khrtt · · Score: 1

      What engineering references? I've never seen a it used in any datasheet, or technical manual. Most of them clarify which MB or GB they use where, but very few use MiB or GiB. None use "mebi-" or "gibi-" anything.

      If by "engineering reference and texts" you mean a textbook, then note that your particular textbook is not the only one around, and 4 out of 5 stick with the ambiguous notation.

    3. Re:Irrelevant? by Anonymous Coward · · Score: 0

      You still haven't explained how you're going fix all those old computer science texts and articles. Make sure you notify Donald Knuth while you're at it. Just when he thought he had perfected his series...

    4. Re:Irrelevant? by Jondaley · · Score: 1

      I have only seen the terms in dicussions like these, never in the "real" world.

      But, just because a standard isn't used, doesn't mean we shouldn't start using it.

      One man can make a difference... (:

    5. Re:Irrelevant? by Anonymous Coward · · Score: 0

      They are used in e.g. bittorrent

  55. Re:Author lied when implied that DRIVES are the is by Anonymous Coward · · Score: 0

    could you specify at what timeframe/what kernel version linux didnt pass through the sync command to the hard disks? in the last 11 years of my linux usage ive never noticed that, had always consistent shutdowns, but may be i was missing a special kernel version?

  56. Re:I suggest you don't yank the powercord by Meredeth · · Score: 1

    As all PC power supplies are the same, then I would think that the cord heading out of the back of them would also be the same.

    As for the wall socket side, you will usually be alright, but not always.
  57. No, it's the hardware that matters by Anonymous Coward · · Score: 1, Informative

    The idea is to flush all buffers in the software and the specs are not talking about the buffers in the hardware.

    That's nonsense. Applications that use fsync() do so in order to be certain that things are actually recorded in the hardware. It's by FAR the most important issue, and this is the whole purpose of fsync() --- a portable way of achieving it.

    1. Re:No, it's the hardware that matters by Anonymous Coward · · Score: 0

      Then the application authors are fools who shouldn't be trusted to write VB 6 applications. If they can't Read The Fucking Man-page and understand what fsync() does, that's their problem.

    2. Re:No, it's the hardware that matters by petermgreen · · Score: 1

      the man page CLEARLY states STABLE STORAGE

      a volatile cache is NOT STABLE STORAGE.

      afaict most operating systems DO tell the drives to sync on a fsync but the drives simply ignore the request.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  58. My Hard Drive Lies... by fudg3tunn3l · · Score: 1

    ...Partition Magic lies more. Bastard.

    --
    Resident of Skara Brae since 1985
  59. Re:I suggest you don't yank the powercord by khrtt · · Score: 1

    I am sure our local standard does.

    He's probably from the US, where the power cord is specifically designed to disconnect power before disconnecting ground, with the ground prong a whole quarter inch longer than the other two. It usually works as intended, except thatan electrical connection between the ground prong and the socket contact is necessarily flaky while the prong is being pulled out, and your equipment could fry as a result.

  60. Re:Author lied when implied that DRIVES are the is by Anonymous Coward · · Score: 0

    Firt everyone mods me down and THEN you ask for help?

    great.

    anyway the best answer for you is

    "you need to send the command every wakeup from sleep or every bootup." the command is a standard "Set Features" ATA-IDE command with command byte of 82 in hex.

    A systems level programmer could write a small utility that does so, or you could possibly learn to write and send the command.

    for info on method "2' download a free manual from seagate called the "Barracuda Serial ATA V Product Manual, Rev. A" ir will have deatail relevant to your drive, despite the name.

    refer to section 2.2.2 "Set Features command"
    Table 9:
    82h Disable write cache (features rgister value)

    Power-on default has the read look-ahead and write caching features enabled.

    method "4" will not work for that specific device, but would be permanent if it did accept it. (stored in a vendor features settings track, rather than flash ram)

  61. BS *NO* get a clue idiot. i was correct. by Anonymous Coward · · Score: 0

    BS? *NO* get a clue idiot. i was correct in my parent post.

    First of all I design Fibre Channel Firmware, SCSI drivers, caching drivers, MO drivers, DVD and CD burnign tools, ATA-IDE raid drives, S-ATA drivers, and USM Sorage Class drivers for a living.

    I do not even want to take the time to elaborate how incorrect your denail of m post is.

    but get a clue fool..... for example even if you turn off all cache on ANY drive in the world sold today including Fibre Channel and SCSI320 high end drives.... you will still have corruption on powerfailure if writing data or failure happens soo after getting data.

    Why? because its in the TRACK CACHE you fool, and the drives lack capacitors to feed the ASICs during poweroutage even if the rotational energy is still there. All they can do usually is pull the heads back to landing zone.

    if you "eally_, _really_, _really" care the OS issues a goddamned FLUSH command to the drivers that then send it to the drives. DUH!

    If you already misunderstand how cache and flush works in a drive, I doubt I could ever teach you here.

    just read and learn.

    1. Re:BS *NO* get a clue idiot. i was correct. by Anonymous Coward · · Score: 0

      Device driver authors and normal people clearly don't mix well

  62. Re:Please be clearer - my parent post was correct by Anonymous Coward · · Score: 0


    you asked "Do journaling FS just use regular fsync? Can they use some other call that actually does flush?"

    the answers are : NO THEY INVOKE FSYNC in special contexts and/or issue power-manager oriented commands to flush the drive as a side effect or by direct calls to drivers

    Can they use some other call that actually does flush? yes they can and do on almost every OS ever sold.

    the reason is people could write denial of service user apps that rob the entire OS of speed by maliciously sync-ing all the time to cause trouble.

    also , idiots that flush for no reason

    apple in OSX for example has a special way of issuing flush that they do document but wish people to not call

    other methods on many classic unix systems include running as root or as a special process or in a context designed to allow it

  63. Re:Err... "lying" is the default setting. RTFM. by iive · · Score: 1

    I wonder if Linux kernel issues this flush command on fsync() at all, as it works on filesystem level.

    We could blame HDD manufacturers if there is data loss after:
    hdparm -f /dev/hda

  64. Being right doesn't stop you being a pedant (^_^) by Dogtanian · · Score: 2, Insightful

    Maybe using kilo to mean 1024x is wrong.

    Fact of it is that *anyone* who knew enough about computers for it to matter would have known and agreed on this standard anyway, right or wrong.

    They came along and messed up a standard that everyone had agreed upon and was happy with. Don't even *think* of saying that using decimal kilobytes et al had any purpose other than making drives seem bigger than they were; that trick only worked because everyone had previously agreed that a kilobyte was 1024 bytes.

    If the industry was *so* damn keen to get the 'correct' meaning of the words, they wouldn't still be using the 'incorrect' versions when selling memory.

    Simple fact; anyone who wants to be pedantic about it can correctly argue that the 1024 definition of kilobyte is wrong. What they can't do is give any proper justification for changing a definition that everyone knew and understood to mean 1024 bytes.

    Marketing bullshit, pure and simple; in fact, I propose the phrase "marketing gigabyte", just to make it absolutely clear which definition is in use...

    --
    "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
  65. Examples from the World of Windows. by stereoroid · · Score: 4, Interesting
    Microsoft have had a few problems in this area - see KB281672 for example.

    Then they released Windows 2000 Service Pack 3, which fixed some previous cacheing bugs, as documented in KB332023. The article tells you how to set up the "Power Protected" Write Cache Option", which is your way of saying "yes, my storage has a UPS or battery-backed cache, give me the performance and let me worry about the data integrity".

    I work for a major storage hardware vendor: to cut a long story short, we knew fsync() (a.k.a. "write-through" or "synchronize cache") was working on our hardware, when the performance started sucking after customers installed W2K SP3, and we had to refer customers to the latter article.

    The same storage systems have battery-backed cache, and every write from cache to disks is made write-through (because drive cache is not battery-backed). In other words, in these and other Enterprise-class systems, the burden of honouring fsync() / write-through commands from the OS has switched to the storage controller(s), the drives might as well have no cache for all we care. But it still matters that the drives do honour the fsync() we send to them from cache, and not signal "clear" when they're not - if they lie, the cache drops that data, and no battery will get it back..!

    --
    (this is not a .sig)
  66. does hdparm work on a sata drive? by Anonymous Coward · · Score: 0

    I tried running hdparm -i /dev/sda and get the following:

    /dev/sda:
    HDIO_GET_IDENTITY failed: Inappropriate ioctl for device

  67. Re:Author lied when implied that DRIVES are the is by Anonymous Coward · · Score: 2, Informative

    there were many linux defects with no track cache flush command being recived by devices, but if you want one set of recent fixes for flush corruption ...

    refer to :

    -force-ide-cache-flush-on-shutdown-flush.patch
    -force-ide-cache-flush-on-shutdown-flush-fix.patch

    in Changes since 2.6.6-mm1

    ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/ patches/2.6/2.6.6/2.6.6-mm2/

    why the hell my informative parent post gets modded to only a "2" just because people do not like the truth is astounding.

    I was hoping this would happen to my INFORMATIVE post because it just means i will not bother helping anyone in slashdot again for another halfyear absence form posting.

    i figure... why bother... the S/N ratio is such that no low level coders seem to ever read slashdot anymore anyways in recent years.

    its probably time for me to more to other sites as well.

    "2"! on the only FACTUAL and informative post in the entire damned thread!

  68. And your point is? by Moraelin · · Score: 3, Informative

    Yes, nothing by itself is enough, not even XA transactions, but it can make your life a _lot_ easier. Especially if not all records are under your control to start with.

    E.g., the bank doesn't even know that the money is going to reserve a ticket on flight 705 of Elbonian United Airlines. It just knows it must transfer $100 from account A to account B.

    E.g., the travel agency doesn't even have access to the bank's records to check that the money have been withdrawn from your account. And it shouldn't ever have.

    So you propose... what? That the bank gets full access to the airline's business data, and that the airline can read all bank accounts, for those integrity checks to even work? I'm sure you can see how that wouldn't work.

    Yes, if you have a single database and it's all under your control, life is damn easy. It starts getting complicated when you have to deal with 7 databases, out of which 5 are in 3 different departments, and 2 aren't even in the same company. And where not everything is a database either: e.g., where one of the things which must also happen atomically is sending messages on a queue.

    _Then_ XA and ACID become a lot more useful. It becomes one helluva lot easier to _not_ send, for example, a JMS message to the other systems at all when a transaction rolls back, than to try to bring the client's database back in a consistent state with yours.

    It also becomes a lot more expensive to screw up. We're talking stuff that has all the strength of a signed contract, not "oops, we'll give you a seat on the next flight".

    Yes, your tools discovered that you sent the order for, say, 20 trucks in duplicate. Very good. Then what? It's as good as a signed contract the instant it was sent. It'll take many hours of some manager's time to negotiate a way out of that fuck-up. That is _if_ the other side doesn't want to play hardbal and remind you that a contract is a contract.

    Wouldn't it be easier to _not_ have an inconsistency to start with, than to detect it later?

    Basically, yes, please do write all the integrity tests you can think of. Very good and insightful that. But don't assume that it suddenly makes XA transactions useless. _Anything_ that can reduce the probability of a failure in a distributed system is very much needed. Because it may be disproportionately more expensive to fix a screw-up, even if detected, than not to do it in the first place.

    --
    A polar bear is a cartesian bear after a coordinate transform.
    1. Re:And your point is? by DavidTC · · Score: 1
      We can simplify this to two databases and show if the drive is lying, you're screwed. Forgive me, I'm not a database expert, but I can see a fundamental flaw with lying hardware.

      You have, say, an ecommerce shop with a credit card processing setup. You accept a transaction, save it to disk, send off the CC request, and then finish it when the request clears.

      Now, if power fails at any time. you're covered. If it fails before save the transaction to disk, you're fine. The order's gone, but you're fine.

      If it fails after you saved it, but before the CC company gets back to you, you need to ask the CC company 'Did I send this to you, and did it clear?'. Complicated, but I'm sure it's doable.

      All this stuff has been worked out by experts, and I'm sure there are nice terms for all of them, but the concept is not that hard.

      However, there is one extreme point of failure if the hardware lies...you could have sent off the request, and the original transaction record could have vanished.

      You told it to write to disk, you even synced it, you're even using a RAID in case it fails....but if the hardwares says 'Yeah, I wrote it', so you send off the CC request, but it was lying, you're sunk, and there's nothing you can do about that...you just billed someone for something you have no record of. Even after carefully waiting for the record to be saved to disk...

      And, of course, depending how you have things set up, that just as easily be 'you just shipped something without billing for it'.

      --
      If corporations are people, aren't stockholders guilty of slavery?
  69. Marketing created the 'confusion' by Dogtanian · · Score: 4, Insightful

    Actually, it is. The standard was updated in 1998 to avoid confusion. Having different name for different things can avoid an awful lot of confusion, so it would very much recommend using them.

    Which is more important? The de facto standard that slightly misuses the 'kilo-' prefix, but *everyone* knows what it means; or something that was forced into place by marketing?

    As I argued in more depth elsewhere, anyone who used computers *knew* what "kilobyte" and friends meant.

    There was no confusion, because only the 1024-byte definition was widely used.

    The 'need' to use the '1000 byte' definition was created by marketing, not computer people. THEY caused the confusion for their (short term) gain by exploiting the old meaning of 'kilobyte' to make their drives seem larger.

    Marketing do not give a flying **** about correctness or clarity; if there was any problem, *they* created it. Computer people knew what kilobyte meant.

    --
    "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
    1. Re:Marketing created the 'confusion' by Crayon+Kid · · Score: 4, Insightful

      Marketing do not give a flying **** about correctness or clarity; if there was any problem, *they* created it. Computer people knew what kilobyte meant.

      I'm sure they took advantage of the blurry meanings for a while. But in the long run, you gotta admit the change makes sense, from a standardisation point of view. Every measuring unit uses kilo/mega/giga to mean powers of ten. Computer world was the odd one out, and it should rightly be labeled specifically.

      --
      i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
    2. Re:Marketing created the 'confusion' by quantum+bit · · Score: 4, Insightful

      Every measuring unit uses kilo/mega/giga to mean powers of ten. Computer world was the odd one out, and it should rightly be labeled specifically.

      Oh, the computer world uses those prefixes to mean powers of 10 too. They just mean powers of 10 in base 2 math :)

    3. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 2, Funny

      But in the long run, you gotta admit the change makes sense, from a standardisation point of view.

      Next thing you're going to tell me is that they're going to make us give up pounds and miles.

    4. Re:Marketing created the 'confusion' by nine-times · · Score: 1
      Still, it'd be nice if, when I bought a 40 GB hard drive and installed it, it registered as having a capacity of 40 GB instead of 37.14 GB. Just to avoid a lot of nonsense, you know, one way or the other, they should make up their minds and stick with a defined way of dealing with things. And since people are used to dealing with base 10, I would think it'd probably be better to stick with the 1 KB = 1000 B definition.

      Really, I don't even care why the issue exists. I just think it'd be easier for my poor mother to understand how much space she's getting.

    5. Re:Marketing created the 'confusion' by alexhs · · Score: 2, Informative
      anyone who used computers *knew* what "kilobyte" and friends meant.

      15 years ago, maybe. Nowadays, I don't think so. It's just that windows reports sizes by 2^10 chunks and not 10^3 ones, so people are thinking someone is lying, and, you know, Microsoft never lies.

      OTOH, cfdisk happily reports disk sizes by 10^3 units.

      I don't even think that there is some marketing push to use kilo instead of kibi :

      Once upon a time, disks (like floppies) were strictly divided into cylinders, heads, sectors, a sector being 2^9 bytes (what would be interesting would be to know WHY 512 bytes ?). You would multiply c*h*s and get your total disk capacity. But space was wasted on the outer tracks.

      Now, thinks have changed. You have reserved sectors for bad sectors handling (unadvertised space!), and sector per track isn't a constant. You just have a total number of (LBA) sectors, that is not a simple product of three factors. Moreover, capacities became important regarding to the 512 bytes unit.

      Total number of sectors still is printed on the hard disk, if you want it. And remember that all 160GB disks aren't equal (ie don't have the same number of sectors). Seriously, are you going to check the exact number of sectors when you're seeking for a new ca.200GB hard disk ? rpm, noise, ... seems to me to be better criterion that the few additional sectors I might get. And what would you think about CDs or DVDs ? Most CDR-80/700MB really are 703MiB but there might be little differences. They still are advertised 700MB and not 703MB. And DVDs however aren't 4.7GiB but 4.7GB.

      USB keys, ram sticks still are using MiB. Why ? What is doing the marketing ? It's just that they still are using a binary scheme. The other way, Ethernet or modem speeds never have used powers of two.

      The transition between GiB and GB was an unfortunate event but, formally speaking, it's better now in regards to (international) units.

      --
      I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
    6. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      There are 10 types of people:
      People that understand binary (and got the joke),
      and crack smoking moderators.

    7. Re:Marketing created the 'confusion' by barawn · · Score: 4, Insightful
      As I argued in more depth elsewhere, anyone who used computers *knew* what "kilobyte" and friends meant.

      Except Ethernet card manufacturers, modem manufacturers, PCI card manufacturers... oh, hell, just about anyone who transfers something with a clock.

      10baseT ethernet transfers data at 10 Mbps. That means 10 x 10^6 bits per second. IDE buses running at 66 MHz list their theoretical maximum as 66 MB/s.

      kilo = 1024 is retarded. It only makes sense for things that have to scale in powers of two, like memory. For a long while, "data rate" meant "kilo=1000, mega=1000 kilo" wheras in storage, "kilo=1024". Talk about a recipe for disaster.

      Just as an example: here's an article describing Ultra320 SCSI, and PCI bus bandwidth:

      Under standard PCI the host bus has a maximum speed of 66 MHz. This allows for a maximum transfer rate of 533 MB/sec across a 64-bit PCI bus.


      66 2/3 MHz (M here means what? oh, right, 10^6) times 8 bytes is 533 1/3 MB/s. Where here, "M" means "1000*1000". In MiB/s, it'd be 508.6263 MiB/s.

      Is this a problem? Yes. I shouldn't have to pull out a freaking calculator to figure out how long it should take to dump 2 GB of RAM across a 2 GB/s link. It should be one second, not 1.0737418 seconds.

      Computer people knew what kilobyte meant.

      No we didn't. We've never used kilo consistently. See above - we've talked about CPU speeds in terms of kHz and MHz, meaning 10^3, 10^6, and talked about kilobits/second meaning 10^3 bits per second, talked about kilobytes/second meaning 10^3 bytes/second, and turned around and talked about file sizes where kilobyte means 1024 bytes.

      We've never been consistent. The IEC finally owned up to it and admitted it, and asked us to all finally stop being so damned sloppy, and I'm quite glad they did.
    8. Re:Marketing created the 'confusion' by dgatwood · · Score: 2, Insightful
      I'm sure they took advantage of the blurry meanings for a while. But in the long run, you gotta admit the change makes sense, from a standardisation point of view.

      No, I don't admit it. Volume and distance measures are standardized to base 10 because they have no inherent natural unit. Computers have a natural unit---powers of two. In much the same way, we don't standardize time to base 10. Can you imagine if we decided we wanted to have 100 days in a year? It wouldn't work well because Earth doesn't go around the sun every 100 days. It goes around the sun every 365.25 days.

      For the same reason the base-10 standardization of time was rejected, the base-10 bastardization of computing units should also be rejected. A megabyte (2^20) is a natural unit that expresses both the underlying addressing of the computer and the fundamental organization of RAM that corresponds to that addressing system. A megabyte (10^6) represents an arbitrary grouping that (at least with modern design standards) CANNOT ACTUALLY EXIST IN HARDWARE.

      So how does the SI "standard" make sense again?

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    9. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      /quote
      Marketing do not give a flying **** about correctness or clarity; if there was any problem, *they* created it. Computer people knew what kilobyte meant. /quote

      Of course, why do we call them IDE drives?
      Intergrated Device enhancement (Maxtor)
      or...
      Independant Drive Extension (WD)

      Its an ATAPI drive FFS. Hello confusion, we have a new entry in your Wiki.

    10. Re:Marketing created the 'confusion' by stonecypher · · Score: 1

      The 'need' to use the '1000 byte' definition was created by marketing, not computer people.

      No, it was needed in order for a rational uniform standard, which is the entire point of the metric system in the first place. That marketers are right and you are wrong doesn't make the actions of the SI a giant advertising plot.

      --
      StoneCypher is Full of BS
    11. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      You a moronic, blathering, maggot infested, goat fucking, shit eating whore, you piss-snorting, scum sucking bastard of a sheeple.

    12. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0
      Volume and distance measures are standardized to base 10 because they have no inherent natural unit.
      Base ten is because most people have ten fingers, and ten is large enough to be convenient and small enough to fit in the human brain.
      Computers have a natural unit---powers of two.
      What if you're writing a program that communicates with 9-bit bytes and runs on a machine with 14-bit program bytes? (Which is not at all out of the question.)

      Suppose I have an analog-to-digital converter that produces one-byte outputs at a rate of 100,000 per second. According to you, 100 ksamp/sec * 1 byte/samp == 97.65625 kbyte/sec. Sheer idiocy.

      In much the same way, we don't standardize time to base 10.
      Speak for yourself, lam3rboy. Milliseconds, microseconds, nanoseconds: I use 'em all the time. Occassionally I need picoseconds. Megayears are common in scientific work.
      A megabyte (2^20) is a natural unit that expresses both the underlying addressing of the computer and the fundamental organization of RAM that corresponds to that addressing system.
      Utterly wrong. Today there are mass market memory devices that do not store binary data, are not structured as balanced trees, and do not use data words that are an power of two in size. This will become more and more common over the coming decade as designers pack ever more data into a given volume of circuitry.
      A megabyte (10^6) represents an arbitrary grouping that (at least with modern design standards) CANNOT ACTUALLY EXIST IN HARDWARE.
      Pull the corks out of your eye sockets and take a look at modern flash memory devices. The actual persistent storage hardware is completely goofball: multilevel (nonbinary) stored voltages, data lines with "extra" digits for error detection and correction, and however many data lines your chip has room for.
    13. Re:Marketing created the 'confusion' by ceswiedler · · Score: 1

      Of course, when you say base 2, you mean 2 in base 10.

    14. Re:Marketing created the 'confusion' by bradm · · Score: 1
      Every measuring unit uses kilo/mega/giga to mean powers of ten. Computer world was the odd one out, and it should rightly be labeled specifically.
      Oh, the computer world uses those prefixes to mean powers of 10 too. They just mean powers of 10 in base 2 math :)
      Works for me. So now a Kilobyte is 8 bytes (2^3) a Megabyte is 64 bytes (2^6), and a Gigabyte is 512 bytes (2^9)? Cool, my laptop drive now holds billions and billions of Mb. Does yours?
    15. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      You both mix your mother's toe-jam with your wheaties!

    16. Re:Marketing created the 'confusion' by evilviper · · Score: 1
      We've never used kilo consistently.

      It is consistent, if you follow the rules. It isn't just files, storage of any kind has always been bytes. Yes, anything with a clock uses the metric prefix. But they were easy to distinguish, always being listed in BITS rather than bytes. Your modem isn't ever listed as 7KBps, it's 56Kbps. Network cards are never 12.5MBps, they are always 100Mbps.

      KiloBYTE has always been a power of 2, and kiloBIT has always been a power of 10.

      If hard drive manufacturers had gone to mega/giga --"bits" instead of bytes, there would never have been any problems. Instead they decided to change the terms to make it subtle.

      Perhaps NIST should have re-defined "bytes" to equal "8.192 bits" to solve the problem, but then they would have been contradicting hard drive manufacturers, and leaving them open to lawsuits. That would have address your concerns equally well, wouldn't it?
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    17. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      There are 10 kinds of people: those who understand binary and those who don't. = Ho ho! The sharp-eyed nerd! How wonderful not to be trained to speak in 1's and 0's. From the anagrammys

    18. Re:Marketing created the 'confusion' by dgatwood · · Score: 1
      Okay, I'll feed the troll. First of all, software that communicates with 9-bit bytes is just adding check bits. That's not relevant, nor is the number of bits of an instruction. What is relevant is the organization of groups of bytes in computers. The organization of a group of bytes is determined by the practicality of hardware to decode addressing.

      Bits are used for addressing in ALL modern computing devices, and a bit represents a particular power of two. It is impractical to design a system of any real complexity in such a way that boundaries between devices don't jump by power-of-two sizes. You can't reasonably map hardware devices at 100 bytes apart. The decode logic costs would be enormous. Therefore, the most optimal organization of memory will ALWAYS be a power of two as long as we don't replace the fundamental notion of bytes being made up of bits. Period.

      As for the physical organization of flash memory devices, that is irrelevant. The interface presented to computers is either ATA, which presents itself in 512-byte chunks, or RAM-like flash, which also presents itself in the form of some even power-of-two size, e.g. 256kX8. And even with the extra digits and nonbinary voltages used for storing a bit, the individual bytes are still typically in power-of-two arrays, at least for all the flash parts I've ever heard of....

      I'm not saying that there won't be specialty devices with a microcontroller and memory all integrated into a single chip that use oddball, non-power-of-two address space organization. If all your device has to deal with is a little bit of ROM and a chunk of RAM (e.g. an answering machine), the organization of memory is a lot less important. However, for general purpose computing, such designs do not make ANY sense, as they are completely impractical.

      As for the argument about a 100kHz sampling rate for an ADC... when is the last time you saw anythinkg like that outside of a science lab? 99.99% of ADCs are audio, and they don't run at 100kHz.

      Sampling rates in the audio world are based on either 44.1kHz or 48kHz or various halvings or doublings thereof. 44.1kHz was chosen as 3 samples per line of video in NTSC (3 * 245 lines * 60 Hz = 44,100). IIRC, 48kHz was chosen relative to the 24 fps projection rate of film. If anything, audio is a perfect argument AGAINST your point of view, as the standard unit measures for audio are closely tied to the fundamental design of the underlying technology.

      And finally, your argument about tiny fractions of seconds is just plain silly. The reason those are in powers of ten is because those units don't have any commonly-used natural organization. That's why I specifically chose larger units such as months and years, which are (roughly) based on lunar and solar cycles. Any argument from you about any unit of time smaller than a day are completely irrelevant in refuting my point, which was that natural organization (where such organization exists) should be preferred over an arbitrary and unnatural base-10 organization.

      Also, in some circles, though, tiny units of time are measured in non-base-10 units, such as Cesium ticks. Imagine if we just decided to round that from 1 nine-billionth of a second to 1 ten-billionth of a second. That's no different from what you're suggesting when it comes to computers.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    19. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      It is consistent, if you follow the rules. It isn't just files, storage of any kind has always been bytes. Yes, anything with a clock uses the metric prefix. But they were easy to distinguish, always being listed in BITS rather than bytes. Your modem isn't ever listed as 7KBps, it's 56Kbps. Network cards are never 12.5MBps, they are always 100Mbps.

      No it isn't you fucking moron. Usually on any parallel bus architecture things are given in prefixed units of bytes or sometimes words.

    20. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      Perhaps NIST should have re-defined "bytes" to equal "8.192 bits" to solve the problem,

      You clearly have little historical knowledge in computing. Byte is defined as the native character size of a computing system, usually the smallest addressable unit, almost invariably some factor of the word size. For all practical purposes it is always 8 bits these days, unless otherwise noted. In cases where ambiguity needs to be a non issue it is better to say "octet."

      Besides there are many cases of stating aggregates of bytes (like transfers over a parallel bus) that are not constrained by powers of two. Sorry, but there are many contexts where one says 100MB/s and does not mean a power of 2.

    21. Re:Marketing created the 'confusion' by Evil+Pete · · Score: 1

      No we didn't. We've never used kilo consistently. See above - we've talked about CPU speeds in terms of kHz and MHz, meaning 10^3, 10^6, and talked about kilobits/second meaning 10^3 bits per second, talked about kilobytes/second meaning 10^3 bytes/second, and turned around and talked about file sizes where kilobyte means 1024 bytes.

      True. But when kilo was used to prefix the word byte it was always assumed to be 1024. However, when the term 'megabyte' came into use it was a not as clear cut. We all 'knew' it should be 1024 * 1024, but it was just easier to say that it meant a million. 1024 is easy to say but 1048756 is just making life difficult. But we were still pissed off when the drive makers stopped using 2^20 as their definition and started using 10^9 (its nice to get a bit extra, and not nice to get less than you might have).

      I don't lament MB = 10^9 .. But I'd argue that kilobyte still has a very particular meaning. When you are doing low level stuff you don't think in megabytes you think in blocks of bytes, usually in powers of 2 and then kilobyte has its place.

      --
      Bitter and proud of it.
    22. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      I want to upgrade my computer but nobody seems to sell 512MB memory modules. I keep going to shops advertising them cheap but when you look closely at the box they're actually selling 536 MB modules! Help!

    23. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      I've never heard 14 used, but software that communicates using 9 bit bytes is doing it because they fit in 36 bit words, which used to be fairly common (along with 18 bit word addresses).

    24. Re:Marketing created the 'confusion' by dickrichardv8 · · Score: 1

      lemme see here; 1 + 1 = 2 in base 10; 1 + 1 = 10 in base 2; Who's or What's on base 3.

    25. Re:Marketing created the 'confusion' by Anonymous Coward · · Score: 0

      That wouldn't do the trick--1 ki is 2.4% larger than 1 k, but 1 Mi is nearly 5% larger than 1 M, and the difference keeps compounding.

    26. Re:Marketing created the 'confusion' by Bush+Pig · · Score: 1

      Even if he meant 2 in base 3, he would have still been correct.

      --
      What a long, strange trip it's been.
    27. Re:Marketing created the 'confusion' by Bush+Pig · · Score: 1

      512 bytes may well date back to the block size on a VAX.

      --
      What a long, strange trip it's been.
    28. Re:Marketing created the 'confusion' by barawn · · Score: 1

      But they were easy to distinguish, always being listed in BITS rather than bytes.

      Did you read the quote I posted? They're talking about bytes, not bits. You're telling me that a 2GHz link that transfers 1 byte per transfer runs at 1.8 GB/s? What part of "units should be consistent" didn't you understand?

      Perhaps NIST should have re-defined "bytes" to equal "8.192 bits" to solve the problem

      A byte is not 8 bits. A bit is a unit of data, whereas a byte is the smallest addressable unit of data in a computer. Hence the reason that a 9600 bit-per-second modem transfers 960 bytes per second. A byte is 10 bits on a serial line (well, for 8N1, at least). Not that you address data on a serial line, but when you put a "byte" on a serial line, you get 10 bits on the line.

    29. Re:Marketing created the 'confusion' by barawn · · Score: 1

      When you are doing low level stuff you don't think in megabytes you think in blocks of bytes, usually in powers of 2 and then kilobyte has its place.

      Just call it a kibibyte. Just write KiB. If you think the name is stupid, petition for new names. It's one letter. It also makes life simpler because you never again have to wonder whether or not they mean 1024 or 1000.

      It has nothing to do with drive makers. It has everything to do with abusing a unit. The only reason "kilo = 1024" hasn't been hell is that the people who use it don't actually venture outside of their own discipline that often.

      Let me explain: imagine a runner, who, every meter, transfers a total of 1kB every kilometer. How much does he transfer every meter? How much does he transfer every megameter?

      1 kB/1 km should be 1 B/m. 1 kB/1 km in Mm should be 1 MB/Mm. Unit math is nice. By having a "special usage" of kilo for one particular unit, you completely and utterly destroy unit math.

      As I've said elsewhere, the best answer is this: it is utterly ludicrous to expect a person to pull out a calculator to figure out how long it will take to transfer 2 GB of RAM over a 2GHz link that transfers 1 B per transfer. It should be 1 second.

      The funny thing is that no one cares. If you stick an "i" in the sizes of RAM, no one will notice or care. We have special units all over the place - DVD and CD-ROM drive speeds are measured in "X" speeds (times the original speed of the medium). CPU speeds don't even list MHz/GHz anymore. We fudge transfer speeds all the time (100 Mbit/s for 100baseTX when that's the maximum raw speed).

      The only people who need to think primarily in terms of 1024 bytes (or 1024*1024 bytes, or 1024*1024*1024 bytes) are OS authors, driver writers, and filesystem writers who want to split up address spaces by masking off bits in an address. Now ask yourself - are there more of them, or are there more of everyone else who's been taught the metric system? That's why I advocate telling people "just stop being lazy, and do it right."

      There is no reason for ls to output info in GiB, MiB, KiB by default, for instance. No one needs to know that. Heck, it would be nice for ls to round properly, anyway. Listing a 43098112 file as 42MiB when it's 41.1MiB is just stupid. But that just stresses my point - computer programs are sloppy when it comes to filesizes, and no one cares. So if we don't care whether it's 41 MiB or 42 MiB, why do we care if it's listed in MiB or MB?

      It's just silly to berate hard drive manufacturers for following a standard when computer programmers are so incredibly sloppy with terminology. Who cares why they did it? They're right. Saying that they did it for marketing reasons is just a way for computer programmers to feel better about themselves for continuing to be sloppy.

    30. Re:Marketing created the 'confusion' by Alberic · · Score: 1

      It goes around the sun every 365.25 days.

      Really ? I have to see a 365.25(exact)days long year

      In much the same way, we don't standardize time to base 10

      1 second = 1000 milliseconds. Point taken, anyway. Okay, this is a bit of a flame, but my point is to show you this point is not valid. What would be the point of base 60 in seconds, minutes, and hours ? (open question, as you guessed, i have no clue ^_^)

      Anyway, this is not the point, the Gibi, kibi, and mibi are perfectly suited for binary factors. Kilo everywhere outside a compter means 1000, so why the hell should the same prefix have two different meanings ?

      --
      *squeak*
  70. Re:Being right doesn't stop you being a pedant (^_ by ReallyNiceGuy · · Score: 1

    (OFFTOPIC warning)
    You have no heart! You killed my bunny!!!

  71. Re:Author lied when implied that DRIVES are the is by Anonymous Coward · · Score: 0

    thanks for the info, i appreciate it. unfortunately, the 7200.7 is sitting behind a 1394 bridge, so it sounds like i'm basically screwed. (and to add to the issue, the enclosure periodically power cycles the disc for no apparent reason...)

    i am no longer purchasing seagate discs. in the past, i have tried to get this info out of their tech support, however they are completely incompetent, condescending, and plain annoying.

    fwiw, disabling the write cache on IBM/Hitachi discs is permanent, and they offer a utility to do so. Unfortunately, under SP2, windows vigorously reenables the write cache setting regardless of what you tell it.

  72. Just trying to figure out whose fault it is by leehwtsohg · · Score: 2, Informative

    fsync(2) man does state:
    fsync copies all in-core parts of a file to disk, and waits until the device reports that all parts are on stable storage.
    But then it goes on to state:
    NOTES
    In case the hard disk has write cache enabled, the data may not really be on permanent storage when fsync/fdatasync return.

    Which, as you point out, can be a BAD THING (TM) if someone opens a window. So, who should change? fsync, and it's man page's NOTES for devices that have a cache but actually are capable of flushing that cache? Or should there be a special really_fsync() call?

    1. Re:Just trying to figure out whose fault it is by Gulthek · · Score: 1

      Or should there be a special really_fsync() call?

      Apple has one:

      "Now, a little bit more detail: on ATA drives we implement F_FULLFSYNC with the FLUSH_TRACK_CACHE command. All drives sold by Apple will honor this command. Unfortunately quite a few firewire drive vendors disable this command and do not pass it to the drive. This means that most external firewire drives are not reliable if you lose power or the system crashes. We can't work-around that unless we ask the drive to disable the write cache completely (which hurts performance quite badly -- and even that may not be enough as some drives will ignore that request too). ... On MacOS X fsync() behaves the same as it does on all Unices. That's not good enough if you really care about data integrity and so we also provide the F_FULLFSYNC fcntl. As far as I know, MacOS X is the only OS to provide this feature for apps that need to truly guarantee their data is on disk."

  73. Nice journalism skills by jsbrown · · Score: 1

    When posting such as article, it'd be handy to get your sources right. His name is Brad Fitzpatrick, not Fitzgerald.

  74. fsync question by jskline · · Score: 1

    Frankly, this speaks volumes to the reasons why when you enable write caching in hdparm, and Winblows, and the thing crashes, you have to wait while the file system is checked, scrubbed, et al before coming back up.

    --
    All content in this message is copyright (c) 2008. All rights reserved. RIAA is prohibited here.
    1. Re:fsync question by tomstdenis · · Score: 2, Informative

      Use reiserfs?

      At least then the file is either there or not there.

      My gentoo box has been through a few brownouts/powerouts [I have a UPS now ...] and hasn't skipped a beat. It even comes back up on it's own [go Asus bios ;-)] when I'm say on another continent ;-)

      Tom

      --
      Someday, I'll have a real sig.
    2. Re:fsync question by Anonymous Coward · · Score: 0

      The two issues are related, but write caching done by disks is certainly not the reason why fsck is needed.

      Non-journaled file systems require fsck because data from the operating system cache hasn't been written to the disk device due to an unclean shutdown.

      Disks performing write caching can cause additional problems (because the data isn't written to disk in order) and actual corruption beyond the ability of fsck to fix. On servers write caching is often turned off entirely by default.

      Additionally some file systems are unsafe because they don't write stuff to disk in order, anyhow (e.g. if you're using ext2, turning off write caching is going to slow things down but the system will still be unsafe).

  75. Re:Author lied when implied that DRIVES are the is by Anonymous Coward · · Score: 0

    Hey man, tell me what sites you are going to and I will follow you. Bottom down: I love posts like yours, and I agree it is too bad people fuck with them.

  76. Re:Being right doesn't stop you being a pedant (^_ by Saven+Marek · · Score: 1, Flamebait

    Simple fact; anyone who wants to be pedantic about it can correctly argue that the 1024 definition of kilobyte is wrong. What they can't do is give any proper justification for changing a definition that everyone knew and understood to mean 1024 bytes.

    Because it's not a simple fact. kilobyte is 1024 bytes when referring to binary addressed data (such as RAM chips) but is 1000 bytes when used in other areas, such as network bandwidth, or floppy drive space, or bus bandwidth, or what have you.

    The problem is everyone does not know and understand 1024 bytes to be one kilobyte, they only presume it always does, when it quite obviously doesn't.

    Since you've demonstrated confusion over the matter yourself by making a blanket statement that 1024 bytes is one kilobyte, while ignoring the times when it IS NOT one kilobyte, you demonstrate a need for rejecting the system that lead to your own confusion.

    Don't even *think* of saying that using decimal kilobytes et al had any purpose other than making drives seem bigger than they were; that trick only worked because everyone had previously agreed that a kilobyte was 1024 bytes.

    Why do you say such inaccuracies? drives going back to the first drives ever made used kilobyte = 1000 bytes. It has always been that way and that is the correct way because a hard drive is not binary addressed data rather it is arbitrary based on the number of bits that fit on a circle of metal. Nobody "previously agreed that a kilobyte was 1024 bytes" because that is a blanket incorrect statement.

    People agreed that a kilobyte is 1024 bytes only when referring to binary addressed data which is not the case on a hard drive platter which is an arbitrary size much like network speeds or things like bandwidth. The only time in a hard drive life when kilobyte=1024 is when you are talking about the MAXIMUM ADDRESSABLE DATA over the controller that the drive is attached to. and that has a bit width and therefor is a power of two.

    Drives always have been decimal binary even from when they were first research-only inventions. it is revisionism to suggest it is all marketing and you have fallen into a trap of thinking that.

  77. your os did by petermgreen · · Score: 1

    the windows OS shows every partition as a seperate drive even though they are not actually seperate drives.

    so what lied to you was your operating systems user interface that claims there are two drives when in fact there are just two partitions on the same drive

    note to nitpickers: i said USER INTERFACE i know you can see the partitions in the administrative tools but most users won't know that exists.

    --
    note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    1. Re:your os did by kbjnash · · Score: 0

      note to nitpickers Nitpickers on /.? Parish the thought, what do you think this is, Fark?

  78. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 1, Informative

    From the UNIX spec, vol 2:

    ---
    NAME

    fsync - synchronise changes to a file

    SYNOPSIS

    #include

    int fsync(int fildes);

    DESCRIPTION

    The fsync() function can be used by an application to indicate that all data for the open file description named by fildes is to be transferred to the storage device associated with the file described by fildes in an implementation-dependent manner. The fsync() function does not return until the system has completed that action or until an error is detected.

    The fsync() function forces all currently queued I/O operations associated with the file indicated by file descriptor fildes to the synchronised I/O completion state. All I/O operations are completed as defined for synchronised I/O file integrity completion.

    ---

    In short, fsync() is specifically designed to flush the data in memory to the device as well as ensure the device to writes right fucking now and stfu until the job is done. fsync() under Linux does indeed issue command E7h for ATA5, which the drive is expected to follow immediately.

    If the device fails to do so, then it's operating out of spec and therefore is either faulty, or the manufacturer is falsely claiming compliance with the spec and selling something other than what was promised.

  79. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 1, Informative

    No, flush is still used to dump filesystem changes from system memory to the drive even if the drive doesn't have a cache.

    However, fsync() _is_ expected to ensure that the data is committed in a way that ensures data integrity, regardless of the medium being used.

    If the drive has a hardware cache, then fsync() implimentations are expected to ensure that this cache is also flushed. To this end, various incarnations of Linux and BSD employ ATA commands specifically designed for this task and which are mandatory for a drive to claim ATA compliance.

    If the drive manufacturers are failing to impliment these commands as specified, then we have what amounts to dirty pool and most likely consumer fraud.

  80. Re:Err... "lying" is the default setting. RTFM. by frinkazoid · · Score: 4, Informative

    this is true .. Installing a fresh windows 98 SE on a fairly new pc and then doing windows update, there is an update witch this description:

    The Windows IDE Hard Drive Cache Package provides a workaround to a recently identified issue with computers that have the combination of Integrated Drive Electronics (IDE) hard disk drives with large caches and newer/faster processors. Computers with this combination may risk losing data if the hard disk shuts down before it can preserve the data in its cache.

    This update introduces a slight delay in the shutdown process. The delay of two seconds allows the hard drive's onboard cache to write any data to the hard drive.

    I found it nice to see how M$ worked around it, just waiting 2 seconds, how ingenious !
    link to the M$ update site: http://www.microsoft.com/windows98/downloads/conte nts/WUCritical/q273017/Default.asp

  81. Liberache gay terminology. by Anonymous Coward · · Score: 0

    I'll never say that nasty term. What were those retards thinking when coming up with it? "Let's make something that sounds really stupid."?

  82. That was the clearest description.... by awfar · · Score: 1

    and helped the discussion, alot.

    If we stop for a moment and assume the drive itself is not able to really flush all those big cache entries during a hard power fail, you'd have to ask why.

    Does the power supplies being used have any excess capacitive storage? (No, they're switchers). Does the power supply power off voltage curve go down too fast after power fail signal? (Probably)

    Is it because they got into cache-size competition? Is buffer-size truly limited only by how much time during a hard power fail they have to physically write it out? Has the manufacturers, from said competition, pushed the envelope to the edge?

    As areal density increases, and buffers increase, speed from linear data increases clock rate and therefore logic requirements, but lower-power logic appears, spinning mass probably stayed the similar, RPMs similar.

    Could there be a system constraint that is being exceeded? One that only shows during hard power fault?

    Inquiring minds want to know (but too lazy and not wnough hardware to find out himself)

  83. Re:Err... "lying" is the default setting. RTFM. by diegocgteleline.es · · Score: 1

    Linux recently added write barriers. I don't know if it helps but it looks like its related

  84. Re:Err... "lying" is the default setting. RTFM. by jonwil · · Score: 3, Informative

    The right answer is for the drive not to respond to the "Sync" command with "Done" untill it really is done (however long it takes) and for the OS to not continue untill it sees the "done" command from the drive.

  85. Re:I suggest you don't yank the powercord by scharkalvin · · Score: 1

    Then throw the little switch on the back of the
    power supply (labeled '0' and'1') to the '0' position to turn off the computer. This is usually a DPDT switch and will kill BOTH legs of the line at the same time. This will be safe.

  86. In Soviet Russia by c0ldfusi0n · · Score: 1

    Your hard disks lie to YOU! Oh.. wait..

    --
    A computer makes it possible to do, in half an hour, tasks which were completely unnecessary to do before.
  87. Case design for free air delivery has a lot to do by crovira · · Score: 1

    with it as does the room design around the case (as I found out the hard way in the summer of 2001 when I moved my computer to some spot where there was stagnancy and fried 3 drives (of different capacity and manufacture.)

    --
    MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
  88. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    Seems you don't get it. fsync() flushes to the device not to the physical media! The specs clearly says that all the data should be sent to the storage device, it does not say that the storage device should flush it's internal cache too! Do you see the difference?

  89. Re:Err... "lying" is the default setting. RTFM. by wren337 · · Score: 1


    You seem to be referring to the kernel API fsync() rather than the ATA spec for fsync(). The author is talking about the ATA spec, and the fact that the drive is ignoring the command to flush cache to media.

  90. Re:Err... "lying" is the default setting. RTFM. by dirty · · Score: 2, Informative

    The Linux man page (last updated 2001-04-18) states that all data should be written to stable storage. To me stable means that if power is pulled that data is still there. It does however, give a warning in the NOTES section that if write cache is enabled on the drive, "the data may not really be on permanent storage." I don't know if that warning is just there because of observed behavior, or if the various specs allow said behavior.

    --

    -matt
  91. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    There is no such thing as fsync() in the ATA spec! ATA spec only talks about command bytes and things like that not API callable in perl!! You'd better RTFA.

  92. Re:Err... "lying" is the default setting. RTFM. by pv2b · · Score: 3, Interesting

    Right. And the author is implementing a program that sends raw commands to ATA drives... in perl. Right. He does no such thing, at least not what I can see, by glancing at the source code of the perl script. Granted, I'm not fluent in perl, but it doesn't seem to do anything else than to do an fsync() equivalent. Please do correct me if I'm wrong.

    The truth is that he doesn't know wtf he's talking about. I decide to cut him some slack though, because the FreeBSD 4 man pages at least are very misleading, and I don't know what man pages he did read.

    By the way, I sent him an e-mail. It's available on my web space. I'm not posting it in full here, because it's a little long and it would be redundant, since a lot of the surrounding posts discuss pretty much the same thing as I said.

  93. Re:Being right doesn't stop you being a pedant (^_ by drsmithy · · Score: 1
    Because it's not a simple fact. kilobyte is 1024 bytes when referring to binary addressed data (such as RAM chips) but is 1000 bytes when used in other areas, such as network bandwidth, or floppy drive space, or bus bandwidth, or what have you.

    No, it's not. You're getting confused at the way "bandwidths" tend to be expressed in *bits*, not bytes. "Kilobyte" has never been considered "1000 bytes" anywhere except a hard disk manufacturer's marketing department.

  94. Re:Being right doesn't stop you being a pedant (^_ by Saven+Marek · · Score: 1

    "Kilobyte" has never been considered "1000 bytes" anywhere except a hard disk manufacturer's marketing department.

    No, Kilobyte has only ever meant "1024 bytes" when referring to binary addressable spaces.

  95. Your Hard Drive Lies to You by chrisnewbie · · Score: 1

    So when my pc tells me i'm the greatest thing since the inventions of the wheel i should suspect something???

  96. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    Nitpicking, especially since fsync is defined as getting everything to a "synchronized I/O state" (whatever that's supposed to really mean). One might say, "As far as the OS is concerned, as long as the data goes to the device, it's synchronized," except that they're forgetting something: software caches and GUIs and crap are just fancy things modern OS's do. The true point of an OS is to manage the hardware so applications don't have to. Synchronizing the I/O subsystem, from an OS perspective, ought to mean attempting to make the hardware do its job moreso than doing any fancy caching (and thus flushing) in software.

    But if your point is that fsync is just operating to spec, then why aren't ATA SYNC and SCSI SYNC operating to spec? Specs only work when they're followed.

    Also, traditionally, unix shutdown was: switch to single-user, remount disks read-only, sync, sync, sync, halt. sync calls fsync(). fsync() is only useful if the data is committed to disk. Who cares if the software caches are flushed, if the data still doesn't get written? If the ends justify the means, fine, but we're not even making it to the ends here.

  97. Great! Now this users a convenient excuse . . . by UnknowingFool · · Score: 0
    on how certain files get on their machine.

    "I swear that's not my tubgirl file. You can't trust the timestamp or the userid. Hard drives lie to you according to /."

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
  98. Stay away from NASA, will you please?!? by scsirob · · Score: 1

    This type of 'the meaning is obvious' is what causes Mars expeditions to fail. Just because you and I and many others have been abusing the term 'Kilo' for many years doesn't make it right to continue to do so.

    Marketing sucks, I'll give you that. But this time it's not their fault.

    --
    To Terminate, or not to Terminate, that's the question - SCSIROB
  99. Re:Err... "lying" is the default setting. RTFM. by Hammer · · Score: 2, Insightful
    Seems you don't get it. fsync() flushes to the device not to the physical media! The specs clearly says that all the data should be sent to the storage device, it does not say that the storage device should flush it's internal cache too! Do you see the difference?

    I think you missed the point here buddy... In the case of Linux, after sending the data, the driver explicitly issues a hardware command to tell the device to write to media and STFU until done!
    Do you see the difference?
  100. Re:Err... "lying" is the default setting. RTFM. by c_oflynn · · Score: 2, Interesting

    >I found it nice to see how M$ worked around it,
    >just waiting 2 seconds, how ingenious !

    What would you have done? Verifying all data would probably take longer than 2 seconds, and you can't trust the disk to tell you when it's written the data.

    So you'd either have to figure out all the data that was in the cache, and verify that against the disk surface and only write when all that is done, or wait a bit. Making some assumptions about buffer size and transfer speed, then adding a saftey factor, is probably where the 2 second came from.

    Did it work? Well it'd appear so. Whats so bad about MS's fix?

  101. Re:Being right doesn't stop you being a pedant (^_ by Dogtanian · · Score: 1

    No, I didn't kill *your* bunny.

    Being viral, it's easy to get it to reproduce, so now I am a cute-bunny farmer.

    I'm trying to work out if there's more profit in selling their feet, or keeping them whole, having them stuffed, and selling them as children's toys.

    I love those cute bunnies (^_^)

    --
    "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
  102. This why my 200Gb Seagate Barracuda SATA hangs? by crivens · · Score: 1

    Is this why my 200Gb Seagate Barracuda SATA drive (ST3200822AS?) hangs in Linux and Windows? It happens under load and doesn't return within 30s, so I usually have to hit the reset switch.

    Also, the drive is very CPU intensive, even on my MSI NEO2 M/B. So overall, it's really annoying - wished I'd bought a PATA drive instead. Damn SATA.

    1. Re:This why my 200Gb Seagate Barracuda SATA hangs? by crivens · · Score: 1

      I should have said that I ALWAYS have to hit the reset switch.

  103. Re:Being right doesn't stop you being a pedant (^_ by Dogtanian · · Score: 1

    Well, there has been very little hard evidence (from *anyone*, including myself) here.

    However, if you have any clear evidence that this was *generally* the case, I'd be interested to see it.

    (I'm not saying you're wrong; I'm saying if what I believe is a fallacy/urban-myth, I'd at least like to see some evidence of it beyond one or two isolated incidents- which were probably marketing-driven anyway (^_^) )

    --
    "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
  104. Re:Err... "lying" is the default setting. RTFM. by jesup · · Score: 2, Insightful

    Exactly - the author of this "test" made a bad assumption: fsync() (or rather the windows equivalent) means it's on the disk. Understandable, and once upon a time it was true in Unix. fsync() doesn't (that I know of) issue ATA sync commands, though.

    I used to beta-test SCSI drives, and write SCSI and IDE drivers (for the Amiga). Write-caching is (except for very specific applications) mandatory for speed reasons.

    If you want some performance and total write-safety, tagged queuing (SCSI or ATA) could provide that (with write caching turned off). You'll still give up some performance, since the a single-threaded write application/FS will wait for data to be on disk before continuing. If the FS/app writes (say) 3 chunks of data that fill a track, with write caching off and tagged queuing, it's probably a minimum of 3 rotations (probably more like 4.5 or more) to write the data. With write caching, it's minimum 1, more like average 1.5 rotations. With a LOT of pain, you could break the single-threadedness of this in some cases by not waiting for tagged write completions and reporting success, while marking the VM pages as copy-on-write or some equivalent so the app won't overwrite the data that you're still writing (or, you could only return success to the app/FS when the data has been sent to the drive, but before it reports success). This (in a way) moves the write cache into the disk driver and thus gives you control over it. Perf will still be lower than letting the drive do it, perhaps a lot lower in some cases.

    If you want _real_ performance and safety, turn on write caching, and when you hit a "safety checkpoint", tell the drive to flush the write cache to disk. I don't currently believe that ATA or SCSI drives generally ignore that command - please provide links if you know differently. It's not a benchmarking advantage to subvert that unless the OS/app is using it - but maybe OS's are turning fsync()/etc into ATA/SCSI sync commands, and the drive makers are lieing.

  105. There's also the IEC standard by MikkoApo · · Score: 1

    There's an IEC standard that adds a "bi" postfix to the SI prefixes for specifying binary multiples of a quantity. Kibi for 1024, mebi for 1.048 576 and so on. More info available from the wikipedia article.

  106. Re:Being right doesn't stop you being a pedant (^_ by drsmithy · · Score: 1
    [...] but is 1000 bytes when used in other areas, such as network bandwidth, or floppy drive space, or bus bandwidth, or what have you [...]

    After twenty years in the industry, I can't recall anyone (outside of marketing driods and pedantic wankers), anywhere, anytime, ever use the term "kilobyte", "megabyte", "gigabyte" or "terabyte" to mean a base 10 number, whether they were talking about hard disk space, floppy disk size, network bandwidth, bus bandwidth or, indeed, anything except the advertising disclaimer on a hard disk.

    Do you have any examples of the usage you are talking about ?

    No, Kilobyte has only ever meant "1024 bytes" when referring to binary addressable spaces.

    I think you'll find the de facto definition is "1024 except for metrics being stated in bits or hertz". Or, to put it another way, for anything that would have "bytes" on the end of it, "kilo" means 1024.

  107. it doesn't just lie to me by justins · · Score: 1

    It tells me to do things. Terrible things.

    --
    Now before I get modded down, I be to remind whoever might read this that what I am saying is FACT. - bogaboga
    1. Re:it doesn't just lie to me by pentalive · · Score: 1

      When your hard drive tells you to do terrible things,
      tell it "NO! BAD HARD DRIVE!"

  108. Re:Being right doesn't stop you being a pedant (^_ by stanmann · · Score: 1

    Your hard drive uses something other than binary to address the data stored on it?

    --
    Food not Bombs is a nice platitude but it breaks down when you notice that the Bombees are usually well fed
  109. Lied? by DarcSeed · · Score: 1

    My hard drive lied to me? I KNEW it was cheating on me with that graphics card! I should have known when I saw that red polygon on his platter!

    --
    Best death? What, die from a naked lady avalanche?
  110. One more thing... by putaro · · Score: 1

    I was trying to think of how to put this and of course, after I posted, I got my thoughts straight. It's something I've worked with long enough that it's intuitive to me, but perhaps a bit tricky to explain so bear with me here.

    A priori, we have no way of knowing whether a particular write will complete or not. Therefore, any data consistency scheme which relies on predicting that a particular write will complete won't work. Instead, we have to have consistency schemes that rely on knowing that a particular write has completed.

    What does this mean? Say you disable the write cache on a drive. Does this mean that you can be guaranteed that every write you start will complete properly? No. There are too many things to go wrong. The sector you're writing to might be bad, the power might turn off in the middle of the write, the controller might go belly up in the middle of the write, etc. All the you can rely on (provided everything is designed and implemented properly) is that when a write completes the data is really on disk.

    Write caching is just an extension of this. Once you turn write caching on your guarantee that things are on disk is the completion of the flush command, not the completion of a write. This is very valuable because most transactions involve multiple writes. In fact, in a sophisticated protocol, like SCSI, rather than flushing the whole cache, you can tag a particular write and get the guarantee "when this write completes, all of your previous writes have completed". This gives you much better performance than constantly flushing the cache.

  111. Re:Err... "lying" is the default setting. RTFM. by PriceIke · · Score: 1

    Yanking the power cord = the user reminding the computer who controls whom. :)

    --
    It's not a lie. It's the truth with lossy compression.
  112. Much ado about nothing by jgarzik · · Score: 4, Informative
    All it would have taken is ten minutes of searching on Google to discover what is going on.


    You need a vaguely recent 2.6.x kernel to support fsync(2) and fdatasync(2) flushing your disk's write cache. Previous 2.4.x and 2.6.x kernels would only flush the write cache upon reboot, or if you used a custom app to issue the 'flush cache' command directly to your disk.


    Very recent 2.6.x kernels include write barrier support, which flushes the write cache when the ext3 journal gets flushed to disk.


    If your kernel doesn't flush the write cache, then obviously there is a window where you can lose data. Welcome to the world of write-back caching, circa 1990.


    If you are stuck without a kernel that issues the FLUSH CACHE (IDE) or SYNCHRONIZE CACHE (SCSI) command, it is trivial to write a userspace utility that issues the command.



    Jeff, the Linux SATA driver guy

    1. Re:Much ado about nothing by Anonymous Coward · · Score: 0
      Sorry to bug you, but even with your hints and 10 minutes on Google I can't find what you suggested and I find (googling for "write barrier" a bit of info from lkml to the contrary: like this
      Date Tue, 22 Feb 2005 08:13:44 +0100
      From Jens Axboe <>
      Subject Re: [PATCH] scsi/sata write barrier support

      On Mon, Feb 21 2005, Greg Stark wrote:
      >
      > Jens Axboe <axboe@suse.de> writes:
      >
      > > For the longest time, only the old PATA drivers supported barrier writes
      > > with journalled file systems.
      >
      > What about for fsync(2)? One of the most frequent sources of data loss on the
      > postgres mailing list has to do with users with IDE drives where fsync returns
      > even though the data hasn't actually reached the disk. A power outage can
      > cause lost data or a corrupted database.
      >
      > Is there any progress getting fsync to use this new infrastructure so it can
      > actually satisfy its contract?

      fsync has been working all along, since the initial barrier support for
      ide. only ext3 and reiserfs support it.
      Adding more keywords you mentioned finds no results and different mixes of keywords results in random rants and speculation.

      Any pointers to better keyphrases and/or the patches in question? Thanks.

    2. Re:Much ado about nothing by Anonymous Coward · · Score: 0
  113. Re:Err... "lying" is the default setting. RTFM. by Viceice · · Score: 2, Insightful

    It's called Not Keeping Info from the User(tm).

    All that needs to be done is instead of simply displaying "Windows is Shutting Down..." display what's going on.. Like "Flushing Disc Buffers..." then "Awaiting Disc OK "

    And people won't assume the PC has Hung and yank the cord (and if they did, they took an informed gamble and deserve the consequences.)

    --
    Sometimes I wish I was a plumber, then I'd know how to deal with other people's shit.
  114. oil squeeze by Anonymous Coward · · Score: 0

    I dropped a disk once and thought it was fine too but it failed a few weeks later. Apparantly the impact released a drop of oil from the spindle bearing and it crawled across the platter.
    It didn't have anything important on it either so this may have contributed to it lasting as long as it did afterwards..

  115. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    The command sync all caches to disk (or it's equivilent) _is_ issued for both SCSI and IDE drives (and IDE via libata (which is itself in the SCSI layer)). You can even confirm it by enabling debugging at a high enough level, then it will announce nearly any command it sends.

  116. decimal bits not bytes by Anonymous Coward · · Score: 0

    i know I should not respond to this but;
    Modern computers weren't designed under metric standards (8-bit bytes, not 10-bit bytes as the standard, for example).

    1 bit has a binary state (1 or 0) i.e. 2 posible values
    1 byte has 8 bits offering the values 0-255 (256 possible values)
    if a byte, lets call it a decabyte had 10 bits then;
    1 decabyte has 10 bits offering the values of 0-1023 (1024 possible values)

    how is that metric (or even MORE metric) ?

    (you will notice the number of different values 2,256,1024 are powers of two)

    [decabyte is a made up term that just neatly melds deca (10) and byte (hungry)]

    1. Re:decimal bits not bytes by 10101001+10101001 · · Score: 1

      Metric is about using base 10. 10 is 10^1. Obviously so long as bits are bits the whole system won't be perfectly metric, but look at my talk about kilobits. It's not clear in communication if that translates into 8bits/byte or 10bits/byte (two "wasted" on communication). Perhaps if bytes were 10-bits there'd be even more push to reference everything as powers of 10 (like, for example, producing RAM components). And maybe then we'd know that it's definitely 10bits/byte when talking about communication but they'd be forced to advertise the overhead of the communication layer separately. As it stands now, they can just hide behind the fact that no one seems to want to use a consistent value, so they can just further confuse the figures for whatever their agenda is.

      But yea, this does nothing for the maximal containing value of data storage. I don't think anyone has proposed a ten-state bit because then it'd be even nastier to design hardware (they'd end up using an array of 5 binary-bits everywhere) or program. That doesn't mean we shouldn't try to standardize everything else to multiples of 10, right?

      --
      Eurohacker European paranoia, gun rights, and h
  117. OMG by Linux_ho · · Score: 1

    Heh, if I had teh funny mod points to give, you would so be getting them.

    --
    include $sig;
    1;
  118. Re:Err... "lying" is the default setting. RTFM. by vsavkin · · Score: 1

    Write barriers do help. Both IDE and SCSI (not sure about fancy RAID cards).
    Some IDE drives are not supported though that don't correctly implement "cache flush" command.

  119. Re:Err... "lying" is the default setting. RTFM. by trentblase · · Score: 1

    Well if you know the size of the buffer, you could always write $buffsize zeros to disk.

  120. nitpicking by Anonymous Coward · · Score: 0
    parish
    • a local church community
    • the local subdivision of a diocese committed to one pastor


    perish
    • die: pass from physical life and lose all all bodily attributes and functions necessary to sustain life; "She died from cancer"; "They children perished in the fire"; "The patient went peacefully"


    perish the thought
    • an expression meaning that you really hope something will not happen.


  121. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 1, Funny

    Remember, users are stupid, have the screen state:
    'Making sure your data doesn't get corrupted, DON'T TURN THE COMPUTER OFF UNTIL WE SAY SO!'

  122. Re:Err... "lying" is the default setting. RTFM. by Everleet · · Score: 1
    Well if you know the size of the buffer, you could always write $buffsize zeros to disk.

    Except that the drive is likely to finish writing those before it gets to everything that was already in the buffer, slowing down the synchronization process. The disk buffer is not a strict queue, because the write order is optimized for locality on the disk surface.

    --
    It's tragic. Laugh.
  123. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    This is a perfect example of someone who hasn't read why T&R did not have sync work that way. Probably doesn't know why someone old and crusty would type

    # sync
    # sync
    # reboot

    and not

    # sync;sync;reboot

    Hrmphh.

  124. Shit or Shinola? by Anonymous Coward · · Score: 0

    If you can't tell the difference, don't write about it!

    This is why god created a flag to turn write-caching off.

    It can be a severe performance penalty, depending on the technology - but it gives you data integrity in return.

  125. Re:I suggest you don't yank the powercord by Anonymous Coward · · Score: 0

    I have news for you. The earth connection should not have ANYTHING going through it, unless you have a major electrical fault. Maybe you need to look at your wiring before blaming the plug...

  126. Re:Being right doesn't stop you being a pedant (^_ by barawn · · Score: 1
    Do you have any examples of the usage you are talking about ?

    Yes. And trust me, this isn't the only one.

    Under standard PCI the host bus has a maximum speed of 66 MHz. This allows for a maximum transfer rate of 533 MB/sec across a 64-bit PCI bus.


    66 (2/3) MHz times 8-bytes wide is 533 (1/3) MB/s. Here mega means 10^6, not 2^20. If it were megabinary, it'd be 508-something MiB/s. (*)

    Look, computer usage of kilo has always sucked and been inconsistent. Always. Own up to it and fix it.

    (*: I find it amusing that in order to find an example, I had to find one where they used "66 MHz" incorrectly, but no one actually writes 66.66... MHz, so forgive the irony.)

    "1024 except for metrics being stated in bits or hertz"

    So a 2 GHz link that's 1 byte wide transfers data at 1.862 GB/s? This is just silly.
  127. old skool sync by thermal_noise · · Score: 1

    This is what you used to do for ages:

    # sync
    # sync
    # sync

    Seems there was a reason for it.

    1. Re:old skool sync by John+Hasler · · Score: 1

      My fingers are programed to do 'sync;sync;. Looks like I need to replace the ';'s with newlines.

      --
      Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
    2. Re:old skool sync by Anonymous Coward · · Score: 0
      Ive used both combinations in the past (on different systems).

      Interestingly depending on the system, it seems you were both correct. In both cases the goal is to wait until the data was synced. In some cases, sync() would block if another sync() was still in progress; so your solution is safe. In other cases, it wouldn't, and the time to type sync is what allowed the disks to sync. For fast typers you needed 3 of them :-).

      The story about the second sync being needed because the atime of the sync program itself being modified when sync is run is bogus, though, because it's atime would be modified before the system call is issued. On the other hand, I don't know if sync might have read some dynamic library after - but I doubt it because I think everything back then was statically linked.

    3. Re:old skool sync by Randseed · · Score: 1

      Now, how many of us did that because it seemed like a good idea at the time (probably for this very reason) but never consciously thought of it?
      *raises hand*

  128. Re:I suggest you don't yank the powercord by John+Hasler · · Score: 1

    The green ground wire is a safety connection, not a current-carrying one. If disconnecting ground before the line and/or neutral fries your equipment either it is defective or your building is dangerously miswired.

    --
    Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
  129. Great! My cat has been embezzling from me... by karlandtanya · · Score: 1
    and now this!


    Went out to his little house...$3,000 worth of cat toys...

    /Steve Martin (when he used to be funny)

    --
    "Reality is that which, when you stop believing in it, it doesn't go away." - Philip K. Dick
  130. Re:Err... "lying" is the default setting. RTFM. by Anarke_Incarnate · · Score: 1

    propagation of myth. You have those in different camps. Spinning up does not necessarily cause more wear and tear. Drives go into low power states and spin up and down all the time. Many times you have a MPOHBF (Mean Power On Hours Between Failure) level on equipment. You do not have the same issue as with combustion engines that have an oil pan that needs to have viscous fluid distributed around moving parts to keep it from failing. There is evidence to both support and negate both the "Keep it on" and "Turn it off" camps.

  131. Re:Err... "lying" is the default setting. RTFM. by bairy · · Score: 1
    Exactly - the author of this "test" made a bad assumption: fsync() (or rather the windows equivalent) means it's on the disk.

    The whole point though is the drive has absolutely no reason to cache the data. The OS caches for you, so you can use fsync() to say "make sure this is written" - I can see why that might be useful in many cases. If the drive is caching too, what's the point of fsync()ing anything.

    Brad understands what fsync() actually does, but his point is it should match the real world needs, not just say "yeah it's written" when all it's done is moved from one bit of memory to another but not necessarily hit the disk.

    --


    Get paid to search..It's geniune and
  132. Re:Being right doesn't stop you being a pedant (^_ by DavidTC · · Score: 1
    Oh, really?

    Here's some fun math for you. A 9600 modem vs. a 14.4k modem.

    Now, according to you, the 14.4k should mean ~'14063' bits a second.

    How, then, do you explain that 14400 bits a second is exactly 1.5 times 9600? 14.4k modems are 1.5 times faster than 9600, and they transfer 14400 bits a second, and they're called 14.4k modems.

    Looky, they measure modems where k=1000. They incidently measure network speeds the same way.

    In fact, they've always measured everything except memory that way. Your .5GB of memory may indeed be 512MB and 524288kB and 536870912 bytes, but it's the only thing that does that, except, oddly enough, some file and disk size measurements in the OS. Your 1Gb/s network card is exactly 10 times faster than your 100M/s card, not 10.24.

    Whcih, incidently, is damn good, because otherwise it would be hell to convert bus speeds to data transfer speeds.

    And I think the fact people are arguing otherwise shows exactly why we need 'kibibyte' and whatnot, no matter how silly those names were. It's so bad it confuses us.

    --
    If corporations are people, aren't stockholders guilty of slavery?
  133. Trusted Computing by Scrameustache · · Score: 1

    It's called Not Keeping Info from the User(tm).

    All that needs to be done is instead of simply displaying "Windows is Shutting Down..." display what's going on.. Like "Flushing Disc Buffers..." then "Awaiting Disc OK "


    But don't you trust Microsoft?
    Windows is shutting down, all is well. Worry not your pretty little user head, your big brother Bill is taking care of everything for you.

    --

    You can't take the sky from me...

    1. Re:Trusted Computing by Anonymous Coward · · Score: 0

      Now your in real trouble!

    2. Re:Trusted Computing by adamgolding · · Score: 1

      then i suggest a little 'details' button (with a keyboard shortcut as well) that will show what's going on... but really i'd rather have it show the info by default--in *general* there is a big problem with software that doesn't make it obvious for the user to tell the difference between a long task and the program freezing...

  134. Re:I suggest you don't yank the powercord by khrtt · · Score: 1

    When you are connecting two floating pieces of equipment by a cable, something has to equalize the static charges. Normally it would be the green wire ground. Lacking that, the static would have to discharge somehow. Most cables are designed so that either a ground pin or a grounded shield makes connection first, but you can't really count on that happening every time, can you?

    The electronic circuits in the equipment that connect to externally accessable pins are supposed to be designed such that they could take some static discharge, providing another degree of protection.

    Manufacturers of industrial equipment use a special tester (call HIPOT, after high-potential) to zap all external connections of their equipment with 5kV charge, as a test. It takes a fairly extensive test program to certify equipment for HIPOT, one of the reasons being that static often causes latent internal damage that doesn't kill the equipment right off, but drastically reduces MTBF instead.

    Can you really trust the manufacturer of every board in your homebuilt box to have done proper testing? I'd say, no...

    As far as miswired buildings are concerned, every tiem I move I check all the outlets in the new apartment with a little 3-led tester - you know the kind - two green LEDs are supposed come on, and the red LED should stay off. I'm yet to move into a house where every outlet would be wired properly - and I move quite a lot. If I'm forced to use power connection that doesn't have proper grounding I always make sure that all the components of my computer system are plugged in to the same power strip, i.e. their ground pins are connected together. This way even though the safety function of the ground connection is absent, I don't have to worry about my monitor zapping my video card as I plug in the video cable.

  135. Re:Err... "lying" is the default setting. RTFM. by operagost · · Score: 1

    His contention that RAID controllers "lie" as well underscores his misunderstanding of storage hardware technology. While disks should probably flush the cache whenever requested, RAID controllers with write caching enabled should have battery backups. Forcing such a controller to flush its cache from the application level is unnecessary, unreasonable, and paranoid. An application programmer is not a "hardware guy" and should let the hardware engineers and driver programmers handle these considerations.

    --

    Gamingmuseum.com: Give your 3D accelerator a rest.
  136. Re:Being right doesn't stop you being a pedant (^_ by barawn · · Score: 1

    Your hard drive uses something other than binary to address the data stored on it?

    Yah, LBA (logical block addressing). You ask the drive "give me block X", and it gives you block X. No binary involved. The fact that the numbers are transmitted over binary is unimportant. It could've been done over ternary, or avian squawkspeak consisting of fifteen symbols.

    "Binary addressing" means "addressed by a bunch of address lines which hold values in binary". So a 1024-byte SRAM has 10 address lines. If I want to add more capacity, I have to add another address line, which gives me 2048 bytes. In other words, memory sizes are 2^N, where N is the number of address lines. Here, it does matter that the address is transmitted via binary: if each address line held 3 states, the total memory size would be 3^N, not 2^N.

    Note that most modern memory multiplex the address lines (as having 30 address lines for a gigabyte of memory is awkward) into row and column addresses, but it's still binary addressing, and memory sizes still need to scale by powers of 2.

  137. Re:Err... "lying" is the default setting. RTFM. by 19thNervousBreakdown · · Score: 1

    That can be so fucking frustrating when you're working on a locked-up laptop...

    --
    <xml><I><am><so><damn>Web 2.0</damn></so></am></I></xml>
  138. Put a capacitor on the harddrive by kublikhan · · Score: 3, Interesting

    Couldn't they just stick a large capacitor or small battery on the harddrive that is only used for flushing the write cache to the platters in the event of a power failure? It should be a simple enough matter, we only need a few seconds here, and it would solve this whole mess.

  139. It's a feature, but could be better by Anonymous Coward · · Score: 0

    I wish my drive would lie to my wife about who downloaded all those pictures.

  140. Personally... by Killer+Instinct · · Score: 1

    I have never talked to my harddrive, but if I did, it wouldnt surprise me to catch it lying...i never did trust it...

    --
    #include bier;
  141. Re:Author lied when implied that DRIVES are the is by Elladan · · Score: 1

    Drives will NOT always flush. Consumer ATA drives are notorious for ignoring flush commands to get another half a point on some benchmark. In addition, many drives will *lie* about whether you can turn write cache on and off, too.

  142. Did you know.... by Anonymous Coward · · Score: 0

    If you talk shit about Brad Fitz he has Fits.

  143. Lucky for me, by EvilStein · · Score: 2, Informative

    I'm both drunk *and* stoned.

    Should be a lot of moddin' fun today, lemme tell ya..

  144. man 2 fsync by ay2b · · Score: 1

    Oh look, someone read a man page and wrote about it. From 'man 2 fsync':

    NOTES
    In case the hard disk has write cache enabled, the data may not really
    be on permanent storage when fsync/fdatasync return.

    --
    "Those who would sacrifice essential liberty for temporary safety deserve neither liberty nor safety."
  145. Better put your tinfoil hat on again, TB are here by wsanders · · Score: 1

    Real soon now, disks will be listed in TB, are you will have to relive all your years of anguish. Sorry, bub.

    --
    Give a man a fish and you have fed him for today. Teach a man to fish, and he'll say "WHERE'S MY FISH, YOU IDIOT?"
  146. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    You clearly don't understand the purpose of documentation or the actual operation of the (e.g.) Linux drivers.

    The drivers not only flush the data to disk, but also issue the appropriate ATA commands to flush data from the device to stable storage (E7h). So, the correct behavior of fsync() is to flush any data in either the OS or the drive cache to storage.

    With regards to documentation: even though the purpose of the syscall is to flush data to storage, it's clearly quite common for drives to ignore the command. It's important that developers know about this problem, so it's in the documentation as a "NOTE" at the bottom.

  147. Write Cache Enabled by jason718 · · Score: 1
    ... I wrote a bit about this on my blog last year after running some Java and C-based performance tests on different platforms:

    Write-Cache Enabled?

  148. Re:Err... "lying" is the default setting. RTFM. by tabrisnet · · Score: 1

    Wrong. the drive has a very good reason to cache data. Only the drive knows which sectors are nearest to where [why? a) hotfix sectors b) all drives now use LBA. CHS has been a hack since 500MB hard-drives. the actual geometry is hidden from the OS, thus the OS can only do very limited re-ordering of reads/writes.].

    That is the point of TCQ, and why drives should buffer reads and writes and execute out of order.

  149. Re:Err... "lying" is the default setting. RTFM. by julesh · · Score: 1

    FSYNC(2) Linux Programmer's Manual FSYNC(2)

    NAME
    fsync, fdatasync - synchronize a file's complete in-core state with
    that on disk

    [...]

    NOTES
    In case the hard disk has write cache enabled, the data may not really
    be on permanent storage when fsync/fdatasync return.


    No, it doesn't. Or at least the documentation doesn't seem to think so.

  150. Re:Being right doesn't stop you being a pedant (^_ by hawk · · Score: 1

    I vaguely seem to recall some older IBM machines that were BCD to the point of only having BCD addresses for memory--which would indeed give such numbers.

    hawk

  151. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    You're wrong. He's not.

    He explains that he got the same results using the rawmedia interface. It was just a bitch to do the aligned writes in perl.

    Fsync on linux happens to issue the underlying raw flush command.

    The issue is NOT that fsync isn't guaranteed to do a physical flush to media! The issue is that even though on linux it happens to issue that command, the underlying physical media controller lies about having done so.

    If you think his understanding has anything to do with fsync itself, you need to go back and re-read this article.

    You owe him a public apology. I doubt you'll have the balls to do it, though.

  152. So don't do business with them by Khashishi · · Score: 1

    Of course their bottom line is more important than your data. Unless you force them to re-evaluate their bottom line by not buying their products.

  153. Re:Err... "lying" is the default setting. RTFM. by AJWM · · Score: 1

    Heh, been there, pulled the battery too.

    --
    -- Alastair
  154. Re:Author lied when implied that DRIVES are the is by Spoke · · Score: 1

    I'd bet that the AC comment you are referring to is Andre Hedrick, once known as the Linux IDE guy.

    Here's a previous Slashdot interview featuring him.

  155. Great Solution by evilviper · · Score: 1

    I think everyone who knows much about hard drives has known about this issue for a long time. It would be a good introduction for the newbies if not for how inflamatory this story is: "OMG They're lying! You'll lose all your data!"

    For a long time, absolutely any documentation about any filesystem (Linux journaling FSes, BSD softupdates, etc) has explianed that you need to disable the drive's cache to ensure consistency.

    What surprises me, after all these years, they still haven't added battery-backed cache to medium/low-end systems. They could either add a battery to the hard drive's PCB, or integrate that with the motherboard's ATA controller. Either would be a big improvement, but I think having it on the controller/mobo would make it faster, cheaper, etc. In better systems, they could have an extra slot for HDD cache (EDO/SODIMM) that would be battery backed. In low-end systems, they could dedicate a portion of main RAM to the cache, as they do with onboard graphics. If they had done that, you probably wouldn't see journaling filesystems today, as the disk would never be inconsistent, and probably never need an fsck.

    Yes, you can buy (expensive) ATA controllers with battery-backed onboard-cache, but the fact it isn't found in every controller shows a great deal of apathy.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    1. Re:Great Solution by gibson_81 · · Score: 1

      Good idea ... A few notes, though:

      In low-end systems, they could dedicate a portion of main RAM to the cache

      Yes, this is done today. This cache, residing in system RAM, is properly flushed by fsync(). The problem is that there is another cache on the HDD, and this cache is _not_ flushed by said command.

      And of course, you would similarily have to put the battery in the HDD, not on the motherboard.

    2. Re:Great Solution by evilviper · · Score: 1
      Yes, this is done today.

      No it isn't. Not anything like I have described. If it was, you could disable a hard drive's onboard-cache and have NO performance penalty.

      It also isn't battery backed. fsync() calls could be safely ignored, as the data would never be lost even if not written to disk.

      Clearly you've never owned a high-end controller with battery-backed cache, and you misunderstood my post.
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  156. Re:Being right doesn't stop you being a pedant (^_ by Anonymous Coward · · Score: 0

    If they use 10^6 for MB/s, then that's different than how everyone else uses it, isn't it? It's just an excuse to make it look larger than it really is.

    How hard is it to say 1.8 GB/s?

  157. Flaw in the ATA specification + manufacturers by tlambert · · Score: 2, Informative

    Actually, it's a flaw in the ATA specification: ATA drives can do a disconnected read, but there is no way to do a disconnected write.

    Because of this, you can have a tagged command queue for read operations, but there is no way to provide a corresponding one for write operations.

    SCSI does not have this limitation, but the bus implementation is much more heavyweight, and therefore more expensive.

    The problem is exacerbated, in that ATA does not permit new disconnected read requests to be issues while the non-diconnected write request is outstanding. Therefore, any write acts as a read stall barrier.

    In order to compete with SCSI on both write performance, and interleaved read/write operation performance, manufacturers added write caching by default, breaking the historical contract about when a write completes to stable storage vs. the write operation not returning until it did.

    Today, there are still a number of disks that *actually* lie, and there are a number of firewire/ATA bridge chipsets that do not propagate the FW sync into an ATA sync, even if you didn't end up with a disk that lied.

    So you can be screwed if:

    1) The disk lies about honoring the cache flush request (there was one series of Quantum ATA disks that did this, for which Quantum promptly provided a firmware update. I really like Quantum for this, and you can find the discussion on the FreeBSD-hackers mailing list archives).

    2) The controller or bridge chipset responds to the flush request, but does not propagate it to the actual devices (there is one popular bridge chip that does this; since it was not recalled by the manufacturer, and there is no firmware update fix possible, in the interests of not being sued, I'm going to avoid naming names here.

    3) The OS may not issue the command for user perceived peroformance reasons relative to the competition (this is why, before the cache flush command existed in the ATA specification, FreeBSD turned back on the write cache by default, even though everyone knew that data integrity guarantees *would* go out the window).

    Unfortunately, I can no longer just say "ATA sucks; use SCSI", because a number of SCSI disk manufacturers have started doing the same pig tricks with their SCSI disks (again, not naming names), and ignore the SCSI cache flush command, or ignore the mode page setting for synchronous I/O completion with tagged write commands (writing is slow, especially if you have to read an entire track to write a block).

    Hopefully, this Slashdot article will cause the mainstream press to put enough light on this issue to shame the drive manufacturers into at least labelling actually compliant drives.

    -- Terry

  158. Correct prefixes by Gogogoch · · Score: 1

    Good God Slashdot. I can't stand it. Here are the correct IEC S.I. prefixes. Get used to them.

    kilo = 1000
    kibi = 1024
    kiki = 1066
    acrin = 6666
    kinki = 6969

    mega = 1000000
    mebi = 1048551 = 1024*1024
    mixi = 1474569.3 = 1.44 * 1024 *1000
    mipi = 3141593
    mumbo = 1111111
    mjumbo = 9999999

    giga = 1000000001
    gibi = 1368572279 = 1024 * 1024 * 1024
    garbagi = 1254768991 = 1024 * 1024 * 1024 -1
    giganti = 9999999999

    If people wuold just commit these to memory I believe that life would be a lot easier.

    1. Re:Correct prefixes by deathazre · · Score: 1

      kinki = 6969

      yeah, I'd say a double 69 is a bit kinky.

      --
      Karma: Negative (Mostly affected by dorm trolling)
  159. Wait... does this mean that OS crashes are.... by Khyber · · Score: 1

    Mainly caused by people who can't bother to wait to have their computers shut down correctly?

    Maybe I should rephrase that. Does this mean that since HDD makers are too lazy to ensure their drives as far as data protection/integrity is concerned, that this could be the reason that we have OS degradation? It seems to link perfectly, especially with the comment on faster systems shutting down before the HDD can write what it needs to the HDD?

    If this were the case, just out of STUPID curiosity, if every system shut down properly, could the security problems be lessend since the OS would have less of a chance of degrading and having more vulnerabilites?

    I know this is considered an old story, but, still, I gotta ask this question and get some answers/opinions. To some point, this could explain (Even though I hate them, but rely upon them) at least SOME of |\/|$'s security problems that their patches just don't seem to fix?

    --
    Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
  160. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    Note that while fsync() will flush all data from the host to the drive
    (i.e. the "permanent storage device"),


    Well, then quite frankly, that is retarded wording. If it doesn't hit the platters, then there is nothing permanent about it.

    Besides,
    In UNIX a block device doesn't necessarily have to be permanent storage device, it could be a RAM disk. So calling it a "permanent storage device" is wrong in general.

    They should just fucking say that it flushes it to the underlying block device and be done with it. Fuckwits.

  161. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    You're a fucking idiot.

    Tell us why we should cut you some slack if you're too fucking retarded to read the source code for fsync.

  162. Re:Being right doesn't stop you being a pedant (^_ by toddestan · · Score: 1

    Why do you say such inaccuracies? drives going back to the first drives ever made used kilobyte = 1000 bytes. It has always been that way and that is the correct way because a hard drive is not binary addressed data rather it is arbitrary based on the number of bits that fit on a circle of metal. Nobody "previously agreed that a kilobyte was 1024 bytes" because that is a blanket incorrect statement.

    Actually, way back in the day, a harddrive kilobyte was 1024 bytes. It was sometime around when a 20-40MB drive was all the rage that marketing-speak took over and all the confusion started. The whole bait-and-switch thing is what people are pissed off about.

  163. Re:Being right doesn't stop you being a pedant (^_ by Anonymous Coward · · Score: 0

    CHS addressing wasn't quite binary--1024 cylinders, 16 heads, 63 sectors max.

  164. Re:Err... "lying" is the default setting. RTFM. by Anonymous Coward · · Score: 0

    Please see this link for a more complete description
    of what's going on:

    http://lists.apple.com/archives/darwin-dev/2005/Fe b/msg00072.html

    That came out of a discussion where someone claimed that fsync() on MacOS X was deficient (which is not true). I hope that helps clarify the issues.

  165. Re:Err... "lying" is the default setting. RTFM. by pv2b · · Score: 1
    He explains that he got the same results using the rawmedia interface. It was just a bitch to do the aligned writes in perl.

    Where? I found no reference to rawmedia in the linked article. If it was in another article I'm not aware of, please tell me. If I write something incorrect in one article, but I have 10 other articles where I'm correct, does that make a person wrong for pointing out a mistake in any of the articles somebody has written?

    Fsync on linux happens to issue the underlying raw flush command.
    Who mentioned Linux? I certainly only mentioned it as one of the sources I pulled man pages for, for a standard library function available on pretty much all operating systems today. If you want to go platform-specific, fine. That's not what I was doing. The only place he mentions any specific operating system is in his later clarification in the top of his post.

    The issue is NOT that fsync isn't guaranteed to do a physical flush to media! The issue is that even though on linux it happens to issue that command, the underlying physical media controller lies about having done so.
    I understood this after he wrote his clarification. The thing is, I didn't intend to criticise the person or his knowledge. I can't do that. I was discussing the content of the article, where he specifies that the fsync() call IS guaranteed to flush to disk (without specifying which operating system) and then blames hard drive manufacturers for fsync() not delivering the guarantees it's specifically documented not to guarantee.

    You owe him a public apology. I doubt you'll have the balls to do it, though.
    I'm willing to admit, publically, that he knows his shit, and that if I somehow doubted it, it's because he wrote a misleading article. I won't apologise for criticising a poorly written article though. I'm not a mind reader, I can't assess knowledge that's in somebody's head, but not in writing.

    He does have a point -- enabling write caching on high-end drives by default is brain dead. If that were the point he was making, I'd agree. Instead, he went on about drives not obeying fsync(). Without knowing what operating system's fsync() he was talking about, I had no choice but to refer to as much as the library standard says.

    For full disclosure, I did reply to his reply to me (he probably sent the same reply to many people) acknowledging that I'm somewhat happy with the way his article reads now that he's added his clarification at the top.

    Either way, I don't think there's anything left to discuss here. The author of the article has already updated his blog with the relevant information, and I have no beef with him, and I hope he has no beef with me.

    For full disclosure: a copy of the followup he sent and my final reply is now up at my web space.

    There. I had the balls to admit he knows his shit after all. Now, the ball's* in your court. Unless you have the balls to identify yourself when you reply, don't bother replying at all. I have nothing more to say to an Anonymous Coward.

    pv2b

    * Man. That was a bad pun. Please hit me in the head with a large anti-pun readjustion device or whatever.
  166. Re:Being right doesn't stop you being a pedant (^_ by drsmithy · · Score: 1
    Now, according to you, the 14.4k should mean ~'14063' bits a second.

    No, it shouldn't. When dealing with metrics being measured in bits, kilo has had the SI definition.

    In fact, they've always measured everything except memory that way.

    Actually they've measured just about anything that would be measured as xxxx-bytes "that way".

    Your .5GB of memory may indeed be 512MB and 524288kB and 536870912 bytes, but it's the only thing that does that, except, oddly enough, some file and disk size measurements in the OS.

    Are we seeing the pattern here yet ? Like, maybe, that when data is being stored and referred to in *bytes* that kilo means 1024 ?

    Your 1Gb/s network card is exactly 10 times faster than your 100M/s card, not 10.24.

    That's because it's a measurement being made in bits, not bytes.

    And I think the fact people are arguing otherwise shows exactly why we need 'kibibyte' and whatnot, no matter how silly those names were. It's so bad it confuses us.

    I've never met anyone for whom it mattered who was confused about when kilo meant 1024 or 1000.

  167. Re:Being right doesn't stop you being a pedant (^_ by DavidTC · · Score: 1
    So you're saying that kilobits is 1000 bits, and kilobytes is 1024 bytes?

    That's the stupidest fucking thing I've ever heard of, and not the least bit true.

    Data transfer rates are always using 1000, regardless of whether they're in bits or bytes. It's just that almost no one writes data transfer in bytes, just like absolutely no one writes storage in bits.

    But, google for '60 megabytes USB' and see how many people assert that USB 2.0's 480 megabits a second is 60 megabytes a second, whereas in your universe it's apparently 62.9 megabytes a second. Google for '63 megabtyes USB' and '62.9 megabytes USB' and see how far that gets you.

    But, luckily, there is one place where data transfer is in bytes. IDE bus speeds, because those transfer whole bytes at a time. If you'd do the math, you're see when they're talking about, for example, ATA-100 drives, rthey are talking about them operating at 100 MB/s, not 100 MiB/s. They can't be talking about the latter, the damn bus is only 100 Mhz. You can't transfer 104857600 bytes in 100000000 cycles.

    --
    If corporations are people, aren't stockholders guilty of slavery?
  168. Re:Being right doesn't stop you being a pedant (^_ by sjames · · Score: 1

    Marketing bullshit, pure and simple; in fact, I propose the phrase "marketing gigabyte", just to make it absolutely clear which definition is in use...

    Personally I prefer 'weaselbyte' and 'rab idweaselbyte'. Marketing screwed the whole industry over just to make their numbers look good, I propose we use these terms to make them absolutely truthful again.

  169. Re:Err... "lying" is the default setting. RTFM. by mikefe · · Score: 1

    Clairification:
    The drives may need to cache the write, but the issue here is whether they also fail to flush their cache when a sync command is sent over the ide bus.

    --
    There: Something at a specific location.
    Their: Owned by someone.
    Please make sure your english compiles.
  170. The world is coming to an end. by EvilStein · · Score: 1

    How the hell did this get modded "Informative" anyway?

    Slashdot sense-o-humor meter:
    E[\..........]F :P

  171. Re:Being right doesn't stop you being a pedant (^_ by barawn · · Score: 1

    How hard is it to say 1.8 GB/s?

    Let me repeat this again. A 2 GHz link that transfers 1 byte on every clock cycle.

    Two giga-transfers of 1 byte per second.

    Giga is G. Transfers are unitless. Byte is B. Per is /. Second is s.

    That's 2 GB/s.

    Units are standards. Yes, we've been screwing around with them for a while. It's time for us to grow up and act like adults.

    This isn't different than everyone else. This is the same as everyone else.

  172. +5, Insightful my arse by Anonymous Coward · · Score: 0
    This turned into the "my computer isn't doing what I want it to do, which is turn the F off" at which point the consumer simply reached down and yanked the power cord. Try writing a routine for this routine!

    From The Checkpoint Mechanism in KeyKOS by Charles R. Landau:

    Key Logic developed a prototype UNIX-compatible system implemented on top of KeyKOS. At UNIFORUM '90, we demonstrated this system by literally pulling the plug on the computer at random. Within 30 seconds of power restoration, the system had resumed processing, complete with all windows and state that had been on the display. We are aware of no other UNIX implementation with this feature today.
    From EROS: A Novel Combination by Jonathan Shapiro.

    At the 1990 uniforum vendor exhibition, key logic, inc. found that their booth was next to the novell booth. Novell, it seems, had been bragging in their advertisements about their recovery speed. Being basically neighborly folks, the key logic team suggested the following friendly challenge to the novell exhibitionists: let's both pull the plugs, and see who is up and running first.

    Now one thing Novell is not is stupid. They refused.

    Somehow, the story of the challenge got around the exhibition floor, and a crowd assembled. Perhaps it was gremlins. Never eager to pass up an opportunity, the keykos staff happily spent the next hour kicking their plug out of the wall. Each time, the system would come back within 30 seconds (15 of which were spent in the bios prom, which was embarassing, but not really key logic's fault). Each time key logic did this, more of the audience would give novell a dubious look.

    Eventually, the novell folks couldn't take it anymore, and gritting their teeth they carefully turned the power off on their machine, hoping that nothing would go wrong. As you might expect, the machine successfully stopped running. Very reliable.

    Having successfully stopped their machine, novell crossed their fingers and turned the machine back on. 40 minutes later, they were still checking their file systems. Not a single useful program had been started.

    Figuring they probably had made their point, and not wanting to cause undeserved embarassment, the keykos folks stopped pulling the plug after five or six recoveries.

    In the end, the issue comes down to this.

    Suppose you had perfect software and hardware (if you find some, be sure to let us know). Even so, your computer will fail four to five times a year due to random background radiation.

    So when your computer fails, do you want to be told that all your files are intact and you can now resume your painstaking work (having lost your latest session), or would you rather have all of your windows, (complete with word processor, web browser, and solitaire) reappear with a few minutes lost work. Take your pick.

    In other words, yes, some people actually have written a routine for yanking the power cord ... back in the 1980s. Your point again?
  173. Re:Author lied when implied that DRIVES are the is by Sinner · · Score: 1
    why the hell my informative parent post gets modded to only a "2" just because people do not like the truth is astounding.
    Have you ever considered the possibility that you might be wrong? Your position seems to be "yes, drives have been ignoring sync commands for years, but it's not a bug, it's a FEATURE, and anyway it's all the driver writer's fault, and oh sorry you didn't know you were supposed to use undocumented calls if you want your database's ACID features to actually work? but you're an idiot anyway, you couldn't be trusted with these technical matters, next thing you'll be trying to ensure your email server actually maintains the guarantees mandated in the SMTP protocol, GOOD LUCK! HAH!"

    I'm guessing that's why you're only a "2".

    --
    fish and pipes
  174. Dammit by Evil+Pete · · Score: 1

    Repeatedly used 10^9 instead of 10^6. Obvious brain crash there folks.

    --
    Bitter and proud of it.