Slashdot Mirror


Turing Award Winner On The Future of Storage

weileong writes "Ars Technica highlights an interview at ACM Queue with Jim Gray, a winner of the ACM Turing award *(among other things) by one of the pioneers of RAID (among other things). Many issues touched upon, including: "programmers have to start thinking of the disk as a sequential device rather than a random access device." "So disks are not random access any more?" "That's one of the things that more or less everybody is gravitating toward. The idea of a log-structured file system is much more attractive. There are many other architectural changes that we'll have to consider in disks with huge capacity and limited bandwidth." Actual interview has MUCH detail, definitely worth reading."

11 of 227 comments (clear)

  1. Huge disks by heironymouscoward · · Score: 5, Insightful

    If I look at the trends of the last decades, while disk sizes increase exponentially, the actual number of top-level objects I store on my systems increases only linearly, and quite slowly. True, I still store individual documents, but I also store AVIs, ISOs, entire photo albums that take gigabytes each.

    It's still random access: I can choose and access an object, even individual photos, without scanning through large amounts of unwanted data.

    --
    Ceci n'est pas une signature
  2. Ouch... by cybermace5 · · Score: 4, Insightful

    Frankly the interview was painful every time Dave Patterson said something. How many times does he have to ask questions about the concept of mailing a computer? "We mail computers because transferring over the Internet is too slow for these massive data transfers." "Are they computers?" "Yes." "Do you mail them?" "Yes." "It's like a movie." "Uhh ok." "Is it a whole computer that you mail?" "Yes, it is a computer full of hard drives." "Why don't you just use the Internet?" "Because it is too slow."

    --
    ...
  3. Troll in the article by panurge · · Score: 4, Insightful
    ...semi-seriously. Look at all the stuff about MySQL and Linux in the middle. It's as if a Microsoft Marketoid had suddenly taken over the interview. Or someone who didn't understand the difference between many thousands of developers working on Linux and the smaller number that work on MySQL.

    Apart from speculating as to whether this attempt at FUD was the real payload of the article, did it really say anything that most of us haven't already noticed? Whether Flash or fast SCSI, we could do with an intermediate layer of backing store, with faster random access than current IDE HDDs. And we are fast heading for removable IDE drives to be a better and cheaper tape replacement. And the Internet has limited bandwidth. I'm sorry, but you don't need a Turing prize to work any of that out.

    --
    Panurge has posted for the last time. Thanks for the positive moderations.
    1. Re:Troll in the article by mr_majestyk · · Score: 1, Insightful

      You're the troll. Jim Grey wrote the book on transaction processing. If there is anyone qualified to judge the capabilities of a DB system it is him.
      Or someone who didn't understand the difference between many thousands of developers working on Linux and the smaller number that work on MySQL.
      WTF??? His comments show a clear understanding of how many developers are working on MySQL ("Twenty-five people can do a pretty full-blown system, and ship it, and support it, and get manuals written, and test it. The Postgress and MySQL teams are on that scale and likely represent the leading open-source DBMSes out there."). It's like you saw the word "Microsoft" and automatically assumed everything following would be Market-speak.

    2. Re:Troll in the article by leandrod · · Score: 2, Insightful

      > Look at all the stuff about MySQL and Linux in the middle. It's as if a Microsoft Marketoid had suddenly taken over the interview. Or someone who didn't understand the difference between many thousands of developers working on Linux and the smaller number that work on MySQL.

      He's correct as far as he goes.

      MySQL and MS SQL Server actually have the same problem, and it is called SQL; both even go downhill from there.

      SQL is simply too complex to implement properly, and it only gets worse when you start with a non-standard implementation. While MySQL benefits from a better OS to run on, it has the more fundamental flaws that its developers don't really understand data in general and the relational model in particular, and it has started with something that wasn't really SQL at all and not a DBMS at all; MS began with something that was a real if weak DBMS, and almost SQL already, and since has hired some pretty good guys and improved impressively.

      Eventually MySQL will reach maturity, and with more ports, a more variated and complicated legacy and less understanding, it will have a rougher time developing the future and supporting the past. See that MySQL's current idea of future is SAPdb, which is stuck with Oracle v7 feature parity and less-than-SQL 92 Entry Level compliance.

      Obviously PostgreSQL is a better base to build on, it originally was even better than SQL (Ingres QUEL was based on Codd's own Alpha), since it got into the SQL cult it was never as unfaithful as MySQL is now or even SQL Server was, and PostgreSQL was always a real DBMS. Too bad the exposition it gets is so small Mr Gray can't even spell its name or see its superiority.

      --
      Leandro Guimarães Faria Corcete DUTRA
      DA, DBA, SysAdmin, Data Modeller
      GNU Project, Debian GNU/Lin
  4. The hierarchical object file system by master_p · · Score: 3, Insightful

    One final thing that is even more speculative is what my co-workers at Microsoft are doing. They are replacing the file system with an object store, and using schematized storage to organize information. Gordon Bell calls that project MyLifeBits. It is speculative--a shot at implementing Vannevar Bush's memex [http://www.theatlantic.com/unbound/flashbks/compu ter/bushf.htm]. If they pull it off, it will be a revolution in the way we use storage

    I've talked about it before. This guy thinks what Microsoft is doing is revolutionary. Come on all you people, can't you see the problem with today's file systems ? the problem is that the type information is lost!!! we need objects, and we need type information to be stored along those objects!!! This is the only way lots of problems will go away.

  5. Defending Jim Gray by chrisd · · Score: 3, Insightful
    I didn't really read that as fud or even invalid criticism of MySQL. Maybe I'm biased because of my previous work with Queue and since I have met Jim, but if you get the impression that Jim doesn't like MySQL (which I did not) then I would actually assume it is because he felt that way, not because of Microsoft. Jim is one of those guys that will never be looking for a job, his early work on databases were pivotable to the development of transactions and his work on fault tolerant systems is legendary, he really is beyond reproach.

    Chrisd

    --
    Co-Editor, Open Sources
    Open Source Program Manager, Google, Inc.
  6. Three letters: F, U, and D by heironymouscoward · · Score: 4, Insightful

    Take this choice quote from the article:

    My buddies are being killed by supporting all the Linux variants. It is hard to build a product on top of Linux because every other user compiles his own kernel and there are many different species.

    Ain't it sweet? I count five lies:

    (1) people being killed by supporting (gasp) operating systems... gosh, horror and violence, not nice at all!

    (2) all the Linux "variants", are in fact pretty much one standard, LSB, with several skins

    (3) "hard to build a product on top of Linux", rather than, hmmm, Windows? Linux is incredibly easy to build for. I suspect the fact that it's very standard helps.

    (4) "every other user compiles his kernel"... maybe at Microsoft. I suspect less than 1 in 20 Linux users ever compiled a kernel.

    (5) compiling a kernel means you can't support it... WTF? The kernel is incredibly stable, since most changes are in external modules. And I can't remember a single case where a kernel change broke one of my apps.

    (6) (sorry, I was not counting well), "many different species"... well, AFAICS the only difference between the Linux distributions is that they have different packaging methods, different timelines as to their versions, and different UI tools for hardware detection, configuration, etc. Nothing at all that makes life hard.

    Look: I just installed Xandros, which is Debian with a nice face. On two different types of machine, and it installed without asking a single question about my hardware except whether the mouse was left or right-handed. Check my journal...

    Windows never worked this nicely. Where is the support issue?

    In the writing indistry we call this "to condemn with faint praise".

    Yeah, Windows kinda works, I mean, it'll run Office without crashing too often, but it's just killing by buddies to have to maintain Win2K, WinXP, and even some older Win98 machines, not to mention we have a whole cupboard simply filled with driver CDs for every PC we have.

    --
    Ceci n'est pas une signature
    1. Re:Three letters: F, U, and D by Junks+Jerzey · · Score: 2, Insightful

      From the Devil's Dictionary:

      FUD: The sound made by someone attempting to wish away inconvenient facts.

      http://www.eod.com/devil/archive/fud.html

  7. He is right, but nothing to do with the kernel by brunes69 · · Score: 4, Insightful

    His basic idea is 100% correct, but the reson is all wrong. It *IS* much harder to develop an app Linux the myriad of flavours, not because of the kernel, but because every distro has its own versions of libraries. I work for a company that makes Linux software, and we only support RedHat, and even certain versions of RedHat at that. While our product would probably compile against any number of distros, and even the BSDs, we just don't have the time and manpower required to build, test, debug, package, and maintain 15 different releases for every sub-release or patchlevel we have in the product. With Windows products, at least, (unless you are doing some lower-level stuff) if you build something you can be reasonably assured it will run on Windows 2000, or Windows XP, or Windows 2003. Not the same if you build something with RedHat 9 and try to run it on Debian or Suse, etc. And before you go on about "release a source package", not all companies release everything GPL, and want to keep their IP theirs, since they like to put some money on the table at night. It's definitly not FUD to say it is much more effort to develop and release cross platform binaries in Linux than Windows.

  8. Re:3 Terrabytes on a credit card? by jetkust · · Score: 2, Insightful

    This article claims some kind of software based 8:1 compression scheme on binary data. Am i reading this wrong or does this seem a bit like nonsense?