Slashdot Mirror


Dell Says 90% of Recorded Business Data Is Never Read

Barence writes "According to a Dell briefing given to PC Pro, 90% of company data is written once and never read again. If Dell's observation about dead weight is right, then it could easily turn out that splitting your data between live and old, fast and slow, work-in-progress versus archive, will become the dominant way to price and specify your servers and network architectures in the future. 'The only remaining question will then be: why on earth did we squander so much money by not thinking this way until now?'" As the writer points out, the "90 percent" figure is ambiguous, to put it lightly.

11 of 224 comments (clear)

  1. Coincidence? by Hognoxious · · Score: 5, Funny

    90% - just like the percentage of statistics that are made up on the spot.

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    1. Re:Coincidence? by dov_0 · · Score: 5, Funny

      Or is dell about to make a press release about faulty storage in their servers resulting in about 90% data loss?

      --
      sudo mount --milk --sugar /cup/tea /mouth /etc/init.d/relax start
    2. Re:Coincidence? by espiesp · · Score: 5, Funny

      Or having developed a new memory technology.

      "Dell releases a new drive based on their patented WORN architecture. Because this device forgoes the need to read your data they can be made lighter and faster and more power efficient than even the latest SSD drive technology."

  2. Which 90% ? by mbone · · Score: 5, Insightful

    I could believe the 90% number. There is plenty of data sitting around in case it is needed. Some of it will be needed. Much of won't be. How do you predict which is which ?

    1. Re:Which 90% ? by eldavojohn · · Score: 5, Insightful

      I could believe the 90% number. There is plenty of data sitting around in case it is needed. Some of it will be needed. Much of won't be. How do you predict which is which ?

      Yeah, as someone who has implemented a few auditing solutions where I work, I must confess that it seems to be 99% of the data we archive is never looked at again. A lot of it is due to policies and is only used after something goes dreadfully wrong. If they are well thought out, the metrics can be collected as the data is written instead of needing to search across the data.

      I think their "90% dead-weight rule" is really a misnomer as you could probably claim that 90% of Google's indexing is never read but we all know that it's the potential that data holds that makes it so valuable and necessary. If Google knew every future possible search then they could delete the data they will never use ... but how do they know they will never use it? How do I know that the auditing data will never have a use--by new metric or incident investigation? The truth is simply that you don't.

      --
      My work here is dung.
  3. It's like Office features by drinkypoo · · Score: 5, Informative

    People always bitch that they have to pay for Microsoft (or whatver) Office's features because they only use 5% of its functionality. But you buy all those features at once because you don't know which you will need in the future. Data warehousing is the same way. If you start taking data offline you'll just need that data. That's why analyses of very large data sets are performed before archiving.

    But what is really wanted is a way to cluster the database servers, with old data automatically cycled to the slowest, most remote nodes, and with the most frequently-altered data heavily replicated and aggressively synchronized.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  4. This isn't a 'new way of thinking' by sirwired · · Score: 5, Insightful

    Automated Hierarchical Storage Management has literally been around for decades. It may be new-ish on low-end crap x86 servers, but for say, mainframe users, it isn't new at all.

    What is new is available implementation choices. When your tier choices are between enterprise disk and enterprise tape, you are biased towards keeping data on disk; there's still use cases for HSM with only high-end disk and tape, but they aren't as great. Now with lower-cost disk available, you have a cheap disk choice too, with fairly reasonable access time.

    SirWired

  5. Re:The problem is "Write-only" applications by mikael_j · · Score: 5, Insightful

    The problem is, that IT people dream up all these "write only" applications that record data, without any rational plan for what the data might actually be used for in the business.

    These plans mostly come into being because us "IT people" (read: developers) know that the "business people" love changing the specs and they'll blame us if they want to start using data they didn't ask us to save and we tell them we can't save data retroactively (really, they'll basically blame the developers for not being able to time-travel). This is why we'd rather save everything than not save enough.

    --
    Greylisting is to SMTP as NAT is to IPv4
  6. dell's new line of fire extinguishers coming soon! by drfireman · · Score: 5, Insightful

    Over 92% of fire extinguishers will never be used, we could probably save a bit of space by having the unneeded ones stored off-site, or in less accessible corners of the garage.

    Slightly more seriously, we can certainly answer this question posed by the linked article easily: "why on earth did we squander so much money by not thinking this way until now?" The answer is: because you are a moron. Anyone who has given even a moment's thought to storage has known this, either implicitly or explicitly, for a long time. So whoever's included in your "we," Steve Cassidy, is just profoundly stupid. I think that quite easily explains why you all squandered so much money by not thinking about this. Next question?

  7. Re:which 90% by Koby77 · · Score: 5, Informative

    I worked in a call center, and I can definitely believe that 90% of the data is never read again. However, when a customer is calling back (and is angry!), you don't have time on a live call to wait to see what's up with the account. Also there can be some litigious aspects, and a lot of information was recorded for C.Y.A. purposes. Again, you never know which part is needed for C.Y.A. purposes, but that 10% sure is valuable.

    So yeah, we needed to store ALL the account information, and we needed fast access to ALL of it ALL the time.

  8. Solutions: by drolli · · Score: 5, Interesting

    a) Forbid *unmanaged* of documents. If the question: "where is the most up-to-date version of this document stored?" is systematically and easily answered then people can delete the crap from their laptops.

    b) Forbid in-company attachments to mails. If the last version can be easily found, including the revision history, a link to this revision is worth *more* than the current state of the document. Most space in my inbox are totally useless attached documents.

    c) Forbid the use of formats unsuitable for storing a certain kind of information. (Where i work, they use powerpoint/word files for electronics forms)

    d) Provide a good archiving and backup service. Besides the quality improvement by using a service, also the 100th copy done in some unsystematic way of some data is prevented (forbid this explicitely)

    e) Thin clients. store the data on a server. Deduplicate.

    f) i would expect that most of the documents in a company can (and should) be stored in a database.