Slashdot Mirror


Vint Cerf: Data That's Here Today May Be Gone Tomorrow

dcblogs writes "Vinton Cerf is warning that digital things created today — spreadsheets, documents, presentations as well as mountains of scientific data — may not be readable in the years and centuries ahead. Cerf illustrates the problem in a simple way. He runs Microsoft Office 2011 on Macintosh, but it cannot read a 1997 PowerPoint file. 'It doesn't know what it is,' he said. 'I'm not blaming Microsoft,' said Cerf, who is Google's vice president and chief Internet evangelist. 'What I'm saying is that backward compatibility is very hard to preserve over very long periods of time.' He calls it a 'hard problem.'" We're at an interesting spot right now, where we're worried that the internet won't remember everything, and also that it won't forget anything.

6 of 358 comments (clear)

  1. We should have listened by Anonymous Coward · · Score: 5, Insightful

    We're in a difficult spot right now because for years we ignored the warnings about 'proprietary file formats'.

    I'm not blaming Microsoft either. We let Microsoft do this to us of our own free ignorance.

  2. Re:So? by MrBandersnatch · · Score: 5, Insightful

    I think you will find that there's a little known branch of academia called "history" which sometimes takes a curious interest in even the most trivial of past information.....

  3. Yes, backwards compatibility, blah blah blah... by Narcocide · · Score: 5, Insightful

    Yes, you're right I have this ASCII text file created in 1997 and I can't find anything to read it...

    OH WAIT ACTUALLY FUCKING *EVERYTHING* STILL READS IT.

    Stop gargling Microsoft's balls so much and wipe off your chin. Proprietary data formats are THE PROBLEM. Stop trying to redirect public discourse with this thinly veiled bullshit.

    1. Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · · Score: 5, Insightful

      But your EBCDIC documents are absolute rubbish now and the tools to convert them aren't commonplace any more.

      $ printf "\xC5\xC2\xC3\xC4\xC9\xC3\x25" | iconv -f ebcdic-us -t ascii
      EBCDIC
      $ dpkg -S `which iconv`
      libc-bin: /usr/bin/iconv
      $ apt-cache show libc-bin | grep -e Essential -e Priority
      Essential: yes
      Priority: required

      So we got a program that can convert from EBCDIC-US to ASCII (or UTF-8 or whatever you want) and that program is in an Essential/Required package on any Debian-based system and for some reason you say that "aren't commonplace"?

      Are you on crack?

  4. Re:emulation / virtualization by Anonymous Coward · · Score: 5, Funny

    You're very clever, young man, very clever - but it's VMs all the way down!

  5. Re:My data will be readable by Bremic · · Score: 5, Insightful

    Until HTML includes DRM and half the stuff you create ends up being unreadable.

    Well, really we are probably good for anything that can be opened in a text editor for a long long while; but the point is there. Anything can be lost to data format shifts.

    As someone who had to re-type a 80 page document because the company stopped using the software the document was created on, and didn't have a licence for it an no converter found online worked - I can say this does happen.

    How many people are going to shell out $600 for software to open something they want to make an edit on? How many are going to just give up and find someone to rekey it, or just give it up as a loss?

    With more and more systems including format locks, in 50+ years historians will likely have a lot of trouble finding out details from today. Kind of like it is now when we go to look at archival film from WWII and find it's all faded into obscurity. We have the same problems, just with different causes. Then it was lack of preservation of a medium with a limited lifespan. Now it's storing stuff in formats that will go away as they are improved upon, blocked, or just forgotten about.

    Sure if your in your 20s, or even 30s, you probably haven't realized the copy of your grandfathers photos are sitting on a floppy disk in a proprietary format. But when you get older you may encounter these issues.