Vint Cerf: Data That's Here Today May Be Gone Tomorrow
dcblogs writes "Vinton Cerf is warning that digital things created today — spreadsheets, documents, presentations as well as mountains of scientific data — may not be readable in the years and centuries ahead. Cerf illustrates the problem in a simple way. He runs Microsoft Office 2011 on Macintosh, but it cannot read a 1997 PowerPoint file. 'It doesn't know what it is,' he said. 'I'm not blaming Microsoft,' said Cerf, who is Google's vice president and chief Internet evangelist. 'What I'm saying is that backward compatibility is very hard to preserve over very long periods of time.' He calls it a 'hard problem.'"
We're at an interesting spot right now, where we're worried that the internet won't remember everything, and also that it won't forget anything.
I think that given MS office and LibreOffice are in XML, it shouldn't be difficult at all to reverse engineer in the future.
Careful with names containing L slashdot.org/~AiphaWolf_HK slashdot.org/~AlphaWoif_HK slashdot.org/~AiphaWoif_HK
My data will be readable because I use bog-standard formats. If I get really froggy I use HTML, and you can just strip the tags and read that.
If his data won't be readable, that's his problem. Anything you want to save for posterity, export it now.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Support emulatorVM developers! Encapsulate your entire machine in a VM and you can run the entire software stack if necessary. Anything you need convenient access to, export to CSV, XML or some other standard format.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
We're in a difficult spot right now because for years we ignored the warnings about 'proprietary file formats'.
I'm not blaming Microsoft either. We let Microsoft do this to us of our own free ignorance.
I think you will find that there's a little known branch of academia called "history" which sometimes takes a curious interest in even the most trivial of past information.....
Yes, you're right I have this ASCII text file created in 1997 and I can't find anything to read it...
OH WAIT ACTUALLY FUCKING *EVERYTHING* STILL READS IT.
Stop gargling Microsoft's balls so much and wipe off your chin. Proprietary data formats are THE PROBLEM. Stop trying to redirect public discourse with this thinly veiled bullshit.
If there is a demand to open up and view a certain file type there will always be someone to create an app or website which will either open up the file or convert it to a more compatible format. There are already services out there that convert word to pdf for example oh and I just found an iPhone app for converting files, yay!
Anveto
Man, fuck the future (that's right you historians-not-yet-born). They have all the flying cars and meal-in-a-pill's and immortality clinics and shit. The hell have they done for us to deserve our sympathy? If that means we can make them have to work that much harder to see how life was now, I say do it.
Now back to my zombie virus work. Anybody got a decent time capsule for me to use?
I read TFA and all I got was this lousy cookie
A perfect example of this is basically the issue of old video games. (I may as well bring this up because it's going to come up)
Recently, the Internet Archive stored a whole pile of TOSEC collections of games from various old systems (thanks to their DCMA exemption of being an archival repository so that they can legally do this). Data and information that would have otherwise been completely lost into a digital black hole, if it weren't for the fans of the system, and the dedicated teams of people collecting and amassing this software as a hobby.... in breach of copyright.
The problem with DRM is that without dedicated crackers and pirates, unless the original rights holders are around long enough to resell old titles for that long (which most aren't), old games will simply disappear into a digital copyright black hole and never be seen again. This happens once the computer/console system system is old, not sold anymore, and forgotten about, and the media degrades and isn't backed up in some form (in breach of EULA). If people aren't able to collect the software and hang on to it, preserving/duplicating the media while still in copyright, it's going to vanish. Culturally important games of significance will be lost forever, and that, if anything is as much a crime as it is to pirate software in the first place.
It's only due to the efforts of an army of swappers/crackers, etc, that most of the old games on old systems were even preserved.
The steam model on PC is quite good though as it makes a few compromises where you can actually make backups and go offline if you want.
For old computers and consoles however, this doesn't apply,.... and with some more restrictive attempts to squash the used game market, and force internet-always-connected authentication on upcoming consoles to even play the game... one has to wonder if the game companies deliberately want to squish all traces of their old work, let it disappear into the ether, and to resell you this year's football game which is just like last year's. I fear that this is where we are headed (if we aren't there already)
READY.
PRINT ""+-0
Print Everything!
Problem solved.
Saw info on a book on this topic today, in fact: http://filesthatlast.com/about/ . Looks interesting so far.
Were living in what could well be a future dark age for archeologists / historians. Hardly anything is put into a nice hard format (stone is incredibly rare and metal gets stolen) for someone to find. What's left suffers from incompatible file formats, acid based paper that decomposes, bit rot, cryptography, incompatible technology for data storage and worst of all DRM. With DRM you have active measures that try to prevent something from being usable.
In the old days people stopped use with armed guards, obfuscation and primitive crypto. Today we have servers that are required for operational functionality for many products. With the advent of the cloud you have reasons for storing things where you have a dependency on a third party. How many services that are cloud / server based have come about and gone tits up?
Even having a large well known brand name doesn't protect you from having a server shut down. Just think of Microsoft's play4sure service that lasted less than a decade. Having a license and a physical disk isn't that helpful when the DRM requires an authentication server that doesn't exist. With the movement to put more and more DRM into the cloud or with SSL certificates (again dependent upon servers and naturally time bombed) this is going to be a problem that will only grow worse.
Learning to break DRM is far more critical than file formats which require nothing more than a conversion tool.
Digital archival is one of the HARD problems. Over the last 40 years we have already lost more cultural artifacts that were created for the entirety of human history. A great deal of that is useless garbage of course but the original moon landing tape? 1000s of government emails reavealing exactly what was going on at pivotal times in history?
The truth is, we need systems for hardcopy; digital is too tranient; emulators are a useful stop gap measure but dont protect againt the kinds of catastropic failures that we will likely see over the longer time frame; and we need indexing because someone at somepoint will want to wade through our digital ditritus.
This has been true of all technology in the past and will continue into the future. Just look at film. How many preserved films from 1915 are still around? Just the ones that were recorded into a new format of film, then a newer format of film, then into a VHS, then into a LaserDisc, then a DVD, then a BlueRay... (Metropolis, I am looking at you.)
Within arms reach, I have Floppy drives that contain files created in AMI Pro work processors.... WHen I say Floppy, I am talking about the 5 1/4 inch floppies.
Technology hardware and software is not stagnant... It will always continue to develop and progress (ignore windows 8). Data that is worth keeping will get converted. Data that isn't will get left behind. I would not be surprised that in about 25 years, there will be "classic" software as there is Classic literature...
Too much typing.. going back to drinking.....
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
"First things first -- but not necessarily in that order"
-- The Doctor, "Doctor
The IRS wants to audit me, going back several years. I kept the records as required but they are unreadable now.
Thanks Microsoft!
Have gnu, will travel.
That people in the far future would be getting smarter to accomplish this - probably a tossup - and apart from it, it's very questionable if a far future for humanity even exists, the way "humanity" is behaving this days/years/decades/centuries/millenia....
Maybe there are smarter robots by then babysitting...
I think you will find that there's a little known branch of academia called "history" which sometimes takes a curious interest in even the most trivial of past information.....
Even if you don't care about the historians, I'm sure the lucky people who have the pleasure of handling property deeds at your local governance hive can tell you a story from within the last week or two about needing to pull some rather seriously dusty documents to allow a present-day transaction to go through without incident.
Many data will, indeed, be of no interest at all, or the same historical interest that neolithic refuse dumps are; but data in the nontrivial-number-of-decades range are still live in more than a few contexts.
I use Github Flavored Markdown. Thousands of years in the future, archaeologists will no doubt work furiously to decode my etchings upon a stone tablet, which will read: "# IF YOU CAN READ THIS YOU'RE A GEEK #" .
XML doesn't magically solve everything in this regard. If there's no good documentation for the format, it's unlikely you'll be able to display everything exactly as intended. Likewise, if the format is hideously complex (see: Microsoft Office Open XML) or there's bugs in the de-facto implementation, it's going to be tricky to reverse engineer.
I'd also point out that MS Office spits out compressed XML. I believe it's based on ZIP, which is very well documented, but that's yet another hurdle to cross. And then you have to deal with the binary format of the XML itself -- ASCII, UTF8, etc.
There's no -1 for "I don't get it."
For open source. Save your files in open and/or openly defined, standardized formats and there will always be software that can deal with it.
But I guess it's difficult for people to hear you explain that to them with their head up their ass.
I would solve this by installing a Windows XP VM with a copy of Office XP. Now that I solved Google's hard problem they must now see I am qualified to work there. Google is on a FUD rampage of which the likes I haven't seen since the great Microsoft FUD storms.
Ok, so how do you retrieve your photos that you stored on that 8inch floppy disk... 10 years from now?
That is a gross exageration but is an anaolgy to the point of the article. Without proper protections, all the information, notes, white papers, studies, etc will be useless if there doesn't exist technology that can read it.
In a worst case scenario how would humankind rebuild and not forget what was previously learned (e.g. dark ages we already experienced).
Still haven't found a description of the chaaracter set in which octal 222, 223, and 224 are right single quotation mark, left double quotation mark, and right double quotation mark.
Anybody know this one?
MS removed the PowerPoint 4.0/95 converters completely with Office 2007 for Windows and later, and disabled them by default in Office 2003 SP3. And the PowerPoint 4.0 converter (but not 95) was disabled by default instead of fixed with MS09-017.
On the Mac, they removed then even earlier, when they ported Office to Carbon.
IMO it would be a good idea for MS to package PP4X32 and PP7X32 from PowerPoint 2003 separately, along with a utility to call the converters of course.
For a supposedly smart guy, he seems a bit silly:
He could've just downloaded MS's Powerpoint 97 viewer
I don't respond to AC's.
I think the user was either using PowerPoint 4.0 for Mac or did not upgrade to Office 97 immediately.
If not, file a bug and send in the document. The power of freedom ...
Quite likely. I had some old Word for Mac documents of scientific papers I wrote in the 90s, and the only way I was able to recover them a few years ago was to install a Windows 3.1-era copy of Word for Windows.
I remember over two decades ago there was talk of making data objects, that is data that new how to present an object interface to get at its information. Data self contain its own reader in some ubiquitous language. But wait, we never got a ubiquitous language. Perhaps javascript today? But if you want to solve this problem then this is how to solve it. Or perhaps you could just package a converter to convert format XYZ to BSON as being good enough or at least better than today's breakage.
One thing that really burns me is having my information that I created / entered / caused to be locked up in some proprietary opaque format, especially if owned by one and only one app.
Have you tried disabling the file blocks first? At least Word for Mac 4.x and 5.x can be read this way.
Some are glass plate Daguerreotypes. Somehow, I am not too confident that my digital pictures will be legible 150 years from now, unless I make a good quality print on archival paper. Digital files are too easily corrupted and made totally useless. Media formats will change. 8" floppies anyone?
"Do the Right Thing. It will gratify some people and astound the rest." - Mark Twain
Note they also sometimes drop support for old formats too:
https://bugs.freedesktop.org/show_bug.cgi?id=59902
We're still able to restore cars from the 80s and earlier as the cars were fully mechanical or hydraulic. No computers.
Fast forward to 20yrs from now, nobody's going to be carrying the computer boards for a 2004 Toyota Pruis or a 2013 Tesla.
However, you'll still be able to restore your grandfather's '57 Chevy...
I presented a solution to this long-standing problem last year to the Denver HTML5 Meetup.
Code should never be separated from data. This is possible with HTML5, JavaScript, and open source.
In the presentation, I steal and repurpose Hofstadter's analogy of DNA to an LP vinyl record, which is an information bearer, but useless without its information retriever (the record player). Like the cell of an animal, which contains both DNA and the means to "play" it, I ask why not the same with software?
My maxim is: data should always carry the code with it to play itself. It was inspired from the field I've spent 50% of my career in: non-destructive testing where, for example, X-Rays and ultrasounds are performed on safety-critical industrial parts with 50-year service lives. If one of those parts fails and kills someone, you're going to want to go back into the old data and find the earliest indication of the flaw or fault and reinspect every other part in the world like it that is still in service. And maybe you need to go back 50 years. Under such a context, not providing the code with the data could be considered an act of gross neglect.
In my presentation, I use the 1990's era trick of embedding XSL into an XML file, with the addition of the XSL now being able to use HTML5/JavaScript. Sadly, I've only gotten it work with Firefox -- the other browsers consider it a security violation.
https://en.wikipedia.org/wiki/Windows-1250
Professional Wild-Eyed Visionary
Yields 4 results in Ubuntu. You can search reputable open source archives on the web, too.
How deep are your pockets?
*IBM Consulting*
Um, really???
*spoilers*
"A person is smart. People are dumb, panicky dangerous animals and you know it." - K
I've been part of archival problem planning. We went with DVD. now I am not there, I suspect they are thinking DVD sucks and are moving "forward" when the DVD was more than good enough and those plastic discs will last a century. mpeg-2 files will have open source decoders. Now physical readers will still be a problem... the only solution is to wait as long as possible and then switch to the next long lasting format - but not necessarily the newest one at that time. (which is why moving to blueray is a waste of money.)
The biggest problem with other formats is the FORMAT; even with something like open office documents, the ODF format will have revisions and new features added and tweaks to the format. version 2, 3 etc. The features and changes that promote the creation of more and more formats is the biggest problem. Just like my above DVD video problem- if you go beyond your needs then you are complicating things with more and more formats.
TEXT? sucks. we need WORD! Word 1.0? the app sucks... we need WORD 20! (and all versions in between to migrate the old docs...plus labor to deal with conversion issues...)
Perhaps we need ARCHIVAL formats; like PDF, which has done besides the stupid additions Adobe has been making to it. Or just TEXT export... a less bloated output only format without the feature BS problems.
Thankfully, email remains the same... sort of. although storage of the emails differs greatly; if you want to archive emails you need to pick a close-to-the-source method (and simple storage filesystem-- good luck reading that NTFS formatted disk image in 30 years.)
Democracy Now! - uncensored, anti-establishment news
Seriously, why would Vincent Cerf not blame Microsoft? They have an extremely poor track record with backwards compatibility, and I don't think they even know what forwards compatibility is. If you design the data formats correctly then you can keep things usable for decades (or centuries). Guess what, twenty year old TeX documents still work, and yet Word X won't work with Word X-2. I've pulled runoff documents off of 70's versions of Unix that can still be printed. That says to me that one can deal with compatibility issues.
This is all intentional on Microsoft's part too. They make money when customers buy new copies of software, so it is in their best financial interests to make sure that customers have significant pressure to upgrade. I remember the solution to an acknowledged bug for Word 97 was to make sure that everyone who was going to read your document had the appropriate Word 97 plug in in their older version of Word. I completely blame Microsoft here.
This is not that hard a problem, IF the company pays attention to it and gives it even a small amount of priority.
You don't even have to install Word for Windows from that era. WinWord 2.0 will run as a stand-alone binary. Just the Winword.exe file by itself will run. And it's less than 1.44M in size so you can just have it on a floppy diskette. On any 16 or 32-bit Windows machine, of course. It even includes that era's VBA so you can use the winword.exe binary as a portable 'execution environment' sort of.
he specifically stated that he re backs up every year. I dont go that far but i have data going back as far as the early 90s that started on large floppys, migrated them to smaller floppies, migrated them to CD-r's and now have them on external hard drives. It isnt too hard to keep formats alive. (also note on the hard drives I keep VMs with older OS's able to read formats that i have not found a way to convert, which isnt many.)
have you seen my sig? there are many others like it but none that are the same
This problem isn't new to anyone. If it's new to you, then you need to get involved in the digital preservation movement.
http://en.wikipedia.org/wiki/Digital_obsolescence
Kriston
Vint, that's bullshit and you know it. It's nothing more than preserving syntaxes, grammar, file formats. That's not hard, and it only requires someone to create a format conversion ONCE to solve the problem at each stage of the evolution.
The real problem here is proprietary non-public formats and structures. When the structure of data has been a closely guarded secret and requires reverse engineering that may not even yield a perfect result, THAT is hard.
WordBasic, actually. What is fun BTW is to unblock Word 6.0/95 formats in 2010 and later and open a file with WordBasic like SCANPROT.DOT.
No! Fail! You don't get it!
1) Code is data
2) Code is data that is especially hard to interpret
3) One of the main reasons of all this mess ia that in all those proprietary formats, data is intermixed with code, and the whole mess is very hard to parse.
Data should be kept completely isolated, as far away from code as possible. That way, if you cannot interpret the code any more, you will still be able to analyze and parse the data. You know, it is not that hard to construct a record player.
AccountKiller
My first Latex publications from 20 years back and all my human-readable ASCII scientific data still be read and used without any problem. Human-readable file
formats in the UNIX tradition completely solve this problem.
This problem is only hard if the people making the data formats are either stupid or do not want their formats to be easily accessible to other applications, as Microsoft does. Of course, others are creating just as fundamentally broken formats for either of the same reasons.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You can get emulators for just about every machine you can imagine: PDP-10, PDP-11, DOS, Atari, Amiga, C64, microcontroller, etc. You can get hardware emulators with FPGAs if you like. Almost any important format is documented or has been reverse engineered. Yes, you can easily read 1997 PowerPoint files, even if his weird choice of Office on Mac can't. And that's only with current technology. Give it a few decades and all that can happen behind the scenes and computers will just automatically perform even the most complicated data conversions behind the scenes. "Computer, scan the 1997 floppy and put the data on screen."
I was guessing I wouldn't find Kool Aid Man for Atari 6200 but, sure enough, it's out there.
Who hurt you? :-(
Sand's overrated... it's just tiny little rocks.
Backward compatibility is not a hard problem, Vint Cerf just isn't very good at it as evidenced by the IPv6 fiasco.
When all you have is a hammer, every problem starts to look like a thumb.
What's he doing keeping stuff in MS apps for? Then when they don't work 5 years later he's all like OMG THE NET WILL BREAK.
Idiot. He knows better. Or should.
Need Mercedes parts ?
Sure I am sometimes saddened at the thought of the video games of my youth being lost forever, but even if they weren't it wouldn't recapture the joy I felt upon encountering them at the time. Do you think you are more important than that? Think of the current year and then start going back a decade at a time and name one person you know of from that time. How long before you run out of people you know personally? Before you run out of people you have even heard of? I bet most people can't even make it a century. Millions of men fought in the world wars, many of their stories are still recorded. How many people bother to look at even one? My grandfather recounted a story of seeing the first automobiles in his town, how many people even think of a time when they didn't exist, or the time when they were new to the world? Precious few I reckon.
If you want to worry about what history will think of THIS time, perhaps you should be a more careful custodian of previous ones.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
I have simulation programs trapped in Working Model for Mac format. I have 3D animation projects trapped in Softimage 3D for Windows NT. Neither is easily convertible to anything else. (Worse, they're on DAT tapes.)
Images, video, audio, and text documents are easy to convert because there are modern formats that directly correspond to them. But some things don't translate well.
In my presentation, you'll see that the strategy of embedding XSL in an XML file has the code in the top half and the data in the bottom half, clearly delineated. They are easily separable. But by having them in a single file, they will not get separated by someone copying them.
Old samplers are rather a victim of that. The hardware is often fine and can still crank out some awesome sounds, but they are often diskette based and storage technology has moved on hundreds of times faster than synthesizor technology.
The Ensoniq scene has almost abandoned the EPS series because they used double-density drives and DD 3½" floppies haven't been made for years - and HD floppies aren't reliable in DD drives. Nowadays even HD diskettes are losing their stored bits. *All* the people keen to keep the ASR-10 alive have shifted to SCSI solutions because floppies are just not reliable anymore.
Wade.
Bullshit. You're merely enjoying the consequences of voluntary DRM. If you don't care about your data you'll lose it, just like those pictures you used to draw in crayon that hung on the fridge. If they ARE important then you can keep them and use the data indefinitely.
I still run the GWBASIC programs, and even 16 bit x86 DOS code I wrote as a child to edit images and color palettes via keyboard in (M)CGA video modes which BIOS still emulates, and OSs like Free DOS can still make use of (Watercolor isn't extinct because Oil paint exists, Platforms are to game makers what Canvas and Paint is to Painters). Hell even my very 1st 386 bootloader can be written to an MBR and booted on a brand new x86-64 system (disable Security Theater Boot). This is NATIVE support. With an emulator, I can even run programs I wrote for my dad's old PDP-8 -- A completely different architecture... 12 bit bytes!. I cared enough about the little dinky things I did as a kid to make sure they were preserved across every major storage format change. I can still read the comments my dad thankfully added to some of my code all those years ago -- a valuable lesson indeed; My kids find gramps' snark quite funny. That's several generations of data compatibly for my family's directory tree...
It's not useful to bitch about compatibility by citing programs created by companies that willfully suck at compatibility. MS DOS requires an emulator, but DR DOS can still be installed on my new systems. Though it doesn't recognize my sound card I can still program a driver for it though -- just like I did to get my old custom IR transceiver devices to control my new home theater setup (lights, screen, volume, etc) via my aging Osborne-1's serial port.... It's a functional "conversation-piece" to hear that familiar 5.25" drive access as the signal tables are loaded for TV instead of the stereo. That same data format which has been in use now for decades and even works on new hardware w/ Linux via LIRC now -- thanks to the kids... old Ozy will give out someday. Thats a future proof protocol compatibility across several generations of hardware, simultaneously.
There is NOTHING stopping me from converting the palettes and images created in my PAL_EDIT.COM into a GIMP .PAL / indexed .TGA or .TIFF, or .PNG, etc. I can (and do) frequently convert files in both directions, to go from GIMP to PAL_EDIT.COM to get new images and new "mods" into my really old game "engines". That's the thing about open formats and programs with source code available. Remember the push back against non-textual network protocols and even in email?) We won this battle already. I wasn't aware anyone had stopped fighting it. This page is written in TEXT. It's JavaScript and HTML... FFS: The 1st damn web page on the Internet still renders.
The authors can ALWAYS create data converters if they want, the problem is giving up that right and not demanding source code access. If my own data formats can survive the transition from kid to teen to adult and even be shared and passed on to my own kids (who love "real" retro games, BTW, such hipsters), then surely multi-billion dollar companies can do it too. Or, are you implying that despite all that money they are more inept than I can even imagine? If so, that's a pretty big dig at Microsoft there Vint... Bravo. Kind of makes me wonder WTF you're paying them for, eh?
I expect this kind of BS from you now Vint. I mean, you don't even realize the usefulness of your own contributions to mankind, Saying that the Internet is not a human right. Look up human: A characteristic of humans; A human being. It is a human right. It's the right to bear technology. That's what the 2nd amendment is really about, they just worded it wrong, they're imperfect. Just because some old farts can't understand the future the way we do now, doesn't make new technology NOT a human right. The Internet is the equivalent of access to spee
There are free/libre software projects with great records in opening up interoperability and keeping backwards compatibility. On the other hand, fashions among proprietary s/w makers seem to change, and about now there is a tendency to stop worrying about existing users and just abandon past formats.
Any number of folk will say things like "shouldn't be difficult at all to reverse engineer", but that doesn't make anything happen. On the other hand, there are plenty of apostles of the latest version ready to heap abuse on anyone bold enough to ask for backwards compatibility, and that attitude is a big source of problems.
Longterm readability is helped when software developers take the trouble to maintain backwards compatibility across different versions of popular tools and across competing applications that have broadly similar uses. That doesn't directly help with hardware barriers, but at least it would be good if the number of needless software barriers is kept down.
[...] Most of these things will be readable just as long as the applications that created them are around, but not longer.
[...]
Incidentally, all my decades old LeTeX documents still compile and can also be read directly. So can my 20 year old ASCII-coded measurement data.
"I'm not blaming Microsoft,' said Cerf,
Let's call a spade a spade. It's 100% a problem due to opaque binary formats. Had the document been written in (clean) HTML or plain text, it would have stayed usable without problems.
a thief for example is, recently i was looking for an Owner's Manual for a Suzuki motorcycle in PDF form, the bike is a few years old so Suzuki does not keep it and the only website that has it downloadable wants me to both sigh up for an account with them and wants money for the download, and they did not make the owners manual so they have no rights to withhold that information either intellectually or materialistically, so i refused to sign up on their lame website and refuse to give them money and i will keep searching for a free copy
Politics is Treachery, Religion is Brainwashing
I guess that makes sense if your data is so complicated that it actually needs XML, but I would still say that for simple data that can be stored in a simple to parse format like csv or tsv, it is better to keep it separate.
AccountKiller
I was wondering what professions I should keep tags on, just in case we have that talk about careers with my son-to-be... Being an expert on long-gone and "lost" data formats and collecting their respective tools just seems like a future relic (Oh, and we already keep terabytes of all those myriads of one-time-use programs and utilities we downloaded from 5 years ago, right?)
The best safeguard is the abandonment of all existing proprietary formats to freedom (so anybody can write conversion software) and the proliferation of open formats on an ongoing basis.
"I believe in Karma. That means I can do bad things to people all day long and I assume they deserve it." : Dogbert
*** Yes, and by god, future historians will care about YOUR spreadsheets and YOUR websites! ***
Actually they do. Historians are still trying to (painstakingly) find out how people in the Neolithic lived. So yes, having access to YOUR spreadsheets and YOUR websites will be very valuable for historians in say 3000 years.
*** Egotistical jackass. No one gives a shit about 99.999999% of humanity after they're gone. ***
Projection? That YOU don't give a shit about humanity, doesn't mean nobody else does.
# touch universe # chmod +rwx universe #
Microsoft from day one has been making its data incompatible with everything else. It was a lean and hungry company back then (it is fat and hungry now), and it was compatible with every existing thing on the import side and incompatible with everything on export. It fought a mean campaign against Samba. It played dirty with Netscape and the web standards. Bugs in IE worked around in IIS and vice versa to make it very very hard to stick to a standard.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
I'm having a similar problem. My father had started writing a book on Macintosh 512k using Macwrite. He passed away a decade ago, but, recently I uncovered a box of floppies.
Needless to say, even reading a floppy on a modern Macintosh is pretty much impossible, and even then, the older Mac documents had a data and resource fork, and recovering data from those early formats is pretty hairy.
Some of the data can be recovered, but it's unlikely I'll ever be able to completely read the book he was writing -- Unless I find myself a Mac 512 with Macwrite, and then run the text through the serial port to a more modern PC.
If telephones are outlawed, then only outlaws will have telephones.
Same way the TRS-80 fans do.... Take an old drive with an adapter and read it off once and transfer it to new media.
Ira Goldlang's site (trs-80.com) has TONS of old software done that way.
Your thin skin doesn't make me a troll
His position is that the data format is what will prevent data recovery - I postulate that as long as there are bored nerds that perceive a challenge, the old can and will be reverse engineered.
Your thin skin doesn't make me a troll
What WILL cause all of our digital data to finally be lost is media degradation. Every piece of data ever created will eventually be lost because the media it's on finally fails and someone forgot to copy it before hand. (That or the sun engulfs the Earth before we finally figure out that we have to get off this planet)
Your thin skin doesn't make me a troll
Put copies online and see how fast some nerds don't decrypt it....
Your thin skin doesn't make me a troll
(Ignore my "don't" in the above sentence. It made sense in my head but not so much in print.)
Your thin skin doesn't make me a troll
I recently encountered some bit of data that was encoded in a proprietary format but didn't really need to be. Nothing about the data required the extra features available from the proprietary format.
It turned out that a file from proprietary app X generated a file that couldn't be properly displayed on other copies of the same app without first being converted to a non-proprietary format.
Some people do really perverse things to avoid giving you data in a reasonable format.
A Pirate and a Puritan look the same on a balance sheet.
The Open Document Format(tm) was intended to ensure that documents have longevity. They looked at what companies like microsoft were doing, with every version 'incompatible' with prior versions. (Its not a random thing either, microsoft goes out of its way to make *certain* that new versions are incompatible with old, so that people are *forced* to upgrade. When the Open Document Format(tm) was created, users such as the Vatican Library who have a large number of documents over 1000 years old, a good number of documents over 1500 years old, a smaller number of documents over 2000 years old, and less than two dozen shelves full of documents more than 2500 years old. Being able to read old data is important to them. Being able to read old data is an abomination to microsoft. Hence ODF. But microsoft tried to kill ODF with their OOXML which has proprietary undocumented containers within the XML, which makes reading anything older than 1 version impossible. Thanks again microsoft.
So, the internet never forgets about that time you got drunk and posted stupid photos, but it forgets everything else? God damn.
Great. Now make your solution continue to work 20 years from now when the Windows XP activation service ceases to exist, which is what TFA is actually about.
"...software lifetime is only like 7 or fewer years..." Do you have a source for this, or is this your guess?
I'm not asking to disagree, quite the opposite: for seven years (coincidence) now, I've been arguing for storing grammar data in an XML format precisely because storing it in the programming language of a particular grammar parser means it will be unuseable in the not-so-distant future. While I have anecdotes (I once wrote a parser using three programming languages, and all three of them became obsolete within a year or two), I would love to have a study to cite.
Rumor has it the Bible is still readable after a couple thousand years. In Greek, Hebrew and Aramaic if you take the time to learn, else in translation.
And I have email from the 1990s that I canNOT read today. It's called Lotus cc:Mail. (I could read it if I was willing to pay.)
"Digital data lasts forever -- or five years, whichever comes first."
--Jeff Rothenberg, 1997
Windows loader, or an army of lawyers.
My husband and I have been writing roleplaying games for nearly 20 years together. Many of the older games in effect only exist as hardcopy because the softcopies are on outdated media like floppy discs and zip cartridges in old versions of PageMaker or Quark that we can no longer open.
He is keen on storing new games on GoogleDocs but I'm reluctant to trust them to an external 3rd party who has a history of killing services. I have much more faith in storing the content as txt or rtf files moving them from computer to computer as we migrate.
Sara
Designer, Gamer, Macgrrl in an XP World
If you want to keep it, you should probably be laser printing it, inkjet ink fades.
Sara
Designer, Gamer, Macgrrl in an XP World
We don't know what this means either.... proprietary format... encrypted... and it cost a lot to send it.... alas it never arrived.
AOAKN HVPKD FNFJU YIDDC
RQXSR DJHFP GoVFN MIAPX
PABUZ WYYNP CMPNW HJRZH
NLXKG MENEK ONOIB AREEQ
UAOTA RBQRH DJoFM TPZEH
LKXGH RGGHT JRZCQ FNKTQ
KLDTS GQIRU AOAKN 27 1525/6
NURP 40 TW 194
NURP 37 DK 76
lib 1625
ToR 1522 copies sent 2
signed W. Stot, S(j/g)T.