Vint Cerf: Data That's Here Today May Be Gone Tomorrow

XML? by AlphaWolf_HK · 2013-06-04 14:09 · Score: 1

I think that given MS office and LibreOffice are in XML, it shouldn't be difficult at all to reverse engineer in the future.

--
Careful with names containing L slashdot.org/~AiphaWolf_HK slashdot.org/~AlphaWoif_HK slashdot.org/~AiphaWoif_HK

Re:XML? by Nerdfest · 2013-06-04 14:34 · Score: 1, Insightful

The same applies to any *open* format.
Re:XML? by cheater512 · 2013-06-04 14:40 · Score: 2

In to a usable document from scratch? Pretty hard. Ever looked at the XML of a moderately complex document?
Re:XML? by ShanghaiBill · 2013-06-04 14:40 · Score: 2

I think that given MS office and LibreOffice are in XML, it shouldn't be difficult at all to reverse engineer in the future.
Yes, the problem is not "data" but "data in proprietary formats" ... and even that is becoming less of a problem. A converter to/from almost anything is usually just a google search away. With VMs and emulators, even proprietary binary programs are easier than ever to deal with. I can run any CP/M or C64 program on my desktop Linux computer using free emulators. This was indeed a "hard problem", but today it is mostly solved.
Re:XML? by fuzzyfuzzyfungus · 2013-06-04 14:52 · Score: 2

I think that given MS office and LibreOffice are in XML, it shouldn't be difficult at all to reverse engineer in the future.
Binary formats were standard for everything up through Office 2003. Office 2007(2003 with optional converter pack and some weird bugs) could output something XML based, though I have the vague memory from the OpenDocument/Open Office XML slugfest that 2007 produced something that deviated from the theoretical ideal of OOXML in some respects, and that full conformity happened at 2010 or 2013. I might be remembering that wrong; but anything before 2003, and a lot from 2003 were definitely binary.
Re:XML? by Why2K · 2013-06-04 15:45 · Score: 1

They are binary, but at least they are documented: http://msdn.microsoft.com/en-us/library/cc313105(v=office.12).aspx
Re:XML? by belmolis · 2013-06-04 15:49 · Score: 2

Both have published specifications, so reverse engineering shouldn't be necessary. However, Microsoft's XML includes things that are not defined in the specification. That was one of the objections to giving it status as an open standard.
Re:XML? by Hamsterdan · 2013-06-04 15:56 · Score: 1

The problem is not just related to the format, but the medium it's stored on. I can still read C64 floppies because I have some drives, but everything I have for my Apple ][ is considered lost until I find both a drive and a working machine.

--
I've got better things to do tonight than die.
Re:XML? by KGIII · 2013-06-04 15:59 · Score: 1

Hell, even the non-open formats are pretty easy to get to a readable level of functionality. They won't contain the markup necessarily and certain features won't be available but, frankly, if we're able to decode all the other ancient languages I'm pretty sure someone will be able to decode these as well.
Speaking of ancient... Err.. When did Vint go to Google? That's kind of cool that he has but that is, in itself, news to me. I must have missed the announcement as I'm sure there was one.

--
"So long and thanks for all the fish."
Re:XML? by gweihir · 2013-06-04 17:03 · Score: 4, Insightful

Have you seen what some people (and MS) do with XML? And what convoluted structures they use? Coded in binary? With compression and other eminently hard to understand stuff? Most of these things will be readable just as long as the applications that created them are around, but not longer.
Forget XML. Forget Unicode as well. Plain ASCII is the only thing that works. Simple PDF or PostScript will work also, because the standards and open-source tools to read them will still be around. But nothing as complicated as a MS office document will survive. LibreOffice formats may have a chance, because LibreOffice may still be compilable and runnable (being FOSS), but only because of that and I would not bet on it.
Incidentally, all my decades old LeTeX documents still compile and can also be read directly. So can my 20 year old ASCII-coded measurement data.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:XML? by Anonymous Coward · 2013-06-04 17:29 · Score: 0

Good point. At least he still has the 1997 PowerPoint file in his possession and can run an emulator or something.
Try that with Google Docs in 2028 and see how it works -- you will almost certainly be fucked.
Re:XML? by Anonymous Coward · 2013-06-04 18:07 · Score: 1, Funny

Holy shit, yeah, you're right - it's totally impossible to strip out the XML tags and be left with readable plain text content!
I bet nobody could ever decode it!
Re:XML? by Anonymous Coward · 2013-06-04 18:23 · Score: 1

Hell, even the non-open formats are pretty easy to get to a readable level of functionality
Hell, ya, as if the real world works like the series "24", where their super efficient CTU lab can identify any type of files, and once identified, they can decrypt anything
Re:XML? by Joce640k · 2013-06-04 19:25 · Score: 1

I think that given MS office and LibreOffice are in XML, it shouldn't be difficult at all to reverse engineer in the future.
You know how I know you haven't read the OOXML standard?

--
No sig today...
Re:XML? by rvw · 2013-06-04 19:25 · Score: 1

Holy shit, yeah, you're right - it's totally impossible to strip out the XML tags and be left with readable plain text content!
I bet nobody could ever decode it!
Well we could of course describe the entire Windows 95 OS, Office 95 and even Mac OS 8 or something in an XML CDATA tag.
Re:XML? by Dr_Barnowl · 2013-06-04 20:13 · Score: 4, Informative

Not even Microsoft can implement their Office XML "standard" ; from examination it's pretty much a direct name-for-name serialization of their internal binary structs, with some of the more obvious gaffes like explicitly saying "do this like this old version of Word" hastily renamed to placate ISO. It needs you to implement a whole bunch of specific behaviours if you want it to work in the MS software (things like "if you update this bit, you also have to update this other bit just so or it won't work"), but these aren't documented.
You've got more of a chance, sure, just because the structs are marked and you don't have to infer where their boundaries are, but it's a far cry from ODF which was designed from the outset to be an open XML format rather than just hastily being bunged together to permit large purchasing bodies (like governments) to tick the "Open format" box on their form.
Re:XML? by Dr_Barnowl · 2013-06-04 20:37 · Score: 1

After they were forced to, for interoperability purposes ; you can see this from the 6-monthly release dates on the documents, even if the formats haven't changed, it's obvious a court order is compelling them to go to the effort of releasing a new document.
The document bundle also has over 6000 pages (6,154) ; Excel accounting for the lions share at over 2600 pages. Coincidentally I think this is about the same size as the initial MOO-XML format submission.
It's quite a task to re-implement (presuming these documents are clear, concise, and accurate).
Re:XML? by Dr_Barnowl · 2013-06-04 20:43 · Score: 1

MOO-XML is transparently just a serialization of the internal binary formats of Office produced in response to the threat of large buyers (like governments) insisting on open document formats ; unlike ODF which was designed to be an XML format from scratch. It let their government buyers tick the box and push through the procurement order - "Hey, it's what we use already, so it's definitely compatible, and it supports all that open format jazz - so it's the best value for money, even if this other thing is free."
The fact that the XML formats are now the default is just the final piece of delicious irony.
Re:XML? by Dr_Barnowl · 2013-06-04 20:45 · Score: 1

To give him his fair due, he's talking about reverse engineering, presumably in the absence of the standard.
The markup does make it much easier - you do at least get to see the structs, and the names of their elements, instead of just inferring them by poking around in a hex editor.
But I'd lay odds that the guy re-implementing ODF from scratch and a few sample documents would be done long before the guy with the MOO-XML documents had recovered from his first nervous breakdown.
Re:XML? by Anonymous Coward · 2013-06-04 20:49 · Score: 0

I'm to lazy to post proper, whatever this bloke just said, I agree with.
Re:XML? by dkf · 2013-06-04 21:28 · Score: 1

Both have published specifications, so reverse engineering shouldn't be necessary. However, Microsoft's XML includes things that are not defined in the specification. That was one of the objections to giving it status as an open standard.
Vendor extensions are sometimes a necessary evil, but just how much you object to them depends on how much they impact on the comprehensibility of the document by tools other than the ones by the original vendor. Are they generating those in newly-created documents or are they just there in documents converted from a previous format? The latter, while not nice, would be not a great problem as it would be possible to get them documented as vendor extensions for legacy support (even if it was "guerilla documentation" and not official), but if critical new/current features require lots of vendor extensions then that's highly problematic.
If a tool can read in the document, throw away all the vendor extensions, and still completely understand the document, those extensions cannot be deeply objectionable. (Very few people get worked up about one-pixel layout tweaks, but putting the actual content inside an extension isn't good at all.)

--
"Little does he know, but there is no 'I' in 'Idiot'!"
Re:XML? by gl4ss · 2013-06-04 23:07 · Score: 1

In to a usable document from scratch? Pretty hard. Ever looked at the XML of a moderately complex document?
it's doable though.
if he really wanted to open his powerpoints.. it would be trivial to find sw to open them.
that's a funny thing now. you can run almost any sw on your modern pc. from practically any system that sold more than 20 000 units.

--
world was created 5 seconds before this post as it is.
Re:XML? by Half-pint+HAL · 2013-06-04 23:30 · Score: 2

Holy shit, yeah, you're right - it's totally impossible to strip out the XML tags and be left with readable plain text content!
I bet nobody could ever decode it!
You seem to be assuming a flat-text file with predictable order. Strip the XML out of anything in a tabular format (eg a spreadsheet -- see TFS) and you lose vital data. Blank cells are lost and the tabulated data no longer lines up.
It gets worse in a filetype with unstructured formatting, eg DTP and slideware. You've got a collection of elements that are only ordered by their metadata. The explanatory labels you want to overlay on top of that image? They're no longer linked to it and you've no way of knowing what they're their for. Multiple news stories on the same page merge into one, and have been divorced from their headlines.
Readable != useful.

--
Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
Re:XML? by gmack · 2013-06-04 23:36 · Score: 1

It would seem you aren't entirely out of luck. The FC5025 Floppy controller can be combined with the TEAC FD55GFR in order to read Apple II disks.
Re:XML? by DarkOx · 2013-06-04 23:45 · Score: 1

To be really technical about it no data is lost, but information is. The structure of an xml document describes relationships between its elements.

--
Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
Re:XML? by NJRoadfan · 2013-06-05 00:10 · Score: 1

Apple II hardware is still readily available. All the schematics and ROM code for the disk controller is online too (typical minimalist Woz design). The bigger problem is file formats, not so much the physical media. In many cases you can read the disk, but not decode the files into something usable.
Re:XML? by Anonymous Coward · 2013-06-05 00:32 · Score: 0

If the file format is open, it should be less costly to hire someone to implement a basic converter than to reverse-engineer it.
Re:XML? by cusco · 2013-06-05 01:01 · Score: 1

Look at a file from that time and see if you can tell me whether FileName.doc was created in Word 1, Word 2, Word 3, Word Perfect 3, Word for Mac (which was a different file format), Word Perfect 4, AMI Pro, Pfs First Choice, or any of the other programs from that time period which would/could use that extension. Or maybe a Wang word processor? Or an IBM DisplayWriter word processor? There are companies that can do that, but it's far from "trivial".

--
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
Re:XML? by drinkypoo · 2013-06-05 01:52 · Score: 1

It's hard to do by eye and by hand, but presumably digital archaeologists of the future will have access to some sort of pattern-mining software which will bring order from chaos for them.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:XML? by Anonymous Coward · 2013-06-05 02:44 · Score: 0

I can run any CP/M or C64 program on my desktop Linux computer using free emulators
How do you read the five inch floppies they're stored on?
Re:XML? by jedidiah · 2013-06-05 03:52 · Score: 1

Even in 1997, I could run a 1985 OS in emulation and therefore the entire tool chain associated with any file format you would care to name. This problem is not nearly as hard as some people make it out to be. Although it's made artificially difficult by the sort of company Vint is trying to make excuses for here.
If you are really worried about stuff being readable 20 years from now then perhaps you should really act like it.
This problem didn't just magically appear yesterday.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:XML? by jedidiah · 2013-06-05 03:55 · Score: 1

> Look at a file from that time and see if you can tell me whether FileName.doc was created in...
Or I could just use simple tools that can tell me the pedigree of a file regardless of what it's named.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:XML? by Anonymous Coward · 2013-06-05 04:49 · Score: 0

With compression and other eminently hard to understand stuff?
Compression. Tricky shit.
Re:XML? by gweihir · 2013-06-05 05:48 · Score: 1

Indeed. For MOO-XML, they could have wrapped a Base64 encoding of the old format, it would be about as useful.
I have my misgivings about ODF though. It may still be too complicated.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:XML? by gweihir · 2013-06-05 05:50 · Score: 1

When you have just the binary output and need to reverse-engineer it, yes, very much so. May even be infeasible in practice.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:XML? by cusco · 2013-06-05 05:57 · Score: 1

Such as?

--
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
Re:XML? by cheater512 · 2013-06-05 10:05 · Score: 1

Well technically there is a nifty Linux command called 'file'. It will detect exactly what file format any file is in pretty much.
Re:XML? by cusco · 2013-06-05 10:18 · Score: 1

Didn't know such a thing existed. So it would look at FileName.doc and tell me that it was created by Pfs FirstChoice Version 2.1 or Word Perfect 5.1 for AS/400? That's almost worth having a Linux installation around all by itself.

--
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
Re:XML? by cheater512 · 2013-06-05 10:30 · Score: 1

Yep. Here are a couple of outputs for various files:
$ file index.php
index.php: PHP script, ASCII text, with very long lines, with CRLF line terminators
$ file doc.pdf
doc.pdf: PDF document, version 1.2
$ file doc.docx
doc.docx: Microsoft Word 2007+
$ file archive.zip
archive.zip: Zip archive data, at least v2.0 to extract
It won't tell you the program used to make it, but it will give intricate details on the exact file format.
Re:XML? by Anonymous Coward · 2013-06-05 15:21 · Score: 0

Have you seen what some people (and MS) do with XML? And what convoluted structures they use? Coded in binary? With compression and other eminently hard to understand stuff? Most of these things will be readable just as long as the applications that created them are around, but not longer.
Forget XML. Forget Unicode as well. Plain ASCII is the only thing that works. Simple PDF or PostScript will work also, because the standards and open-source tools to read them will still be around. But nothing as complicated as a MS office document will survive. LibreOffice formats may have a chance, because LibreOffice may still be compilable and runnable (being FOSS), but only because of that and I would not bet on it.
Incidentally, all my decades old LeTeX documents still compile and can also be read directly. So can my 20 year old ASCII-coded measurement data.
Not if you call it LeTeX - it's LaTeX...
Re:XML? by mcswell · 2013-06-05 15:42 · Score: 1

I don't think you understand "intricate." Reverse engineering a data format from 20 years ago ain't like dustin' crops, boy.
Re:XML? by mcswell · 2013-06-05 15:53 · Score: 1

"Plain ASCII is the only thing that works." Ever try to encode Chinese in ASCII? Sure, you can do it, in the sense that you can encode any 32-bit sequence as a sequence of five 7-bit characters, but that doesn't mean anyone else will be able to figure it out. Including yourself ten years from now.
Re:XML? by Anonymous Coward · 2013-06-05 18:13 · Score: 0

So then write a fucking piece of software that opens the document and preserves the important information contained in the XML, you dumbfuck.
point is there's nothing arcane about opening an XML document and doing something with it's contents. it is, inherently, a structured plain-text representation of your document.
The point I made was in response to somebody saying, "good luck opening an old xml file, that'll never work." Of course it will - you may lose formatting, and contextual information (such as tabular formats - though it'd be pretty fucking trivial to convert empty cells into an extra tab or blank cell in a csv format), but you do not lose the actual CONTENT of your memos, or spreadsheets, or whatever.
Re:XML? by gweihir · 2013-06-05 19:44 · Score: 1

Actually, it is \latex. Sorry about that.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:XML? by gweihir · 2013-06-05 19:47 · Score: 1

The Chinese have a larger problem, agreed. But Unicode is not going to solve it. They likely need to create a compact, human-readable transliteration in ASCII, or they need to drop their broken and obsolete system. Yes, they are not going to like that, but it is already happening and there is a price to pay for arriving in the modern world.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:XML? by Half-pint+HAL · 2013-06-05 22:51 · Score: 1

Only if you consider "metadata" not to be data, but metadata is data -- data about data.

--
Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
Re:XML? by Half-pint+HAL · 2013-06-05 23:08 · Score: 1

DNA is made up of five elements: carbon, nitrogen, oxygen, hydrogen and phosphorous. If I was to take a single molecule of my DNA and break all the molecular bonds (ie strip out all structural information) and hand you a collection of the resulting atoms, would you have all the "content" of my DNA in any useful sense?
So please dial back the insults.
XML may be a structured plain-text representation of the document, but the structure itself has semantics that are not always trivial to decode. In order to "write a piece of software that opens the document and preserves the import information", I would have to decode the semantic links between elements. The whole point of this discussion is that step in the process. Plaintext certainly makes the job easier, but it doesn't make it easy.
There may be nothing arcane about opening an XML document and doing something with its [NB: no apostrophe] contents, but there may be something very arcane about doing the right thing with its contents. If there's a non-obvious interaction between two elements, for example, as happens in Microsofts OOXML.
Speaking of which, why don't you go and download a large PowerPoint presentation in PPTX format from slideshare, open it up in a text editor and then come back and call me a "dumbfuck" again when you find it trivially easy to process....

--
Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
Re:XML? by RockDoctor · 2013-06-09 01:23 · Score: 1

frankly, if we're able to decode all the other ancient languages I'm pretty sure someone will be able to decode these as well.
We're not able to decode all ancient languages. Some of those with a significant corpus of work remain incomprehensible. One example at the borderline of what should be possible is the Phaistos Disc language. Linear-A remains undeciphered, while it's descendent Linear-B has been deciphered. There was probably a more-or-less common language amongst the cities of the "Indus Valley Civilisation", but only scattered fragments of it's (syllable-based ?) written language have been found. And with that list, I've not even left the Indo-Aryan language group - probably.

Speaking of ancient... Err.. When did Vint go to Google?
I don't remember exactly ; it was a while ago, after he was doing work for NASA on high-latency networks - i.e. Interplanetary Internet. (Wikipedia says he went to Google in 2005, but work on "Interplanetary Internet" has been wobbling on since the early 1980s.

--
Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
Re:XML? by mcswell · 2013-06-09 15:35 · Score: 1

Ok, if you don't like that example, try Hangul (the writing system used for Korean) or Perso-Arabic (the writing system used for Arabic, Persian, Urdu, Punjabi, Pashto, and many other languages), or Tifinagh (used for various Berber languages) or Devanagari (Hindi, Marathi,...) or Bengali (Bangla) or Syriac (for Syrian) or Greek (language left as exercise to the student) or Cyrillic (Russian, Ukranian, and almost any Slavic language except Polish and Czech) or Hebrew or...
But you get the idea. Even many languages that use a "Latin" writing system have diacritics (accent marks, tildes, etc.) that aren't in ASCII.

My data will be readable by drinkypoo · 2013-06-04 14:14 · Score: 3, Informative

My data will be readable because I use bog-standard formats. If I get really froggy I use HTML, and you can just strip the tags and read that.

If his data won't be readable, that's his problem. Anything you want to save for posterity, export it now.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"

Re:My data will be readable by Anonymous Coward · 2013-06-04 14:47 · Score: 1

Export it? Sure. Or.... If he took that 1997 PowerPoint, and opened it with each successive Office version, and re-saved it as the latest version, he'd be fine.
I'm sure there's an automated way to do that with numerous files.
Re:My data will be readable by Bremic · 2013-06-04 14:59 · Score: 5, Insightful

Until HTML includes DRM and half the stuff you create ends up being unreadable.
Well, really we are probably good for anything that can be opened in a text editor for a long long while; but the point is there. Anything can be lost to data format shifts.
As someone who had to re-type a 80 page document because the company stopped using the software the document was created on, and didn't have a licence for it an no converter found online worked - I can say this does happen.
How many people are going to shell out $600 for software to open something they want to make an edit on? How many are going to just give up and find someone to rekey it, or just give it up as a loss?
With more and more systems including format locks, in 50+ years historians will likely have a lot of trouble finding out details from today. Kind of like it is now when we go to look at archival film from WWII and find it's all faded into obscurity. We have the same problems, just with different causes. Then it was lack of preservation of a medium with a limited lifespan. Now it's storing stuff in formats that will go away as they are improved upon, blocked, or just forgotten about.
Sure if your in your 20s, or even 30s, you probably haven't realized the copy of your grandfathers photos are sitting on a floppy disk in a proprietary format. But when you get older you may encounter these issues.
Re:My data will be readable by Nutria · 2013-06-04 15:18 · Score: 4, Informative

Or NASA data from deep space probes that's stored in now-unknown formats on mag tapes from long, long, long gone manufacturers.

--
"I don't know, therefore Aliens" Wafflebox1
Re:My data will be readable by starburst · 2013-06-04 15:25 · Score: 2

From a 2002 slashdot story:
mccalli writes :
"Thought people might find this amusing. In 1986, the UK compiled an electronic [copy of the] domesday book. They used BBC Master computers to do it, and the result was put on laserdisc. I actually used this project whilst at school. This article states that nothing can now read these merely 15-year old discs. The original, written approx. 1086, is still doing fine thank you very much."
Sounds like a good candidate for Bruce Sterling's Dead Media Project. (Speaking of Sterling, the "graying cyberpunk" has an interesting article in the Austin Chronicle on the upcoming SXSW Interactive conference called "Information Wants to be Worthless" -- thanks to reader ag3n7.)
Re:My data will be readable by ganjadude · 2013-06-04 15:43 · Score: 2

why didnt you OCR and then make the edits? There are numerous OCR options that would have fit that need no?

--
have you seen my sig? there are many others like it but none that are the same
Re:My data will be readable by geniice · 2013-06-04 16:39 · Score: 2

In fairness they did manage to transfer the stuff off the discs and put the stuff without copyright issues online.
Re:My data will be readable by Concerned+Onlooker · 2013-06-04 17:35 · Score: 2

"How many people are going to shell out $600 for software to open something they want to make an edit on?"
The upside to this is that when somebody wants to update that nifty company Flash web site and discovers that Flash now costs an arm and a leg, the site gets re-written in html.

--
http://www.rootstrikers.org/
Re:My data will be readable by kermidge · 2013-06-04 19:12 · Score: 2

Well, there's the problems with the medium itself, then there's the format, as you say (ought to be right up a cryptanalyst's alley, tho), then there's the real blocker: number of tracks, head design, and the circuitry that goes with it. Unless there are good documents for the machine's design and building, or one can be found in working order in a museum, you're SOL. It's a big problem that doesn't get much exposure.
Re:My data will be readable by thsths · 2013-06-04 20:23 · Score: 1

> If his data won't be readable, that's his problem.
Actually the problem is (as usual) in front of the screen. There are many programs that can read a 1997 powerpoint file - he just picked the one version of office that does not.
Using a format that is designed for compatibility would also solve the problem. PDF is pretty good, but there can be issues with embedded objects. PDF/A seems to be a safe bet, maybe missing some nice features, though.
Re:My data will be readable by Anonymous Coward · 2013-06-04 20:52 · Score: 0

A good typist would probably have better luck re-typing things than the painful process of editing something that size.
Re:My data will be readable by Dr_Barnowl · 2013-06-04 20:54 · Score: 1

The main problem was that the project came too early ; it was innovative.
It used very uncommon hardware - you needed a 12" analogue / digital laserdisc reader, and you needed an uncommon add-on CPU unit for the BBC Micro.
Only a few years later, CD-ROM became ubiquitous, with the 700MB basic disc size being more than double one of the single 300MB sides on those 12" discs, although I'm not entirely sure whether the purely digital CD-ROM would have enough storage to cope with encodes of the analogue video tracks. A single DVD-ROM could probably house the whole project, along with a bootable OS and emulator to run it.
Re:My data will be readable by Kjella · 2013-06-04 21:04 · Score: 1

As someone who had to re-type a 80 page document because the company stopped using the software the document was created on, and didn't have a licence for it an no converter found online worked - I can say this does happen.
I'm rather surprised that with todays VM/emulation solutions companies haven't figured this one out, unless you've sold the licenses or just leased/rented them in the first place keep at least one license of your old technical platform. That way you should at least be able to get a copy-pasted version or OCR a copy written to PDF.

--
Live today, because you never know what tomorrow brings
Re:My data will be readable by Half-pint+HAL · 2013-06-04 23:42 · Score: 1

If I get really froggy I use HTML, and you can just strip the tags and read that.
Only if you're very, very careful, because if you strip the tags, you lose the ALT attributes on your IMG tags, which means you're ditching the plaintext fallback for the non-textual information in the page..../p.

--
Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
Re:My data will be readable by Nutria · 2013-06-05 00:11 · Score: 1

number of tracks, head design, and the circuitry that goes with it
That's exactly what I was thinking when I referred to long-gone manufacturers. Otherwise, you could toss them on "modern" IBM 9-track drives and pull the data onto modern media for decipherment.

--
"I don't know, therefore Aliens" Wafflebox1
Re:My data will be readable by cusco · 2013-06-05 01:14 · Score: 1

That's why volunteers at the Planetary Society had to pull a computer and drive out of (literally) a computer museum to read the Pioneer data tapes. The tapes themselves were actually in fairly good shape, they had been stored in a controlled environment for all that time, but they were unreadable. (The miniscule storage costs were the excuse that the Bush Madministration used to order the data destroyed; not 'disposed of' but 'destroyed', and they were mightily pissed when NASA management handed them over to the Society.)

--
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
Re:My data will be readable by cusco · 2013-06-05 01:18 · Score: 1

The law firm that my mom worked for bought an AS/400 with Word Perfect as a networked word processing system (yes, the salescritter should have been shot). Good luck getting any of those documents open, even with Word Perfect for DOS or Word Perfect for Mac. They had interns and Manpower temps re-typing documents for half a year when that thing went away.

--
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
Re:My data will be readable by IndustrialComplex · 2013-06-05 01:23 · Score: 1

As long as we aren't talking about post WWIII levels of tech, I don't see how this will be a problem.
Interpreting the data is trivial compared to preserving the data. As long as it isn't encrypted, getting useful data out of a format, even a long dead format on a long dead piece of equipment will be possible. Potentially hard and expensive, but possible.
Recovering data from formats which have been allowed to deteriorate is a much bigger problem because you aren't dealing with extracting data from a difficult medium, you are dealing with data that is no longer there at all! That's the problem with old tapes and other formats.

--
Out of modpoints but really liked a post? 1BDkF6TtmmeZ3yqXbz9yhdYVqRYnwFoXDj
Re:My data will be readable by Nutria · 2013-06-05 01:54 · Score: 1

Madministration
I see what you did there!
I've seen too many people (a) misinterpret the blatantly obvious, in the zealous surety of their rightness, and (b) lie for political advantage.
Thus, just as I'm withholding judgment on BO's apparent evilness, I withhold judgment on the alleged reason why the tapes were to be destroyed.

--
"I don't know, therefore Aliens" Wafflebox1
Re:My data will be readable by anagama · 2013-06-05 02:29 · Score: 1

Imagine if they didn't have paper copies. They'd have been screwed rather than just annoyed and slightly poorer.
My business partner recently lost the all of her baby pictures for the first two years of her first kid. Not from hardware failure, but as best we can figure it relates to an issue in 2010 where updating iPhoto caused data loss. The time machine backup does not extend back before 2010 because the drive was replaced at some point (how many non-savy tech users think to backup their backups?). As a result -- they're totally gone.
In contrast, I have all my baby photos from the 60s and 70s. Some a bit tattered, and instead of thousands you get when people use digital cameras, maybe 50 or so, but because they're on paper I have them. Reading them requires no unavailable technology -- just eyes.
I love technology in general, but I've been bitten by it. If I really want to make sure I have the best chance of keeping something, I print it out. I don't print out everything -- the nature of digital content is that it allows people to store a huge amount of crap (like thousands of photos only slightly different from each other when one would be sufficient) for almost no cost. But that cheapness makes people devalue the little bit in that pile, that they really don't want to lose. And then they lose it and would pay almost anything to get it back.
And yes, I know papers and photos fade, but the process is slower. Typically with a computer, it's working and available one second, and gone the next. You have a lot more time to correct poor storage techniques with physical documents. And yes, papers and photos can get burned up -- but you can make offsite backups of these things too.

--
What changed under Obama? Nothing Good
Re:My data will be readable by Anonymous Coward · 2013-06-05 02:58 · Score: 0

Utter nonsense, stupidity and ignorance of the computer illiterate sheep, brainwashed, etc. by advertising, propaganda, etc. by Microsoft, et al. I have lots of stuff in ASCII that can be read on any computer I've ever used. LOL! ;)
So your problem is using MS junk, as usual. Sigh. Get with the program. Install some flavour of linux today. Problems solved gratis. :)
Re:My data will be readable by jedidiah · 2013-06-05 04:07 · Score: 1

It sounds like he could have just saved it to PS in 1997 and have been done with it.
I have a Project Gutenberg CD from 1994 that's still perfectly usable because the data is isn't in some proprietary format. The data migrated off of optical media long ago and resides in various places within backup copies of my media hoard.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:My data will be readable by jedidiah · 2013-06-05 04:13 · Score: 1

The funny thing is that Word Perfect used a markup system and exposed the markup codes to the end user so the markup could be directly manipulated.
A competent typist from back in the day could probably just read the file directly.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:My data will be readable by Anonymous Coward · 2013-06-05 04:48 · Score: 0

This is all so fecking annoying. Facebook comments - stored on a server forever, emails of my junk - stuck in the google compound forever, accessible by not me. Any important production? Decaying rapidly.
Re:My data will be readable by cusco · 2013-06-05 06:03 · Score: 1

I missed the Reveal Codes feature for a long time when I had to change to MS Word. The Word equivalent was primitive in comparison.

The problem apparently was that the AS/400 files couldn't be exported to PC format. (Although knowing IBM it probably could be done, but no one that her office had access to know how to do it so they were told that it was impossible.) When the AS/400 went away so did all the data.

--
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
Re:My data will be readable by uninformedLuddite · 2013-06-06 12:27 · Score: 1

Someone should have lost their life let alone their job for that

--
The new right fascists are bilingual. They speak English and Bullshit.
Re:My data will be readable by metaforest · 2013-06-06 19:37 · Score: 1

Kinda funny. some time ago I met a codger that wanted to speel-in some old wire recordings...
It was impossible. Too much cost for too little.... he had no idea if any of the reels were blank...
We could not even determine for sure what mechanical format the reels were. Who made them? (they are unmarked) What system(s) they were compatible with no clear coding on the spools? (Mechanical dimensions are not searchable... and are ambiguous for that tech)
It was a wash. I'm sure those reels are quite readable, but short of engineering a custom machine... they are effectively unreadable.
Re:My data will be readable by kermidge · 2013-06-09 09:42 · Score: 1

Two of my elder cousins had a wire recorder, taped stuff off radio mid- late-'50s. My uncle had brought the recorder back from a lab he'd worked at during the War. It was a commercial model (company, I can't remember). I though it was an ingenious bit of engineering at the time (I was what, 8yrs. old?)
I don't know the exact kind of stuff needed, but it should be readily possibly to differentiate the recorded places on the wire, then on to the scheme for encoding. It would have to be fairly simple given one-dimension to work with. But I can see that to do so usefully would be decidedly un-trivial.
I'm glad y'all at least gave it a try.
Re:My data will be readable by metaforest · 2013-06-09 11:03 · Score: 1

When first researched the project it became clear that using steel wire like that creates some rather heinous mechanical constraints. The electronics are bonehead simple... The mechanical stuff needed to safely transport the wire and bale it properly onto a reel is fiendishly difficult to get right. Failing to get it just right makes any further attempt to read the wire impossible due to tangling. Anyone who has had to de-tangle a fishing reel knows what I mean. Now to add more difficulty, the line is brittle and a little thinner than human hair.
Re:My data will be readable by petermgreen · 2013-06-17 03:19 · Score: 1

It wasn't an electronic copy of the domesday book, it was a project collecting various stuff including photos and videos from schools that was supposedly in the spirit of the domesday book and putting it into a newflangled computer based system.
The problem with that project was it was ahead of it's time and as such needed some pretty esoteric hardware*. Normal computing hardware from that era is still easy enough to find but the esoteric stuff needed for the domesday syste is not.
* Specifically it used a BBC master (common) with a 6502 second processor card (fairly rare), a SCSI card (very rare) and a specific model of laserdisk player (very rare)

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register

What's a Macintosh by Anonymous Coward · 2013-06-04 14:15 · Score: 0, Insightful

What's a Macintosh?

What ever it is, I bet if he used LaTeX+Beamer he wouldn't have this problem. Whether it was authored in 1997 or 2011, it almost certainly would still work on a "Macintosh". Maybe he could learn a thing or two from Donald Knuth and Leslie Lamport, and stop playing around with the rugrats at Google.

emulation / virtualization by smash · 2013-06-04 14:17 · Score: 2

Support emulatorVM developers! Encapsulate your entire machine in a VM and you can run the entire software stack if necessary. Anything you need convenient access to, export to CSV, XML or some other standard format.

--
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.

Re:emulation / virtualization by Anonymous Coward · 2013-06-04 14:34 · Score: 0

It's called TeX. It's a virtual machine for document generation. Originally from the 1970s, it's the only VM* (as well as complete authoring and publishing tool) that will be around after VMWare, Linux, and Windows are long forgotten, and will still work _natively_ on whatever platform predominates. Heck, it's already outlasted the "Macintosh", and it's even managed the transition to the WWW better than everything else: https://www.writelatex.com/
* Well, that and whatever IBM uses on their mainframes.
Re:emulation / virtualization by lister+king+of+smeg · 2013-06-04 14:35 · Score: 1

and when Unicode and ASCII are replaced?

--
---Saying gnome 3 is better than windows 8 not so much a compliment as it is damning with light praise.
Re:emulation / virtualization by cheater512 · 2013-06-04 14:42 · Score: 1

How do you encapsulate the VM so it will still work 20 years in the future?
Re:emulation / virtualization by Anonymous Coward · 2013-06-04 14:46 · Score: 5, Funny

You're very clever, young man, very clever - but it's VMs all the way down!
Re:emulation / virtualization by phantomfive · 2013-06-04 15:01 · Score: 1

There's a pretty good chance LaTeX will still support them. There's a reason the TeX distribution is like 2 Gigs.....

--
"First they came for the slanderers and i said nothing."
Re:emulation / virtualization by fuzzyfuzzyfungus · 2013-06-04 15:05 · Score: 1

Unicodes is a bit sprawling; but ASCII is only 128 characters(unless dealing with the wonderful world of nonstandardized non-latin extensions or ad-hoc 8-bit extensions-of-convenience is your problem, in which case I'd advise shirking your duties and drinking heavily), making preserving the whole thing even by chiselling it into stone monuments or other archaic methods potentially viable.
Re:emulation / virtualization by Mitchell314 · 2013-06-04 15:36 · Score: 2

Honestly, reverse engineering ACII plain text files would be trivial. Not to the average person, but to somebody with a bit of background:
A) We have software that can use something called frequency analysis to decipher something encoded that has a 1-1 correspondence so something we know (ie the english alphabet).

B) Ignoring software, frequency analysis is something that could be (and before the days of computers, was) done by hand. Hell, some things could be picked out by eye. For one, all files would have a particular byte character that appears near the end of every (well formed) text file, as well as often appearing periodically through the average file. A key indicator of being a newline/carriage return. Also in the bulk of most documents the new line is followed by a particular other character that also appears in a periodic manner. Being the period. And then another character appearing often every so often (on average around 5-6 characters), a good candidate for the space character. I and A also being somewhat easy to pick out (the whole upper/lower case making it a bit harder, but still doable). With a bit more dedication, you can start guessing common words, such as a common letter followed by a less common letter followed by a very common letter ('the' sounds like a good candidate). And then to figure the rest out, compare the average frequencies of characters across many documents to the average frequencies of letters and punctuation in documents we already know. A decent undergrad senior in computer science could write a program to do this. Hell, I took a sophomore level math class that went over this.

--
I read TFA and all I got was this lousy cookie
Re:emulation / virtualization by mrsurb · 2013-06-04 16:17 · Score: 1

ASCII is even easier than that - because 0-9, a-z and A-Z are represented by sequential binary numbers.
Re:emulation / virtualization by geniice · 2013-06-04 16:41 · Score: 2

There are a few industrial setups where that is pretty much what has happened.
Re:emulation / virtualization by thediv17 · 2013-06-04 16:42 · Score: 1

Indeed, I have run PalmOS and Windows Mobile VM's on a Windows 2000 VM running inside VirtualBox and it worked fine.
Re:emulation / virtualization by Anonymous Coward · 2013-06-04 17:03 · Score: 0

write it in brainfuck
Re:emulation / virtualization by smash · 2013-06-04 18:21 · Score: 1

Virtualbox has an open source edition. If you think x86 VMs are going anywhere you are mistaken.

--
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Re:emulation / virtualization by smash · 2013-06-04 18:23 · Score: 1

Also.... I already have VMs that run software from 20 years ago (well, 1995 - close enough).

--
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Re:emulation / virtualization by smash · 2013-06-04 18:24 · Score: 3, Interesting

err... plus DosBox is running x86 software I have from 198x...which is 30+ years now.

--
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Re:emulation / virtualization by StripedCow · 2013-06-04 21:02 · Score: 1

Encapsulate your entire machine in a VM and you can run the entire software stack if necessary.
Yes, but what about my Google doc stuff?
Can you run Google in a VM?

--
If Pandora's box is destined to be opened, *I* want to be the one to open it.
Re:emulation / virtualization by devman · 2013-06-05 01:10 · Score: 1

You can export all that data in ODF formats.
Re:emulation / virtualization by kfall · 2013-06-05 01:28 · Score: 1

Also related to the "there are people that do history" below, 'VM curator' is the new librarian... e.g., https://olivearchive.org/
Re:emulation / virtualization by mcswell · 2013-06-05 16:00 · Score: 1

And if it's not English? You know, one of those 6999 other languages. Or maybe a programming language.
Re:emulation / virtualization by smash · 2013-06-05 16:28 · Score: 1

This is why i don't trust important stuff to somebody else's cloud.

--
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Re:emulation / virtualization by Mitchell314 · 2013-06-06 00:29 · Score: 1

If what is not English? In practice, in these studies you try to do across many varied documents so outliers don't throw you off. If you are testing a bunch of files that were found in random storage devices in the US, it's safe to assume that English is the majority language. For a different country, a different language. As far as I know, frequency analysis works well with many alphabetical languages.

--
I read TFA and all I got was this lousy cookie
Re:emulation / virtualization by Anonymous Coward · 2013-06-06 04:27 · Score: 0

"If what is not English?": If the text you're deciphering--the one you refer to in your 4 June 11:36 PM posting--isn't in English.
Sure, frequency analysis will work with an alphabetic writing system, providing you already know what language it is. Alternatively, if it's a standard encoding (ASCII, ISO 8859, Unicode, etc.), then you can use data like http://borel.slu.edu/crubadan/stadas.html to figure out what language it is. What's hard is when you don't know the language OR the encoding, i.e. if you had to reverse engineer ASCII without knowing what language you were looking at (or if you didn't even know whether it was a natural language or a programming language).
It's rather like decoding Linear B--only after Michael Ventris realized that the language was ancient Greek was it possible to begin deciphering it.
And yes, if it's in the US it's probably in English. But that's not true in most of the world.
Re:emulation / virtualization by Macgrrl · 2013-06-06 10:29 · Score: 1

*snort*
Best use of this line I've seen in a long time.

--
Sara
Designer, Gamer, Macgrrl in an XP World

We should have listened by Anonymous Coward · 2013-06-04 14:19 · Score: 5, Insightful

We're in a difficult spot right now because for years we ignored the warnings about 'proprietary file formats'.

I'm not blaming Microsoft either. We let Microsoft do this to us of our own free ignorance.

Re:We should have listened by Lehk228 · 2013-06-04 14:52 · Score: 1

what's this "we" shit all my files are odt, ods, html, tex or txt files. they will be just as accessible in 100 years as they are now.

--
Snowden and Manning are heroes.
Re:We should have listened by Tr3vin · 2013-06-04 15:01 · Score: 1

Without developers maintaining editors/viewers, open formats are only slightly more usable than proprietary ones. 100 years is a really long time from now as far as technology goes. I wouldn't be so quick to say that open formats will still be easily accessible.
Re:We should have listened by Wolfling1 · 2013-06-04 15:39 · Score: 1

And we're not doing it now with Apple products?
Re:We should have listened by Anonymous Coward · 2013-06-04 16:19 · Score: 0

Well, I've been thinking the same thing a long time. Heck, language changes a lot in a few centuries. Look at the old king James Bible and things like Romeo and Juliet and compare that to more modern English. They are reasonably different and English is a relatively new language even. That difference doesn't even compare to the difference between modern spoken/colloquial (not standard) Arabic and Classical Arabic. Now compare the difference between the educated language of today with the educated language of a thousand years ago and compare the language spread of what was spoken a thousand years ago with what is spoken today. Many languages have disappeared and we still can't decipher ancient documents. and even within older (not ancient) documents that we can decipher, like Hebrew and Classical Arabic, there is still a lot of discrepancy among linguists and scholars over how various words or sentences should be interpreted and what the meaning of an old word is. and the number of people who can interpret this stuff well are very limited (how much does it pay or how much would it pay if many more people did it?) and trying to document a language to the point where someone can even become fluent based only on existing documentation is very difficult (if it were easy then Google translate wouldn't be so bad at translating since then the problem would simply and easily be resolved into writing a bunch of simple rules to go from one language to another and we can simply document these rules in a few books and everyone can become fluent. It doesn't work that way).
Does anyone think the future is going to be any different unless someone makes the effort to preserve this stuff? Computer stuff still revolves around language, computer language to spoken language, and language changes with time.
Re:We should have listened by Anonymous Coward · 2013-06-04 16:36 · Score: 0

TeX is approaching 40 years. TeX is written in Pascal, which is a very simple language to understand, as well as implement or emulate. In fact, current TeX distributions transform it to C code, and if C ever goes away it'll be transformed to something else. TeX is never going away, and the core of it is pretty much frozen, warts and all.
LaTeX is written using TeX. LaTeX is never going away.
While LaTeX isn't as stable as TeX, it's API is immeasurably more stable than just about any other API I can think of, so things like Beamer aren't going away anytime soon (and by soon I mean many decades). Of course, it's all FOSS software, but this stack has remained backward compatible in the sense that source from the 1980s can be processed by the most recent versions of today.
If Vint Cerf has been using PowerPoint, he's lost my respect. Beamer has only been around for 10 years, but I'm sure there were many presentation templates floating around in the 1990s. Starting from source, they would generate identical documents today, although they'd probably look a little better because of rendering improvements.
Re:We should have listened by plover · 2013-06-04 16:39 · Score: 2

Actually, languages have been consolidating and standardizing rapidly with the advent of the printing press, effective and affordable transportation, broadcast media like TV and radio, and the Internet. Diversity of language is rapidly disappearing.
The way things are going now, there will be only a few dozen languages left at the end of this century, and possibly only a handful after the hundred years that follow.
Although it's entirely possible that technology will preserve native languages, too. If machine translation becomes as easy as slipping a Babel fish in your ear, people won't feel the need to drop their mother tongue for English or Mandarin.
No matter what, we'll all still be yelling hateful things at each other, but at least we'll understand the insults the other guy is hurling.

--
John
Re:We should have listened by geniice · 2013-06-04 16:44 · Score: 1

Wrong tense. There are enough converters emulators around at this point that we can read anything halfway mainstream as long as we can read the hardware its on.
Re:We should have listened by Anonymous Coward · 2013-06-04 17:44 · Score: 0

A lot of insults and curse words don't translate well. You are right, there are fewer languages but language today is very different than language a few hundred years ago. So we still don't know what language a few hundred years from now will look like and if the past is indicative of the future then it will probably be just as different.
Re: We should have listened by Guspaz · 2013-06-04 17:54 · Score: 1

Even in the same language, cursing doesn't always translate. In Quebec French, all the curse words are church terms. Somebody from France probably wouldn't even recognize it as cursing, they'd just wonder why someone was angrily reciting a church-related vocabulary list.
Re:We should have listened by lgw · 2013-06-05 07:39 · Score: 1

We're in a difficult spot right now because for years we ignored the warnings about 'proprietary file formats'.
I'm not blaming Microsoft either. We let Microsoft do this to us of our own free ignorance.
Early file formats were all proprietary, and Microsoft was far from the worst. In the early mainframe days, you didn't really even have the concept of a "text file" (what an old mainframe called a file is what we'd call a partition). One common approach for individual text files was to keep things in the printer queue. No joke - there was no filesystem way to write a small text file to disk, but you could print a file easy enough, with metadata that kept it around in the queue, and there were common tools to let you read them on your terminal directly from the print queue.
For anything advanced enough that each document is a discrete file on a filesystem, the file converters will outlive the original media. You'll be able to find, say, WordPerfect to Word converters long after the last 5.25" floppy becomes unreadable.

--
Socialism: a lie told by totalitarians and believed by fools.
Re:We should have listened by Lehk228 · 2013-06-05 13:11 · Score: 1

zip is already 24 years old, ASCII is 50 years old. .od* files are zipped ascii XML, it will be parsable as long as anyone is interested in doing so.

--
Snowden and Manning are heroes.

Re:So? by MrBandersnatch · 2013-06-04 14:23 · Score: 5, Insightful

I think you will find that there's a little known branch of academia called "history" which sometimes takes a curious interest in even the most trivial of past information.....

Yes, backwards compatibility, blah blah blah... by Narcocide · 2013-06-04 14:24 · Score: 5, Insightful

Yes, you're right I have this ASCII text file created in 1997 and I can't find anything to read it...

OH WAIT ACTUALLY FUCKING *EVERYTHING* STILL READS IT.

Stop gargling Microsoft's balls so much and wipe off your chin. Proprietary data formats are THE PROBLEM. Stop trying to redirect public discourse with this thinly veiled bullshit.

Re:Yes, backwards compatibility, blah blah blah... by Nerdfest · 2013-06-04 14:36 · Score: 4, Informative

Odds are that you don't need to convince Vint Cerf or Google in general about the advantages of open formats.
Re:Yes, backwards compatibility, blah blah blah... by cheater512 · 2013-06-04 14:43 · Score: 1

But your EBCDIC documents are absolute rubbish now and the tools to convert them aren't commonplace any more.
Re:Yes, backwards compatibility, blah blah blah... by PPH · 2013-06-04 14:50 · Score: 2

Just Googled "ebcdic to ascii converter"
About 123,000 results.

--
Have gnu, will travel.
Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · 2013-06-04 14:59 · Score: 1

They've been showing some concerning signs of late, however
Re:Yes, backwards compatibility, blah blah blah... by ImperialSardaukar · 2013-06-04 15:01 · Score: 1

122,999 of those are empty pages full of malware, or porn (often, both).
Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · 2013-06-04 15:02 · Score: 0

Not only that, but there were several different EBCDICs.
Re:Yes, backwards compatibility, blah blah blah... by Narcocide · 2013-06-04 15:02 · Score: 1

Yes, you're right and maybe this is the part I am having trouble coming to grips with. He seems like the last guy who should be spouting this line of rubbish. I feel like I'm in a bad B-rate horror movie and the body snatchers just got to the President...
Re:Yes, backwards compatibility, blah blah blah... by fuzzyfuzzyfungus · 2013-06-04 15:06 · Score: 2

But your EBCDIC documents are absolute rubbish now and the tools to convert them aren't commonplace any more.
How deep are your pockets?
*IBM Consulting*
Re:Yes, backwards compatibility, blah blah blah... by Nerdfest · 2013-06-04 15:14 · Score: 1

Yes, the Talk XMPP shutdown and Google Reader are a little disturbing. We're as far as we are with the ubiquity of the internet because of open formats enabling intercommunication and competition between products and services by different providers. That seems to be going away again in favour of platform lock-in with things like iMessage, FaceTime, etc. Google's Hangouts are at least cross platform, but that's really only a mild improvement. You still need to use Google's implementation. I'm just happy I can still use the stuff under Linux for the most part. I'm a little worried about the future, as short sighted greed seems to have taken over.
Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · 2013-06-04 15:31 · Score: 5, Insightful

But your EBCDIC documents are absolute rubbish now and the tools to convert them aren't commonplace any more.
$ printf "\xC5\xC2\xC3\xC4\xC9\xC3\x25" | iconv -f ebcdic-us -t ascii
EBCDIC
$ dpkg -S `which iconv`
libc-bin: /usr/bin/iconv
$ apt-cache show libc-bin | grep -e Essential -e Priority
Essential: yes
Priority: required
So we got a program that can convert from EBCDIC-US to ASCII (or UTF-8 or whatever you want) and that program is in an Essential/Required package on any Debian-based system and for some reason you say that "aren't commonplace"?
Are you on crack?
Re:Yes, backwards compatibility, blah blah blah... by aaarrrgggh · 2013-06-04 15:33 · Score: 1

...everything except that Zip Drive I saved it on.
Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · 2013-06-04 15:47 · Score: 0

Are you on crack?
Judge Koh, is that you?
Re:Yes, backwards compatibility, blah blah blah... by BrokenHalo · 2013-06-04 16:14 · Score: 1

Just Googled "ebcdic to ascii converter"
I don't even need to do that, because I still have one I wrote in Fortran 4 back in the early '70s when I was converting a suite of banking programs to migrate them off Burroughs mainframes. I'm quite sure it'll go through the GNU Compiler Collection without much modification.
Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · 2013-06-04 16:59 · Score: 0

Tools to convert them? Look up the format in wikipedia and spend an hour writing the code to do it.
Re:Yes, backwards compatibility, blah blah blah... by gweihir · 2013-06-04 17:10 · Score: 1

But your EBCDIC documents are absolute rubbish now and the tools to convert them aren't commonplace any more.
Well, they are available in any halfway complete Perl installation as standard. So I would say you have no clue...

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Yes, backwards compatibility, blah blah blah... by chrismcb · 2013-06-04 18:29 · Score: 1

Proprietary data formats aren't exactly the problem either. Sure your ASCII text file is readable, and probably always will be. But that doesn't mean 100 years from now you'll be able to understand the format of the ASCII data stored there, or that programs still exist to read it... And that is the issue.
Of course it might be easier to hack an XML file, you still need something to understand the format. Whether it is proprietary, or a standard.
Re:Yes, backwards compatibility, blah blah blah... by RedHackTea · 2013-06-04 19:16 · Score: 1
With an Open Source file format, I think it'd be pretty easy in 100 years to write a program to -- at the very least -- convert it to a new Open Source format. If you're a programmer, you know that this is trivial. The only thing you have to worry about is losing the specification or lack of examples (which is extremely rare depending on the popularity of the format).
1. Step 1: Pick a modern language; today that may be Ruby
2. Step 2a: Read the Open Source specification; this should easily be preserved on Wikipedia or within other "libraries"
3. Step 2b: Or, if Step 2a does not exist, search for an example program with code (I've never studied BASIC, but I can read it; programming languages have a lot of common factors)
4. Step 3: Convert to new Open Source format (since making a Viewer/Editor is probably pointless); share on modern versioning system (today that would be Git) with F/OSS license and spread amongst the geeks
Anyone that says that "Proprietary Formats" aren't the problem are spreading shill -- either from ignorance or from being force-feed by M$ for so long. Proprietary formats may not be the whole problem, but they are 100% part of the problem. If you're not a coder or a geek, then maybe you have an excuse from lack of understanding. Geeks love to preserve stuff and tinker with old technology; read about Atari recently on /.? How many user forums are there are but the most oddball relics? As long as Wikipedia and our desire to store the past holds up, I don't see any problem. If I could, I'd bet my life on being able to still either read/convert the ODT format or code a program to convert the ODT format to a modern format in 100 years.

The scary part... this isn't even just documents. This deals with audio files (why I use FLAC/OGG), images, videos, etc., etc....
--
The G
Re:Yes, backwards compatibility, blah blah blah... by kermidge · 2013-06-04 19:39 · Score: 1

A little disturbing? I'd say a whole lot disturbing - locking out open protocols for closed ones disturbs and dis-enfranchises participants; disrupts, rejects, and disables open communication for the poor bargain of yet another closed system of sets of walled-off users. Maybe it makes some kind of short-term business sense, otherwise seems fucking stupid to me.
The open exchange of ideas is reduced to walled-off ghettos of gossip.
Re:Yes, backwards compatibility, blah blah blah... by felipekk · 2013-06-04 22:41 · Score: 2

Just Googled "oranges to apples converter"
About 4,780,000 results
Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · 2013-06-04 23:35 · Score: 0

That sounds like a dynamite presentation. Do you use ASCII graphics for that? And movies, I bet that's the ol' "hold the arrow key down and see what happens" trick?
You don't just have embedded stuff in Microsoft documents. How about pdfs? They bear little resemblance to pdfs of a few years back. Or are game files in open source formats? Nearly _everybody_ uses proprietary data formats. They generally come first.
Re:Yes, backwards compatibility, blah blah blah... by ideonexus · 2013-06-05 01:40 · Score: 1

I think this is more than just Microsoft. It's crazy the lengths I have to go to sometimes if I want to resurrect a 10-year-old game on my modern PC. Switching to 64-bit Windows also killed a number of old programs I used to run in x86--even though they should run in x86 mode, they don't. I agree with you that the vast majority of issues are with proprietary software, but discontinued open-source projects regularly suffer the same fate.
Kevin Kelly had a good article on this at the Longnow blog, where he makes the argument that the only way to preserve digital data is to perpetually migrate it to new systems and formats. It seems extreme, but I don't know if I see an alternative; othewise, if not for the work of volunteers we will loose much of our digital history.

--
i ~ Celebrating Science, Cyberspace, Speculation
Re:Yes, backwards compatibility, blah blah blah... by karmaflux · 2013-06-05 05:14 · Score: 1

XMPP would like to have a word with you.

--
REM Old programmers don't die. They just GOSUB without RETURN.
Re:Yes, backwards compatibility, blah blah blah... by david_thornley · 2013-06-05 08:27 · Score: 1

Yeah, but the .txt files in my old Mac 400K floppies aren't easy to read any more, let alone the 5.25" Radio Shack floppies.

--
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · 2013-06-05 09:00 · Score: 0

But your EBCDIC documents are absolute rubbish now and the tools to convert them aren't commonplace any more.
man dd - EBCDIC to ascii & back was part of the reason for its existence. It's on most Unixen out there.
Re:Yes, backwards compatibility, blah blah blah... by Anonymous Coward · 2013-06-05 09:39 · Score: 0

Yes, but i'm getting help.
Maybe after rehab, i'll understand some of what you've written.
Re:Yes, backwards compatibility, blah blah blah... by uninformedLuddite · 2013-06-06 12:32 · Score: 1

I'd like to party with you

--
The new right fascists are bilingual. They speak English and Bullshit.
Re:Yes, backwards compatibility, blah blah blah... by BrokenHalo · 2013-06-06 14:53 · Score: 1

:-)

Contracting quite often did involve boring jobs, especially at that site. However, that little task at least got me out of COBOL for a day or so (plus time for testing). And they did pay me a lot of money for my time.
Re:Yes, backwards compatibility, blah blah blah... by uninformedLuddite · 2013-06-06 15:46 · Score: 1

I did a job a few years back in AU on a PDP-11 that is still being used by a very large concern. They still used RT-11, TSX, and doing all of their in house WP on LEX-11. I kid you not. It was like stepping through a time warp with Vt-52's and 100's running. They never upgraded their stuff as the custom written software was still doing the job required of it after all these years.

--
The new right fascists are bilingual. They speak English and Bullshit.

We have an app for that by uberbrainchild · 2013-06-04 14:25 · Score: 1

If there is a demand to open up and view a certain file type there will always be someone to create an app or website which will either open up the file or convert it to a more compatible format. There are already services out there that convert word to pdf for example oh and I just found an iPhone app for converting files, yay!

--
Anveto

Re:We have an app for that by Prof.Phreak · 2013-06-04 15:37 · Score: 1

By the time YOU care to convert a file and can't... there's no app, and NOBODY but you gives a damn about that file you got.

--
"If anything can go wrong, it will." - Murphy

Re:So? by Mitchell314 · 2013-06-04 14:27 · Score: 1

Man, fuck the future (that's right you historians-not-yet-born). They have all the flying cars and meal-in-a-pill's and immortality clinics and shit. The hell have they done for us to deserve our sympathy? If that means we can make them have to work that much harder to see how life was now, I say do it.

Now back to my zombie virus work. Anybody got a decent time capsule for me to use?

--
I read TFA and all I got was this lousy cookie

DRM and the digital black hole by Neo-Rio-101 · 2013-06-04 14:31 · Score: 4, Interesting

A perfect example of this is basically the issue of old video games. (I may as well bring this up because it's going to come up)

Recently, the Internet Archive stored a whole pile of TOSEC collections of games from various old systems (thanks to their DCMA exemption of being an archival repository so that they can legally do this). Data and information that would have otherwise been completely lost into a digital black hole, if it weren't for the fans of the system, and the dedicated teams of people collecting and amassing this software as a hobby.... in breach of copyright.

The problem with DRM is that without dedicated crackers and pirates, unless the original rights holders are around long enough to resell old titles for that long (which most aren't), old games will simply disappear into a digital copyright black hole and never be seen again. This happens once the computer/console system system is old, not sold anymore, and forgotten about, and the media degrades and isn't backed up in some form (in breach of EULA). If people aren't able to collect the software and hang on to it, preserving/duplicating the media while still in copyright, it's going to vanish. Culturally important games of significance will be lost forever, and that, if anything is as much a crime as it is to pirate software in the first place.
It's only due to the efforts of an army of swappers/crackers, etc, that most of the old games on old systems were even preserved.

The steam model on PC is quite good though as it makes a few compromises where you can actually make backups and go offline if you want.
For old computers and consoles however, this doesn't apply,.... and with some more restrictive attempts to squash the used game market, and force internet-always-connected authentication on upcoming consoles to even play the game... one has to wonder if the game companies deliberately want to squish all traces of their old work, let it disappear into the ether, and to resell you this year's football game which is just like last year's. I fear that this is where we are headed (if we aren't there already)

--
READY.
PRINT ""+-0

Re:DRM and the digital black hole by jeffasselin · 2013-06-04 14:40 · Score: 4, Interesting

What about online-only games? Will historians in 100 years be able to play WoW and see what the game was like?

--
If he explores all forms and substances Straight homeward to their symbol-essences; He shall not die.
Re:DRM and the digital black hole by Mitchell314 · 2013-06-04 14:48 · Score: 2

Luckily for them, no.

--
I read TFA and all I got was this lousy cookie
Re:DRM and the digital black hole by timeOday · 2013-06-04 15:40 · Score: 2

Nor will they be able to join in World War II to see what that was like. However there is more recorded footage of WoW than WWII for future historians to study.
Re:DRM and the digital black hole by Anonymous Coward · 2013-06-04 15:55 · Score: 0

Dear Historians of the future,
WoW is gone. Nothing of value was lost.
Yours truly,
The past.
Re:DRM and the digital black hole by JustOK · 2013-06-04 21:43 · Score: 1

They can just watch the movie

--
rewriting history since 2109
Re:DRM and the digital black hole by Anonymous Coward · 2013-06-04 22:32 · Score: 0

Somebody ought to write a book on a low acidic paper about the experience of playing WoW. That will solve it.
Re:DRM and the digital black hole by Sockatume · 2013-06-04 22:47 · Score: 1

That'd depend on whether Blizzard turns over server code and whatever authentication they use (or a version of the game without such authentication) to archivists.

--
No kidding!!! What do you say at this point?
Re:DRM and the digital black hole by Chris+Mattern · 2013-06-05 02:20 · Score: 1

Will historians in 100 years be able to play WoW and see what the game was like?
Historians can't play WoW *now* and see what the game used to be like.
Re:DRM and the digital black hole by Anonymous Coward · 2013-06-05 03:12 · Score: 0

Will historians in 100 years be able to play WoW and see what the game was like?
That's assuming historians care about an online game a hundred years from now, and need to play it for some reason. How many historians care about MUDs?
Lucky for everybody's great-great grandchildren, private servers already exist. It'll be possible to play WoW forever (any expansion: vanilla, WotLK, BC, etc).
WoW Private Server History:
A long time ago, a few people used a network sniffer to detect all the opcodes exchanged between the WoW client and server, then reverse engineered what the opcodes do (login authentication, player movement, zone background music/ambient SFX, opening dialogs, combat, casting spells, etc). They used that knowledge to create their own mock servers, which are compatible with specific game client versions. By itself, the server only allows clients to connect and float aimlessly through a completely vacant world. To make the server behave like the real thing, world databases were also needed.
Many other groups of people figured out how to extract all the resource identifiers from inside the WoW game client: terrain height, NPCs & monsters, items, crafting recipes, skills, some quests & dialog, etc. The other invisible half of the world's data was server-side only (not stored anywhere in the client), which was recreated through trial and error (or game strategy websites): flight paths, mining/herb/fishing nodes, airship queues, some quests, some dialog, area events, vanity pet & toy behaviors, etc.
When the server is combined with a mostly complete world database, the result is a playable private server unique to a certain time period (vanilla, wrath of the lich king, burning crusade, etc).
To my knowledge, none of the private servers available are 100% complete and bug free. Anyone who runs their own has to implement and fix thousands of things. Despite those problems there are some great possibilities with private servers: changing movement speeds, allowing flying in no-fly zones, modifying spell requirements and behavior, generally increasing or decreasing the game difficulty (drop rates), raising the level cap, and more.
Without client modifications, there are limits to what a private server can do differently, because the proprietary client contains the game resources and the server can't make up entirely new things. What a server can add has to be based on something that already exists (like Fel Reavers wandering in new places). This means adding new spells, races, classes, zones and such is very difficult. For a historian who wants to see the game as it was, this problem is a good thing. For players and people that run private servers, it's a hard problem to work around.
Legality:
1) Server source code is reverse engineered, not copied from Blizzard. A server without a world database is perfectly legal (but unplayable).
2) The WoW client is freely downloadable, but extracting resource IDs to assemble and/or play a private server is a violation of the EULA.
3) World databases that can be found online may be copyright violations, but depending on how the data was gathered it could be a gray area.
Running a private server on a local network doesn't cost Blizzard anything, they don't know it exists, and they have little reason to care. If it's made accessible online, then Blizzard will find out about it and try to crucify the operators to the full extent of the law.
Maybe someday Blizzard will make the earliest servers open source. Or, people might someday build their own game clients from scratch and create a completely legal WoW clone. In the meantime, as long as people keep archives of the servers and world databases around, every variety of WoW will exist forever.
As a footnote, there were some people that once stole the source code from Blizzard (or part of it). It's not available anywhere, it's illegal to have, and Blizzard massacred everyone involved. The server source code that can be found today is made from scratch, otherwise Blizzard would go after the people that maintain it.
Re:DRM and the digital black hole by gmezero · 2013-06-05 05:00 · Score: 1

Actually, it's quite possible. There are frequently dedicated fans for MMOs willing to reverse engineer the servers and setup hosts to keep the clients usable. For instance, consider Phantasy Star Online. There is a free private server distribution called Blue Burst that can be configured (along with some LAN trickery) to allow Dreamcast, Xbox, and early PC versions of the game to still authenticate against a server so they can simply boot!
While PSO fans can get pretty nutty, I'm going to go out on a limb and say WoW fans are even more fanatical and one of them will come up with a solution.
Re:DRM and the digital black hole by Anonymous Coward · 2013-06-05 05:14 · Score: 0

Which WoW? There's servers running various builds, maybe even a couple owned by ActiBlizz.
Actually it didn't sound like it you were referring to static data that historians can measure, but the unfixed perception you associate with whatever condition the game will be in for a moving point in time.

Print Everything! by dohzer · 2013-06-04 14:32 · Score: 1

Print Everything!
Problem solved.

"Files that Last" by ddyer-bennet · 2013-06-04 14:33 · Score: 1

Saw info on a book on this topic today, in fact: http://filesthatlast.com/about/ . Looks interesting so far.

Don't forget DRM by onyxruby · 2013-06-04 14:33 · Score: 4, Insightful

Were living in what could well be a future dark age for archeologists / historians. Hardly anything is put into a nice hard format (stone is incredibly rare and metal gets stolen) for someone to find. What's left suffers from incompatible file formats, acid based paper that decomposes, bit rot, cryptography, incompatible technology for data storage and worst of all DRM. With DRM you have active measures that try to prevent something from being usable.

In the old days people stopped use with armed guards, obfuscation and primitive crypto. Today we have servers that are required for operational functionality for many products. With the advent of the cloud you have reasons for storing things where you have a dependency on a third party. How many services that are cloud / server based have come about and gone tits up?

Even having a large well known brand name doesn't protect you from having a server shut down. Just think of Microsoft's play4sure service that lasted less than a decade. Having a license and a physical disk isn't that helpful when the DRM requires an authentication server that doesn't exist. With the movement to put more and more DRM into the cloud or with SSL certificates (again dependent upon servers and naturally time bombed) this is going to be a problem that will only grow worse.

Learning to break DRM is far more critical than file formats which require nothing more than a conversion tool.

Re:Don't forget DRM by phantomfive · 2013-06-04 15:05 · Score: 1

Learning to break DRM is far more critical than file formats which require nothing more than a conversion tool.
That is utterly a waste of time. It makes me sick to think of how much good effort is wasted jailbreaking the iPhone, when Apple could merely write a few lines of code and none of that would have been necessary. The entire jailbreak community around Apple is compensating for a few lines of code.

I say that with complete respect for the jailbreakers, but it could be so much better.......

--
"First they came for the slanderers and i said nothing."
Re:Don't forget DRM by Anonymous Coward · 2013-06-04 15:53 · Score: 0

Do you need me to call the waaahmbulance?
Re:Don't forget DRM by Anonymous Coward · 2013-06-04 18:23 · Score: 0

Historians will always have trouble gleaning the past because it's impossible to store and experience every second of every person's life. Unless they focus on major events, they'll always have an incomplete history of what happened. We're also assuming that the historians we're talking about are focused on traditional political history. Historians of games, science, or say software or hardware historians will never get the information they really need, which is what went on in the heads of developers, computer engineers, and other experts involved in creating modern electronics and software.
The best we can do is to demand that source code and original assets be stored in escrow. This will not enough, but it's far better than nothing. I'd also like to see hardware designs and documentation, etc. It will be painful to port from Cell (stupid Sony).
Really, I wouldn't worry much about it until we find a medium that stores a huge amount of information flawlessly and will last an extremely long time. We lose so much informstion all the time. Institutional knowleege, knowledge that resides in the head of experts, DNA of extinct species, etc. Also, we actually don't want all this informtion to be stored for future generations.
Re:Don't forget DRM by Anonymous Coward · 2013-06-04 22:17 · Score: 0

Were living in what could well be a future dark age for archeologists / historians. Hardly anything is put into a nice hard format (stone is incredibly rare and metal gets stolen) for someone to find.
Stone inscriptions were incredibly rare back in time and metal got stolen. The vast majority of stuff that the old timers wrote was written on perishable materials and almost all of that has perished.
Re:Don't forget DRM by Anonymous Coward · 2013-06-04 23:52 · Score: 0

This. We too often forget in technology what finance does not. Large companies have junk rates on their bonds because historically the number of companies who survive more than even 10 years is vanishingly small. When dropbox (who is fantastic right now) disappears in another 5, or 8, or 12 years because some new technology has made their business model irrelevant, and they adapted just six months too slow to it, what happens to the pictures, and letters, and god-know-what-else that they were hosting? Gone. Vanished. All of it. Case in point - remember Geocities? Even if 90% of what it hosted was trash, that other 10% represents an immense loss of culture & history.
Re:Don't forget DRM by drinkypoo · 2013-06-05 01:48 · Score: 1

It makes me sick to think of how much good effort is wasted jailbreaking the iPhone
If that makes you sick, how do you sleep at night with all the important things going on in the world?
I am a bit confused at why so many people are willing to spend so much effort on a closed platform when there's a substantially less-closed platform next door that does all the same stuff. I don't think the iDevices are particularly bad or Android all that much better, I just don't get why anyone would feed Apple when they are so much more abusive to the user than Google. And don't give me all that guff about Google services, alternatives exist for all of them and you don't even need a Google account. You'll lose very little (for instance, you're definitely not going to get any kind of assisted location service which will integrate with the device, but if you don't want to be tracked by Google you'll have turned that off and rely on GPS anyway) and you'd have to lose all the same stuff to not be tracked by Apple, if not being tracked is your thing.
For the record, I use google login and services, but I don't use assisted location. I don't particularly need to leak information about my APs to Google...

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:Don't forget DRM by Anonymous Coward · 2013-06-05 04:00 · Score: 0

Were living in what could well be a future dark age for archeologists / historians.
Bullshit.
Digital media will survive in vastly greater quantities than any media ever before.
Parchment from 2000 BC? What percent of that survived?
Stone carvings and paintings from 10,000 years ago? What percent of those survived?
In addition, 99% of digital data is shit data anyway. We don't need every single fucking post on facebook for archeologists to know what life was like in the early 2000s. We don't need every fucking instragram picture of sushi for them to know what we ate. We don't need every port of tetris to survive, every fucking mario cart disk ever made, and every second of every farmville game ever played for anthropologists to understand life in the 21st century. They won't need to see every damn spreadsheet, every email, and listen to every voicemail ever made. We're not talking about capturing every second of modern life - just the overall idea of it.
We have far less of a snapshot of daily life every 1000 years prior to this than we'll have of this period in time.
Data that's here may be gone tomorrow? That's fine. That's the way it's been for all of history. And as time goes on, less and less of it is going away.
Re:Don't forget DRM by phantomfive · 2013-06-05 04:09 · Score: 1

If that makes you sick, how do you sleep at night with all the important things going on in the world?
Same as everyone else, I ignore the problems I can't deal with while I fall into sweet, deep sleep.

I am a bit confused at why so many people are willing to spend so much effort on a closed platform when there's a substantially less-closed platform next door that does all the same stuff. I don't think the iDevices are particularly bad or Android all that much better, I just don't get why anyone would feed Apple when they are so much more abusive to the user than Google.
Yeah, it's sad. Although some of them have managed to swing $400k jobs from it, so I guess it kind of worked out?

--
"First they came for the slanderers and i said nothing."
Re:Don't forget DRM by Anonymous Coward · 2013-06-05 08:13 · Score: 0

That is utterly a waste of time. It makes me sick to think of how much good effort is wasted jailbreaking the iPhone, when Apple could merely write a few lines of code and none of that would have been necessary. The entire jailbreak community around Apple is compensating for a few lines of code.

Those lines are the scissors that cut the ties. It's not profitable for Apple to let you loose from them and do whatever you wanna do.

*sigh* by MrBandersnatch · 2013-06-04 14:34 · Score: 2

Digital archival is one of the HARD problems. Over the last 40 years we have already lost more cultural artifacts that were created for the entirety of human history. A great deal of that is useless garbage of course but the original moon landing tape? 1000s of government emails reavealing exactly what was going on at pivotal times in history?

The truth is, we need systems for hardcopy; digital is too tranient; emulators are a useful stop gap measure but dont protect againt the kinds of catastropic failures that we will likely see over the longer time frame; and we need indexing because someone at somepoint will want to wade through our digital ditritus.

Re: *sigh* by AvitarX · 2013-06-04 16:23 · Score: 1

Over the last 40 years we have already lost more cultural artifacts that were created for the entirety of human history.

If this is true, it stands to reason we are creating cultural artifacts at such an increased rate that even if only a small percentage survive, future generations will have a more detailed picture of now than has existed in the past. I believe I read that half of all photographs ever taken were taken in the last few years. It doesn't take a high keep rate for things to be better preserved than from any other time in history.

--
Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
Re: *sigh* by Dogtanian · 2013-06-05 00:28 · Score: 1

If this is true, it stands to reason we are creating cultural artifacts at such an increased rate that even if only a small percentage survive, future generations will have a more detailed picture of now than has existed in the past.
You hit the nail *right on the head*. I've said exactly the same thing myself in response to the "OMG the present-day is going to be a digital dark age in centuries to come!!!!!!!!!!!!111111111" stories.

Yeah, we're losing more because we're creating and storing ludicrous amounts of information compared to what we used to. Even if we lose a much higher percentage of that than we did with hardcopied information, and even if only a tiny percentage overall survives, we'll still have way more than we have compared to previous ages.

Of course, if you're attached to a *specific* (e.g.) photograph, then yes, there are problems associated with digital storage that mean it might be lost, and you may still have to take steps to preserve it- but that's a different issue. There's so much out there- in general- that enough of it *will* survive to provide a representative portrait of our society.

IMHO we're already at the stage where we're storing too much information (i.e. random crap on Facebook that will be around forever and may bite you on the arse in future rather than being able to healthily move on and leave the past behind like people were able to do in previous generations).

--
"Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).

I dont have this problem by Osgeld · 2013-06-04 14:36 · Score: 0

with office 2007 pro, maybe if your were not using the student license you would have power-point as well

What matters? by tim6890 · 2013-06-04 14:36 · Score: 0

What really matters besides photographs? I back mine up on a number of offsite solutions that I control ( hard drives ) and the re- backup every year on slightly newer hardware. I also rotate through a variety of online cloud solutions and all that stuff to make sure I am backed up on whatever is the current popular services. Okay I realize more than photographs matter bit that's what matters to me. I don't see it being a huge issue.

Re:What matters? by Anonymous Coward · 2013-06-04 14:54 · Score: 0

I also tend to think that for historians etc the problem will actually be information overload rather than shortage. Before the 19th Century, the histoircal record can be remarkably scanty. Now there is vast amounts of data being generated about just about everyone. Photos, however, are a problem. Your solution only works while you're around to do the business. Unless one of your children takes up the task, images of you and yours could disappear very quickly, apart from your passport and driver's licence photos. Meanwhile i have a postcard-type photo of my grandmother as a 14-year old taken in the mid-19th Century that has survived quite happily in various shoe boxes.
Re:What matters? by codepigeon · 2013-06-04 15:05 · Score: 1

Ok, so how do you retrieve your photos that you stored on that 8inch floppy disk... 10 years from now?

That is a gross exageration but is an anaolgy to the point of the article. Without proper protections, all the information, notes, white papers, studies, etc will be useless if there doesn't exist technology that can read it.

In a worst case scenario how would humankind rebuild and not forget what was previously learned (e.g. dark ages we already experienced).
Re:What matters? by Anonymous Coward · 2013-06-04 15:10 · Score: 0

What really matters besides photographs?.
Nothing matters.
Not you, not your photographs, not your children,
nor their grandchildren.
If you were intelligent in any sense that matters you would
already know this.
Re:What matters? by ganjadude · 2013-06-04 16:13 · Score: 1

he specifically stated that he re backs up every year. I dont go that far but i have data going back as far as the early 90s that started on large floppys, migrated them to smaller floppies, migrated them to CD-r's and now have them on external hard drives. It isnt too hard to keep formats alive. (also note on the hard drives I keep VMs with older OS's able to read formats that i have not found a way to convert, which isnt many.)

--
have you seen my sig? there are many others like it but none that are the same
Re:What matters? by Macgrrl · 2013-06-06 10:38 · Score: 1

My husband and I have been writing roleplaying games for nearly 20 years together. Many of the older games in effect only exist as hardcopy because the softcopies are on outdated media like floppy discs and zip cartridges in old versions of PageMaker or Quark that we can no longer open.
He is keen on storing new games on GoogleDocs but I'm reluctant to trust them to an external 3rd party who has a history of killing services. I have much more faith in storing the content as txt or rtf files moving them from computer to computer as we migrate.

--
Sara
Designer, Gamer, Macgrrl in an XP World

Vint Cerf jumped the shark by Anonymous Coward · 2013-06-04 14:37 · Score: 0

Vint Cerf jumped the shark a long time ago - when ICANN became more of a money making venture than something to make the Internet better.

I've got email from the 1990s I can still read today.
The gifs and jpgs from back then are still viewable today, and will likely to be viewable in 20 years.

You have to keep migrating data off your old storage media _hardware_. And that can be a problem if you don't actually have enough bandwidth for your archive size.

He's mistaken by Anonymous Coward · 2013-06-04 14:42 · Score: 0

I have plenty of old powerpoint 97 presentations ( and 98 since there was no 97 for mac ) and they work just fine with Office 2011.

Maybe if he had some plugins or something that were OS 9 or OS X PPC specific he couldn't load it?

Anyway who cares... fire up SheepShaver in a virtual machine or OS X 10.4 in Virtualbox and launch it from there.
I think he's making a mountain out of a mole hill here... That I can download any Atari 2600 or Commodore 64 product ever made and run it on anything from my Mac to my PC to my Android phone tells me that we're not at risk of losing anything any time soon.

Also, why didn't he write it in LaTeX like he should have in the first place? :P

Re:He's mistaken by yuhong · 2013-06-04 15:10 · Score: 1

I think the user was either using PowerPoint 4.0 for Mac or did not upgrade to Office 97 immediately.
Re:He's mistaken by 0123456 · 2013-06-04 15:16 · Score: 1

Quite likely. I had some old Word for Mac documents of scientific papers I wrote in the 90s, and the only way I was able to recover them a few years ago was to install a Windows 3.1-era copy of Word for Windows.
Re:He's mistaken by yuhong · 2013-06-04 15:21 · Score: 2

Have you tried disabling the file blocks first? At least Word for Mac 4.x and 5.x can be read this way.
Re:He's mistaken by Bing+Tsher+E · 2013-06-04 16:10 · Score: 1

You don't even have to install Word for Windows from that era. WinWord 2.0 will run as a stand-alone binary. Just the Winword.exe file by itself will run. And it's less than 1.44M in size so you can just have it on a floppy diskette. On any 16 or 32-bit Windows machine, of course. It even includes that era's VBA so you can use the winword.exe binary as a portable 'execution environment' sort of.
Re:He's mistaken by yuhong · 2013-06-04 16:41 · Score: 1

WordBasic, actually. What is fun BTW is to unblock Word 6.0/95 formats in 2010 and later and open a file with WordBasic like SCANPROT.DOT.
Re:He's mistaken by Ultracrepidarian · 2013-06-04 16:53 · Score: 1

I was guessing I wouldn't find Kool Aid Man for Atari 6200 but, sure enough, it's out there.

This is news? Nope. Not new... by flogger · 2013-06-04 14:43 · Score: 2

This has been true of all technology in the past and will continue into the future. Just look at film. How many preserved films from 1915 are still around? Just the ones that were recorded into a new format of film, then a newer format of film, then into a VHS, then into a LaserDisc, then a DVD, then a BlueRay... (Metropolis, I am looking at you.)

Within arms reach, I have Floppy drives that contain files created in AMI Pro work processors.... WHen I say Floppy, I am talking about the 5 1/4 inch floppies.
Technology hardware and software is not stagnant... It will always continue to develop and progress (ignore windows 8). Data that is worth keeping will get converted. Data that isn't will get left behind. I would not be surprised that in about 25 years, there will be "classic" software as there is Classic literature...

Too much typing.. going back to drinking.....

--
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
"First things first -- but not necessarily in that order"
-- The Doctor, "Doctor

Re:This is news? Nope. Not new... by Anonymous Coward · 2013-06-04 16:21 · Score: 1

This has been true of all technology in the past and will continue into the future. Just look at film. How many preserved films from 1915 are still around? Just the ones that were recorded into a new format of film, then a newer format of film, then into a VHS, then into a LaserDisc, then a DVD, then a BlueRay... (Metropolis, I am looking at you.)

A lot of those films were purposely destroyed because they weren't seen as useful anymore. Put yourself in the time (well, anytime before 1970 or so): there's no home media. If you go back before 1945-1950, there's no television either. If a film can't be shown theatically, there's no way for it to be seen. And there's only so much room in the cinema. Theaters have one screen and there's all that many of them. Then there's all the pre-sound films. Who wants to see those again? (People now have little interest in black and white films from the 1950s. Even people in the 1940s had better things to do than watch silent films from 20 years ago.)
So what do you do with all these old movies? Keep them around and wait for some practical form of home video? Hard to envision that such a thing would be developed--and in fact, it took 50 years from the start of widespread motion pictures for that to happen.
The problem they had them is the same one we have now: no one cares about stuff until it gets really old. There's more interest in really old films (say from the 1920s) than moderately old films (films from the 1950s that have been lost, mostly B movies). So who wants to keep all this crap for 80+ years in the hope that one day, someone will want it?
Re:This is news? Nope. Not new... by geniice · 2013-06-04 17:01 · Score: 1

You don't need most of the things on that list. B&W Cellulose acetate film stored at low temps would still be around. In principle Nitrocellulose stored at low enough temps might have survived but you'd need to get postgrads or other expendable people to handle it.
Re:This is news? Nope. Not new... by AliasMarlowe · 2013-06-04 17:39 · Score: 1

You don't need most of the things on that list. B&W Cellulose acetate film stored at low temps would still be around. In principle Nitrocellulose stored at low enough temps might have survived but you'd need to get postgrads or other expendable people to handle it.
Even some of the earliest movies have been digitized from ancient film. For example there are collections of shorts by Edison (1899-1902), as well as items like The Little Match Seller (1902), The Great Train Robbery (1903) or both halves of the Chicago-Michigan Football Game (1903). Obviously these are silent and monochrome, and in some cases the original was imperfect.

--
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
Re:This is news? Nope. Not new... by NJRoadfan · 2013-06-05 00:35 · Score: 1

The biggest losses in cultural resources wasn't from the degradation or the inability to play back the media, but from deliberate wiping of the programs off of video tape for reuse. For example, there is no known surviving copy of the entire broadcast of the first Superbowl.
Re:This is news? Nope. Not new... by Dogtanian · 2013-06-05 01:24 · Score: 2

You put your finger on it. I'd just add what I had planned on saying- that, in general, it's not always obvious what's going to be "useful" and "of interest" to future generations when it isn't practical to keep everything.

In fact, a lot of things that would be of interest to us- i.e. everyday, mundane life- was never recorded at all, back when film and equipment were quite expensive and the effort and cost would have been saved for documenting "important" occasions. Even at a personal level, if I'd known that something like the Internet would become as important as it has, and that there'd be projects like Wikimedia Commons and the like, I might have photographed more of the things around me in my relatively mundane home town while growing up in the 1980s.

--
"Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
Re:This is news? Nope. Not new... by geniice · 2013-06-05 06:48 · Score: 1

While this happens from time to time there tends to have been an intermediate conversion to something other than nitrocellulose which tends to have exploded by now although a mix of chance and good storage conditions can prevent that.

Tax Records by PPH · 2013-06-04 14:45 · Score: 2

The IRS wants to audit me, going back several years. I kept the records as required but they are unreadable now.

Thanks Microsoft!

--
Have gnu, will travel.

Re:Tax Records by yuhong · 2013-06-04 14:56 · Score: 1

If it is Word/Excel, try disabling the file blocks using the registry or in 2010 or later using the UI in the Trust Center.
See http://support.microsoft.com/kb/922849

Re:So? by Anonymous Coward · 2013-06-04 14:47 · Score: 0

Man, don't be like that. If we're nice to the future, they might give us time machines!

One would think by no-body · 2013-06-04 14:56 · Score: 1

That people in the far future would be getting smarter to accomplish this - probably a tossup - and apart from it, it's very questionable if a far future for humanity even exists, the way "humanity" is behaving this days/years/decades/centuries/millenia....

Maybe there are smarter robots by then babysitting...

LOL! by Narcocide · 2013-06-04 14:58 · Score: 0

+1 Underrated.

Re:So? by fuzzyfuzzyfungus · 2013-06-04 14:59 · Score: 2

I think you will find that there's a little known branch of academia called "history" which sometimes takes a curious interest in even the most trivial of past information.....

Even if you don't care about the historians, I'm sure the lucky people who have the pleasure of handling property deeds at your local governance hive can tell you a story from within the last week or two about needing to pull some rather seriously dusty documents to allow a present-day transaction to go through without incident.

Many data will, indeed, be of no interest at all, or the same historical interest that neolithic refuse dumps are; but data in the nontrivial-number-of-decades range are still live in more than a few contexts.

Github Flavored Markdown by HalcyonBlue · 2013-06-04 15:01 · Score: 1

I use Github Flavored Markdown. Thousands of years in the future, archaeologists will no doubt work furiously to decode my etchings upon a stone tablet, which will read: "# IF YOU CAN READ THIS YOU'RE A GEEK #" .

Re:Github Flavored Markdown by Anonymous Coward · 2013-06-04 16:41 · Score: 0

I thought you said "IF YOU CAN READ THIS YOU'RE A GREEK", and I was intrigued by your thesis and wanted to subscribe to your newsletter.

Maybe. by MrEricSir · 2013-06-04 15:03 · Score: 4, Insightful

XML doesn't magically solve everything in this regard. If there's no good documentation for the format, it's unlikely you'll be able to display everything exactly as intended. Likewise, if the format is hideously complex (see: Microsoft Office Open XML) or there's bugs in the de-facto implementation, it's going to be tricky to reverse engineer.

I'd also point out that MS Office spits out compressed XML. I believe it's based on ZIP, which is very well documented, but that's yet another hurdle to cross. And then you have to deal with the binary format of the XML itself -- ASCII, UTF8, etc.

--
There's no -1 for "I don't get it."

Re:Maybe. by viperidaenz · 2013-06-04 15:59 · Score: 1

The file being in ZIP format is documented. the character encoding of the XML file is specified in the XML file itself, like all XML files should do.
From what I've used so far the Open XML formats aren't hideously complex, although i've only been working with XLSX files.
Re:Maybe. by Anonymous Coward · 2013-06-04 16:00 · Score: 1

Much of the problem revolves around
A: Hardware is always changing. Newer computer are no longer compatible with Windows 9x, for instance. Five years from now Win XP will no longer be compatible with it.
So old programs written in Basic and Fortran and in MS Dos will no longer run properly. Old games, etc.. that don't support the hardware. Heck, I tried to get Duke Nukem 3d To run on Windows XP using MsDos mode, Dos Box, Freedos, and other simulators and it just won't work with my newer sound card and I can't figure out how to make it work.
B: Intellectual property. Since many of these programs are still under protection for 95+ years and software lifetime is only like 7 or fewer years (often much less) it's not like third parties can take an older operating system or a simulator and modify it to work with newer hardware so that it can run old programs. Not to mention it's hard to get a hold of all these old programs to be able to do the testing required to run them. All the newer hardware is propriety and so we can't simply do what we want with it either in terms of things like writing drivers for older software, etc... and (hardware and other) patents last 20 years which is like a good three to maybe four generations in terms of technology turnover.
Re:Maybe. by Anonymous Coward · 2013-06-04 17:29 · Score: 0

But XML and metadata can tell you WTF is wrong with the file (i.e. why it doesn't launch in your MSWord 2020).
Basically, it's better than nothing. Of course, so guru will say he can reverse engineer and hack it to work in no time, but that's having faith in the hit by a bus tomorrow scenario--that no one gets hit.
Re:Maybe. by kermidge · 2013-06-04 19:02 · Score: 1

For A. - as I understand it a good emulator can, for instance, contain a complete virtual 6502 sufficient for doing assembly coding. On the off chance, have you poked around in relevant forums to find out if others have the same problem or are able to get Duke running? I don't know from sound cards so only two ideas come to mind; one, again, hit the forums, and also see if you might have success with converting the sound output to a different format, one that the sound card (driver, really) expects to get.
Anyway, I think you raise good points, ones I don't see mentioned very often if at all.
Re:Maybe. by Anonymous Coward · 2013-06-04 20:58 · Score: 0

the character encoding of the XML file is specified in the XML file itself
Even if so, knowing the name of a particular kind of encoding != knowing the method of encoding.
What MrEricSir probably ment is that its another hurdle to pass: if you cannot find something describing that encoding you still cannot read the document.
Would be funny if the encoding is described in a document alike the one you just try to read: you need to decode the document to be able to decode the document ...
Re:Maybe. by wonkey_monkey · 2013-06-04 21:32 · Score: 2

ZIP format is documented.
Right now it is. What about the ragtag bunch of misfit librarians who are all that's left after the zombie apocalypse?
They burned all the books for warmth and to keep the zombies away.

--
systemd is Roko's Basilisk.
Re:Maybe. by dkf · 2013-06-04 21:42 · Score: 1

XML doesn't magically solve everything in this regard.
There are no silver bullets at all in this area. You can make the container format self-describing, and the tree structure self-describing (which is pretty much what XML gives you), but the hard problem of capturing the actual semantic meanings of the nodes in the tree is just going to remain that. You could attach "semantic meaning descriptors" of course, but whose to say that they're going to be understandable by anyone? (Indeed, where I've seen such things they've typically been less understandable than the original non-semantic nodes and have depended on comprehending a large body of complex documents on the open internet at the same time, which is a total failure mode on many levels.)
But going for XML (or ASN.1 or JSON or YAML or any number of other tree description schemes) is still better than trying to also pick apart the mess from horrible custom binary dump. It's at least one less obstacle, as you at least can read what the original generator thought it was sensible to tag the tree as. (Other layers of encoding that are reasonably standardized and so don't add to the problem are using a defined character encoding such as UTF-8 or ASCII, and using a compressed packaging format like ZIP or gzip.)

--
"Little does he know, but there is no 'I' in 'Idiot'!"
Re:Maybe. by cyborg_zx · 2013-06-04 22:59 · Score: 1

For Duke Nukem 3D there are a plethora of source code modifications I suggest you check out. At least one of them will amost certainly do it for you. Running on the original exe may well be a challenge. Surprised SoundBlaster wouldn't work to be fair though since that at least became the de-facto standard.
Re:Maybe. by Xest · 2013-06-05 01:28 · Score: 1

I think it's more the point that although you may not be able to get a perfect representation back from the dead, you will at least be able to extract or worthwhile information if it's XML.
Re:Maybe. by viperidaenz · 2013-06-05 08:12 · Score: 1

If all the books have been burnt, it doesn't matter if it's an open standard or not.

Another argument... by FuzzNugget · 2013-06-04 15:03 · Score: 1

For open source. Save your files in open and/or openly defined, standardized formats and there will always be software that can deal with it.

But I guess it's difficult for people to hear you explain that to them with their head up their ass.

Re:Another argument... by wvmarle · 2013-06-04 15:12 · Score: 1

Argument for open standards, yes. Open source, no. You don't need open source for open standards. And open source does not necessarily mean open standards.
Re:Another argument... by Anonymous Coward · 2013-06-04 18:42 · Score: 0

For open source and proprietary alike, store a program that can access the data together with the data.
If it is open source, even better. The source code will show how to access the data even if it can't be compiled on a modern system. (You still need a program to access the source code, but pure ASCII is pretty portable so it has a high chance of being decodable in the near future.)

Google hire me, I solved this problem in 3 seconds by dicobalt · 2013-06-04 15:04 · Score: 1

I would solve this by installing a Windows XP VM with a copy of Office XP. Now that I solved Google's hard problem they must now see I am qualified to work there. Google is on a FUD rampage of which the likes I haven't seen since the great Microsoft FUD storms.

Who is this Cerf noob ? by Anonymous Coward · 2013-06-04 15:04 · Score: 0

Doesn't he know about the magic of the cloud
that Apple, Photobucket, Flickr, and others who cannot be trusted
any further than they can be tossed by a trebuchet
have promised us ?

Can anyone identify this character set? by Anonymous Coward · 2013-06-04 15:06 · Score: 1

Still haven't found a description of the chaaracter set in which octal 222, 223, and 224 are right single quotation mark, left double quotation mark, and right double quotation mark.

Anybody know this one?

Re:Can anyone identify this character set? by Anonymous Coward · 2013-06-04 17:17 · Score: 0

If only you were using XML, you wouldn't have this problem.
Re:Can anyone identify this character set? by Anonymous Coward · 2013-06-04 18:43 · Score: 0

Yeah, Windows character sets, CP-1250, 1251, and 1252, at the very least.
You didn't look very hard if you weren't able to find this.
https://en.wikipedia.org/wiki/Windows-1252
Re:Can anyone identify this character set? by Dr_Barnowl · 2013-06-04 20:16 · Score: 1

Just to note, CP-1252 is the standard Western code page for Windows. I know this because I have to make special efforts on all my build scripts to cope with the fact that Windows has so far failed to join the "Just use UTF-8 like every other modern OS" club.
Re: Can anyone identify this character set? by Anonymous Coward · 2013-06-04 22:02 · Score: 0

Maybe because utf-8 is double byte when only one byte is required for english. Makes a huge difference when you're dealing with large amoint of data.
Re: Can anyone identify this character set? by andy.ruddock · 2013-06-05 00:28 · Score: 1

It's not, see https://en.wikipedia.org/wiki/Utf-8.
Characters encode into one to four bytes - the beauty of UTF-8 being that a 'standard' US-ASCII text document is also correctly UTF-8 encoded.

--
God: An invisible friend for grown-ups.
Re:Can anyone identify this character set? by Anonymous Coward · 2013-06-05 18:21 · Score: 0

Indeed, I've run into this too, when developers start writing Java code in Eclipse (with fucked up, non-utf8 encodings selected) on Windows, and then check their shit into scm for a build & deploy from Linux, which defaults to UTF8 for everything.
Annoying as hell, but easy to fix, as long as developers don't KEEP checking in CP1252 encoded files.
Re: Can anyone identify this character set? by voidphoenix · 2013-06-05 23:53 · Score: 1

It's called UTF-8 because it uses a baseline of 8 bits (one byte, not 2) to represent characters.

UTF-8 encodes each of the 1,112,064 code points in the Unicode character set using one to four 8-bit bytes (termed "octets" in the Unicode Standard). Code points with lower numerical values (i.e. earlier code positions in the Unicode character set, which tend to occur more frequently) are encoded using fewer bytes. The first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single octet with the same binary value as ASCII, making valid ASCII text valid UTF-8-encoded Unicode as well.

--
Excuse me, wtf r u doin?

On the PowerPoint 4.0/95 converters... by yuhong · 2013-06-04 15:07 · Score: 4, Insightful

MS removed the PowerPoint 4.0/95 converters completely with Office 2007 for Windows and later, and disabled them by default in Office 2003 SP3. And the PowerPoint 4.0 converter (but not 95) was disabled by default instead of fixed with MS09-017.

On the Mac, they removed then even earlier, when they ported Office to Carbon.

IMO it would be a good idea for MS to package PP4X32 and PP7X32 from PowerPoint 2003 separately, along with a utility to call the converters of course.

Re:On the PowerPoint 4.0/95 converters... by Anonymous Coward · 2013-06-04 16:30 · Score: 0

Searching for "powerpoint 4.0 converter" on Google gives me 11.8m hits. I think that Google is in a FUD campaign lately (1-week warning for 0-days and Google employees constantly talking bullshit about other companies). I think we're much better off leaving our documents in Google Docs, right?

Uh, hello? by DogDude · 2013-06-04 15:10 · Score: 4, Funny

For a supposedly smart guy, he seems a bit silly:

He could've just downloaded MS's Powerpoint 97 viewer

--
I don't respond to AC's.

Re:Uh, hello? by Narcocide · 2013-06-04 15:22 · Score: 1

Yes, that's your first clue that schadenfreude is involved here. http://en.wikipedia.org/wiki/Straw_man
Re:Uh, hello? by Charliemopps · 2013-06-04 15:23 · Score: 0, Flamebait

I to am marveling at how dumb his concern is given the fact that you can pretty much convert any file format in existence to any other file format in existence with any number of free conversion applications that the internet is riddled with... and he's talking about MICROSOFT file formats. You could simply open the file in Google Docs for Christs sake. I wouldn't be surprised if Firefox could open it natively either.
Re:Uh, hello? by Anonymous Coward · 2013-06-04 17:00 · Score: 0

Who would have thought Google was going in FUD mode. Looks like their HR dept is having some troubles with internal culture.
Re:Uh, hello? by Anonymous Coward · 2013-06-04 21:04 · Score: 0

His point is that in an ideal world you would be able to open any document using any editor.
Re:Uh, hello? by Anonymous Coward · 2013-06-05 01:40 · Score: 0

Vint would have to upgrade to Windows XP or better to run that viewer.
Re:Uh, hello? by Yo_mama · 2013-06-05 05:45 · Score: 1

And what does he do X years from now when that link is broken? What do historians 20, 50, or 100 years from now.
For a supposedly Dog Dude, you seem a bit short sighted.

--
Never understimate the power of human stupidity -Lazarus Long

libreoffice will open it ! by Anonymous Coward · 2013-06-04 15:10 · Score: 0

If not, file a bug and send in the document. The power of freedom ...

Re:libreoffice will open it ! by yuhong · 2013-06-04 15:32 · Score: 1

Note they also sometimes drop support for old formats too:
https://bugs.freedesktop.org/show_bug.cgi?id=59902

libreoffice will open it ! by mejmeeks · 2013-06-04 15:11 · Score: 1

If not, file a bug and send in the document. The power of freedom ...

Except spoken word changes too by Anonymous Coward · 2013-06-04 15:18 · Score: 0

Even language changes over a few hundred years or so. Might not be a problem for some things like technical documents, but in terms of presenting information of cultural significance most of it still gets lost with time. Slang, flowery speech, idioms, references to current events, jokes, etc., a lot of it looses it's "zing" after around 50 years or so if not within a single generation. There may be a few notable exceptions to this (some classical works are still funny, like much of the Canterbury Tales), but even then you sometimes need a dictionary or thesaurus to get the jokes or at least the general gist of them.

Re:Except spoken word changes too by mcswell · 2013-06-05 16:06 · Score: 1

Rumor has it the Bible is still readable after a couple thousand years. In Greek, Hebrew and Aramaic if you take the time to learn, else in translation.

Wasn't this solved ages ago? by samantha · 2013-06-04 15:19 · Score: 1

I remember over two decades ago there was talk of making data objects, that is data that new how to present an object interface to get at its information. Data self contain its own reader in some ubiquitous language. But wait, we never got a ubiquitous language. Perhaps javascript today? But if you want to solve this problem then this is how to solve it. Or perhaps you could just package a converter to convert format XYZ to BSON as being good enough or at least better than today's breakage.

One thing that really burns me is having my information that I created / entered / caused to be locked up in some proprietary opaque format, especially if owned by one and only one app.

Re:Wasn't this solved ages ago? by gweihir · 2013-06-04 17:18 · Score: 1

There is an ubiquitous language: ANSI-C + the concept of a raster display with x/y coordinates. Nobody cared enough, also because if you use sane formats (ASCII, PostScript, PDF/A), you can already display them everywhere.
But JavaScript would be a complete fail. It is not even really compatible across browsers.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.

I have legible pictures over 150 years old by the_rajah · 2013-06-04 15:21 · Score: 2

Some are glass plate Daguerreotypes. Somehow, I am not too confident that my digital pictures will be legible 150 years from now, unless I make a good quality print on archival paper. Digital files are too easily corrupted and made totally useless. Media formats will change. 8" floppies anyone?

--

"Do the Right Thing. It will gratify some people and astound the rest." - Mark Twain

Re:I have legible pictures over 150 years old by AK+Marc · 2013-06-04 16:20 · Score: 1

I have gifs and jpegs from much older than the document in question, and have no trouble with any of them (and BMPs from before that). I was listening to MP3 in the 1990s. They work fine now.

What digital picture standard (not raw) do you think you'll have any trouble reading, and roughly when do you think you'd have any trouble with it?

Media formats will change. 8" floppies anyone?
So you are worried that your MFM HD from the 1980s won't work, not worried about the .BMP on it not being readable, if you found some way to spin it up? I can't even tell what you are whining about, other than "technology bad."

--
Learn to love Alaska
Re:I have legible pictures over 150 years old by jafac · 2013-06-04 17:00 · Score: 2

yes - this is a real issue - and ARCHIVED data that is important DOES need to be "spun up" and refreshed to new media.
If it's hard drives, yes. If it's optical media. . . well that depends. Because some optical media just plain degrades over time. Some is written in special proprietary formats (like Apple's early implementations of CD+R) that you're going to have a hard time reading with CURRENT equipment.
If your data is archived to tape, and more than 10 years old, I'm afraid you're fucked.

--

These are my friends, See how they glisten. See this one shine, how he smiles in the light.
Re:I have legible pictures over 150 years old by AK+Marc · 2013-06-04 17:41 · Score: 1

Tape backups, as a home solution, is rare, and even in your case, you aren't fucked. But, to keep the Internet nutters from arguing endlessly, yes, once you've saved your bits, they have a shelf life of [insert absurd number here]. I've got an old 3.5" floppy drive running around in a box somewhere, if I had to read a floppy, and the last time I tried, it worked fine, drive and media, and that's at least 10 years old. When I threw my XT out 5 years ago, it was still reading 5.25" without an issue, and I never had a disc go bad that wasn't visibly damaged, even 1980s vintage CD-Rs that were given a 5 year shelf life by a previous generation of nutters.

--
Learn to love Alaska

A Moot Red Herring by Anonymous Coward · 2013-06-04 15:25 · Score: 0

As long as the decription of the file format is preserved -- and this can still be done with paper documents -- then we can simply translate or convert the old information into new forms or formats. Nothing ever need be lost.

The concern is only illusory.

No different than cars by HockeyPuck · 2013-06-04 15:34 · Score: 4, Interesting

We're still able to restore cars from the 80s and earlier as the cars were fully mechanical or hydraulic. No computers.

Fast forward to 20yrs from now, nobody's going to be carrying the computer boards for a 2004 Toyota Pruis or a 2013 Tesla.

However, you'll still be able to restore your grandfather's '57 Chevy...

Re:No different than cars by AK+Marc · 2013-06-04 16:21 · Score: 3, Informative

You'll just have to take the Prius ROM on an emulator on your phone, and plug in your phone to drive your car. Easy.

--
Learn to love Alaska
Re:No different than cars by Anonymous Coward · 2013-06-04 22:57 · Score: 0

64-68 Mustangs are booming. You can build one from scratch too.
Re:No different than cars by Anonymous Coward · 2013-06-05 03:13 · Score: 0

Mustangs are booming.
You have the Mustang confused with the Pinto...
- T
Re:No different than cars by uninformedLuddite · 2013-06-06 12:44 · Score: 1

We're still able to restore cars from the 80s and earlier as the cars were fully mechanical or hydraulic. No computers.
We won't be allowed to drive it though. Unless of course it is after the EMP and then we will all be rich cab drivers

--
The new right fascists are bilingual. They speak English and Bullshit.

Code should accompany data by michaelmalak · 2013-06-04 15:35 · Score: 4, Interesting

I presented a solution to this long-standing problem last year to the Denver HTML5 Meetup.

Code should never be separated from data. This is possible with HTML5, JavaScript, and open source.

In the presentation, I steal and repurpose Hofstadter's analogy of DNA to an LP vinyl record, which is an information bearer, but useless without its information retriever (the record player). Like the cell of an animal, which contains both DNA and the means to "play" it, I ask why not the same with software?

My maxim is: data should always carry the code with it to play itself. It was inspired from the field I've spent 50% of my career in: non-destructive testing where, for example, X-Rays and ultrasounds are performed on safety-critical industrial parts with 50-year service lives. If one of those parts fails and kills someone, you're going to want to go back into the old data and find the earliest indication of the flaw or fault and reinspect every other part in the world like it that is still in service. And maybe you need to go back 50 years. Under such a context, not providing the code with the data could be considered an act of gross neglect.

In my presentation, I use the 1990's era trick of embedding XSL into an XML file, with the addition of the XSL now being able to use HTML5/JavaScript. Sadly, I've only gotten it work with Firefox -- the other browsers consider it a security violation.

Re:Code should accompany data by femtobyte · 2013-06-04 16:05 · Score: 1

From a future data recovery standpoint, how is the "code" any more useful than data? You'd still need to be able to figure out how to execute the code itself --- the code is just an especially complex and capable file format (which likely makes it very difficult to figure out if you've lost the execution instructions). Some file formats are already complete programming languages --- like PostScript. Do you think you could make much sense of a PostScript representation of a document if you started without a PostScript interpreter available (or at least a comprehensive PostScript specification, and a heck of a lot of free time)? Why is JavaScript any more likely to be well-known in the year 2100 than PDF? Or any easier to reverse engineer? Your trick of embedding XSL only "worked" because you were lucky enough for XSL to stick around as a commonly used file format --- or did you have a magical future-seeing crystal ball in the '90's (and then, why didn't you warn anyone about 9/11)?
Re:Code should accompany data by Anonymous Coward · 2013-06-04 16:07 · Score: 0

That works fine and dandy until you remember that programming languages disappear and OSs and processor architectures change, so neither the source nor binaries are future-proof.
Re:Code should accompany data by michaelmalak · 2013-06-04 16:35 · Score: 1

It's an issue of installed base.
The installed base of any given NDT system is typically less than Qty. 100, often much less. The installed base of HTML5 interpreters is on the order of a billion. The installed base of PowerPoint 97 at its peak was in the tens of millions. To be honest, I think Vint Cerf is complaining a bit much. Anyone (including him) could download the appropriate VMs, archival operating systems, and archival Microsoft Office systems to read PowerPoint 97 and even convert it to a modern format where the file could even be further edited and modified. In his search to provide an example, I don't think Vint Cerf came up with a good one.
A life-critical system with an installed base of less than 100 where the data must be preserved for 50 years is a better example, and answers your question: why code can be more useful than data. Code from an installed base of 100 is useful if it relies upon software that had an installed base of a billion.
Re:Code should accompany data by femtobyte · 2013-06-04 17:31 · Score: 1

PDF readers have an install base of pretty much every current computer on the planet. In fact, Microsoft Office has an install base at least able to *read* the formats of pretty much every current computer on the planet. So does Adobe Flash. If your argument is security-through-install-base, then why aren't these approximately as "safe" as JavaScript? I'd consider an MS Office document or a flash application to be a pretty iffy long-term archival format; however, according to an "install base" criteria, they seem like perfectly good choices. In fact, that's the same logic that encouraged so many people to use Office '97 back in '97: "everyone uses it; stop being silly and bothering me about 'open standards.'" I don't disagree that a wide install base is one ingredient in increasing the probability of format longevity; but I think you'll need more sophisticated criteria than that to not get seriously blindsided by widespread future changes --- "everyone switched to Python 5 a decade ago --- the last time anyone would have been running a JavaScript interpreter, they were still using *magnetic* storage media. Good luck getting one of those working!"
Re:Code should accompany data by michaelmalak · 2013-06-04 17:46 · Score: 1

The likely long-term viability of PDF does not discredit the long-term viability of JavaScript.
You agree with Vint Cerf about PowerPoint 97 being an archival format. I disagree with him. As I wrote, I believe software archives will maintain VMs, copies of OS's, and copies of Microsoft Office due to the historical installed base of tens of millions.
The question is whether you would be able to find a JavaScript interpreter on a search engine in 2100. I believe the answer is "yes," because it was an important (by installed base) piece of software. Admittedly, it won't help you in an apocalyptic scenario.
Re: Code should accompany data by Anonymous Coward · 2013-06-04 17:56 · Score: 0

So when distributing an old OpenOffice file I should also put OpenOffice? and maybe also the whole operative system, 'cause a lot of libraries and syscalls are gonna be eradicated 10 years from now....
You need to consider dependancies, your solution does not work
for every data format...band that's assuming your system does not depend on the hardware...
Re: Code should accompany data by michaelmalak · 2013-06-04 17:57 · Score: 1

So when distributing an old OpenOffice file I should also put OpenOffice?

OpenOffice should have an option to save to an XML file that includes an embedded XSL/HTML5/JavaScript viewer program as I described in my presentation.
Re:Code should accompany data by femtobyte · 2013-06-04 18:16 · Score: 1

By that logic, though, there's nothing particularly special about your concept of "code accompanying the data." PDF is an "important (by installed base) piece of software" --- and whatever archived VMs that still have a working JavaScript interpreter (web browser) will *also* have a working PDF reader. While you're making a big deal of keeping the "code" (XML/JavaScript) with the data, this is actually entirely irrelevant to what the mechanism you propose relies on: constantly maintaining a working "chain" of VMs for VMs for VMs for whatever systems are (were) commonly in use. This approach might well work; after all, storage is cheap these days. However, it is prone to catastrophic failure: if the "chain" of generating new VMs is ever broken, you're left with an extremely complicated and opaque mess of bits that would require re-inventing entire (dead) operating systems to restore.
An alternate approach to maintaining ever increasing opaque complexity (better not let anything important slip into the cracks) is to try to come up with clever ways to produce formats/archives that would be especially fast/easy to reverse-engineer and bootstrap from first principles if no "living" interpreter could be found. This is a hard problem, if not impossible to find a satisfactory answer, but worth thinking about. If you can popularize such a format, then you'll have both the protection you propose ("eternal" support in nested VMs), *and* a backup plan for recovering information if, over the decades, you misjudge what technological branches will be faithfully preserved for posterity.
Finally, one potential monkey wrench in the works of your plans to always have a chain of older operating systems: operating systems are becoming increasingly dependent not only on self-contained binaries on a machine, but internet connections to gigantic networks of services. What happens when you try to boot up Windows 2038 in the year 2065 (through a few intermediate virtual machines), only to find that the OS needs to connect to several hundred servers/services that were scrapped twenty years ago? So, for your proposal to work, you also need to force operating system designers not to create any external dependencies --- exactly opposite to the current trend of "cloudifying" everything in sight. Increasing integration with "the cloud" would be fatal to your data-preservation method.
Re:Code should accompany data by michaelmalak · 2013-06-04 18:28 · Score: 1

I consider PDF to be powerful because it can contain JavaScript, and even embedded mouse-driven interactive animated 3D. I consider PDF to be a lateral alternative to embedding JavaScript in XML as I presented.
I agree, self-describing formats, such as the Voyager pixel image and the Contact engineering diagrams, are interesting. They solve the extreme of the problem, in a survivalist way. It's the bomb shelter level of planning, whereas it would be reasonable for most people to instead just stock 7-30 days of provisions on a shelf. A shelf is convenient, as is a data file that executes itself against a widely-available interpreter.
Yes, I see the chain of operating systems as the solution just for the closed source world, and I'm hoping those days are behind us or soon will be.
Re:Code should accompany data by Anonymous Coward · 2013-06-05 01:25 · Score: 0

That works fine and dandy until you remember that programming languages disappear
A suitably competent person of the future could almost certainly read the logic of an unknown programming language from the past. A lot of it is similar to maths, which is likely to last as long as the human race.
Re: Code should accompany data by Yo_mama · 2013-06-05 05:48 · Score: 1

Really? A viewer that works on N^X operating systems, including ones we haven't written yet?

--
Never understimate the power of human stupidity -Lazarus Long
Re: Code should accompany data by michaelmalak · 2013-06-05 06:00 · Score: 1

A viewer that complies with W3C standards for HTML5/JavaScript.
Re:Code should accompany data by yusing · 2013-06-05 12:25 · Score: 1

I can retrieve the -essential- information on an LP vinyl record with a piece of paper and a pin.
I vote for ASCII (or some representation with all-printable characters) for all data that's got to last. That way it can be printed on paper which can potentially last for thousands of years... even buried in a garbage pit. https://en.wikipedia.org/wiki/Oxyrhynchus_Papyri
At least, until someone invents little disks you can spin over a self-powered stonelike table and they're read back to you.

--
"You must try to forget all you have learned. You must begin to dream." -- Sherwood Anderson

see Windows 1250 and 1251 by Doug+Merritt · 2013-06-04 15:39 · Score: 2

Windows 1250 and 1251 do, and possibly others. It sounds familiar, but my memory is fuzzy, so I just looked around.

https://en.wikipedia.org/wiki/Windows-1250

--
Professional Wild-Eyed Visionary

apt-cache search EBCDIC by Burz · 2013-06-04 15:45 · Score: 1

Yields 4 results in Ubuntu. You can search reputable open source archives on the web, too.

How deep are your pockets?

*IBM Consulting*

Um, really???

Latex by mnajem9960 · 2013-06-04 15:46 · Score: 0

some said use Latex or VIM/Notepad plain ASCII

Re:So? by ArhcAngel · 2013-06-04 15:47 · Score: 1

*spoilers*

--
"A person is smart. People are dumb, panicky dangerous animals and you know it." - K

real problem is: FEATURE CREEP by bussdriver · 2013-06-04 15:48 · Score: 2

I've been part of archival problem planning. We went with DVD. now I am not there, I suspect they are thinking DVD sucks and are moving "forward" when the DVD was more than good enough and those plastic discs will last a century. mpeg-2 files will have open source decoders. Now physical readers will still be a problem... the only solution is to wait as long as possible and then switch to the next long lasting format - but not necessarily the newest one at that time. (which is why moving to blueray is a waste of money.)

The biggest problem with other formats is the FORMAT; even with something like open office documents, the ODF format will have revisions and new features added and tweaks to the format. version 2, 3 etc. The features and changes that promote the creation of more and more formats is the biggest problem. Just like my above DVD video problem- if you go beyond your needs then you are complicating things with more and more formats.

TEXT? sucks. we need WORD! Word 1.0? the app sucks... we need WORD 20! (and all versions in between to migrate the old docs...plus labor to deal with conversion issues...)

Perhaps we need ARCHIVAL formats; like PDF, which has done besides the stupid additions Adobe has been making to it. Or just TEXT export... a less bloated output only format without the feature BS problems.

Thankfully, email remains the same... sort of. although storage of the emails differs greatly; if you want to archive emails you need to pick a close-to-the-source method (and simple storage filesystem-- good luck reading that NTFS formatted disk image in 30 years.)

--
Democracy Now! - uncensored, anti-establishment news

Re:real problem is: FEATURE CREEP by Stuarticus · 2013-06-05 00:59 · Score: 1

I'll gladly bet with you that those all DVDs won't last one hundred years. Unless the archival problem planning you were involved in was more than ten years ago DVD was an insane option, even then it was sketchy at best.

http://www.thexlab.com/faqs/opticalmedialongevity.html

--
If you think someone isn't free to have a different definition of "freedom" you may be a tyrant.
Re:real problem is: FEATURE CREEP by bussdriver · 2013-06-05 14:21 · Score: 1

DVD-RAM and DVD-R in cartridges, never exposed. Blueray wasn't even a name back when we did this. HD storage was a joke and way more expensive at the time. Tapes were expensive but were looked at as an option. Best of all, a long list of devices that could read the things being around for a long time. (Sure you can get PATA to SATA adapters, but will you be able to read an old HD's filesystem? probably.)
Well, I'll bet $1,000,000 they will last 100 years. You can collect the money from my grandchildren :-p
It doesn't matter as long as it lasts a long enough time to avoid upgrading and migrating data multiple times. It'll last beyond blueray and perhaps beyond it's replacement. It may be migrating to HDs today as far as I know... knowing them, probably RAID 5 without a backup (they kept thinking RAID 5 had built-in backup way back then... which BTW was hardware only back then.)

--
Democracy Now! - uncensored, anti-establishment news

The Print Button by Anonymous Coward · 2013-06-04 15:48 · Score: 0

Use it. We're way too obsessed with saving everything. If it's not worth paper and ink, we won't miss it in 100 years.

Re:The Print Button by Macgrrl · 2013-06-06 10:44 · Score: 1

If you want to keep it, you should probably be laser printing it, inkjet ink fades.

--
Sara
Designer, Gamer, Macgrrl in an XP World

I do blame Microsoft by Darinbob · 2013-06-04 15:59 · Score: 4, Informative

Seriously, why would Vincent Cerf not blame Microsoft? They have an extremely poor track record with backwards compatibility, and I don't think they even know what forwards compatibility is. If you design the data formats correctly then you can keep things usable for decades (or centuries). Guess what, twenty year old TeX documents still work, and yet Word X won't work with Word X-2. I've pulled runoff documents off of 70's versions of Unix that can still be printed. That says to me that one can deal with compatibility issues.

This is all intentional on Microsoft's part too. They make money when customers buy new copies of software, so it is in their best financial interests to make sure that customers have significant pressure to upgrade. I remember the solution to an acknowledged bug for Word 97 was to make sure that everyone who was going to read your document had the appropriate Word 97 plug in in their older version of Word. I completely blame Microsoft here.

This is not that hard a problem, IF the company pays attention to it and gives it even a small amount of priority.

Re:I do blame Microsoft by Narcocide · 2013-06-04 16:19 · Score: 0

Seriously, why would Vincent Cerf not blame Microsoft?
+1 Underrated.
Re:I do blame Microsoft by KiwiSurfer · 2013-06-04 17:56 · Score: 1

I think Microsoft is doing very well in terms of backward compatiability when you compare certain products of theirs to those offered by other companies. One example I'm sure many people can relate to is Windows. It is the only OS I'm aware of that can run many apps compiled in the '90's on the current version without requiring a lot of hacking. I still play games from the '90's on Windows 8, many of which work fine out without any tweaking. Try that with Apple, Linux, et al. While Microsoft hasn't always done well with all their products (IE, Outlook, et al comes to mind), there are several products (such as Windows) where they are doing very well and they should be acknowledged for that.
Re:I do blame Microsoft by mhotchin · 2013-06-04 17:59 · Score: 2

To say that MS has a poor record of backwards compatibility is, well, ridiculous. It's only just about *the* most important thing for them, because the majority of their business is with busnesses, and if their FooBar app doesn't run, then they don't upgrade.
No other OS has near the level of compatibility that the MS sequence does.
http://www.youtube.com/watch?v=vPnehDhGa14
http://blogs.msdn.com/b/oldnewthing/archive/2006/11/06/999999.aspx
http://blogs.msdn.com/b/oldnewthing/archive/2003/08/28/54719.aspx
Re:I do blame Microsoft by serviscope_minor · 2013-06-04 19:50 · Score: 2

No other OS has near the level of compatibility that the MS sequence does.
Somebody's been drinking the kool-aid.
There's a small, little known company called IBM selling a type of computer called a "mainframe" which might beg to disagree. You can buy a modern mainframe which will still run your unmodified programs which you wrote on an original System 360. In 1964.
Microsoft have not even existed as long as that chain of backwards compatibility, and you try getting the original digger to run on Windows 8 (or RT! ha! instruction set changes are no barrier to IBM apparently) without Dosbox.

--
SJW n. One who posts facts.
Re:I do blame Microsoft by Darinbob · 2013-06-04 20:12 · Score: 2

The OS stays compatible in some ways (Windows is not at all unique here). However the Microsoft applications have serious problems in this regard. Maybe some of the competition is not so great either but it's no excuse when Word can't even be compatible with itself. They have changed the file format in Word in fundamental ways several times.
Re:I do blame Microsoft by drinkypoo · 2013-06-05 01:44 · Score: 2

No other OS has near the level of compatibility that the MS sequence does.
It's called ANSI C on Unix. Pick up a copy of The UNIX Programming Environment and you can still use the examples verbatim on a Linux machine today. And you can even still use Motif apps, if we're talking about GUI programs. They still work just like they did when they were new, except a hell of a lot faster.
Oh, you want backwards compatibility for closed-source software? Guess what? Plenty of software craps itself when it does anything interesting on the wrong version of windows. In reality, there's only one way to ensure compatibility, and that's to have your hands on the source — and for it to be worth a crap to begin with.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:I do blame Microsoft by Mirar · 2013-06-05 06:04 · Score: 1

I would blame Microsoft as well.
It's their problem to invent a format that is readable in the future and in the future create readers that can read all old formats they invented.
Do that design well and it's an easy problem.
Don't do it well (they didn't) and it's a hard problem.
It doesn't even have to be separated in time for it to become a problem - formulas in Excel used to be in the local language, so you couldn't even exchange sheets to another region (or a computer with a different language in the same region). I haven't heard about it, but I kind of bet there were issues with transferring files between PC and Mac for the same Microsoft Office (in the same language) as well.
Re:I do blame Microsoft by Darinbob · 2013-06-05 06:43 · Score: 1

One thing I found interesting was that the MFC stuff from Microsoft (which I rarely used, a nasty piece of work that is) has some code to automatically save and restore your class states. However all it does is write out a version number and then essentially do a binary dump of the data. If the version number doesn't match when reading it in then it will refuse. There are so many things wrong with that approach I don't know where to start. But having Microsoft present that to their developer customers as an example of how things should be done is telling.
People don't get to see much code written by Microsoft, but what can be seen is very often very naive, poorly written, and violating Microsoft's own standards. Most of this code though are samples though, things to show devs how to use a new DLL for instance. So part of me wonders if the absymal quality is only because they assign interns to write this stuff and that the smart developers do more important things, but at other times I wonder if this is a more pervasive style internally which would explain the MFC stuff.

This problem isn't new to anyone by kriston · 2013-06-04 16:21 · Score: 1

This problem isn't new to anyone. If it's new to you, then you need to get involved in the digital preservation movement.

http://en.wikipedia.org/wiki/Digital_obsolescence

--

Kriston

Re:This problem isn't new to anyone by gweihir · 2013-06-04 17:23 · Score: 1

Indeed. I remember hearing a very good talk about preserving digital images 15 years ago. They were going for TIFF without any compression as writing software that recovers the image becomes very simple then. .pnm formats in ASCII would also do it.
This problem is not new, and it is solved. It is solved as long as you look at whether a particular product ignores this problem or not _before_ you decide to standardize on it. If people were not so ignorant, Microsoft could have never pulled this stunt that made them countless billions.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.

"hard problem" by macraig · 2013-06-04 16:29 · Score: 4, Insightful

Vint, that's bullshit and you know it. It's nothing more than preserving syntaxes, grammar, file formats. That's not hard, and it only requires someone to create a format conversion ONCE to solve the problem at each stage of the evolution.

The real problem here is proprietary non-public formats and structures. When the structure of data has been a closely guarded secret and requires reverse engineering that may not even yield a perfect result, THAT is hard.

Re:"hard problem" by gweihir · 2013-06-04 17:24 · Score: 1

Some say that for MS file formats, not even MS has a spec that is usable. Would explain why documents get mangled when you go from one Word version to another. Pathetic.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.

Re:Code should NEVER accompany data! by lahvak · 2013-06-04 16:41 · Score: 4, Insightful

No! Fail! You don't get it!

1) Code is data
2) Code is data that is especially hard to interpret
3) One of the main reasons of all this mess ia that in all those proprietary formats, data is intermixed with code, and the whole mess is very hard to parse.

Data should be kept completely isolated, as far away from code as possible. That way, if you cannot interpret the code any more, you will still be able to analyze and parse the data. You know, it is not that hard to construct a record player.

--
AccountKiller

He should be blaming Microsoft by gweihir · 2013-06-04 16:41 · Score: 2

My first Latex publications from 20 years back and all my human-readable ASCII scientific data still be read and used without any problem. Human-readable file
formats in the UNIX tradition completely solve this problem.

This problem is only hard if the people making the data formats are either stupid or do not want their formats to be easily accessible to other applications, as Microsoft does. Of course, others are creating just as fundamentally broken formats for either of the same reasons.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.

Re:He should be blaming Microsoft by hcs_$reboot · 2013-06-04 17:30 · Score: 2

Just hex print the MS 97 file and you have a human readable format:00007b0 5f00 675f 6f6d 5f6e 7473 7261 5f74 005f 00007c0 696c 6362 732e 2e6f 0036 5f5f 7270 6e69 00007d0 6674 635f 6b68 6500 6978 0074 6573 6c74 00007e0 636f 6c61 0065 626d 7472 776f 0063 706f 00007f0 6974 646e 7300 7274 636e 7970 7000 7475 0000800 0073 6177 6e72 0078 5f5f 7473 6361 5f6b 0000810 6863 5f6b 6166 6c69 6900 7773 7270 6e69

--
Slashdot, fix the reply notifications... You won't get away with it...
Re:He should be blaming Microsoft by gweihir · 2013-06-04 17:31 · Score: 1

Your standards are too low.
Or maybe you have been exposed to too many corporate BS documents....

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.

this is a real problem and has been for a while by Anonymous Coward · 2013-06-04 16:47 · Score: 0

Guys this is NOT Microsoft specific. This has come up with NASA and old telmentry stored on tape and programs stored on punch cards.

The problem is that to store documents for long periods of time you have two methods
1) photograph on microphish
2) print on high quality low acid paper.
For pictures you have one method only and that is to get your photograph put on a negative.

That is it
This is because their is no official electronic backup standard that has been proven to last. None, zippo, goose egg.

Dont complain about Microsoft, or any other company, We ( the public has not demanded), permance for our electronic records, so we have none.

So if you want your data to stick around, get it printed.

I have habit of telling people what they do not what to hear, and people do not really want to hear this. If you do not make you data permant, in the methods which I have specified, it will be gone very quickly

Chip

This is a serious problem? by Anonymous Coward · 2013-06-04 16:48 · Score: 0

Try telling a Syrian villager that you are very worried about not being able to open 1997 Powerpoint presentations 100 years from now.

the man is out of touch by stenvar · 2013-06-04 16:49 · Score: 2

You can get emulators for just about every machine you can imagine: PDP-10, PDP-11, DOS, Atari, Amiga, C64, microcontroller, etc. You can get hardware emulators with FPGAs if you like. Almost any important format is documented or has been reverse engineered. Yes, you can easily read 1997 PowerPoint files, even if his weird choice of Office on Mac can't. And that's only with current technology. Give it a few decades and all that can happen behind the scenes and computers will just automatically perform even the most complicated data conversions behind the scenes. "Computer, scan the 1997 floppy and put the data on screen."

Re:the man is out of touch by uninformedLuddite · 2013-06-06 12:51 · Score: 1

Give it a few decades and all that can happen behind the scenes and computers will just automatically perform even the most complicated data conversions behind the scenes. "Computer, scan the 1997 floppy and put the data on screen."
Really? Do I need to insert my credit card so that all the different patent and rights holders get their cut as I pass through their digital territory?

--
The new right fascists are bilingual. They speak English and Bullshit.
Re:the man is out of touch by stenvar · 2013-06-06 19:50 · Score: 1

That is entirely up to you. But if you want to read the data, you can.

Re:So? by wickedskaman · 2013-06-04 17:09 · Score: 2

Who hurt you? :-(

--
Sand's overrated... it's just tiny little rocks.

Not hard by Tough+Love · 2013-06-04 17:25 · Score: 1

Backward compatibility is not a hard problem, Vint Cerf just isn't very good at it as evidenced by the IPv6 fiasco.

--
When all you have is a hammer, every problem starts to look like a thumb.

Darth Cerf is nuts by rs79 · 2013-06-04 17:39 · Score: 1

What's he doing keeping stuff in MS apps for? Then when they don't work 5 years later he's all like OMG THE NET WILL BREAK.

Idiot. He knows better. Or should.

--
Need Mercedes parts ?

Yes Yes by Greyfox · 2013-06-04 17:48 · Score: 1

Blah blah digital dark age blah blah gay sex. Anything worth preserving is still written in books. Do we really need to preserve every byte of information on the internet for posterity? Most of us dildos just aren't that interesting. I'm pretty sure the future will survive just fine if every cat video on youtube doesn't last to be reviewed by future generations. If every movie for the past couple-three decades went up in smoke, anything of value lost? Really? Granted, I like to imagine some future-NPR blathering on about how that all-digital rendition of Snoop Dog's "All my Bitches" in D minor was one of the greatest works of the era, but it probably wouldn't really be all that funny. The world doesn't need to remember you or me or that guy over there. Hell we can't even learn from our history from few decades ago, much less from some guy who got nailed to a cross a couple thousand years ago (Probably not the one you were thinking of.)

Sure I am sometimes saddened at the thought of the video games of my youth being lost forever, but even if they weren't it wouldn't recapture the joy I felt upon encountering them at the time. Do you think you are more important than that? Think of the current year and then start going back a decade at a time and name one person you know of from that time. How long before you run out of people you know personally? Before you run out of people you have even heard of? I bet most people can't even make it a century. Millions of men fought in the world wars, many of their stories are still recorded. How many people bother to look at even one? My grandfather recounted a story of seeing the first automobiles in his town, how many people even think of a time when they didn't exist, or the time when they were new to the world? Precious few I reckon.

If you want to worry about what history will think of THIS time, perhaps you should be a more careful custodian of previous ones.

--

I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

Re:Yes Yes by uninformedLuddite · 2013-06-06 12:53 · Score: 1

If you want to worry about what history will think of THIS time, perhaps you should be a more careful custodian of previous ones.
I dread to think that archives of youtube twitter and facebook survive into the future. We will be looked back upon as a bunch of retarded idiots.

--
The new right fascists are bilingual. They speak English and Bullshit.

It's hard for obscure forms of content by Animats · 2013-06-04 17:53 · Score: 1

I have simulation programs trapped in Working Model for Mac format. I have 3D animation projects trapped in Softimage 3D for Windows NT. Neither is easily convertible to anything else. (Worse, they're on DAT tapes.)

Images, video, audio, and text documents are easy to convert because there are modern formats that directly correspond to them. But some things don't translate well.

Preservation by Anonymous Coward · 2013-06-04 17:55 · Score: 0

Or we could all use The National Archives Digital Preservation tool called Xena which converts proprietary non-public formats to open source formats see https://sourceforge.net/projects/dpsp/?source=directory - it is how the Archive are dealing with the problem.

Re:Code should NEVER accompany data! by michaelmalak · 2013-06-04 18:01 · Score: 1

In my presentation, you'll see that the strategy of embedding XSL in an XML file has the code in the top half and the data in the bottom half, clearly delineated. They are easily separable. But by having them in a single file, they will not get separated by someone copying them.

Forced encryption of users data by Anonymous Coward · 2013-06-04 18:20 · Score: 0

Again a reason that closed formats for storage of data should not be legal.
Only way to ensure users the right to there own data and histrorians access to old data is to store data in open formats.
Wether software should be "free" or not can in my oppinion be discussed, but I find it hard to understand that the world can not see that the use of closed formats for storage is similar to encrypting the users data without asking.
Give us the freedom of plain paper back !

Old floppies are a problem. by Static · 2013-06-04 18:29 · Score: 1

Old samplers are rather a victim of that. The hardware is often fine and can still crank out some awesome sounds, but they are often diskette based and storage technology has moved on hundreds of times faster than synthesizor technology.

The Ensoniq scene has almost abandoned the EPS series because they used double-density drives and DD 3½" floppies haven't been made for years - and HD floppies aren't reliable in DD drives. Nowadays even HD diskettes are losing their stored bits. *All* the people keen to keep the ASR-10 alive have shifted to SCSI solutions because floppies are just not reliable anymore.

Wade.

Re:Old floppies are a problem. by NJRoadfan · 2013-06-05 00:40 · Score: 1

Floppy emulators are becoming a popular alternative for use in synths and older computers. I'm finding out that even HD disks aren't reliable in HD drives. Brand new disks with bad sectors are common as media quality took a nosedive. Many disks I have used for sneakernetting to my 486 that were just months old have become unreadable (its not the drives), good thing there wasn't anything important on them! Meanwhile disks from years ago still work fine.

Keeping Old Computers by Anonymous Coward · 2013-06-04 18:37 · Score: 0

I guess that justifies the stash of all those old computers I'm hording...

CSV by VortexCortex · 2013-06-04 18:39 · Score: 1

Bullshit. You're merely enjoying the consequences of voluntary DRM. If you don't care about your data you'll lose it, just like those pictures you used to draw in crayon that hung on the fridge. If they ARE important then you can keep them and use the data indefinitely.

I still run the GWBASIC programs, and even 16 bit x86 DOS code I wrote as a child to edit images and color palettes via keyboard in (M)CGA video modes which BIOS still emulates, and OSs like Free DOS can still make use of (Watercolor isn't extinct because Oil paint exists, Platforms are to game makers what Canvas and Paint is to Painters). Hell even my very 1st 386 bootloader can be written to an MBR and booted on a brand new x86-64 system (disable Security Theater Boot). This is NATIVE support. With an emulator, I can even run programs I wrote for my dad's old PDP-8 -- A completely different architecture... 12 bit bytes!. I cared enough about the little dinky things I did as a kid to make sure they were preserved across every major storage format change. I can still read the comments my dad thankfully added to some of my code all those years ago -- a valuable lesson indeed; My kids find gramps' snark quite funny. That's several generations of data compatibly for my family's directory tree...

It's not useful to bitch about compatibility by citing programs created by companies that willfully suck at compatibility. MS DOS requires an emulator, but DR DOS can still be installed on my new systems. Though it doesn't recognize my sound card I can still program a driver for it though -- just like I did to get my old custom IR transceiver devices to control my new home theater setup (lights, screen, volume, etc) via my aging Osborne-1's serial port.... It's a functional "conversation-piece" to hear that familiar 5.25" drive access as the signal tables are loaded for TV instead of the stereo. That same data format which has been in use now for decades and even works on new hardware w/ Linux via LIRC now -- thanks to the kids... old Ozy will give out someday. Thats a future proof protocol compatibility across several generations of hardware, simultaneously.

There is NOTHING stopping me from converting the palettes and images created in my PAL_EDIT.COM into a GIMP .PAL / indexed .TGA or .TIFF, or .PNG, etc. I can (and do) frequently convert files in both directions, to go from GIMP to PAL_EDIT.COM to get new images and new "mods" into my really old game "engines". That's the thing about open formats and programs with source code available. Remember the push back against non-textual network protocols and even in email?) We won this battle already. I wasn't aware anyone had stopped fighting it. This page is written in TEXT. It's JavaScript and HTML... FFS: The 1st damn web page on the Internet still renders.

The authors can ALWAYS create data converters if they want, the problem is giving up that right and not demanding source code access. If my own data formats can survive the transition from kid to teen to adult and even be shared and passed on to my own kids (who love "real" retro games, BTW, such hipsters), then surely multi-billion dollar companies can do it too. Or, are you implying that despite all that money they are more inept than I can even imagine? If so, that's a pretty big dig at Microsoft there Vint... Bravo. Kind of makes me wonder WTF you're paying them for, eh?

I expect this kind of BS from you now Vint. I mean, you don't even realize the usefulness of your own contributions to mankind, Saying that the Internet is not a human right. Look up human: A characteristic of humans; A human being. It is a human right. It's the right to bear technology. That's what the 2nd amendment is really about, they just worded it wrong, they're imperfect. Just because some old farts can't understand the future the way we do now, doesn't make new technology NOT a human right. The Internet is the equivalent of access to spee

Re:CSV by VortexCortex · 2013-06-04 19:16 · Score: 1

Tisk. Tisk. I checked to see. I used the emulator to create an MS Excel file in Office 97 format on Windows 98. I went through each version of Office I have, that is to say: Not every version. I successively pulled up the Excel file, converted and re-saved it. It now works in MS Office 2013. To me this just reinforces the idea that continuous duplication across formats is the answer. I still assert that open source programs and open formats are needed, otherwise you could lose access to a program -- I noticed that XP said it wasn't a legit copy, which I know it is.... DRM fail.

If a company goes out of business and there is no provision for its software to become accessible to others, all the products running that software may become inaccessible, Cerf said. "There are hard, complicated technical and legal problems that will have to be resolved."

The problem is recognized and there are efforts internationally to address it. Cerf said he's been in meetings about this issue attended by 400 people.

"It may be that the cloud computing environment will help a lot. It may be able to emulate older hardware on which we can run operating systems and applications," he said.
Does it take 400 of the finest minds you can muster to simultaneously shout. "OPEN SOURCE" ? If the company goes out of business, and their code was open source, it really doesn't require any further action to ensure the data will always be usable by the end users, eh? Businesses need to stop using artificial scarcity, stop selling infinitely reproducible bits, and simply Do Work to make money. What happens if the buzzword compliant "Cloud Computing Environment" goes out of business? Why then you can't even try to reverse engineer your data -- It's really fucking gone then, eh, Genius?

yawn by Anonymous Coward · 2013-06-04 18:40 · Score: 0

Vint Cerf should learn when to shut the fuck up.

Daguerreotypes? by Anonymous Coward · 2013-06-04 19:54 · Score: 0

By definition they're NOT Daguerreotypes.

Daguerreotypes are single, positive images on silvered copper plates. Anything on a glass plate from the late 1850s up to the 1880s will be a negative image produced by the "wet plate" process (unless its a positive "lantern slide", which is just a positive copy of a glass negative!). Wet plate was a development of the original negative process, the Calotype, which itself in many ways was superior to the contemporaneous and inflexible Daguerreotype.

Of course, 8" floppies are dangerously modern - I see your floppies and raise Mag Tape and Punched media (tape and card). :-)

longterm readability and backwards compatibility by waterbear · 2013-06-04 21:10 · Score: 1

There are free/libre software projects with great records in opening up interoperability and keeping backwards compatibility. On the other hand, fashions among proprietary s/w makers seem to change, and about now there is a tendency to stop worrying about existing users and just abandon past formats.

Any number of folk will say things like "shouldn't be difficult at all to reverse engineer", but that doesn't make anything happen. On the other hand, there are plenty of apostles of the latest version ready to heap abuse on anyone bold enough to ask for backwards compatibility, and that attitude is a big source of problems.

Longterm readability is helped when software developers take the trouble to maintain backwards compatibility across different versions of popular tools and across competing applications that have broadly similar uses. That doesn't directly help with hardware barriers, but at least it would be good if the number of needless software barriers is kept down.

[...] Most of these things will be readable just as long as the applications that created them are around, but not longer.
[...]
Incidentally, all my decades old LeTeX documents still compile and can also be read directly. So can my 20 year old ASCII-coded measurement data.

Call a spade a spade by ArsenneLupin · 2013-06-04 21:17 · Score: 1

"I'm not blaming Microsoft,' said Cerf,

Let's call a spade a spade. It's 100% a problem due to opaque binary formats. Had the document been written in (clean) HTML or plain text, it would have stayed usable without problems.

the internet is filled with thieves by FudRucker · 2013-06-04 21:17 · Score: 1

a thief for example is, recently i was looking for an Owner's Manual for a Suzuki motorcycle in PDF form, the bike is a few years old so Suzuki does not keep it and the only website that has it downloadable wants me to both sigh up for an account with them and wants money for the download, and they did not make the owners manual so they have no rights to withhold that information either intellectually or materialistically, so i refused to sign up on their lame website and refuse to give them money and i will keep searching for a free copy

--
Politics is Treachery, Religion is Brainwashing

Re:the internet is filled with thieves by Kardos · 2013-06-05 00:34 · Score: 1

Just request a copy from Suzuki
Re:the internet is filled with thieves by drinkypoo · 2013-06-05 01:58 · Score: 1

Just request a copy from Suzuki
GP says Suzuki doesn't have the manual. It's quite plausible. Ford no longer has manuals for pre-Powerstroke diesels, and the printer (Helm) is no longer printing them, so if they run out of backstock for your model year then you need to know which model years are most similar, or you need to go to eBay. When I got the FSM for my 1989 240SX I was able to get it from Nissan, but I heard they ran out a couple years later.
Personally, when I want a FSM, I go straight to eBay anyway, because it's almost always the cheapest source. But I will often google around as well, because that's almost. Of course, you can often get illicit FSM scans on eBay as well, but you get about what you pay for there. The best thing is to get the real, OE hardcopy.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"

Re:So? by Anonymous Coward · 2013-06-04 22:12 · Score: 0

Yes, and by god, future historians will care about YOUR spreadsheets and YOUR websites! Egotistical jackass. No one gives a shit about 99.999999% of humanity after they're gone.

Re:Code should NEVER accompany data! by lahvak · 2013-06-04 22:38 · Score: 1

I guess that makes sense if your data is so complicated that it actually needs XML, but I would still say that for simple data that can be stored in a simple to parse format like csv or tsv, it is better to keep it separate.

--
AccountKiller

Re: So? by Anonymous Coward · 2013-06-04 22:40 · Score: 0

To a historian any information is good. Imagine how happy a historian of the future will be to be able run a map reduce on petabytes of documents from now. No information is not of interest to a historian if it can provide insight. Rummy summed it up well with his known unknowns and unknown unknowns. All data can tell a story, be nice to make it sing.

nasa by Anonymous Coward · 2013-06-04 22:50 · Score: 0

NASA has this same problem. They are unable to interpret most of the data from the Viking missions in the 70s. They have tapes of the data but they lost the documentation on how to interpret the 1s and 0s.

Re:nasa by arfonrg · 2013-06-05 03:05 · Score: 1

Put copies online and see how fast some nerds don't decrypt it....

--
Your thin skin doesn't make me a troll
Re:nasa by arfonrg · 2013-06-05 03:06 · Score: 1

(Ignore my "don't" in the above sentence. It made sense in my head but not so much in print.)

--
Your thin skin doesn't make me a troll

Re:So? by Anonymous Coward · 2013-06-04 22:54 · Score: 0

I'm just a realist without the delusion that somehow someone in the future will care deeply about my digital feces.

Nobody gives a shit about Vint Cerf anymore... by Anonymous Coward · 2013-06-04 23:18 · Score: 0

Another pointless story from a pointless old man... Wake me up when Slashdot posts a good article.

captcha: feebler

Future Proffession : Data Archeologist by nemesisfixx · 2013-06-04 23:22 · Score: 1

I was wondering what professions I should keep tags on, just in case we have that talk about careers with my son-to-be... Being an expert on long-gone and "lost" data formats and collecting their respective tools just seems like a future relic (Oh, and we already keep terabytes of all those myriads of one-time-use programs and utilities we downloaded from 5 years ago, right?)

best safeguard by hduff · 2013-06-05 00:04 · Score: 2

The best safeguard is the abandonment of all existing proprietary formats to freedom (so anybody can write conversion software) and the proliferation of open formats on an ongoing basis.

--
"I believe in Karma. That means I can do bad things to people all day long and I assume they deserve it." : Dogbert

Re:So? by r_a_trip · 2013-06-05 00:08 · Score: 1

*** Yes, and by god, future historians will care about YOUR spreadsheets and YOUR websites! ***

Actually they do. Historians are still trying to (painstakingly) find out how people in the Neolithic lived. So yes, having access to YOUR spreadsheets and YOUR websites will be very valuable for historians in say 3000 years.

*** Egotistical jackass. No one gives a shit about 99.999999% of humanity after they're gone. ***

Projection? That YOU don't give a shit about humanity, doesn't mean nobody else does.

--
# touch universe # chmod +rwx universe # ./universe

Vint is not blaming Microsoft, but I do. by 140Mandak262Jamuna · 2013-06-05 00:44 · Score: 1

Microsoft from day one has been making its data incompatible with everything else. It was a lean and hungry company back then (it is fat and hungry now), and it was compatible with every existing thing on the import side and incompatible with everything on export. It fought a mean campaign against Samba. It played dirty with Netscape and the web standards. Bugs in IE worked around in IIS and vice versa to make it very very hard to stick to a standard.

--
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact

Macwrite... by tekrat · 2013-06-05 01:41 · Score: 1

I'm having a similar problem. My father had started writing a book on Macintosh 512k using Macwrite. He passed away a decade ago, but, recently I uncovered a box of floppies.

Needless to say, even reading a floppy on a modern Macintosh is pretty much impossible, and even then, the older Mac documents had a data and resource fork, and recovering data from those early formats is pretty hairy.

Some of the data can be recovered, but it's unlikely I'll ever be able to completely read the book he was writing -- Unless I find myself a Mac 512 with Macwrite, and then run the text through the serial port to a more modern PC.

--
If telephones are outlawed, then only outlaws will have telephones.

ASCII by Anonymous Coward · 2013-06-05 01:42 · Score: 0

It works in the IETF, who shouldn't it work anywhere?
What can't be expressed by ASCII is anyway too complex so you shouldn't do it

Own the format by Anonymous Coward · 2013-06-05 02:34 · Score: 0

Stop making Microsoft out to be the technology leaders of the world. Microsoft exists to make money for themselves.

If we can't read our own files then we can only blame ourselves for choosing file formats for which we do not have the documentation for the structure.

How do you read the five inch floppies they're sto by arfonrg · 2013-06-05 02:56 · Score: 1

Same way the TRS-80 fans do.... Take an old drive with an adapter and read it off once and transfer it to new media.

Ira Goldlang's site (trs-80.com) has TONS of old software done that way.

--
Your thin skin doesn't make me a troll

Cerf is wrong... by arfonrg · 2013-06-05 02:59 · Score: 1

His position is that the data format is what will prevent data recovery - I postulate that as long as there are bored nerds that perceive a challenge, the old can and will be reverse engineered.

--
Your thin skin doesn't make me a troll

Re:Cerf is wrong... part II by arfonrg · 2013-06-05 03:03 · Score: 1

What WILL cause all of our digital data to finally be lost is media degradation. Every piece of data ever created will eventually be lost because the media it's on finally fails and someone forgot to copy it before hand. (That or the sun engulfs the Earth before we finally figure out that we have to get off this planet)

--
Your thin skin doesn't make me a troll

Re:longterm readability and backwards compatibilit by jedidiah · 2013-06-05 04:03 · Score: 1

I recently encountered some bit of data that was encoded in a proprietary format but didn't really need to be. Nothing about the data required the extra features available from the proprietary format.

It turned out that a file from proprietary app X generated a file that couldn't be properly displayed on other copies of the same app without first being converted to a non-proprietary format.

Some people do really perverse things to avoid giving you data in a reasonable format.

--
A Pirate and a Puritan look the same on a balance sheet.

One of the reasons for ODF format by Anonymous Coward · 2013-06-05 04:15 · Score: 1

The Open Document Format(tm) was intended to ensure that documents have longevity. They looked at what companies like microsoft were doing, with every version 'incompatible' with prior versions. (Its not a random thing either, microsoft goes out of its way to make *certain* that new versions are incompatible with old, so that people are *forced* to upgrade. When the Open Document Format(tm) was created, users such as the Vatican Library who have a large number of documents over 1000 years old, a good number of documents over 1500 years old, a smaller number of documents over 2000 years old, and less than two dozen shelves full of documents more than 2500 years old. Being able to read old data is important to them. Being able to read old data is an abomination to microsoft. Hence ODF. But microsoft tried to kill ODF with their OOXML which has proprietary undocumented containers within the XML, which makes reading anything older than 1 version impossible. Thanks again microsoft.

Future Tech by Anonymous Coward · 2013-06-05 04:30 · Score: 0

In the future decompiling programs will be easier. It's already possible, although tedious. Having the source code to a program would make backward compatability, modification and porting to new platforms possible where it wasn't before.

A theoretical tool for decompiling old console games will be known as a triplicator (redundant duplicator). An emulator, debugger, compiler, video recorder and more combined together. Just playing a game would automatically generate a wide variety of metadata, generic labels and identify game data formats. Most of a game is tied to the interface, so being able to glance at snapshots, gamepad input, routine parameters and more while looking at the assembly language would be very useful. And triplicator could generate some pseudo code (or C with inline assembly, or any higher level language), which would be easier to work from than raw assembly language. Recompiling sections on the fly (combined with saved states to avoid lock-ups), to get feedback for identifying variable names and what routines are for. After a certain amount gets decompiled, it can be recompiled with SDL and run natively instead of inside an emulator (playthrough of the game is recorded, so triplicator could use that as a script to playtest the SDL port automatically, finding any differences). The end result is perfect source code, a native port, extracted data and hundreds of megabytes of documentation. All created in a fraction of the amount of time it would take to use separate tools (weeks instead of years, probably).

The internet never forgets by Windwraith · 2013-06-05 06:34 · Score: 1

So, the internet never forgets about that time you got drunk and posted stupid photos, but it forgets everything else? God damn.

Might Help by Anonymous Coward · 2013-06-05 09:34 · Score: 0

If Google themselves gave more than 3 months notice. Years from now? Might not be readable 3 months from now!
Put everything in the cloud Google says! And you got 3 months to get it out...

Blame the user by Anonymous Coward · 2013-06-05 12:16 · Score: 0

If you have a pile of data and you don't keep the tools to read it then, you are the fool. Keeping original tools should have been the smart move, or updating files that were valuable. Sorry, I have old software and hardware to cover my past, why don't more people doi this? Sounds like a niche market for someone with old hardware or emulators.

Re:Google hire me, I solved this problem in 3 seco by Ant+P. · 2013-06-05 14:27 · Score: 1

Great. Now make your solution continue to work 20 years from now when the Windows XP activation service ceases to exist, which is what TFA is actually about.

a lifetime ago by mcswell · 2013-06-05 15:49 · Score: 1

"...software lifetime is only like 7 or fewer years..." Do you have a source for this, or is this your guess?

I'm not asking to disagree, quite the opposite: for seven years (coincidence) now, I've been arguing for storing grammar data in an XML format precisely because storing it in the programming language of a particular grammar parser means it will be unuseable in the not-so-distant future. While I have anecdotes (I once wrote a parser using three programming languages, and all three of them became obsolete within a year or two), I would love to have a study to cite.

cc:Mail by mcswell · 2013-06-05 16:10 · Score: 1

And I have email from the 1990s that I canNOT read today. It's called Lotus cc:Mail. (I could read it if I was willing to pay.)

"Digital data lasts forever -- or five years, whichever comes first."
--Jeff Rothenberg, 1997

Savegames and archival files by Anonymous Coward · 2013-06-05 16:56 · Score: 0

Office file formats, no matter what office suite or version, were never meant to be archival formats. They were more like savegames, little "memory dumps" allowing you to continue the game where you left off, no more no less. In fact some early systems even just dumped the memory onto diskette. (i.e. the Canon Cat) That's why such formats have non-portable options like OLE objects which are nearly impossible to open on another computer. If such a file ever moves from one computer to another you are screwed.

If you want to have something you want to be able to read in a few years or send to someone else, you must use archival formats. Those formats must be as trivially simple as possible. Possible candidates for archiving "printed" documents are TIFF (bitmap format, supports multiple pages) and archival PDF. Be sure to include a dump of the text in a separate text file so it's trivial to search. You don't need to change things in your archive. If you want a newer version re-create it again.

Never ever ever store data in file formats you cannot read yourself. Complex (binary) file formats are acceptable only as long as they don't have to be backed up. That's why SQL-Servers tend to store their dumps as simple text files.

Re:Google hire me, I solved this problem in 3 seco by dicobalt · 2013-06-05 23:44 · Score: 1

Windows loader, or an army of lawyers.

no sit serlock by Anonymous Coward · 2013-06-06 05:19 · Score: 0

Ah, a news bulletin from the land of "Where the Hell have you been for the last 30 years, asleep?"

Who knows by metaforest · 2013-06-06 20:07 · Score: 1

We don't know what this means either.... proprietary format... encrypted... and it cost a lot to send it.... alas it never arrived.

AOAKN HVPKD FNFJU YIDDC
RQXSR DJHFP GoVFN MIAPX
PABUZ WYYNP CMPNW HJRZH
NLXKG MENEK ONOIB AREEQ
UAOTA RBQRH DJoFM TPZEH
LKXGH RGGHT JRZCQ FNKTQ
KLDTS GQIRU AOAKN 27 1525/6

NURP 40 TW 194
NURP 37 DK 76

lib 1625
ToR 1522 copies sent 2

signed W. Stot, S(j/g)T.

Slashdot Mirror

Vint Cerf: Data That's Here Today May Be Gone Tomorrow

358 comments