Slashdot Mirror


Software Archaeology

Plug1 writes "Salon (day pass needed) has an article about preserving software for historical purposes. It discusses source code archiving, and the effect the DMCA is having on attempts to catalog and analyze legacy code. It will be a shame if in the future a wealth of information is locked away because knoweldge of the underlying technology is lost."

26 of 434 comments (clear)

  1. Please understand... by Creepy+Crawler · · Score: 5, Insightful

    That the DMCA DOES NOT APPLY outside the USA. However, hardware Digital Restriction Management DOES.

    I really dont want strong crypto keeping out of stuff that I OWN, or My CONTENT.

    I'td be a neat experiemnt to create a Linux driver that emulates TCPA chips so that stupid software thinks you're auth'ed.

    --
    1. Re:Please understand... by Lazar+Dobrescu · · Score: 5, Insightful
      This is not the only problem the article addresses though. As it is now, there are already tons of old file formats for which the software needed to read it is nearly impossible(or totally impossible) to find. Documents written in those file formats could contain useful, or at the least interesting content, but we can't get to that content.

      We are talking here about file formats 30 years old, or even less. Try to imagine what will happen in 200 years. Most of our history will be written to electronic media, and for people that will live in 200 years, the file format used for that media will very probably be undecipherable.

      What is the solution? Some say that we need to convert all documents in a more recent file format every x years. That will really become a pain in the ass as the number of archives go higher and higher.

      Another trick could be to describe in whole the file format used and attach that description to every file. That, of course, brings up the problem of what file format to use for that description... (will even plain ascii files still exist in 200 years? Maybe not, but I think it is reasonnable to expect that people will at least still have an idea of how to read them...)

      Comparing this to the problem faced for dead languages gives a good idea of the repercussions... There is already countless documents written in very old ages that we cannot decipher because the language used to write it is loss. People are working all their lives trying to understand a dead language. But with computers, we're not talking about something that happened 4000 years ago, but 30 years ago... That means that in the course of your lifetime, You could see obsolete file formats 3 times!

      Someone will need to find a solution for this, and preferably before the problem happens for real...

    2. Re:Please understand... by TheRaven64 · · Score: 4, Informative
      Repeat after me:

      TCPA hardware is not the same as DRM, and is not evil

      The TCPA hardware specifies a cryptography co-processor on the mainboard. This can be used for DRM, but it can also be used for offloading things like SSL from the CPU. Emulating the hardware would be no good. Under *NIX, it would just be mounted at /dev/crypto (or something), and emulated if the hardware were not availible. It is the software which manages DRM.

      --
      I am TheRaven on Soylent News
    3. Re:Please understand... by Sique · · Score: 4, Insightful

      The rosetta stone contained the message in Ancient Greek (a dead but widely known language at the time of deciphration), Coptian and Hieroglyphic. Even though Coptian was at least known to some specialists and people able to read Ancient Greek were abundant at the time (and still are), it took about 25years to decipher the hieroglyphic texts.

      And this was with a language which itself was very easy mapped to the letters (every consonant mapped to a letter, vowels omitted).

      The rules which encode a file may be much more complicated. Look just at the most common compression methods (Run Length for instance), how they just add another layer above the already encoded contents. And they remove something very important for deciphration, the redundancy, out of the data. Then the subjects that are stored in files are much more diverse. We have not only language, we have music and graphics, 3D data and cryptographic certificates, configuration files and program binaries.

      Just to be able to know what the file is about and thus have an idea how to get started can prove to be more complicated than any deciphration from archaeologic texts.

      --
      .sig: Sique *sigh*
    4. Re:Please understand... by DiscoDave_25 · · Score: 5, Insightful

      It's not just the file format that will be the problem (although MS aren't helping in that respect) but simply ensuring that the media that the file is written on can be read. Physical media degrade and the hardware to read them become obselete. An example of this was the BBCs Doomsday disk which contained a huge amount of information (for those days) on a laser disk that is today virtually unreadable. Thankfully this has been recently transferred onto DVD before ALL the readers died but just because someone can understand HOW to read a file doesn't mean they'll be able to access it in the first place.

    5. Re:Please understand... by 4of12 · · Score: 4, Insightful

      an archaeologist of tomorrow can figure out ascii.

      To be sure.

      And will they be able to figure out PowerPoint?

      And how about Secure PointPoint 2005 with automatic DocuSafe technology that incorporates encryption with a public key that is automatically downloaded over the network from microsoft.com after your VISA card number has been authenticated with citibank.com?

      No, tomorrow's archaeologists will miss out on the whole indecipherable morass that is today's data formats.

      Documents and presentations will look indistinguishable from random noise.

      And, honestly, a lot of what gets attached in those formats looks that way already to me in 2003.

      --
      "Provided by the management for your protection."
  2. Explain the Pyramids? by Yohahn · · Score: 5, Funny

    This would explain the pyramids, if in the past IP laws of ancient cultures prevented sharing of ideas.

    1. Re:Explain the Pyramids? by KalvinB · · Score: 4, Insightful

      There's also the problem of grave robbers and that whole burning of the great library thing.

      The Egyptians could very well have written down the instructions for building them. There have been numerous opportunities for that information to be have been destroyed. Or they may have viewed their construction as too sacred and only passed down information on a need to know basis.

      Our problem is that we charge for rocks and lack the motivation. We just assume we couldn't build such things as they did but never really bother to try.

      Ben

  3. Central Point Software by havaloc · · Score: 4, Interesting

    Who could ever forget the awesome software company Central Point Software? Their PC Tools and famous Copy2PC were high quality, and very useful products. Anyone that was anybody had Copy2PC, a program that could copy nearly ANY copy protected floppy disk. They even came out with a floppy controller that did the same thing.

    1. Re:Central Point Software by JoeD · · Score: 5, Funny

      Yeah, and every copy of it I ever saw had been pirated.

  4. Preserve the Hardware as Well? by Nom+du+Keyboard · · Score: 4, Interesting

    If you're going to preserve software, doesn't it make sense to preserve the hardware to run it on as well? Emulation is less than perfect.

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
    1. Re:Preserve the Hardware as Well? by crazyphilman · · Score: 4, Interesting

      There was a great Cowboy Bebop episode in which they received an old Beta tape (keep in mind this was set in the 2070's). They found one beta player in a market, but they managed to demolish it. Then they hunted another down by descending hundreds of meters underground to a defunct "museum of technology" to snatch one of the Beta players there, but not knowing the difference between Beta and VHS, they stole the wrong one. Finally, a beta player was shipped to them in the same way as the tape and they were able to view the tape (a little too "deus ex machina" for my tastes, but still).

      It was fictional, and very tongue in cheek, but it made an interesting point. How the hell will you play your archived media if you don't have a player? And, not just a player, but support equipment as well -- a display that can connect to the player, a power supply that is the right voltage, amperage, and number of cycles, compatible cabling, etc. It could turn out to be quite a trick to get all the requirements together, just to do something as simple as play an old tape.

      Perhaps what's needed is to define a single "data archival standard", and by law require that it be backwards compatible with version 1 of the standard, forever. Then, convert all current data to the version 1 standard, once and for all. We have a good candidate right now: DVD-RW and CD-RW. Preserve those standards, so that all future disk players can at a minimum play current-day CD's and DVD's, and we might be ok. Of course, you'd have to use archival-quality CD's and DVDs, because the cheap ones only last five years (the good ones last a hundred or more, they've got extra coatings to prevent degradation, etc).

      Why not? Current DVD players already accept CDs. Just take the current DVD writer as a standard and design all new devices to be backwards compatible (on physical size, too -- i.e. a current, standard-size CD should be usable).

      --
      Farewell! It's been a fine buncha years!
  5. Knuth is only one foundation that won't be lost by Dancin_Santa · · Score: 4, Interesting

    If the problem is that knoweldge of the underlying foundations of technology is being lost it is because of the concept of abstraction, of which .Net is the latest and greatest incarnation.

    It really all started when some engineers decided that machine code was too hard and invented assembler. Nowadays it's not even necessary to know what a bit is or how an ALU works to make programs. Just point and click and you've got yourself a brand spanking new database app courtesy of VB.

    No one ought to knock VB because it really is the best tool for what it does, but it also lowers the barrier to entry for would-be programmers. This can only lead to worse programs.

    The most fundamental concept in computer science is logic, not algorithms (or worse programming languages). If a 'programmer' hasn't written a program in a low level language like C or assembler, the hiring manager should beware. Without hands-on experience with the fundamentals of computer science that person is lacking at the most basic level, regardless of whether he knows 1 language or 50 languages. He is handicapped.

    It's a good thing to abstract, but it's also important to remember and study the bases of our science.

    1. Re:Knuth is only one foundation that won't be lost by Kaa · · Score: 5, Insightful

      The most fundamental concept in computer science is logic, not algorithms (or worse programming languages). If a 'programmer' hasn't written a program in a low level language like C or assembler, the hiring manager should beware. Without hands-on experience with the fundamentals of computer science that person is lacking at the most basic level, regardless of whether he knows 1 language or 50 languages. He is handicapped.

      Bullshit.

      "Computer science is about computers in the same way astronomy is about telescopes" --Edsgar Dijkstra

      Programming isn't about knowing how to twiddle bits in registers or even how to leverage strengths of a particular processor.

      Programming is about dealing with complex problems which can be solved by manipulation of information. I would say the the quality a programmer needs most of all is not logic or math, but just the ability to hold and manipulate large and complicated structures inside his head. And no, it doesn't have anything to do with assembler, low-level languages, ALUs, bits, etc. etc.

      --

      Kaa
      Kaa's Law: In any sufficiently large group of people most are idiots.
  6. Coming Soon... by UncleBiggims · · Score: 4, Funny

    Indiana Jones and the Raiders of the Lost Archive

  7. Just a thought you guys.... by zapp · · Score: 4, Offtopic

    Unless I am mistaken Salon, like most websites trying to make some money, is having financial problems.

    They changed to a registration/fee based model, but allowed 1 day passes for whatever reason.

    Nothing can hurt them more than being slashdotted by a bunch of people using a day pass.

    someone has already copied the contents of the article into a comment which is good because it saves them bandwidth, but ... without their permission isn't that plagiarism?

    This is why things like the DMCA and DRM come about - people thoughtlessly violating other people's copyrights/etc, and/or taking their services for granted.

    I'm no better than anyone else, I do the same thing.

    I guess my point is: either support the people who provide services you enjoy (music, video, news, web content, porn, whatever), or quit complaining when they finally start defending themselves.

    --
    no comment
  8. HA HA! by Thud457 · · Score: 5, Funny

    It's the burning of the library of Alexandria all over again. This time, on the fires of corporate profit. Just remember, as we slide into another dark age, you're the ones that used Microsoft Office!

    --

    the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff

  9. Storage of old data / hardware by CaffeinatedMouse · · Score: 5, Interesting

    So, I should be saving the 200 lbs of DEC VMS manuals, Our old VAX, all the tapes, and keep our TU-85 tape drive under service contract? How much is this all worth. Do you have any idea how much it costs to keep that hardware running? If you want to keep the code, what is the point if you don't have hardware to run it on, unless you're going to develop some emulator. Don't get me wrong I think it's a horrible shame that all those hours of engineering to develop the hardware and software is finally being trashed. There are some amazingly great ideas that were used to make that stuff. But at what cost do you preserve it?

  10. Re:here's an easy howto: by danimrich · · Score: 5, Interesting

    CD's degrade over time, their lifetime is estimated to be 100 years maximum. CD-R's can become unusable after a couple of days of being exposed to mountain sun, and will probably not last more than 15 years. In the meantime, the computer equipment will develop to a point where CD's are not needed any more, because there is better technology available. So it will become necessary to store the devices that were used to read them (i.e. whole computers). But these devices are partly made of stuff that decomposes over time, like rubber in bearings etc. Conserving data is not as easy as it seems. I wonder whether it'd be more efficient to print out the source codes on acid-free paper and store them like books - or perhaps microfiches - in a number of locations around the world.

    --
    where's all that Karma?
  11. Re:full article text, no pass required by mozumder · · Score: 5, Insightful

    You know, it really isn't fair-use to repost an entire article from another website site.

  12. Other technologies go obsolete too, So what? by G4from128k · · Score: 5, Interesting

    A number of years ago Scientific American had a article lamenting the loss of intellectual assets with the inevitable degradation of old software, documentation, media, computers, and the like. Yet the same issue had another article on changes in the canned-goods industry (the rise of new canning technologies). While the first article bitterly mourned the loss of software-related knowledge and assets, the second article made no such mention of the corresponding loss of canning-related knowledge and assets.

    Why is obsolete software technology worth preserving where obsolete manufacturing technologies are not? In a 100 years, will we really need access to the billions of JPEGs that were spewed out by digital cameras everywhere? I am not arguing for ignoring history (even though those that learn from history are also doomed to repeat it), but I am wondering about the double-standard. What realms of human knowledge and invention are worth saving, and which are not?

    BTW, for the record, I still have old documents and applications from my Mac 128k and I might even have a paper tape copy of a old APL program that I wrote 25 years ago. But then I am a certified packrat.

    --
    Two wrongs don't make a right, but three lefts do.
  13. A joke by KillerHamster · · Score: 5, Funny

    This article reminds me of a joke one of my CS professors told us (I hope I remember it right):

    The year was 2015. Joe, a programmer, was getting up in years and decided he wanted to have his body frozen after he died. He made the arrangements, and when the time came, he was frozen and placed in a government facility. Time passed, and he was forgotten.

    Jump ahead a few centuries... suddenly Joe finds himself conscious again! He is on a lab table surrounded by strange looking people in uniforms. Their leader, speaking through a translator, welcomes Joe back to life.

    Joe is amazed! There are so many questions he wants to ask, but first he says, "Why did you bring me back to life?"

    The leader answers, "Well, the year is 9999. Y10k is coming up, and your file says you know Cobol."

  14. Another red herring from salon? by poptones · · Score: 4, Insightful
    In one part of the article they mention losing "structure" of programs and talk about source code, then they talk about "losing" old code like the original DOS - for which, so far as I know, there is no publically available archive of source code. So too of Lotus 123, another piece of code mentioned in the article. this is just more fatalistic nonsense people spew when criticising the DMCA. Yeah, it's a bad law, but this nonsense about "losing old works" is just that.

    If you have the source code for something then you have no cause to fear the DMCA, since you don't need to decrypt it. And if you don't have the source code, where is the value? Is there really any value in running lotus 123 for the Apple//? Perhaps if you have an Apple//, but so what? You cannot "fly over the code" from any height (as was mentioned in the article) because you don't have any code to fly over. You have an executable, and the "structure" there is quite different than looking at source code.

    If you want source code for DOS, hit freedos.org and download it. It's not Microsoft's source, but so what? It does the very same job and, in many cases, it's superior to the original. Works that have value will be replicated and emulated; works thta have no value simply have no value - where is the need (or logic) in "preserving" them?

  15. Re:full article text, no pass required by Andrew+Leonard · · Score: 4, Insightful

    At least with jay-walking, no matter how many times you do it, the road will still be there. But if you post the full text of Salon stories without either subscribing or getting the FREE day-pass, eventually we will no longer be able to pay fine writers like Sam Williams and Rachel Chalmers to write the stories that Slashdot readers like to read.

    --

    Editor, Salon Business & Technology

    Salon.com

  16. It's a matter of survival by Andrew+Leonard · · Score: 4, Informative

    I responded to this above once already, but because this is dear to my heart, I'll do it again. Of course Salon isn't going to care if anyone prints out a copy and tapes it to their cube wall. But if a Web site grabs the text and posts it in a place like Slashdot, that deprives us of literally thousands of readers. Many of those readers might otherwise watch and ad and grab the daypass, which is good for our financial health, and some percentage of other readers might even subscribe, which is even better for us.

    Technically, it's copyright infringement, but Salon isn't going to devote resources to suing Slashdot or Slashdot readers. If we were going to go that route, we'd start with the Freerepublic assholes, who actively want us to go bankrupt and do everything they can to help us down that road. To slashdot readers, the best appeal I can make is simple.

    We want to make a living at what we do, so we can keep doing it. I want to keep paying great technology writers like Rachel Chalmers and Sam Williams to do interesting stories. If we convince enough readers to watch our ads or subscribe, we'll pull off this magic trick. So basically, the way I see it, any time a Slashdot reader posts the full text of a story on Slashdot, it's a vote against our survival, which is ironic, since you wouldn't be posting the stories if you didn't think there was some merit in them, right?

    --

    Editor, Salon Business & Technology

    Salon.com

  17. Is this irony? by mblase · · Score: 4, Funny

    After all, in five years Salon.com may be gone from the web, and since neither Google nor the Internet Archive have a paid subscription, this story will be forever lost to the ages.

    So kudos for reposting this valuable information to Slashdot! Without the efforts of others like you, internet surfers in generations to come might never understand the importance of, well, the efforts of others like you.