Slashdot Mirror


Developing a Niche Online-Content Indexing System?

tebee writes "One of my hobbies has benefited for 20 years or so by the existence of an online index to all magazine articles on the subject since the 1930s. It lets you list the articles in any particular magazine or search for an article by keyword, title or author, refining the search if necessary by magazine and/or date. Unfortunately the firm which hosts the index have recently pulled it from their website, citing security worries and incompatibilities with the rest of their e-commerce website: the heart of the system is a 20-year-old DOS program! They have no plans to replace it as the original data is in an unknown format. So we are talking about putting together a team to build a open source replacement for this – probably using PHP and MySQL. The governing body for the hobby has agreed to host this and we are in negotiations to try and get the original data. We hope that by volunteers crowd-sourcing the conversion, we will be able to do what was commercially impossible." Tebee is looking for ideas about the best way to go about this, and for leads to existing approaches; read on for more. tebee continues: "It occurs to me that there could be existing open-source projects that do roughly what we want to do — maybe something indexing academic papers. But two days of trawling through script sites and googling has not produced any results.

Remember that here we only point to the original article, we don't have the text of it online, though it has been suggested that we expand to do this. Unfortunately I think copyright considerations will prevent us from doing it, unless we can get our own version of the Google book agreement!

So does anyone know of anything that will save us the effort of writing our system or at least provide a starting point for us to work on?"

6 of 134 comments (clear)

  1. Try Ruby on Rails by olyar · · Score: 4, Funny

    I'm sure that Ruby on Rails could have a fully functional web site made from this data in about half an hour.

    The downside is that if more than two people try to access the data, it will display a whale suspended by balloons.

    (Please Note: This post is a joke, and not an attempt to start a flame war).

    --
    Custom, hands-free Linux installs. Instalinux
    1. Re:Try Ruby on Rails by greg1104 · · Score: 4, Funny

      It's data for model railroading magazine, so not only are they used to rails, they already have protocols to serialize access to shared resources and prevent collisions.

  2. Re:It would help by beakerMeep · · Score: 3, Funny

    Maybe it's the type of magazines that people used to read "for the articles?"

    --
    meep
  3. Re:It would help by bsDaemon · · Score: 2, Funny

    I'm pretty sure porn indexing isn't niche... or a hobby. Its the true reason Google exists.

  4. And a thousand Mac Fanbois ... by rueger · · Score: 2, Funny

    ... leap up and shout "Filemaker Pro! Cause it's so shiny and pretty!"

    Oh, the number of times that I've heard that refrain... shudder ...

  5. Fancy that by Anonymous Coward · · Score: 1, Funny

    > One of my hobbies has benefited for 20 years or so by the existence of an online index to all magazine articles on the subject since the 1930s. [...] The governing body for the hobby has agreed to host this

    Huh, I didn't realize that porn had a governing body.