Slashdot Mirror


Building and Maintaining Large, Collaborative Databases?

hherb asks: "We are in the process of building and maintaining a free pharmaceutical reference database, in order to liberate medical decision support systems from vendor driven databases. For that purpose, we need some way of allowing multiple authors to contribute to a large number of data records - most of them small...too small for CVS. We need version tracking as well as authentication of authors. We need to tag every bit of information enterrred with information about source reference, author, peer review result, and so forth. I had a look at existing version tracking software. like CVS and Subversion, and I did not have the impression that any of them would suit our needs. Does anyone have ideas for *free* software solutions that we can use?"

1 of 32 comments (clear)

  1. Use File Loading Processes by justanyone · · Score: 2, Interesting

    I work for a large multinational bank in Chicago. We are aggregating data in a data warehouse and have both lots of sources and lots of data.

    The way we cope with this problem is that each data source is given a code (it's usually just the filename). We have a Perl program parse these files (they're comma delimited ascii, tab delimited, DBase IV, etc., and some even human-readable reports), and load the database with the contents. Each record includes the source ID, for easy attribution / tracking.

    We keep each file version for a while. Each file has a business date so if they want to clobber a previous version of that data, they get to do so.

    This could keep your troubles to a miminum. Write a parser and have a file-upload site that lets people upload data. Define a group of people if you want. They should only be able to add/replace/delete their own data by the nature of the file. Each group can only create a certain filename or the group id is in the file.

    This way, many can share database updates via a batch run where updates are tracked and possibly even approved before committing them.