Slashdot Mirror


Data Migration Between CMS Repositories?

StyleChief asks: "My employer has decided to begin migrating all of the company's documentation oriented objects and files to a new content management system. The new system seems to have good functionality, robustness, and better usability than our current systems. However, the task of migrating all of the data from 2 or 3 other repositories to the new system seems to be a daunting chore. Automating the process as much as possible is of course my first goal. There are APIs that one can use to do this, but the details quickly become eye opening. Questions of objects versus files, handling their attributes, authorizations, file type identification, shadowing, build integration, versioning, etc., are several of the plethora of issues at hand. Moving perhaps hundreds of thousands of objects from one proprietary repository to another while preserving everything related to that object is the name of the game. I would like to know how others from Slashdot have dealt with similar scenarios. I am particularly interested in the 'lessons learned,' and the problems that you didn't see coming beforehand."

4 of 16 comments (clear)

  1. The swiftest way. by Anonymous Coward · · Score: 1, Informative

    The swiftest way to migrate data between repositories.

    Drop the firewall.

  2. A few thoughts by Paul+Bain · · Score: 5, Informative
    This is one of the most difficult and important questions that web developers face today. It is important because, in the future, most web content (of businesses, associations, and large institutions, at least) will be managed with content management systems (CMS's), and it is difficult for obvious reasons. I have followed CMS literature for years, and have seen only a few articles on this matter, of which this is one of the best, although far too brief and general. See also "Fear of migration."

    Interestingly, none of these "migration articles" on web sites that are explicitly devoted to CMS matters (e.g., CMSwatch.com, cmsReview.com) seem to characterize this problem as relating to Extraction, Transformation, and Loading (ETL), raising the possibility that their authors are ignorant of the many ETL tools that are available. In the open source world, these tools include Octopus and Jetstream. Of course, Perl programmers do not call this process "ETL," but, rather, simply "data munging."

    A prior Slashdot story on "Transferring data 'tween databases" (posted 14 April 2003) might interest you. I cannot post a link to it, however, because Slashdot's search engine is currently down.

    Finally, EMC just bought Documentum, the CMS that you are considering. EMC is primarily a storage company, and I cannot help but wonder how CMS fits into their storage strategy.

    --

    A lawyer & digital forensics examiner. Also an expert on open source software (OSS).
    1. Re:A few thoughts by Paul+Bain · · Score: 2, Informative

      Here's the link to the prior Slashdot story.

      --

      A lawyer & digital forensics examiner. Also an expert on open source software (OSS).
  3. Data Junction by Anonymous Coward · · Score: 1, Informative

    Data Junction specializes in such work (among other stuff).