Data Migration Between CMS Repositories?
StyleChief asks: "My employer has decided to begin migrating all of the company's documentation oriented objects and files to a new content management system. The new system seems to have good functionality, robustness, and better usability than our current systems. However, the
task of migrating all of the data from 2 or 3 other repositories to the new system seems to be a daunting chore. Automating the
process as much as possible is of course my first goal. There are APIs that one can use to do this, but the details quickly
become eye opening. Questions of objects versus files, handling their attributes, authorizations, file type identification, shadowing, build integration, versioning, etc., are several of the plethora of issues at hand. Moving perhaps hundreds of thousands of objects from one proprietary repository to
another while preserving everything related to that object is the name of the game. I would like to know how others from Slashdot have dealt with similar scenarios. I am particularly interested in the 'lessons learned,' and the problems that you didn't see coming beforehand."
The swiftest way to migrate data between repositories.
Drop the firewall.
Interestingly, none of these "migration articles" on web sites that are explicitly devoted to CMS matters (e.g., CMSwatch.com, cmsReview.com) seem to characterize this problem as relating to Extraction, Transformation, and Loading (ETL), raising the possibility that their authors are ignorant of the many ETL tools that are available. In the open source world, these tools include Octopus and Jetstream. Of course, Perl programmers do not call this process "ETL," but, rather, simply "data munging."
A prior Slashdot story on "Transferring data 'tween databases" (posted 14 April 2003) might interest you. I cannot post a link to it, however, because Slashdot's search engine is currently down.
Finally, EMC just bought Documentum, the CMS that you are considering. EMC is primarily a storage company, and I cannot help but wonder how CMS fits into their storage strategy.
A lawyer & digital forensics examiner. Also an expert on open source software (OSS).
Data Junction specializes in such work (among other stuff).