How Do You Sync Database Schemas?
Rob Sweet asks: "I recently got started coding for a PHP front end to RRDTool. Right now, there are only two developers but we get the impression that once a protocol is in place, we'll have several more. The question has been posed: We can use CVS to keep our code synchronized but how do we go about keeping our database schemas synchronized? The obvious answers involve using mysqldump to keep updated table creation scripts in CVS but I'm wondering if there isn't a better way..." At the very least, a file containing a list of schema changes would be necessary, but what about programs that can take two schemas, look at the differences, and return the commands necessary to make the one mirror the other?
In the past, I have created scripts that populate data. You can do several levels of refresh, refresh just the data, or delete all objects and re-create them as well. This is a bit of a pain to set up, but it works well in simple cases.
Using ant, I just had a task to take a snapshot of the data in the database and save it to cvs. I then had a data refresh as a general part of the setup.
Every once in a while, we would rebuild all objects by dropping all tables and recreating them. This was nice in development, but a pain in production (reloading a 4million row table takes a while, not to mention keeping the data in CVS)
For production usage, we created alter table scripts that got added with the correct TAG. When we installed a build, all alter scripts were run before any code was pushed.
Mike Mangino
mmangino@acm.org
While a metadirectory doesn't solve your problems, it's almost certainly better then scripts populating your databases and helps keep data redundancy to a minimum. Also, realize that 'synching' database schemas is almost impossible since nearly every vendor today uses different standards. You will almost definitly need to brew some homemade software. Good luck.
Check out this product from Red-Gate Software.
I've been working in private for a while on my own ERD-like software (similar in flavor to Erwin and the likes), and I've dealt with this problem to some degree. It's much easier to have a higher-level tool deal with the issue. In my case, abstract schemas are stored in XML, and I have a tool that parses the XML and generates all the sql for "create table blah blah blah" for whatever db vendor you pick. Then there's another tool that can diff between two revisions of the XML schema definition and issue "alter table blah blah" statements to update a database's table layout.
Mine won't be ready for public consumption for some time yet, since I only work on it now again in spare time, and it has big huge unrealistic design goals - but it's not a hard job to build a simple version of the above on your own that's tailored to just your needs.
11*43+456^2
The scripts you use to create and populate the database should go into CVS and then you can run them to create or recreate your database and data.
I have also used CVS to store a file that is a series of database alters or a series of files that are a bunch of db alters that are to applied together to form a kind of DB release or tag.
DDL and load scripts are just code, treat them as code.
Some database shops use CASE tools for data model generation and reverse engineering. Ultimately, these sorts of tools represent a data model with an object model, allow direct editing of the internal representation, can import by examining the data dictionary of a datbase, and can generate SQL DDL as needed to apply the difference or create from scratch.
In proprietary realm, Oracle Designer is pretty good at this sort of thing. You can get a developer licence for free from technet.oracle.com, but it's big $ for production use.
There are some open source tools for this, but they all seem to be are fairly young. I happened to notice one on Freshmeat today called Alzabo.
We perform Nightly backups of the dynamic tables of the database. We can use these to rereate on our staging machine the state of live. We have a separate table (updates) that stores the current update level. We have a script called updateXXXX.pl which contains all the sql code necessary to update the DB. the XXXX is replaced with the update number (update0016.pl is our current). When the script runs, it checks the update table to see what the last script run was. If you are attempting to run update 15, and 14 hasn't been run, it will stop you.
This is part of our QA process. Before a build goes to staging, we wipe out the staging database and replace it with the copy from live. Ten we run the update and push the build code. Once a staging build is blessed, we can push it live.
Open Source Identity Management: FreeIPA.org
If you use MySQL or PostgreSQL you can use Alzabo to synchronize database schemas.
--
Ilya Martynov (http://martynov.org/)
Writing a python script that would read this and output SQL is pretty easy. You could even do this it in XSL. Once you write the script, it should work great. You can tweak the script to output SQL for different DBMSs.
Make sure both developers are using the same dtd (to make sure your XML is valid). And since XML is verbose (is it _ever_), commits have a better chance of not clashing unless you're both hacking the same part of the schema.
My father is a blogger.
we put a mysqldump (with a couple of options) in the cvs! We considered other options, but it's by far the easiest!
If you don't mind using Java, check out the Jakarta project Turbine-Torque at http://jakarta.apache.org/turbine/torque/.a ct schemas are stored in XML and Torque parses the XML and generates the database schema for whatever database vendor you pick. (Torque also generates Java code to easily work with data in the database).
Abstr
Yeah, I know: this is off topic; but it's schemata, not schemas, isn't it?
Either you haven't RTMed, or you didn't explain very well. The documentation has it here. Just make both masters and both slaves, and hope you don't trample on each other.
:-)) wanted to do, and did when I went on vacation! Instead of coding properly, they added tables to make it easier. And then they wonder why the DB slows down on their queries. Or they want to know a concrete method (query or idea) of encapsulating all data of a certain type, and wonder why I can't do it, since they left so many holes in the system.
Now that you have the key to complete disaster, I will warn you since you obviously don't know this. This is a stupid idea. Hmm... that not harsh enough. If you were working for me, I'd reconsider your employment if you came up with this crazy idea.
You should never, ever, ever, ever, have more than one person working on the same schema while coding! Either the database will be driven by your code, which is a quick way to denormalize everthing, and wreck havok, or more than one person will drive the design, and will denormalize and wreck havoc.
Only one person (or group) should make decisions on a schema, and it should be done, *before* any coding is done. The database structure will lend structure to the program itself.
I was a DBA for Oracle at a small company. You wouldn't believe what those idiots (Andrew, you listening?
The database should be designed to handle the system. And that it much more important than coding. Both because of speed and structural reasons. The only time the database should be changed, is when there was a mistake in the original design, or the project it is for has changed dramatically.
So, their is a way to do it, but its on par with logging in as root all the time, so things are easier. Don't do it.
Have you read my journal today?
I've used a system where the database has a version number (in a table).
We then have a script with a list of 'update commands' to run (usually SQL). Each update command has a database version attribute. It runs itself if it's version is newer than the databases. Then it updates the database version.
You could have various permutations on the implementation of this idea (we do!).
The upshot is that you'd have a script that developers could run that would perform whatever SQL DDL or DML changes are needed to get their databases into the right order.
For an opensource solution, TOra on sourceforge will do. Too lazy to link, go there and search.
t id =440
It's compare functionality wasn't quite as robust as I would have liked, but on all platforms but Windows, its free (as in speech and beer). And it supports multiple dbs.
If it's Oracle you're using, TOAD from www.toadsoft.com is currently the best product out there for this. It will compare what you want, and generate the SQL to make one look like the other including stored procs, sequences, etc...
It is also the best available 3rd party tool for Rev-engineering Oracle databases - with DBArtisan second. Erwin is useless for this type of thing, since it doesn't support Index Organized tables. I'm a DBA by trade, and find that most graphical type tools are about 75-80% useful. Trouble is, that remaining 20-25% is where the complexity lies.
Unfortunately, the DBA module that supports this functionality is an add on. The whole thing will probably set your company back about $600 or so.
Another decent comparison Oracle tool is oracompare...
http://www.delphicity.net/download.cfm?componen
Is the actual design of the database going to change so much that anything more the mysqldump is important?
Unlike code, databases generally need to be planned out as fully as you can in advance, for normalization (And maybe other technical buzz words I can't remember) - So while the data will change, the structure should always be the same. (And the data been different shouldnt be a problem)
To make things easy, adding a little script that dumps the database, then updates the cvs in one easy command shouldn't take more then a few moments to russle up.
I saw the light at the end of the tunnel... But it was just someone with a flashlight bringing more work.
Simply hire people who know how to do their jobs.
Why is it that every "Ask Slashdot" question is either (a)"I don't know how to do my job/homework. Can you you do it for me for free?" or (b)"I don't know how to use Google. Give me some answers."
There is a product, called DBExaminer, that will analyze the differences between two schemas and create a script to update one schema to the other. It's not free, it's not Open Source, it's not perfect, but it does work for that task, and it's not horribly expensive (on the order of $2000). It will also analyze your model for Normalization errors...probably a bit too thoroughly, but too much information is better than not enough. You can get a demo at DBE Software's site . No, I'm not an employee nor do I make any money from recommending their stuff, I just have used it at a couple of different jobs and it worked darned well for what we needed it for.
Since you appear to be using MySQL, which is not a database, obviously you don't have a need to keep the "database" schemas in sync.
Maybe you should call back when you are using a real database such as Oracle, or even PostgreSQL.
The main problem is that if you change the dev copy of data, you'll have to rebuild it from scratch unless you can rollback.
This is 101 level stuff: plan first, code later. You don't make things up as you go along.