Evolutionary Database Design
Andre Mermegas writes "Check out this article by everybody's favorite object mentor Martin Fowler on database design. Be sure to take a peek at his wonderful books as well."
← Back to Stories (view on slashdot.org)
I think it's nice that people are starting to get interested in relational databases again. They really are the backbone of information systems in business, despite what the industry rags will have you believe.
The "hype" of object-oriented and XML-driven "databases", although aesthetically prettier, have adverse effects on performance and design. Programmers get lazy, applications become sloppy and performance goes into the toilet.
In the projects I've worked, I often find that the DBAs are older men or women and the developers are young. So the friction lies in the fact that the young-guns are doing .NET or Java or XML queing and so the DBA is really at a loss to help "the developer think of things he may have not thought about". Of course, on the table-design side, this maybe true. Secondly, due to the age-difference, "popping over the cube" is also difficult as the DBAs (being more mature shall I say) are less likely to be excited about a new paradigm.
Case in point, when I read in an Oracle PL/SQL book about Nested Tables, the light bulb in my head went off (or lit up, or whatever). Basically, these nested tables were objects with methods (code behind them), however, could be queried like tables. So, instead of selecting say a person's name, birthdate, and calculating an age, I could select name, birthdate, and age (the age column had code behind it automatically calculating the age). Now the beauty of this is for derived quantities that are only used once, but would be burdensome to store, this was a godsend. However, my DBA completely rejected this idea as too untried and new-fangeled.
This may sound very arrogant, but I think the developer should manage the DBA, often the DBA is a lone-wolf with too much power. Often the poor programmer has to submit changes with about as much hope they'll get done as one might have submitting universe changes to God Almighty.
"This isn't a study in computer science, its a study in human behavior"
A big issue with iterative development is that the QA folks will quickly fall behind and become very anxious. What's the solution to that? Either embrace the QA person to get closer to the real development environment, or if that is impossible, get a new QA person. That's the only way to succeed.
Sex - Find It
While I believe that 'Agile processes' are the best way to develop software, he appears to be advocating anarchy.
When you have a lot of people all making little changed everyone starts to loose sight of the Big picture and you run into a Too many cooks spoil the broth.
I'm sure a lot of people who read this site have seen a lot of code and design, and probably a lot of horrific code and design, well enough said.
thank God the internet isn't a human right.
Just like the actual users the QA folks should be heavily involved in the actual development cycles. Extreme programming states that every development cycles starts with a functional design and the development of tests for each deliverable. Having these available means that QA is able to keep track of the quality of the deliverables very well.
IMHO if QA cannot keep track of the big picture they fail as QA, because that is just an important part of their job. On the other hand perhaps extreme programming should involve relatively more QA people than 'regular' development methods.
giel.y contains 2 shift/reduce conflicts
I'm currently working as a developer, but I used to work as a development DBA. In my opinion this article shows the database and the DBA roles in a project from a developer perspective.
As a general rule, the developers think that the database is there to support their application, which is really the piece that solves the problem. In the other hand DBAs think that the developers are there to support their data model, by supplying an interface with validation and some simple pieces of logic that their store procedures don't cover.
I have worked much longer as a developer than as a DBA, but I still find it funny that the article assumes that the developer should be able to add a column to a table freely and the incorporate the changes to the main database. This is the equivalent of saying the DBA should be able to freely change a class or an interface and then add the changes to the source control repository.
While not wrong in itself, it clearly shows that many developers consider the DBA role secondary to the developer. It goes something like this: I can somehow do some DBA tasks that impact the development like adding tables to the schema, I just don't want a get involved in the boring parts (backups, recovery or replicating schemas).
I think that creating a good data model is as difficult as creating a good application design and doing a decent store procedure as hard as doing an efficient method. While some DBAs can write very good C++/Java code and some developers can design very good data models, no one should be doing each other job unless they really, really, really know what they are doing.
As a general rule of thumb, if you consider that mySQL is a better database for large complex applications as PostgreSQL or Oracle, you should not be doing any database work.
I work on a project of about a dozen developers, some os us geographically diverse. We use an Object-Oriented Database with Java (Database is from Versant).
We don't worry about *any* kind of DB administrator. Each developer has their own instance of the database. We don't worry about schema changes that break the database, because we *also* have a way to import/export the database to an XML file. Thus, if the schema radically changes from the deployed version, we just export to XML and re-import, so there is no complex though about 'schema evolution' necessary,
Of course, with everyone having such free ability to make changes that impacts the format of the data store, we need good unit tests to make sure things don't break unintentionally. This is actually one area we need to improve upon. People can make changes that affect the schema easily, and most times, its not an issue... but a lot of times people make changes that would impact the XML format, and they don't always handle it properly.
Unit Tests are Key, but that's nothing new to the concept of refactoring.
is iterative design. Which is becoming fairly widely accepted in OO circles, and almost universally accepted in Agile circles.
Databases, however, are a lot harder to iterate - the cost of change is higher than with any other code. Martin Fowler is laying down an approach to manage (not reduce - manage) that cost, and it all comes down to a guess we have to make - do we think the overall cost/benefit tradeoff of an iterative process is better than a Big Design Up Front process ?
On the eXtreme Programming mailing list, there's been a lot of discussion about how to deal with databases - some deny the need for databases altogether, some advocate using Mock Objects for testing and even development etc. It all boils down to the cost of change - it's expensive to change a database design because it is very hard to identify the knock-on effects. Some changes are relatively easy to manage - adding a column is unlikely to actually break anything - but others can wreak havoc with existing applications - changing the type or size of a column for instance.
I'd love to think that the next big improvement in software development tools is not going to be yet another language but a sensible way of tying objects to their persisted data. All the solutions I've seen so far are bolted-on - they either force the database into unnatural positions, or make the objects fit into a model that's not quite what they'd be otherwise.
In the meantime, this article is well worth investigating - the idea of evolving the datamodel in tandem with the migration scripts is very powerful.
It's all very well in practice, but it will never work in theory.