Evolutionary Database Design

← Back to Stories (view on slashdot.org)

Posted by Hemos on Sunday January 5, 2003 @06:33AM from the jurassic-database-park dept.

Andre Mermegas writes "Check out this article by everybody's favorite object mentor Martin Fowler on database design. Be sure to take a peek at his wonderful books as well."

171 comments

FP by Anonymous Coward · 2003-01-05 06:36 · Score: 0

FP
IN SOVIET RUSSIA... by Anonymous Coward · 2003-01-05 06:38 · Score: 0

Database designs YOU!
1. Re:IN SOVIET RUSSIA... by I'm+not+a+script,+da · 2003-01-05 10:25 · Score: 0
  
  What a country!
ruined by decimal ? by Anonymous Coward · 2003-01-05 06:38 · Score: 0

Indexing of the database has been ruined by slow decimal to binary conversion routines. Decimal ruins yet another thing.
woohoo by jon787 · 2003-01-05 06:40 · Score: 0, Redundant

I actually get to read an article BEFORE it is knocked off the net by the slashdot effect!

--
X(7): A program for managing terminal windows. See also screen(1).
Why brute force? by The+Z+Master · 2003-01-05 06:42 · Score: 0, Offtopic

Instead of doing a brute-force crack of the private key, why not use an intelligent algorithm for cracking it? As I understand it, the other distributed.net projects used brute-force just to show that how much time a brute-force could take. If this project is really about discovering the key and not about seeing how long it takes to stumble upon it at random, then shouldn't they use a smart algorithm?
1. Re:Why brute force? by TheAntiCrust · 2003-01-05 07:10 · Score: 0, Offtopic
  
  You replied to the wrong article with your brilliant ideas on how to be more effiecient?
  Ahhh, the irony.
(-1, replied to wrong story) by Anonymous Coward · 2003-01-05 06:46 · Score: 0

nt
there you go...., not /.ed for me? by Anonymous Coward · 2003-01-05 06:48 · Score: 0

Dealing with Change

One of the primary features of agile methods is their attitude towards change. Most of the thinking about software process is about understanding requirements early, signing off on these requirements, using the requirements as a basis for design, signing off on that, and then proceeding with construction. This is a plan-driven cycle, often referred to (usually with derision) as the waterfall approach

Such approaches look to minimize changes by doing extensive up-front work. Once the early work is done, changes cause significant problems. As a result such approaches run into trouble if requirements are changing, and requirements churn is a big problem for such processes.

Agile processes approach change differently. They seek to embrace change, allowing changes to occur even late in a development project. Changes are controlled, but the attitude of the process is to enable change as much as possible. Partly this is in response to the inherent instability of requirements in many projects, partly it is to better support dynamic business environments by helping them change with the competitive pressures.

In order to make this work, you need a different attitude to design. Instead of thinking of design as a phase, which is mostly completed before you begin construction, you look at design as an on-going process that is interleaved with construction, testing, and even delivery. This is the contrast between planned and evolutionary design. One of the vital contributions of agile methods is that they have come up with practices that allow evolutionary design to work in a controlled manner. So instead of the common chaos that often happens when design isn't planned up-front, these methods provide techniques to control evolutionary design and make them practical.

An important part of this approach is iterative development, where you run the entire software life-cycle many times during the life of a project. Agile processes run complete life cycles in each iteration, completing the iteration with working, tested, integrated code for a small subset of the requirements of the final product. These iterations are short, usually running between a week and a couple of months, with a preference towards shorter iterations.

While these techniques have grown in use and interest, one of the biggest questions is how to make evolutionary design work for databases. Most people consider that database design is something that absolutely needs up-front planning. Changing the database schema late in the development tends to cause wide-spread breakages in application software. Furthermore changing a schema after deployment result in painful data migration problems.

Over the course of the last three years we've been involved in a large project (called Atlas) that has used evolutionary database design and made it work. The project involved almost 100 people in multiple sites world-wide (US, Australia, and India). It is around half a million lines of code and has over 200 tables. The database evolved during a year and a half of initial development and continues to evolve even though it's in production for multiple customers. During this project we started with iterations of a month, but after a few months changed to two week iterations which worked better. The techniques we describe here are the ones that we (or more accurately Pramod) used to make this work.

Since that project got going we've spread these techniques over more of our projects, gaining more experience from more cases. We've also found inspiration, ideas, and experience from other agile projects.
Limitations

Before we dive into the techniques, it's important to state that we haven't solved all the problems of evolutionary database design. In particular:

* We developed an application database for a single application rather than an integration database that tries to integrate multiple databases.
* We don't have to keep the production databases up 24/7

We don't consider these problems to be inherently unsolvable, after all many people believed we couldn't solve this one. But until we do, we won't claim we can solve them either.
The Practices

Our approach to evolutionary database design depends on a handful of important practices.
DBAs collaborate closely with developers

One of the tenets of agile methods is that people with different skills and backgrounds need to collaborate very closely together. They can't communicate mainly through formal meetings and documents. Instead they need to be out talking with each other and working with each other all the time.Everybody is affected by this: analysts, PMs, domain experts, developers... and DBAs.

Every task that a developer works on potentially needs a DBA's help. Both the developers and the DBA need to consider whether a development task is going to make a significant change to the database schema. If so the developer needs to consult with the DBA to decide how to make the change. The developer knows what new functionality is needed, and the DBA has a global view of the data in the application.

To make this happen the DBA has to make himself approachable and available. Make it easy for a developer to just pop over for a few minutes and ask some questions. Make sure the DBAs and developers sit close to each other so they can easily get together. Ensure that application design sessions are known about so the DBA can pop in easily. In many environments we see people erecting barriers between the DBA and application development functions. These barriers must come down for an evolutionary database design process to work.
Everybody gets their own database instance

Evolutionary design recognizes that people learn by trying things out. In programming terms developers experiment with how to implement a certain feature and may make a few attempts before settling down to a preferred alternative. Database design can be like that too. As a result it's important for each developer to have their own sandbox where they can experiment, and not have their changes affect anyone else.

Many DBA experts see multiple databases as anathema, too difficult to work in practice, but we've found that you can easily manage a hundred or so database instances. The vital thing is to have to tools to allow you to manipulate databases much as you would manipulate files.
Developers frequently integrate into a shared master

Although developers can experiment frequently in their own area, it's important to bring the different approaches back together again frequently. An application needs a shared master database that all work flows from. When a developer begins a task they copy the master into their own workspace, manipulate, and then integrate their changes back into the master. As a rule of thumb each developer should integrate once a day.

Let's take an example where Mike starts a development task at 10am (assuming he actually comes in that early). As part of this task he needs to change the database schema. If the change is easy, like adding a column, he just decides how to make the change himself, Mike also makes sure the column he wants to add does not already exist in the database, with the help of the data dictionary (discussed later). If it's more complicated then he grabs the DBA and talks over the likely changes with him.

Once he's ready to begin he takes a copy of the database master and can modify both the database schema and code freely. As he's in a sandbox any changes he makes don't impact anyone else's. At some point, say around 3pm, he's pretty comfortable that he knows what the database change needs to be, even though he's not completely done with his programming task. At that point he grabs the DBA, and tells him about the change. At this point the DBA can raise any issues that Mike hasn't considered. Most of the time all is well and the DBA goes off and makes the change (by applying one or more database refactorings, which we'll come to below). The DBA makes the changes right away (unless they are destructive changes - again more on that below). Mike can continue to work on his task and commit his code any time he likes once the DBA has applied these changes to the master.

You may well recognize this principle as similar to the practice of Continuous Integration, which is applied to source code management. Indeed this is really about treating the database as another piece of source code. As such the master database is kept under configuration management in much the same way as the source code. Whenever we have a successful build, the database is checked into the configuration management system together with the code, so that we have a complete and synchronized version history of both.

With source code, much of the pain of integration is handled by source code control systems. For databases there's a bit more effort involved. Any changes to the database need to done properly, as automated database refactorings, which we'll discuss shortly. In addition the DBA needs to look at any database changes and ensure that it fits within the overall scheme of the database schema. For this to work smoothly, big changes shouldn't come as surprises at integration time - hence the need for the DBA to collaborate closely with the developers.

We emphasize integrating frequently because we've found that it's much easier to do frequent small integrations rather than infrequent large integrations. It seems that the pain of integration increases exponentially with the size of the integration. As such doing many small changes is much easier in practice, even though it often seems counter-intuitive to many. This same effect's been noticed by people in the Software Configuration Management community for source code.
A database consists of schema and test data

When we talk about a database here, we mean not just the schema of the database, but also a fair amount of data. This data consists of common standing data for the application, such as the inevitable list of all the states in the US, and also sample test data such as a few sample customers.

The data is there for a number of reasons. The main reason is to enable testing. We are great believers in using a large body of automated tests to help stabilize the development of an application. Such a body of tests is a common approach in agile methods. For these tests to work efficiently, it makes sense to work on a database that is seeded with some sample test data, which all tests can assume is in place before they run.

As well as helping test the code, this sample test data also allows to test our migrations as we alter the schema of the database. By having sample data, we are forced to ensure that any schema changes also handle sample data.

In most projects we've seen this sample data be fictional. However in a few projects we've seen people use real data for the samples. In these cases this data's been extracted from prior legacy systems with automated data migration scripts. Obviously you can't migrate all the data right away, as in early iterations only a small part of the database is actually built. But the idea is to iteratively develop the migration scripts just as the application and the database are developed iteratively. Not just does this help flush out migration problems early, it makes it much easier for domain experts to work with the growing system as they are familiar with the data they are looking at and can often help to identify problem cases that may cause problems for the database and application design. As a result we are now of the view that you should try to introduce real data from the very first iteration of your project.
All changes are database refactorings

The technique of refactoring is all about applying disciplined and controlled techniques to changing an existing code base. Similarly we've identified several database refactorings that provide similar control and discipline to changing a database.

One of the big differences about database refactorings is that they involve three different changes that have to be done together

* Changing the database schema
* Migrating the data in the database
* Changing the database access code

Thus whenever we describe a database refactoring, we have to describe all three aspects of the change and ensure that all three are applied before we apply any other refactorings.

We are still in the process of documenting the various database refactorings, so we aren't able to go into detail on them yet. However there are a few things we can point out. Like code refactorings, database refactorings are very small. The concept of chaining together a sequence of very small changes is much the same for databases as it is for code. The triple nature of the change makes it all the more important to keep to small changes.

Many database refactorings, such as adding a column, can be done without having to update all the code that accesses the system. If code uses the new schema without being aware of it, the column will just go unused. Many changes, however don't have this property. We call these destructive changes, an example of which is making an existing nullable column not null.

Destructive changes need a bit more care, the degree of which depends on the degree of destruction involved. An example of a minor destructive change is that of changing a column from nullable to not null. In this case you can probably just go ahead and do it. The refactoring will take care of any data in the database that's null. Usually the only developer who cares about this property is the one who requested the change, and that developer will update the database mapping code. As a result the update won't break anyone else's code and if by some strange chance it does, they find out as soon as they run a build and use their tests. (On our large project we gave ourselves some extra breathing space by waiting a week before making the database change.)

Splitting a heavily used table into two however is a rather more complicated case. In this case it's important to let everyone know that the change is coming up so they can prepare themselves for it. In addition it's worth waiting for a safer moment to make the change. (These kinds of changes we would defer until the start of a new iteration - we like to use iterations of two weeks or less).

The important thing here is to choose a procedure that's appropriate for the kind of change that you're making. If in doubt try to err on the side of making changes easier. Our experience is that we got burned much less frequently than many people would think, and with a strong configuration control of the entire system it's not difficult to revert should the worst happen.
Automate the refactorings

In the world of code we are seeing tools for some languages to automate many of the identified refactorings. Such automation is essential for databases; at least in the areas of schema changes and data migration.

As a result every database refactoring is automated by writing it in the form of SQL DDL (for the schema change) and DML (for the data migration). These changes are never applied manually, instead they are applied to the master by running a small SQL script to perform the changes.

Once done, we keep hold of these script files to produce a complete change log of all the alterations done to the database as a result of database refactorings. We can then update any database instance to the latest master by running the change log of all the changes since we copied the master to produce the older database instance.

This ability to sequence automated changes is an essential tool both for the continuous integration process in development, and for migrating production databases to a new release.

For production databases we don't make changes during the usual iteration cycles. Once we do a release, which may occur at the end of any iteration, we apply the full change log of database refactorings since the previous release. This is a big change, and one that so far we've only done by taking the application offline. (We have some ideas for doing this in a 24/7 environment, but we haven't actually had to do it yet.) It's also wise to test this migration schema before applying it to the live database. So far, we've found that this technique has worked remarkably well. By breaking down all the database changes into a sequence of small, simple changes; we've been able to make quite large changes to production data without getting ourselves in trouble.

As well as automating the forward changes, you can consider automating reverse changes for each refactoring. If you do this you'll be able to back out changes to a database in the same automated way. We haven't done this yet, as we've not had a much demand for it, but it's the same basic principle.

(A similar thing that we have done is to support an old version of an application with an updated version of the database. This involved writing a compatibility layer that allowed the application to think it was talking to the older version of the database even though it was actually talking to the newer one.)
Automatically Update all Database Developers

It's all very well for people to make changes and update the master, but how do they find out the master has changed? In a traditional continuous integration environment with source code, developers update to the master before doing a commit. That way they can resolve any build issues on their own machine before committing their changes to the shared master. There's no reason you can't do that with the database, but we found a better way.

We automatically update everyone on the project whenever a change is made to the database master. The same refactoring script that updates the master automatically updates everyone's databases. When we've described this, people are usually concerned that automatically updating developers databases underneath them will cause a problem, but we found it worked just fine.

This only worked when people were connected to the network. If they worked offline, such as on an airplane, then they had to resync with the master manually once they got back to the office.
Clearly separate all database access code

To understand the consequences of database refactorings, it's important to be able to see how the database is used by the application. If SQL is scattered willy-nilly around the code base, this is very hard to do. As a result it's important to have a clear database access layer to show where the database is being used and how. To do this we suggest following one of the data source architectural patterns from P of EAA.

Having a clear database layer has a number of valuable side benefits. It minimizes the areas of the system where developers need SQL knowledge to manipulate the database, which makes life easier to developers who often are not particularly skilled with SQL. For the DBA it provides a clear section of the code that he can look at to see how the database is being used. This helps in preparing indexes, database optimization, and also looking at the SQL to see how it could be reformulated to perform better. This allows the DBA to get a better understanding of how the database is used.
Variations

Like any set of practices, these should be varied depending on your specific circumstances. These practices are still pretty new, so we haven't come across that many variations, but here are some we have.
Keeping multiple database lineages

A simple project can survive with just a single database master in the repository. With more complex projects there's a need to support multiple varieties of the project database, which we refer to as database linages. We may create a new lineage if we have to branch an application that's put into production. In essence creating a new database lineage is much the same as branching the source code on the application, with the added twist that you also make a lineage when you need a different set of sample data, such as if you need a lot of data for performance testing.

When a developer takes a copy of a master they need to register which lineage they are modifying. As the DBA applies updates to a master for a particular lineage the updates propagate to all the developers who are registered for that lineage.
You don't need a DBA

All of this sounds like it would be a lot of work, but in fact it doesn't require a huge amount of manpower. On the Atlas project we had thirty-odd developers and a team size (including, QA, analysts and management) of close to a hundred. On any given day we would have a hundred or so copies of various lineages out on people's workstations. Yet all this activity needed only one full time DBA (Pramod) with a couple of developers doing some part-time assistance and cover.

On smaller projects even that isn't needed. We've been using these techniques on a number of smaller projects (about a dozen people) and we find these projects don't need a full time DBA. Instead we rely on a couple of developers with an interest in DB issues who handle the DBA tasks part-time.

The reason for this is automation. If you are determined to automate every task, you can handle a lot work with much less people.
Tools to Help

Doing this kind of thing requires a lot of repetitive tasks. The good news is that whenever you run into run into repetitive tasks in software development you are ideally placed to automate them. As a result we've developed a fair amount of often simple tools to help us.

One of the most valuable pieces of automation is a simple set of scripts for common database tasks.

* Bring a user up to date with the current master.
* Create a new user
* Copy a database schema, for example Sue finds a bug with her database, now Mike can copy Sue's database and try to debug the application
* Move a database, for example from a workstation to a different workstation, this is essentially Copy database and Delete database combined as one
* Drop a user
* Export a user so team members can make offine backups of the database that they are working with.
* Import a user, so if the team members have a backup copy of the database, they can import the backup and create a new schema.
* Export a baseline - make a backup copy of the master database. This is a specialized case of Export a User
* Create a difference report of any number of schemas, so that Mike can find out what is different structurally between his database and Sue's.
* Diff a schema against the master, so that developers can compare their local copy against the master.
* List all the users

Analysts and QA folks often need to look at the test data in the database and to be able to easily change it. For that we created an Excel application with VBA scripts to pull data down from the database into an excel file, allow people to edit the file, and send the data back up to the database. Although other tools exist for viewing and editing the contents of a database, excel works well because so many people are familiar with it.

Everybody on the project needs to be able to explore the database design easily, that way they can find out what tables are available and how they are used. We built an HTML based toos to do this that used servlets to query database metadata. We did the data modeling using ERwin and pulled data from ERwin into our own metadata tables.
Further Steps and Further Information

This is by no means the last word on the subject of evolutionary database design. We certainly want to see if and how we can extend these techniques to integration databases, 24/7 operation, and other problem areas that we haven't run into yet.

If you'd like to find out more about this, or talk about your own experiences, Pramod has started a yahoo egroup for agile databases. Pramod is also starting to talk about these techniques at various conferences, so you may get a chance to talk to him directly. Naturally we also do consulting on this stuff too.
relational databases, woo hoo by ademko · 2003-01-05 06:48 · Score: 5, Insightful

I think it's nice that people are starting to get interested in relational databases again. They really are the backbone of information systems in business, despite what the industry rags will have you believe.

The "hype" of object-oriented and XML-driven "databases", although aesthetically prettier, have adverse effects on performance and design. Programmers get lazy, applications become sloppy and performance goes into the toilet.
1. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-05 06:52 · Score: 1
  
  I am too a professional for over 30 years, although I keep forgetting in what field.
  
  I agree with your observations except for XML-driven "databases", sloppy applications, toilets and relational databases in general.
  
  Other than that I feel you present valid points.
2. Re:relational databases, woo hoo by Randolpho · 2003-01-05 07:28 · Score: 2
  
  Relational databases are nice for certain aspects, but certain types of object-oriented databases can indeed be just as fast as and in many ways faster than relational databases. If you add in the extra flexibility they can grant you over relational databases, they can be superior for certain applications.
  
  "XML-driven" databases suck, however, so I'll give you that. :)
  
  --
  "Times have not become more violent. They have just become more televised."
  -Marilyn Manson
3. Re:relational databases, woo hoo by esme · 2003-01-05 07:57 · Score: 3, Informative
  
  I agree that relational databases are the best solution for most problems -- that's why they're the backbones of most apps these days.
  That said, there are some cases where they fall down. One example that I'm working on right now is organizing a million or so smallish documents. The relational design to store the documents with the same degree of specificity as the XML format they are in is ridiculously complicated. But storing them in an XML database (we're using Xindice, but have looked at Tamino, and a few others) is a lot simpler.
  
  Another downside of relational dbs is that it's generally pretty difficult to change your schema. A lot of XML databases, on the other hand, can be configured to not enforce a schema at all. So if you're working on a problem that requires experimentation on the basic schema, it can be a lot easier to use an XML database (or even just files on disk) instead.
  
  -Esme
4. Re:relational databases, woo hoo by King+of+the+World · 2003-01-05 08:02 · Score: 3, Interesting
  
  Q. Aside from putting it in as a BLOB, How can anyone put a document such as the one he's written into a relational database?
  
  A. It can't be done.
  
  Q. What do you lose by putting a document in as a BLOB?
  
  A. Granularity. The ability to have the database sort and extract parts of files at the tag level. For example, take a site that has an essay spread over 10 pages. Do you store each section as a database record? That's not clean, as what if we want to break that essay up over 5 pages? It seems rather strange to hardcode into the database presentation logic, so webpage = database record is a workable but inelegant model. Do you store the essay as one BLOB and extract substrings? Extracting substrings is certainly not as faster as an XML database, though it is smarter than 'webpage = database record'. And here we have a scenario where an XML database might suit, and would easily outperform a relational database (yes, Tamino or Excelon do outperform relational databases some of the time! ;).
  
  #######
  
  Here's a general rule of thumb, kids. When you read a post bashing another database model, bashing another operating system, or bashing another programming language, just realise that the poster is a jock who refuses to see where the alternative suits, and where the alternative doesn't suit. They're not about creating understanding, they're bashers.. They're not informative.
  
  --
  --Giving to trolls for the benefit of us all
5. Re:relational databases, woo hoo by VP · 2003-01-05 08:17 · Score: 5, Informative
  
  I think many people are confusing Relational databases with SQL database - they are not one and the same. In fact, this site, one of the most vocal proponents of relational databases, states that none of the existing SQL databases is a true relational database. A quote from one of their articles ( "Little Relationship to Relational"):
  
  "Not only do most practitioners think that SQL DBMSs are relational, but they actually blame the problems due to SQL's violations of, or lack of adherence to relational principles on the relational model itself!"
  
  In my opinion, there is no reason that an object-oriented environment cannot implement the relational model, and thus be a true relational database.
6. Re:relational databases, woo hoo by RetroGeek · 2003-01-05 08:41 · Score: 2
  
  Isn't SQL just a language which converses with a database engine? So the underlying structure of the database engine is not relevant?
  
  You can have an engine that actually stores the information in a fixed length field text file, then uses SQL to extract information from that file. Slow? Yup, but SQL can still be used.
  
  Case in point is FOXPRO. You can use either SQL or the xBase language to get at the information. In the same code file.
  
  --
  
  - - - - - - - - - - -
  I am a programmer. I am paid to produce syntax not grammar. Deal with it.
7. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-05 09:19 · Score: 0
  
  So the underlying structure of the database engine is not relevant?
  Generally true, but you'd have to have a table based structure.
8. Re:relational databases, woo hoo by Tablizer · 2003-01-05 09:19 · Score: 0
  
  That said, there are some cases where they fall down. One example that I'm working on right now is organizing a million or so smallish documents. The relational design to store the documents with the same degree of specificity as the XML format they are in is ridiculously complicated. But storing them in an XML database (we're using Xindice, but have looked at Tamino, and a few others) is a lot simpler.
  
  If you give more specifics, I could perhaps help you design a relational setup that is more flexible.
  
  Another downside of relational dbs is that it's generally pretty difficult to change your schema.
  
  That probably greatly depends on the vendor and tools rather than an in-born fault of relational theory.
  
  --
  Table-ized A.I.
9. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-05 09:24 · Score: 0
  
  They're not about creating understanding, they're bashers.. They're not informative.
  
  Yeah, except when they happen to bash in a way slashdot likes, then they're modded +5 Insightful. The ratings on comments in this place are so full of shit, yet you still get people who refuse to read level 0 posts... I shudder to think what horrible comments people have taken as fact simply because some retarded drones on slashdot modded it up.
10. Re:relational databases, woo hoo by sql*kitten · 2003-01-05 10:07 · Score: 3, Informative
  
  Q. What do you lose by putting a document in as a BLOB?
  
  A. Granularity. The ability to have the database sort and extract parts of files at the tag level.
  
  That's not true. If you've the Oracle documentation to hand, read about interMedia (formerly known as ConText). It gives you extensions to SQL to use XPath-like statements to select from within an XML document in a CLOB.
11. Re:relational databases, woo hoo by King+of+the+World · 2003-01-05 10:22 · Score: 1
  
  That's not true. If you've the Oracle documentation to hand, read about interMedia (formerly known as ConText). It gives you extensions to SQL to use XPath-like statements to select from within an XML document in a CLOB.
  I don't have the documentation, but from your description it's able to use XPath expressions because the relational database can emulate an XML database anyway. So the problems with a relational database remain and are solved by providing an XML interface! :)
  
  I don't see how this make what I say not true. I am just saying that XML databases have their place. They can be misused, but so can relational databases.
  
  --
  --Giving to trolls for the benefit of us all
12. Re:relational databases, woo hoo by An+Elephant · 2003-01-05 10:39 · Score: 2, Informative
  
  In my opinion, there is no reason that an object-oriented environment cannot implement the relational model, and thus be a true relational database.
  You seem to have read the site, but missed some important points... Especially, their vehement opposition to objects as they are understood in most OO settings, and inheritance as it is understood in C++ and Java.
  In particular, object Id's (=pointers, references) violate the Information Principle, one of the basic tenets of the relational model.
13. Re:relational databases, woo hoo by NineNine · 2003-01-05 10:45 · Score: 1
  
  It means that what you said about losing granularity isn't true. Personally, I didn't like XPath, and I used the other XML engine that came with 8.x, but the effect is the same. You have the same storage, security, redundancy, etc. of the rest of your data, and you still have whatever granuality you'd like.
  
  mmmmmmm...... clobs....
14. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-05 10:52 · Score: 0
  
  I just wish more people would get interested in multidimensional DBs and OLAP - they're much better.
15. Re:relational databases, woo hoo by King+of+the+World · 2003-01-05 11:30 · Score: 1
  
  If a specific relational database decides to provide some XML database features (XPath, XQuery) it does not make my point about relational databases not true.
  
  --
  --Giving to trolls for the benefit of us all
16. Re:relational databases, woo hoo by Pseudonym · 2003-01-05 11:40 · Score: 2
  I agree that relational databases are the best solution for most problems -- that's why they're the backbones of most apps these days.
  More correctly, they're the least worst solution for most problems. The reasons why they're the backbones of most apps these days are far more likely to be some combination of:
  
  It's a legacy system.
  
  It's all the designers know. (Everyone gets taught relational databases/SQL these days, even visual basic script kiddies.)
  
  FUD from a big DBMS vendor. Why would you trust your data with anyone but Oracle, after all?
  
  We already have a licence for Oracle/SQL Server/DB2/whatever.
  --
  sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
17. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-05 12:07 · Score: 0
  
  Not the original poster here,
  
  If you give more specifics, I could perhaps help you design a relational setup that is more flexible.
  Ok, I want to store 1 million documents with metadata and I want to extract sections based on heading levels (ie, H1, H2...). Consider that it's all simple HTML markup.
  
  That probably greatly depends on the vendor and tools rather than an in-born fault of relational theory.
  Well, it depends. Relational databases have tables and records that line up with columns. All records have a value in each column. Change a column and you affect all records. XML however defines the structure inline, which is inherently more flexible (whether or not this is a good thing would depend on your situation - banks wouldn't suit it, a magazine might). In brief, relational database records store values, XML database records can store values and keys too.
  
  Hierarchy in a relational database is done using pointers and parentID and, well, you get the idea. Assembling a simple tree is a hard thing for a database (unless it's frequently done, and then cached). In an XML database simple hierarchy can be expressed more easily (though 'symbolic link' type pointers aren't standardised, and are a pain in the butt with XML databases).
18. Re:relational databases, woo hoo by dubl-u · 2003-01-05 12:59 · Score: 2
  relational databases [...] really are the backbone of information systems in business [...] object-oriented [...] "databases", although aesthetically prettier, have adverse effects on performance and design. Programmers get lazy, applications become sloppy and performance goes into the toilet.
  
  For procedural code, you are very right. For object-oriented code, though, relational databases are dangerous. Why?
  
  One problem is that rectangular tables don't map particularly well to hierachies of objects. This results in either distorted OO models or database layouts that look weird to DBAs (and are therefore hard to optimize).
  Another big problem is that they encourage procedural thinking. There is an awful lot of Java code out there that isn't in any sense object oriented; it is so procedural that it might as well be written in COBOL.
  A third is that they encourage leakage between layers. I don't know how much code I've seen where SQL is scattered fucking everywhere, rather than isolated to a persistence tier. Sure, you can say that programmers shouldn't do that, but when faced with a deadline, an awful lot of developers will cough up some SQL hairball rather than fixing the underlying design issue.
  A fourth is that since the schema duplicates information in your code, this acts as a brake on refactoring. If you use a Martin Fowler's tricks for keeping the database in sync (or if you use a good O/R mapper) you can survive this, but it's still a pain.
  And a fifth is that once the database exists, no matter how much the original designers warn against it, people start using the database as an integration layer. Suddenly 14 different apps are munging the same data, making it impossible to change the schema, and nearly as hard to track down a bug. The whole point of OO programming is that data should always be wrapped by the code that goes with it.
  But my biggest gripe about them is that for 90% of the OO apps out there, an SQL engine just isn't necessary, and only serves to slow things at development and again at runtime.
  
  That's not to say that SQL databases aren't sometimes useful, just that they aren't a magic bullet, and they lead an awful lot of people astray.
  For those OO developers out there, check out Prevayler. As long as your dataset fits in RAM (and really, how much does RAM cost these days?) you can simplify your code and improve performance by thousands of times.
  
  Even for those who can't use something like Prevayler because the dataset is too large, it's a valuable thought exercise to demonstrate that databases need not be objects of worship.
19. Re:relational databases, woo hoo by Tablizer · 2003-01-05 16:17 · Score: 0, Redundant
  
  Ok, I want to store 1 million documents with metadata and I want to extract sections based on heading levels (ie, H1, H2...). Consider that it's all simple HTML markup.
  
  What do you mean "based on heading levels"? The query gives heading levels or the results? What is an example (English) query?
  
  Hierarchy in a relational database is done using pointers and parentID and, well, you get the idea. Assembling a simple tree is a hard thing for a database
  
  I will agree that if *all* your queries are tree-based, then (existing) RDBMS will probably be slower. However, if queries come in all kinds of shapes and sizes, trees being just one, then things are not so clear cut. RDBMS are decathlon athletes, not necessarily event athletes. I often wish I could do non-tree queries on my files. Trees are easy for users to grok (up front), but often don't reflect the true shape of the world when they grow non-trivial in size, in my observation.
  
  --
  Table-ized A.I.
20. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-05 16:34 · Score: 0
  
  This is my first level heading
  <p>Some content here</p>
  <h2>This is my second level heading</h2>
  <p>Some content here</p>
  <p>Some content here</p>
  <h2>This is my second level heading</h2>
  <p>Some content here</p>
  <h1>This is my first level heading</h1>
  <p>Some content here</p>
  
  I have surveyors and website users who want access to land documentation. The surveyors want to be able to get at it from their cellphone/PDA so I would like to break up pages over H2; the website users can just get pages broken up over H1. Normally the content of an H2 section would be about a screen's worth.
  
  I really don't think an relational database can do this elegantly. It's lucky that I don't place much value in elegance :) I am planning to use Xindiche (or however you spell it).
21. Re:relational databases, woo hoo by Tablizer · 2003-01-05 18:12 · Score: 1
  
  You could parse the documents (on H-tags, down to at least H2) on put them into this structure, perhaps with other set-based category tables (not shown):
  
  Table: Chunks
  -----------------
  DocID // FK to Docs table
  Sequence
  HeadingTag
  Heading
  ChunkContent
  
  Table: Docs
  ---------------
  DocID // auto-number
  DocTitle
  Location
  
  --
  Table-ized A.I.
22. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-05 19:16 · Score: 0
  
  And then I've hardcoded down to H2 in the database, so in the future if I want down to H3 then I need to change the database. So effectively I've hardcoded presentation logic in the database, or I'll put each possible segment as individual records.
  
  It's not a clean fit into a relational database, certainly :)
  
  This is the scenario where XML DB's work well. It may be the only one, but documentation and text suit it.
23. Re:relational databases, woo hoo by Malcontent · 2003-01-05 19:48 · Score: 2
  
  I have read many of the articles on dbdebunk and it's obvious that the authors are briliant people. What puzzles me though is why they have never asked nor answered the question "why is it that despite decades of research and development, millions of dollars spent by IBM, MS, Informix, Oracle and others no database is can be called truly relational"?
  
  Maybe it's just not possible to build a truly relational database or perhaps building one has severe cnsequences that people are not willing to accept.
  
  I think it's a question worth asking and answering don't you think.
  
  --
  War is necrophilia.
24. Re:relational databases, woo hoo by leandrod · 2003-01-05 20:49 · Score: 2
  
  > So the underlying structure of the database engine is not relevant?
  
  Not quite. SQL adds many arbitrary restrictions to the relational model, and thus fails to implement any significant data independence capabilities. So you are quite limited on how to store SQL data, and on changing the physical schema without changing the logical and user ones.
  
  > You can use either SQL or the xBase language to get at the information. In the same code file.
  
  That is actually a bad example. Not only SQL itself prevents real data independence, if one mixed a relational language (that SQL is not) to any non-relational language to access a particular database, one looses the relational data independence because the non-relational access will have assumptions on the database physical schema and access plans.
  
  Perhaps one would be able to present a stable user schema to the non-relational parts of the application, but even then performance would suffer mightly because of access path assumptions.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
25. Re:relational databases, woo hoo by leandrod · 2003-01-05 20:56 · Score: 2
  
  > Maybe it's just not possible to build a truly relational database or perhaps building one has severe cnsequences that people are not willing to accept.
  
  You are assuming they have no historical knowledge about how the market came to be like it is now.
  
  In fact Hugh Darwen is a current, and EF Codd and Chris J Date are former, employee(s) of IBM, involved in the creation (Codd), refinement (Codd, Date, McGoveran & Darwen) and publicising (Date, Darwen & Pascal) of the relational model and engines.
  
  There are at least three implementations of a relational system: QUEL (as in old Ingres), BS12 (an IBM retired product by Darwen, that IBM didn't push because internal politics favoured SQL as a part of the failed F/S project) and Dataphor Alphora (a .Net translation layer over SQL).
  
  The reason almost everyone uses SQL is herd instincts: SQL is popular, is backed by the database bullies IBM, Oracle (sorta), MS (sorta) & Sybase, and everyone will stick to that no matter what just because everyone else does. Kinda like MS-W32 vs POSIX, or procedural vs functional, or imperative vs declarative.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
26. Re:relational databases, woo hoo by leandrod · 2003-01-05 22:14 · Score: 2
  
  > relational databases are the best solution for most problems -- that's why they're the backbones of most apps these days.
  
  SQL not being relational, the only apps using relational databases nowadays are the ones based on either QUEL (are there any yet?), IBM BS12 (probably none?) and Dataphor Alphora (those by SoftWise, and some few inhouse ones up to now).
  
  > That said, there are some cases where they fall down.
  
  All the cases you mention are specific failures of SQL, not of the relational model.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
27. Re:relational databases, woo hoo by esme · 2003-01-06 01:37 · Score: 2
  
  SQL not being relational, the only apps using relational databases nowadays are the ones based on either QUEL (are there any yet?), IBM BS12 (probably none?) and Dataphor Alphora (those by SoftWise, and some few inhouse ones up to now).
  
  This is the kind of argument that gives IT folks a bad rep. Everybody knows that the popular database products that use SQL or some subset of it are commonly referred to as relational databases, regardless of whether they perfectly implement the relational model. The fact is that you can take example queries from Date's Introduction to Database Systems and run them with almost no change on any number of SQL databases. These systems store data in basically the way described and follow most of the guidelines for being relational. Of course, they also let you do all kinds of other wacky stuff, too. But that doesn't alter the fact that the features that are common across database products are the core of a relational system.
  
  All the cases you mention are specific failures of SQL, not of the relational model.
  
  Actually, no, breaking up documents into many discrete units is a requirement of the normalization required by relational theory. It is this fragmentation which makes relational databases awkward for storing documents where retrieving the entire document is usually the desired operation.
  
  Now, if you want to say that these are implementation problems, then I'd agree with you. A relational database could assemble a schema on the fly (just like XML databases parse their contents on the fly when processing XPath or XQuery queries). But they don't, and it's a well-known property of relational databases that they don't.
  
  -Esme
28. Re:relational databases, woo hoo by leandrod · 2003-01-06 03:47 · Score: 2
  
  > relational databases are the best solution for most problems -- that's why they're the backbones of most apps these days.
  
  They aren't. SQL is not relational.
  
  > Everyone gets taught relational databases/SQL these days
  
  No one does. Even universities tend to teach SQL instead of the relational model.
  
  > Why would you trust your data with anyone but Oracle
  
  You can use Oracle as a back-end for Dataphor Alphora, and thus have a truly relational application without endangering your data.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
29. Re:relational databases, woo hoo by leandrod · 2003-01-06 03:53 · Score: 2
  
  > popular database products that use SQL or some subset of it are commonly referred to as relational databases
  
  So we should call relational databases something else, like X, and leave the relational name for SQL?
  
  Now, what happens when someone comes calling his product X even if it is not conforming to the relational model Y?
  
  Perhaps we should not allow vendors to implement their Newspeak in the first place.
  
  > you can take example queries from Date's Introduction to Database Systems and run them with almost no change on any number of SQL databases
  
  These examples were never intended to be a compliance test, and many are in SQL anyway for convenience.
  
  > breaking up documents into many discrete units is a requirement of the normalization required by relational theory.
  
  Not if document, or a reasonable subdivision of it (section?) is a data type.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
30. Re:relational databases, woo hoo by leandrod · 2003-01-06 04:05 · Score: 2
  
  > I really don't think an relational database can do this elegantly.
  
  Simple enough:
  
  VAR doc REAL RELATION { element element_id, tag tag_id, content text, containing_el element_id, previous_el element_id } ;</tt> <p>Any relational system should be able to do a recursive query. The previous_el domain needs a special value meaning "none" for the first element in a list.</p> <p>Just one of many possible designs. Another possible one would make XML documents a domain.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
31. Re:relational databases, woo hoo by leandrod · 2003-01-06 04:11 · Score: 2
  
  > How can anyone put a document such as the one he's written into a relational database?
  
  You can either decompose the document in an ordered tree (even SQL, being subrelational, can make recursive queries to put the document together), or make XML document a supported data type.
  
  Obviously you do not know the relational model to understand its capabilities and how different it is from what you know about SQL, or SQL itself.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
32. Re:relational databases, woo hoo by leandrod · 2003-01-06 04:15 · Score: 2
  
  > object-oriented databases can indeed be just as fast as and in many ways faster than relational databases
  
  Only for a specific, hand-tuned application, and if the database never changes. For multiple applications, automatic optimising and changing databases, you need a relational system.
  
  But obviously when you said relational you were thinking SQL, not relational.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
33. Re:relational databases, woo hoo by esme · 2003-01-06 04:28 · Score: 2
  
  > popular database products that use SQL or some subset of it are commonly referred to as relational databases
  
  So we should call relational databases something else, like X, and leave the relational name for SQL?
  Now, what happens when someone comes calling his product X even if it is not conforming to the relational model Y?
  Perhaps we should not allow vendors to implement their Newspeak in the first place.
  
  Whether you agree or like it, the term relational database is used by virtually everyone to mean databases that use SQL. That's the way language works: what people mean when they say something is what the word means, period.
  
  These databases are implementations of the relational model. Not perfect ones, not complete ones. They also have a lot of other junk thrown in to address other needs. If you want to talk about a database that does a better job of implementing the relational model, call it a "True Relational" or "Pure Relational" or something like that. Or just call it a RDBMS and say that Oracle has a crappy impl.
  
  Software vendors are a big source of terminology getting blurred, but I don't think that's what happened here. I think a couple of vendors implemented the relational model, and people started calling them relational databases. Over time the databases added new stuff, and proprietary extensions, and stopped being "just" relational databases. The recent fad of adding XML translation to RDBMS is just the most recent example of this.
  
  >breaking up documents into many discrete units is a requirement of the normalization required by relational theory.
  
  Not if document, or a reasonable subdivision of it (section?) is a data type.
  
  In my particular application, the data is mostly semi-structured content (names, dates, titles of paintings, names of countries, etc.) that aren't formatted consistently (b/c they come from several different source institutions, some of which have local heterogenity as well). There are about 15 top-level categories, with most categories having several all-optional, all-repeatable subcategories, with a different type of data for each sub-category. So a traditional relational design would quickly wind up with dozens of tables and some pretty nasty joins to get the whole doc back.
  
  My solution was to use one system for storage (a native XML database - Xindice) and different system for querying (a fulltext search engine - Lucene). Most of the XML-enabled RDBMS could do basically the same thing, but it seemed like a bad fit to use a RDBMS with a totally non-relational approach.
  
  -Esme
34. Re:relational databases, woo hoo by Malcontent · 2003-01-06 04:42 · Score: 2
  
  You seem to be saying that the non existance of purely relational databases is for political reasons only. I guess that's what I find so hard to believe. If purely relational systems could be built and if they offered clear advantages over SQL and other non relational databases one would think they would have gained some foothold no matter how small. I see people selling XML databases, SQL databases, OO databases, hybird databases, and all kinds of weird and funky stuff but nobody sells relational databases. Not even to a really small niche market. If sleepycat can make money selling db then somebody ought to be able to make money selling relational databases don't you think?
  
  --
  War is necrophilia.
35. Re:relational databases, woo hoo by rycamor · 2003-01-06 04:46 · Score: 2
  
  /leandro dives into the RDBMS fray again ;-)
  
  >>Actually, no, breaking up documents into many discrete units is a requirement of the normalization required by relational theory.
  
  Actually, as Date himself would say, that all just depends on what the logical requirements are. Breaking up data is for logical reasons, to remove redundancy. But, if your database design is to handle documents as discrete entities, then why not?
  
  Date himself argues for the existence of a native XML datatype. This would mean you could have an XML document stored in a single column. The big difference is that this datatype would not be just a blob. It would require a valid XML document, and it would have operators that allow you to interact with the DOM. it would also mean you can change your schema, using a combination of view, for example, to present your data any other way desired.
  
  The whole point that the (serious) relational guys keep saying is that there's no need to throw away the relational model in order to get certain additional advantages. Literally any sort of logical operation on data should be able to exist inside the relational model. And, even more importantly, physical storage can be implemented any way desired for performance. The only difference is that the relational model requires complete logical control over data, which no other model truly handles.
  
  You don't have to throw away one to get the other.
36. Re:relational databases, woo hoo by Tablizer · 2003-01-06 05:31 · Score: 1
  
  And then I've hardcoded down to H2 in the database, so in the future if I want down to H3 then I need to change the database.
  
  Fine, divide clear down to Hn. You may want to have a separate table for headers and content then, I would note.
  
  I agree that a commercial (or dedicated) text indexer would probably be the way to go in this case, because users will probably want full-text search also (not just headers). I just wonder how well such can integrate with a RDBMS. Oracle has such a product, but it is expensive. Document indexing and searching (words) is not an area where RDBMS (without addons) have scored well in the past. There are special structures and techniques designed just for document/text indexing. CADD is another area where RDBMS have been slow also, I would note.
  
  The problem is when you need kind of a hybrid approach. A RDBMS at that point may be the best choice.
  
  --
  Table-ized A.I.
37. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-06 05:34 · Score: 0
  
  Redundant compared to what? Fucken moderators!
38. Re:relational databases, woo hoo by leandrod · 2003-01-06 06:23 · Score: 2
  
  > That's the way language works: what people mean when they say something is what the word means
  
  Even when clearly it is vendor Newspeak? Scary your definition of language.
  
  > These databases are implementations of the relational model.
  
  They aren't, because they don't comply with the basic principles of the relational model. If you disagree, either you don't know the relational model principles, or you don't know SQL, or you're dishonest.
  
  > If you want to talk about a database that does a better job of implementing the relational model, call it a "True Relational" or "Pure Relational" or something like that.
  
  I understand your point, but you forget that then I would have to explain what's "True Relational" and why SQL isn't it. So there is more trouble, not less.
  
  > I don't think that's what happened here. I think a couple of vendors implemented the relational model, and people started calling them relational databases.
  
  You should not "think" if you do not know History. What happened was that SQL was created against Codd's orientation, so it was never relational in the first place. Codd even has quitted IBM because of that.
  
  > Most of the XML-enabled RDBMS could do basically the same thing, but it seemed like a bad fit to use a RDBMS with a totally non-relational approach.
  
  Again, a relational system (like Dataphor) makes it simple to either decompose the tags and contents in a recursive structure, or to define suitable domains. There is nothing non-relational to that. The problem again is SQL not being a RDBMS.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
39. Re:relational databases, woo hoo by leandrod · 2003-01-06 06:30 · Score: 2
  
  > You seem to be saying that the non existance of purely relational databases is for political reasons only.
  
  I actually know of three relational systems: Ingres QUEL (obsolete), IBM BS12 (not available), and Dataphor Alphora.
  > I guess that's what I find so hard to believe.
  
  Just look at Intel vs RISC, or MS-WXP vs Unix, or C vs Lisp.
  
  > If sleepycat can make money selling db then somebody ought to be able to make money selling relational databases don't you think?
  
  Sleepycat does a library, not a full DBMS.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
40. Re:relational databases, woo hoo by leandrod · 2003-01-06 06:45 · Score: 2
  
  > One problem is that rectangular tables don't map particularly well to hierachies of objects.
  
  Not true. Relational does do hierarchies well. The problem here is SQL.
  
  > they encourage procedural thinking.
  
  OO is also procedural. Relational OTOH enables one to think in sets and declarations, thus saving much more programming than OO could.
  
  > they encourage leakage between layers.
  
  This is a fault of SQL, not of the relational model.
  
  > schema duplicates information in your code
  
  ?!?
  
  > 14 different apps are munging the same data, making it impossible to change the schema
  
  On the contrary, the relational model provides data independence that both SQL and OO deny.
  
  > for 90% of the OO apps out there, an SQL engine just isn't necessary, and only serves to slow things at development and again at runtime.
  
  Then go read The Third Manifesto to learn how much better the relational model is than SQL.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
41. Re:relational databases, woo hoo by dubl-u · 2003-01-06 07:11 · Score: 2
  
  Then go read The Third Manifesto to learn how much better the relational model is than SQL.
  
  Ah, yes, When I said "relational", I meant the thing that 99% of people think of as a relational database. Sorry for the confusion.
  
  Are there existing products that you feel provide the full power of the relational model? I've never had the opportunity to use a relational DB other than via SQL.
42. Re:relational databases, woo hoo by RetroGeek · 2003-01-06 07:54 · Score: 2
  
  So you are quite limited on how to store SQL data, and on changing the physical schema without changing the logical and user ones.
  
  Well yes, but to be able to change the relationship of the data (ie: table dependancies, column names, keys) WITHOUT the application even knowing that the change had been made means that you need to add an abstraction layer between the application and the data. This already exists in multi-tier models.
  
  But you still need to be able to converse with the data base engine at some level. So your data access layer must have an understanding of the data storage schema. Whether you are using an RDBMS, flat file system, or OO system, somewhere along the line the data in the file becomes data in RAM, and becomes organized in such a way that the application can use it.
  
  Using SQL means that you are using a standards based language to converse with the data base engine. Because computers are stupid, care must be taken to ensure that data integrity is not lost. Whether this happens at the application level, data level, or through some RDBMS engine is only relevant to whomever is responsible for the integrity.
  
  It would be nice to be able to specify ALL of this in the data base engine, but it still needs to be retrieved. And the retrieval system must have an understanding of what it is retrieving so it can package it for the application (and vice versa).
  
  That is actually a bad example. Not only SQL itself prevents real data independence, if one mixed a relational language (that SQL is not) to any non-relational language to access a particular database, one looses the relational data independence because the non-relational access will have assumptions on the database physical schema and access plans.
  
  I was pointing out that SQL is only one way to get at the data. Yes, extreme care must be taken to ensure that data dependancies are not violated (and probably in more places). But then a DBA must do the same when they set up dependancies in the data schema. The main difference is the level at which data dependancy is conserved. I agree that it is MUCH better to set up the dependancy checking lower down. All higher level code is then forced to comply with it.
  
  I jumped into this discussion because the real problem is the way that we store information, not the particular way we get at it. Columns, rows (collections of columns), tables (collections of rows and columns) are all terribly limiting. You must carefully partition the information, try to figure out the owner of each piece of information, then apply strict rules so someone does not break your view of the information. This is known as a data schema.
  
  Then someone comes along and tells you that you forgot something. Now your schema is wrong, and you must change it, and the application which acts on the information must be changed. Oops!
  
  Until we can store, retrieve, and process information in a homogenous fashion we will have these problems.
  
  --
  
  - - - - - - - - - - -
  I am a programmer. I am paid to produce syntax not grammar. Deal with it.
43. Re:relational databases, woo hoo by leandrod · 2003-01-06 08:34 · Score: 2
  
  > I meant the thing that 99% of people think of as a relational database.
  
  Unfortunately, that is vendor Newspeak, not reality.
  
  > Are there existing products that you feel provide the full power of the relational model?
  
  Alphora Dataphor does not provide the full power, but at least it does not violate any principles. Thus it offers, say, something like 50% of the relational model instead of the 20% SQL does. Obviously the numbers are just feelings, but the reality behind these feelings is solid.
  
  Even if it is a proprietary and expensive product, you still can try a fully functioning demo of Dataphor.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
44. Re:relational databases, woo hoo by esme · 2003-01-06 08:58 · Score: 2
  
  First, about language. If a vendor comes out with some terminology that's completely stupid, they'll usually get mocked or ignored, and they'll drop it. On the other hand, if they come out with something that is a mutation of what has come before, people might adopt it. Language changes like this all the time, and there's nothing anyone can do about it. The fact that most people go along with the terminology is a strong sign that they agree with it.
  
  For the theory, my understanding is that the heart of relational theory is organizing data as n-tuples in a system that abstracts data access away from the details of data storage. I'm not dishonest, I do know SQL, and I'm quite sure that relational databases do this. Even if Codd doesn't like the implementation (or consider it to be faithful). Personally, I'm not terribly interested in what Codd thinks of an implementation, since the people who originate theories are notoriously inclined to be poor judges of their implementation, extension and interpretation.
  
  You should not "think" if you do not know History.
  
  And you shouldn't be pedantic if you honestly want have a discussion.
  
  -Esme
45. Re:relational databases, woo hoo by Anonymous Coward · 2003-01-06 09:21 · Score: 0
  
  Assembing disparate nodes is surely less efficient.
46. Re:relational databases, woo hoo by Malcontent · 2003-01-06 10:43 · Score: 2
  
  "I actually know of three relational systems: Ingres QUEL (obsolete), IBM BS12 (not available), and Dataphor Alphora."
  
  I know. You said that. Two of those are no longer around and one is brand spanking new (and according to their web site it's a "Automated Application Development foundation." whatever that means.
  
  "Just look at Intel vs RISC, or MS-WXP vs Unix, or C vs Lisp."
  
  Again that proves my point. Although RISC, Unix, and Lisp are not popular they do exist in niche areas. I guess it depends on how you look at it but RISC systems are very prevalent in high end workstations and embedded systems, unix is gaining popularity every day (especially if you add linux, freebsd and MacOSX) and lisp is still being used and actively developed. That brings up my question again. If Lisp, unix, and RISC have manged to survive so far why hasn't the "truly relational database".
  
  "Sleepycat does a library, not a full DBMS."
  
  Yea OK whatever.
  
  --
  War is necrophilia.
47. Re:relational databases, woo hoo by Tony-A · 2003-01-06 20:03 · Score: 2
  
  Ok, I'll bite, seeing as there seem to be no real answers.
  Short answer is performance. Not the small differences shown in benchmarks but differences of several orders of magnitude.
  
  Long answer probably involves representation of data and the nature of theoretical versus real-world. Theoretical, regardless of how complicated, is always a vast oversimplification of the real-world.
  Take something simple like a date. Everybody knows what a date is, right?
  You have a table of people including dates of birth and death.
  Guestimated dates and partial information cause trouble.
  Bad example, but real data is not as clean and exact as one would like. Putting that data into a system that demands that everything be clean and exact doesn't really work.
  
  You're hungry. You need food. Do you starve because not everything is done to perfection?
48. Re:relational databases, woo hoo by leandrod · 2003-01-06 20:35 · Score: 2
  
  > Two of those are no longer around
  
  Yet that they do exist shows that it is not a technological, but a political -- as in market herd instincts -- issue.
  
  > it's a "Automated Application Development foundation." whatever that means.
  
  That means that by declaring relational integrity constraints one can implement all business rules without procedural coding. Also that the interface is stored in the database.
  
  But the reason why they are not promoting it as a RDBMS yet is that they are still working on an integrated storage engine. For now their product works as an interface to SQL DBMSs, so they don't get the full performance benefits of which the relational model is capable.
  
  > If Lisp, unix, and RISC have manged to survive so far why hasn't the "truly relational database".
  
  Because all these survived in higly technical environments, or at least under the sponsorship of technically-oriented SysAdmins. OTOH DBMSs are more commercially oriented, so they fell under the spell of IBM.
  
  > Yea OK whatever.
  
  I am sure you realise that a data access library is just a fraction of the effort to make a RDBMS. Alphora for one all but dismissed Sleepycat as a storage engine; what they need to make Dataphor a full, integrated RDBMS is more in the line of InnoDB than Sleepycat.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
49. Re:relational databases, woo hoo by leandrod · 2003-01-06 21:05 · Score: 2
  
  > if they come out with something that is a mutation of what has come before, people might adopt it.
  
  So they can implement Newspeak as they want, provided they do it gradually. Remember telling the Party that before 1.984.
  
  > The fact that most people go along with the terminology is a strong sign that they agree with it.
  
  Yet this might be confusing and self-defeating, as it is in this case.
  
  > I do know SQL, and I'm quite sure that relational databases do this.
  
  Relational databases are supposed to, but SQL doesn't: it has no proper distinction of physical, logical and user schemas, and updateability of derived relations is arbitrarily limited. Also by way of pointers, row ids (Oracle), OIDs (PostgreSQL), undifferentiated NULLs and the like, it violates the Information Principle, and thus its rows aren't n-tuples and its tables aren't relations. You can work around some of these limitations, but not all.
  
  > you shouldn't be pedantic if you honestly want have a discussion.
  
  Let me be clear and honest: I want to inform you, because you are misinformed. Now if you try to make do for information with imagination, as you tried to do and thus elicited my pedantic answer, there is no point in trying to discuss anything.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
50. Re:relational databases, woo hoo by esme · 2003-01-07 02:00 · Score: 2
  
  Melodrama aside -- language is constantly evolving, and draws new words and new uses from many different places. Despite its effective use in dystopic fiction, totalitarian regimes are generally not any more effective at changing language than other sources. Add to that the fact that the idea that our thoughts are controlled by our vocabulary has been utterly discredited. One look at the parade of newly-coined euphamisms for being disabled or black should be enough to convince you of the futility of trying to control people's thoughts by controlling their vocabulary.
  
  I think it's presumptious of you to assume that I'm misinformed. We disagree. I see a flawed implementation of relational theory (like any instatiation of an Aristotelian ideal is bound to be). You clearly see something different.
  
  But my original point stands: everybody uses the term relational database the way I do, and it's disingenious of you to insist otherwise. There hasn't been a massive conspiracy to control people's minds, there has been a natural evolution of language. Loudly denouncing people for using terms the way they are consistently used in the industry is not constructive.
  
  -Esme
51. Re:relational databases, woo hoo by Randolpho · 2003-01-07 02:38 · Score: 2
  
  Indeed, good points. As I said, relational databases are good for some things while object-oriented databases are good for others. Let's not poo-poo one over the other, shall we?
  
  --
  "Times have not become more violent. They have just become more televised."
  -Marilyn Manson
52. Re:relational databases, woo hoo by leandrod · 2003-01-07 06:41 · Score: 2
  
  About language, you are exaggerating. We don't need any new-fangled theories to know that big organisations or charismatic leaders do change words' meanings to suit themselves, and I would say this should be resisted.
  
  > I see a flawed implementation of relational theory (like any instatiation of an Aristotelian ideal is bound to be).
  
  I guess you meant Platonic instead of Aristotelian. Anyway, is set theory and predicate logic an ideal that one can dispose of as he wishes? Either you get the operations right or not. SQL gets them wrong.
  
  > There hasn't been a massive conspiracy to control people's minds
  
  Who needs conspiracies, when intellectual laziness and greed cooperate?
  
  > there has been a natural evolution of language
  
  Relations are a Mathematical concept in set theory. Predicate logic is also scientifical. So if people start saying 2 + 2 = 5 we should assume that's just an evolution too?
  
  C'mon.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
53. Re:relational databases, woo hoo by leandrod · 2003-01-07 06:44 · Score: 2
  
  > relational databases are good for some things while object-oriented databases are good for others.
  
  The relational model is a powerful, high-level general theory of data, while OO data management is just a mess on all accounts. It can't even decide if objects are values or variables. I dare anyone show something OO database systems can do better than RDBMSs.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
Working Together... by airrage · 2003-01-05 06:50 · Score: 5, Interesting

In the projects I've worked, I often find that the DBAs are older men or women and the developers are young. So the friction lies in the fact that the young-guns are doing .NET or Java or XML queing and so the DBA is really at a loss to help "the developer think of things he may have not thought about". Of course, on the table-design side, this maybe true. Secondly, due to the age-difference, "popping over the cube" is also difficult as the DBAs (being more mature shall I say) are less likely to be excited about a new paradigm.

Case in point, when I read in an Oracle PL/SQL book about Nested Tables, the light bulb in my head went off (or lit up, or whatever). Basically, these nested tables were objects with methods (code behind them), however, could be queried like tables. So, instead of selecting say a person's name, birthdate, and calculating an age, I could select name, birthdate, and age (the age column had code behind it automatically calculating the age). Now the beauty of this is for derived quantities that are only used once, but would be burdensome to store, this was a godsend. However, my DBA completely rejected this idea as too untried and new-fangeled.

This may sound very arrogant, but I think the developer should manage the DBA, often the DBA is a lone-wolf with too much power. Often the poor programmer has to submit changes with about as much hope they'll get done as one might have submitting universe changes to God Almighty.

--
"This isn't a study in computer science, its a study in human behavior"
1. Re:Working Together... by Anonymous Coward · 2003-01-05 07:01 · Score: 1
  
  You should listen to your DBA and also read up more about the theory behind relational databases before you try this. You'll change your mind.
2. Re:Working Together... by ergo98 · 2003-01-05 07:26 · Score: 5, Insightful
  
  This may sound very arrogant, but I think the developer should manage the DBA, often the DBA is a lone-wolf with too much power.
  
  So instead you'd end up with a developer with too much power...
  
  The thing about the separation of database design from "front-end" design (which could be middleware or front-end applications) is that in most cases the database scheme and I/O design has a shelf life far longer than most front-ends. i.e. Don't see that new web interface as a new system with a new database, but rather as a "one of many" front ends to the back-end database: i.e. the database is of much greater long term importance than any front-end.
  
  Regarding your particular scenario: Is it possible that you're looking to shoehorn functionality for your particular front-end into a universal back-end where it might not be appropriate? Will every query on the person table suddenly have have the overhead of calculating the persons age because one page in one obscure part of one front end needs it?
3. Re:Working Together... by ComputerSlicer23 · 2003-01-05 07:27 · Score: 3
  
  Hmmm, have you ever heard of views? They do specifically this. I've seen nested tables, but the seem to violite all the rules about Relational design. Oh, I'm a young pup developer, whose read enough to be a DBA w/ no experience.
  
  create view view_name as
  select name, birthdate, age_from_bday( birthdate ) from base_table_name;
  
  Where age_from_bday is the function used to calculate the number of years.
  
  Oracle traditionally has problems with new stuff. Okay, Oracle only has problems with esoteric corner cases with new stuff, but I've run across some of them w/ partitioning. So for production stuff, your DBA might be right on the money. Good DBA's and good SA's get paid big bucks to be ultra-conservative, and say "No". That's because they get paid big bucks so when they say "This will work in a production environment for the next ten years with acceptable downtime", they are correct. This is coming from a developer whose had to do his own DBA, SA and production support work because we can't afford a DBA or an SA. I dream of having another team member to ensure the stability of my production system.
  
  Kirby
4. Re:Working Together... by sql*kitten · 2003-01-05 07:45 · Score: 5, Interesting
  
  Secondly, due to the age-difference, "popping over the cube" is also difficult as the DBAs (being more mature shall I say) are less likely to be excited about a new paradigm.
  
  I guess you haven't been around the industry too long. You see, this is an industry totally driven by fashions and fads (far more so than even the clothing or entertainment industries). Every year there's a slew of new buzzwords and technologies, each of which promises to be the "silver bullet" and a whole new "paradigm" and none of them ever are. So when some bright-eyed bushed-tailed young hotshot announces that he's discovered the solution to the organization's IT ills, all the "old geezers" just roll their eyes, 'cos they've seen it a dozen times before.
  
  However, my DBA completely rejected this idea as too untried and new-fangeled.
  
  The problem with many developers is that they see a shiny new feature, can't wait to use it, and you end up with an application in which a dozen different people have solved the same problem a dozen different ways.
  
  My attitude is usually that a developer can do anything they want... so long as they're willing to carry a pager that might go off at 3AM, and take responsibility for fixing it before the next business day. Amazing how many times they just wanted to try out a new feature without any real need for it.
  
  In your specific case, you could have done exactly what you wanted to do with a view.
  
  This may sound very arrogant, but I think the developer should manage the DBA, often the DBA is a lone-wolf with too much power. Often the poor programmer has to submit changes with about as much hope they'll get done as one might have submitting universe changes to God Almighty.
  
  Yeah, and the accountants use software, so the developers should manage the accountants! And the salesmen! And the canteen staff! After all, a developer wrote the program that prints their paychecks!
  
  I personally have spent half an hour rewriting a developer's SQL that took the run time down from 15 hours to 9 seconds. Having said that, I don't know all that much about writing, say, MT-safe C++. That's why we have specialists in the first place. I'll bet dollars to donuts that your DBA knows far more about databases than you do, even if you know many more trendy buzzwords than he does.
5. Re:Working Together... by Anonymous Coward · 2003-01-05 08:51 · Score: 0
  
  Some one mod this up, I haven't seen for a while a truer post in /. than this reply.
6. Re:Working Together... by lateral · 2003-01-05 09:07 · Score: 2, Interesting
  
  I personally have spent half an hour rewriting a developer's SQL that took the run time down from 15 hours to 9 seconds.
  Sounds interesting, would you care to explain what you did?
  L.
7. Re:Working Together... by Anonymous Coward · 2003-01-05 09:41 · Score: 0
  
  "This may sound very arrogant, but I think the developer should manage the DBA, often the DBA is a lone-wolf with too much power. Often the poor programmer has to submit changes with about as much hope they'll get done as one might have submitting universe changes to God Almighty."
  
  I disagree with this point, the developer relates to database in a theory-based logical sense. The DBA has to deal with the physical structure of the DBMS. When most analysts design their schemas they cannot consider all factors that the DBA will have to face in the deployment. I still believe there should be less of a rift between the DBAs and the data analysts. The Object-Relational databases seem to be a step in the right direction by creating a hybrid of the two.
8. Re:Working Together... by MattRog · 2003-01-05 09:58 · Score: 1
  
  I'm just guessing, but what I have seen (and fixed numerous times):
  1) Non-use of indexes
  2) Improper joins
  3) Procedural mindset
  
  Non-use of indexes is easy. Generally they'll run queries with functions on column names, or type-mismatches, etc. (e.g. WHERE colname * 2 = 12000, etc.). Or they'll not request an index when it is necessary.
  
  Improper joins -- joining on non-indexed columns, non-key columns, etc. Or leaving out a join condition and causing a cartesian product.
  
  Procedural logic should not be in the database. Often you'll see a query such as this:
  SELECT keycol FROM table1
  -- loop
  SELECT *
  FROM table2
  WHERE foreign_key = $keycol
  -- end loop
  
  Obviously a JOIN is needed.
  
  --
  
  Thanks,
  --
  Matt
9. Re:Working Together... by sql*kitten · 2003-01-05 10:03 · Score: 2
  
  Sounds interesting, would you care to explain what you did?
  
  Well, a modern relational database isn't designed to be used as a Von Neumann machine - it is massively inefficient to retrieve and process one record at a time. If you can express your query in terms of sets, and let the database worry about how to actually execute it, performance can be orders of magnitude better. And joins are almost always faster than subqueries. And if you want to use a compound index, the query predicates have to be in the same order as the columns were when the index was created.
  
  I'm sure a programmer could tell you plenty of techniques for making a GUI run faster, and as someone with a DBA background, I wouldn't presume to tell him that since it's a GUI onto a database I could tell him how to write it.
10. Re:Working Together... by rollingcalf · 2003-01-05 13:16 · Score: 1
  
  "I'll bet dollars to donuts that your DBA knows far more about databases than you do, even if you know many more trendy buzzwords than he does."
  
  Yes, overall a DBA should know and probably does know more about databases than the developers. However, they can be very resistant to use new features simply because they don't understand them, not because they have knowledge of it and have weighed the pros and cons before deciding against it. Sometimes I do know more about some aspects of the database, because I've actually used the feature(s) successfully in prior projects at another client, while the current DBA hasn't used it or even read up about it.
  
  --
  ---------
  There is inferior bacteria on the interior of your posterior.
11. Re:Working Together... by Anonymous Coward · 2003-01-05 14:56 · Score: 0
  
  Case in point, when I read in an Oracle PL/SQL book about Nested Tables, the light bulb in my head went off (or lit up, or whatever). Basically, these nested tables were objects with methods (code behind them), however, could be queried like tables. So, instead of selecting say a person's name, birthdate, and calculating an age, I could select name, birthdate, and age (the age column had code behind it automatically calculating the age). Now the beauty of this is for derived quantities that are only used once, but would be burdensome to store, this was a godsend. However, my DBA completely rejected this idea as too untried and new-fangeled.
  
  Gez, man, ever heard of computed columns??? Every single major relationa database out there has had them for ages. You can even include those into indexes on some products.
12. Re:Working Together... by petepac · 2003-01-05 15:14 · Score: 1
  
  If you work with a DBA that resist change, get a job at another place. A DBA that doesn't evolve, dies. To not use a feature or function out of ignorance is blatant stupidity.
  
  I've been in this business too long (30+ yrs) to stop learning. The day I do that; make me a manager. I've seen developers balk at using new database features because they refuse to RTFM. The good ones listen and learn. The really good ones use what they learned again. The things I've seen developers do in designs make me laugh out loud.
  
  When I started at my company, the developers did all the work a DBA should do. The systems had no security, ran slow, and the database would get corrupted every 3-4 months. I had to recover from two of these major fuck-ups. It's been over two years and the database designs by the DBA team I'm on run smooth with no corruption. This included moving up a database version for more stability (M$SQL 6.5 sucked wind out a dead cat's Ass). The developers couldn't even figure out on how to convert the monster (2 - 50 Gig DBs). It's now split between a dozen servers with no single database problem bringing down every system we have. A developer may know features, but they never know stability, recovery or security. That's what's on the line for the DBA. We're the ones that accountable for database systems and it's our ass that upper management goes for when systems are down.
  
  There was a comment in this thread on how a DBA rewrote code to speed up a process. That's done in my team every day. I've heard the old Oracle/Sybase/Informix/MSSQL debate for years. What I've seen is that a developer will write code that sucks for any database they use.
  
  As for the mature crack, I'm 51 and you can byte me. I've already seen the fuck-ups that you're going to make, years ago. They never change. Just the size of the systems and the languages developers use. Just remember to learn from them so you don't hurt yourself again. You only have two feet and one ass.
  
  --
  >> Practice Safe Hex
13. Re:Working Together... by Anonymous Coward · 2003-01-05 16:16 · Score: 0
  
  Will every query on the person table suddenly have have the overhead of calculating the persons age because one page in one obscure part of one front end needs it?
  
  ? Why would it?
14. Re:Working Together... by Anonymous Coward · 2003-01-05 16:26 · Score: 0
  
  Mod the parent up!
15. Re:Working Together... by ergo98 · 2003-01-06 05:13 · Score: 1
  
  If there's a calculated column that's calculating the age of the person based on the current date, clearly it's going to execute everytime the column is selected. As most users select their data as "SELECT * FROM YoMama" this column would be computed needlessly over and over again.
16. Re:Working Together... by aridhol · 2003-01-06 10:23 · Score: 2
  
  So use some intelligence. Only select the data that you need. It'll cut down on resource use and make the code that much cleaner.
  
  --
  I can't say that I don't give a fuck. I've just run out of fuck to give.
17. Re:Working Together... by Tony-A · 2003-01-06 20:27 · Score: 2
  
  Sounds completely plausible, and not at all restricted to SQL.
  The slow run time is due to the way the work is organized.
  
  This is a contrived example to show what is possible.
  You have a document on a screen on one computer that you want to copy to a different computer. To save time and preserve accuracy you will copy this letter by letter. Only thing is the computers are in different buildings and you have to walk back and forth.
  
  What you do is to knock out the incredibly vast amount of (re)wasted motion.
You know... by Tuxinatorium · 2003-01-05 06:53 · Score: 0, Offtopic

You know, by the RIAA/MPAA's logic, if we wanted to stop spam we should just make all database programs illegal...

--
Repeal the DMCA!
evil guy - started to use .NET! by Anonymous Coward · 2003-01-05 06:55 · Score: 0

As it's turned out these have been very valuable as we have started to use .NET in 2002.

Such a person should not be referred to on Slashdot.
This article is short and common sense... by dagg · 2003-01-05 07:16 · Score: 5, Interesting

But it is great to actually read it. Sometimes common sense things need to be written down just to verify that your techniques really do make sense. There are so many great little tidbits in the article, I'm having trouble picking one out to really comment on. Here's one:
An important part of this approach is iterative development, where you run the entire software life-cycle many times during the life of a project. Agile processes run complete life cycles in each iteration, completing the iteration with working, tested, integrated code for a small subset of the requirements of the final product. These iterations are short, usually running between a week and a couple of months, with a preference towards shorter iterations.

A big issue with iterative development is that the QA folks will quickly fall behind and become very anxious. What's the solution to that? Either embrace the QA person to get closer to the real development environment, or if that is impossible, get a new QA person. That's the only way to succeed.

--
Sex - Find It
1. Re:This article is short and common sense... by uweg · 2003-01-05 08:22 · Score: 1
  
  But what are we talking about? We applied common sense and it worked for us, though not 24/7? I still wonder, why so many projects use methodologies without knowing why and for what reason. This article is the other side of it: we still don't know why, so we use just common sense - call it "agile" - and we have at least a reason to write an article. (Or is this already a new methodology?)
2. Re:This article is short and common sense... by Anonymous Coward · 2003-01-05 10:06 · Score: 0
  
  agile development and extreme programming are current new methodologies, yes
The big picture by oliverthered · 2003-01-05 07:20 · Score: 4, Interesting

While I believe that 'Agile processes' are the best way to develop software, he appears to be advocating anarchy.

When you have a lot of people all making little changed everyone starts to loose sight of the Big picture and you run into a Too many cooks spoil the broth.

I'm sure a lot of people who read this site have seen a lot of code and design, and probably a lot of horrific code and design, well enough said.

--
thank God the internet isn't a human right.
1. Re:The big picture by chromatic · 2003-01-05 09:45 · Score: 3, Insightful
  
  When you have a lot of people all making little changed everyone starts to loose sight of the Big picture and you run into a Too many cooks spoil the broth.
  
  Why?
  
  He's big on automated testing. He's big on frequent, small integrations. Ideally, every developer integrates with the bleeding-edge source code at least once a day. That's why the changes are so little! No one strays too long from the work of everyone else.
  
  It may seem like anarchy at first glance, but if you read a little closer, you'll see that there are strong behaviors in place to prevent that chaos.
  
  --
  how to invest, a novice's guide
2. Re:The big picture by buttahead · 2003-01-05 10:22 · Score: 2, Interesting
  
  I don't see how you can call it chaos. He specifically mentioned that there were large planning meetings. He mentions that people should work together across the teams (DBA talks to developer, both talk to the network guy...etc.).
  
  Assuming that the planning meetings give everyone a Big Picture overview and assign tasks to the developers, and that the cycles are short so that the mettings happen regularly, then you are calling what most people consider normal work "chaos".
  
  That was a crazy sentence. Basiclly it seeems the people on his team are all on the same page. That means that they can each write their parts without fear of stepping on anyone's toes.
  
  Too many cooks only spoil the broth if they don't know what it is supposed to taste like, and/or fail to taste it after each small change.
3. Re:The big picture by Anonymous Coward · 2003-01-05 11:58 · Score: 0
  
  This definitely is anarchy, but just a little bit more ordered than the anarchy where I work. I think this philosophy would be great applied to an environment like we have: one shared database and dozens of distinct, separate mini-apps that use the database for very specific purposes. In fact, it's more or less what we do but adding in the expanded communication between developers and DBAs this totally leverages the strengths of an anrchic development group, adding some common-sense control without stifling the creativity.
Quality assurance by giel · 2003-01-05 07:45 · Score: 4, Interesting

Just like the actual users the QA folks should be heavily involved in the actual development cycles. Extreme programming states that every development cycles starts with a functional design and the development of tests for each deliverable. Having these available means that QA is able to keep track of the quality of the deliverables very well.

IMHO if QA cannot keep track of the big picture they fail as QA, because that is just an important part of their job. On the other hand perhaps extreme programming should involve relatively more QA people than 'regular' development methods.

--
giel.y contains 2 shift/reduce conflicts
1. Re:Quality assurance by dubl-u · 2003-01-05 13:45 · Score: 2
  
  IMHO if QA cannot keep track of the big picture they fail as QA, because that is just an important part of their job. On the other hand perhaps extreme programming should involve relatively more QA people than 'regular' development methods.
  
  On an XP project, much more QA work certainly gets done than on your average project. One important component of XP are the customer tests, which are automated tests that are used to decide whether or not programmers are done with a particular feature.
  
  Whether there are special QA people who write those tests is a local decision, but good QA people have the right orientation to write really good customer tests.
Re:The real problem by Stoptional · 2003-01-05 07:50 · Score: 1

Bad day at the office?

--
Stoptional
A developer perspective of the world. by municio · 2003-01-05 07:54 · Score: 5, Insightful

I'm currently working as a developer, but I used to work as a development DBA. In my opinion this article shows the database and the DBA roles in a project from a developer perspective.

As a general rule, the developers think that the database is there to support their application, which is really the piece that solves the problem. In the other hand DBAs think that the developers are there to support their data model, by supplying an interface with validation and some simple pieces of logic that their store procedures don't cover.

I have worked much longer as a developer than as a DBA, but I still find it funny that the article assumes that the developer should be able to add a column to a table freely and the incorporate the changes to the main database. This is the equivalent of saying the DBA should be able to freely change a class or an interface and then add the changes to the source control repository.

While not wrong in itself, it clearly shows that many developers consider the DBA role secondary to the developer. It goes something like this: I can somehow do some DBA tasks that impact the development like adding tables to the schema, I just don't want a get involved in the boring parts (backups, recovery or replicating schemas).

I think that creating a good data model is as difficult as creating a good application design and doing a decent store procedure as hard as doing an efficient method. While some DBAs can write very good C++/Java code and some developers can design very good data models, no one should be doing each other job unless they really, really, really know what they are doing.

As a general rule of thumb, if you consider that mySQL is a better database for large complex applications as PostgreSQL or Oracle, you should not be doing any database work.
1. Re:A developer perspective of the world. by Anonymous Coward · 2003-01-05 08:21 · Score: 0
  
  That's based on the idea that the rdb is some kind of point of interoperability between different applications.
  
  It's clear that the points of interoperation, aka the interfaces, must be tightly managed and documented.
  
  In my opinion, however, the rdb is not all too fit as a point of interoperation and cannot function as some kind of glorified API; that's better left to building bricks like components or network APIs.
  
  In such case the network API is the centerpiece of interoperation, and the rdb schema becomes an implementation detail.
2. Re:A developer perspective of the world. by Anonymous Coward · 2003-01-05 08:28 · Score: 0
  
  Good Points, however I would have to say having cross understanding in both disciplines would make for an even stronger team member. That going sad though, your point about really kowing what you are doing in both is key - good stuff.
3. Re:A developer perspective of the world. by roundand · 2003-01-05 08:56 · Score: 3, Insightful
  
  In my opinion this article shows the database and the DBA roles in a project from a developer perspective.
  
  True but perhaps irrelevent. Sometimes the most important difference between DBAs and developers is not their technical skills but their attitude to change. The nature of the job is that a DBA tends to be a bit like a soccer goalkeeper - he's not rewarded for scoring goals (adding new features that responding to user requirements more rapidly than anyone else) - all he gets is the blame if allows goals in (lets someone break the database). The nature of the job tends to reward defensiveness.
  
  The result - semantic corruption, with any amount of database re-use, however dirty, prefered to re-factoring. Like my insurance client in 1999, who were using 2000-01-01 as the null value in some of their date columns...
  
  It's a really good article. We're doing a fair amount of the recommendations already, I can confirm the value of the tight DB layer, and having good test data packs from the start. In fact I'd go further - it doesn't matter whether you think you're doing waterfall or iterative, you will have to change the DB and you might as well work out how to do it efficiently.
4. Re:A developer perspective of the world. by tuomoks · 2003-01-05 11:17 · Score: 2, Insightful
  
  Agreed and more. Where is the business and where is the low end like performance management and forecasting of resources, etc? A too common mistake.
  We seem to forget that database(-s) are there only to serve business and business requirements. Nice if you can build a fast, reliable and maintainable database just to find out that your company can't afford it - what then ? Or, the company is planning to announce new services and the current information structure can't support it because some DBA or developer hasn't included it to the design.
  (IMHO) the database cycle is more like from business analyst to DBA to development project to performance management and back. None of these functions has ( usually ) enough information to create real business databases. Nothing says that it can't be even one person ( I just have to see first so talented person to believe that it can be done well ).
  Otherwise a nice article ( from developers perspective ).
5. Re:A developer perspective of the world. by dubl-u · 2003-01-05 13:49 · Score: 2
  
  some developers can design very good data models, no one should be doing each other job unless they really, really, really know what they are doing.
  
  The point of object-oriented analysis and design is to define the data model. So yes, any team doing OO work should have a skilled OO A&D person; they are generally called architects.
  
  Beware, though, of self-styled "architects" who are too busy or too important to write code. They are generally useless. Worse, they are good bullshitters, spewing out designs that sound plausible but are hell to implement.
Re:Protest The Wars On Everything: +1, Patriotic by Anonymous Coward · 2003-01-05 08:10 · Score: 0

warning hidden goatse link above
Works for OODB as well. by bokmann · 2003-01-05 08:16 · Score: 4, Interesting

I work on a project of about a dozen developers, some os us geographically diverse. We use an Object-Oriented Database with Java (Database is from Versant).

We don't worry about *any* kind of DB administrator. Each developer has their own instance of the database. We don't worry about schema changes that break the database, because we *also* have a way to import/export the database to an XML file. Thus, if the schema radically changes from the deployed version, we just export to XML and re-import, so there is no complex though about 'schema evolution' necessary,

Of course, with everyone having such free ability to make changes that impacts the format of the data store, we need good unit tests to make sure things don't break unintentionally. This is actually one area we need to improve upon. People can make changes that affect the schema easily, and most times, its not an issue... but a lot of times people make changes that would impact the XML format, and they don't always handle it properly.

Unit Tests are Key, but that's nothing new to the concept of refactoring.
1. Re:Works for OODB as well. by Tablizer · 2003-01-05 09:15 · Score: 2, Insightful
  
  I work on a project of about a dozen developers, some os us geographically diverse. We use an Object-Oriented Database with Java (Database is from Versant).
  
  I would note that "object oriented databases" tend to resemble the hierarchical or "network" databases of the 1960's. They fell out of favor for various reasons.....until OODBMS started popping up again. The OO croud has not been able to solve many of the same problems that were there in the 60's.
  
  It is true that they tend to be more "organic" than RDB's, but they also lack the rigor of relational algebra that makes RDBMS scalable, manageable, and more queriable. (Although the current crop of RDBMS also fall short of the theory by some accounts.)
  
  --
  Table-ized A.I.
2. Re:Works for OODB as well. by alkini · 2003-01-06 05:38 · Score: 1
  
  But how well would that work if you worked for, say, a retailer who has millions of "objects" ordered, sold, shipped, returned for credit, etc? How quickly can you generate XML for 10 million orders and 500,000 returns, store that XML while the schema is updated, and re-import the XML? Remember that for every second that it takes to perform this, your company is losing money.
Done similar things by adamy · 2003-01-05 08:41 · Score: 3, Interesting

In a small shop (4 people) we had a similar setup. We were doing J2EE/JBoss/Tomcat work and used PostgreSQL as the back end. We had no full time DBA or Sys Admin. We had to be flexible.

The Good: Database changes were part of development. When our system worked right (58.3% of the time) All changes would go through QA, a small fix cycle, and we would push code and database changes during the evening (we were running a web site for people who used it during the business day only).

The Bad: People tended to develop with live data. The main problem with this was that if something changed, It might break a unit test.This could be a real problem if Sensitive Data was involved as well.

The Mechanism: We had a script (PERL) that executed a list of sql statements embedded. For a revision, we changed the name of the script (the first script ended in 0001, which for the next revision would be 0002 etc) and checked if that reision had been executed, and if all the previous revisions had been executed; script 0008 could only be run after script 0007 had been run etc. We had a single table that kept track of the current revision.

The nightly backup from live was dumped into the Integration database. The current update script would be run against it, then all functional tests. Every few days we would push code live. Yes, every few days. This was an organisational issue and yes, it casue a lot of headaches.

To improve:
Better QA. We should have had a batch of scripts that could be run agains the DB. Instead, our QA person had to run through them manually.

We should have had test data for the tables that were primarily used for collecting and reporting, instead of runnning against live data. I

Instead of a self executing perl script being the required mechanism, I would have an executable on a machine that tracked the schema name, schema version number ,and currently available script. Scripts would be primarily straight SQL, run through a single program, and targeted against multiple schemas. Hmm, maybe I should explain more.

For a give application, we had 3 databases running. One was for collection, one was for reporting, and one was reference data. Only the collection database was backup up, as the other two could be regenerated from source. However, we still needed to revision conrtol the reference databases source, and it was faster to send patches than to update the whole thing each time.

We started moving to an Application Service Provider (ASP) setup where we uysed the same schemas, but with different data in them, for different clients. Since our clients were Local governments (County) we wanted to A) Be able to get a new county up and runniong quickly if we got them as a client, B)Be able to dump whatever existing data they had into that counties db quickly, C) Keep the counties data separate from each other, D) Maintain our versioning. So certain scripts had to be run against County X data of the collection scheme. If I had to do it agains, with what I know now,I'd have had a revision control database that kept track of the other databases (and be self maintaining, why not) With Schema name and data set name, current revision for each.

--
Open Source Identity Management: FreeIPA.org
The key point by PinglePongle · 2003-01-05 08:57 · Score: 5, Insightful

is iterative design. Which is becoming fairly widely accepted in OO circles, and almost universally accepted in Agile circles.

Databases, however, are a lot harder to iterate - the cost of change is higher than with any other code. Martin Fowler is laying down an approach to manage (not reduce - manage) that cost, and it all comes down to a guess we have to make - do we think the overall cost/benefit tradeoff of an iterative process is better than a Big Design Up Front process ?

On the eXtreme Programming mailing list, there's been a lot of discussion about how to deal with databases - some deny the need for databases altogether, some advocate using Mock Objects for testing and even development etc. It all boils down to the cost of change - it's expensive to change a database design because it is very hard to identify the knock-on effects. Some changes are relatively easy to manage - adding a column is unlikely to actually break anything - but others can wreak havoc with existing applications - changing the type or size of a column for instance.

I'd love to think that the next big improvement in software development tools is not going to be yet another language but a sensible way of tying objects to their persisted data. All the solutions I've seen so far are bolted-on - they either force the database into unnatural positions, or make the objects fit into a model that's not quite what they'd be otherwise.

In the meantime, this article is well worth investigating - the idea of evolving the datamodel in tandem with the migration scripts is very powerful.

--
It's all very well in practice, but it will never work in theory.
1. Re:The key point by Tablizer · 2003-01-05 09:29 · Score: 1
  
  I'd love to think that the next big improvement in software development tools is not going to be yet another language but a sensible way of tying objects to their persisted data. All the solutions I've seen so far are bolted-on - they either force the database into unnatural positions, or make the objects fit into a model that's not quite what they'd be otherwise.
  
  Like I have described elsewhere, relational thinking and object thinking tend to be at odds. They just simply view the world differently. One or the other will eventually have to give up territory IMO. Existing approaches to co-existence tend to neuter one or the other WRT reach and power. This is part of the reason why OO fans and DBA's tend not to get along. (I tend to favor the relational viewpoint. It fits the way I think better.)
  
  --
  Table-ized A.I.
"scattered willy-nilly" by Tablizer · 2003-01-05 09:01 · Score: 3, Interesting

To understand the consequences of database refactorings, it's important to be able to see how the database is used by the application. If SQL is scattered willy-nilly around the code base, this is very hard to do. As a result it's important to have a clear database access layer to show where the database is being used and how.....Having a clear database layer has a number of valuable side benefits. It minimizes the areas of the system where developers need SQL knowledge to manipulate the database, which makes life easier to developers who often are not particularly skilled with SQL. For the DBA it provides a clear section of the code that he can look at to see how the database is being used. This helps in preparing indexes, database optimization, and also looking at the SQL to see how it could be reformulated to perform better. This allows the DBA to get a better understanding of how the database is used.

I disagree with this more or less. SQL is often too closely related to the application to put in a separate place. You have to go hunting back and forth to see and manage the relationship. If there is duplication, then I agree that it should be factored to a shared spot. However, beyond cleaning duplication, keep the SQL near where it is used.

If you want to be able to track it, then put some kind of comment marker that a grep-like utility can use to find and gather the SQL if need be. Both approaches are conventions anyhow. My suggestion gives the best of both worlds.

Also, OO proponents tend to use simpler SQL, and thus it might be easier to put wrappers around such trivial SQL. However, a more balanced approach is the use the full power the DBMS rather than re-invent it in your application. Hand-built indexing, joins, filtering, multi-user contention management etc. built into app code is a common sin of OO design IMO, including Fowler's designs, I am sorry to say.

Sometimes I think many OO'ers are motivated by a desire for control, and that is why they would rather reinvent the DB rather than use an existing one. But that is anti-reuse IMO.

Relational thinking and OO tend to be at odds, either way. It is not practical to have both manage the noun models and noun views IMO. Pick one or the other, I say. Thus, either get an OODBMS, or relax the OO design to not duplicate and fight with the RDB and query languages.

--
Table-ized A.I.
1. Re:"scattered willy-nilly" by NineNine · 2003-01-05 10:51 · Score: 3, Interesting
  
  I've seen projects literally ruined by OO zealots who simply refused to use the database for what it was for. They'd do a "select *" and put a wrapper around it, and object(ify?) the whole damn chunk of data. I've seen this happen in several different projects, despite my protesting. Suffice to say each project done like that flopped due to 1. Very difficult to maintain code and 2. Serious performance problems. It actually caused the collapse of a 80+ person company in one instance. The project didn't make it to the key customers in time due to this shitty architecture.
2. Re:"scattered willy-nilly" by dubl-u · 2003-01-05 13:32 · Score: 2
  
  However, beyond cleaning duplication, keep the SQL near where it is used. [...] However, a more balanced approach is the use the full power the DBMS rather than re-invent it in your application. [...] Thus, either get an OODBMS, or relax the OO design to not duplicate and fight with the RDB and query languages.
  
  I agree that OO and relational thinking are at odds, and that mixing them produces poor results. But if you do good OO design, then the kind of database you use shouldn't be a big issue.
  
  The trick is to put all your persistence logic in one tier of the code. I have one app that will happily use several different backing stores (java serialization, XML files, and SQL storage) at the change of a config line. I could easily add code to use an OO database or an LDAP database.
  
  As long as you design that way, there's very little "fighting with the RDB" to do.
  
  Sometimes I think many OO'ers are motivated by a desire for control, and that is why they would rather reinvent the DB rather than use an existing one. But that is anti-reuse IMO.
  
  Just because you have a database that can do something doesn't mean that you should do it there.
  
  Java, for example, has handy built-in stuff for indexing, joins, filtering, and multi-user contention management. I can happily reuse that, remaining database neutral. Better, most outfits have many more programmers than DBAs; if the logic is in the code rather than the database, I have higher development concurrency and a better truck factor.
  
  Reuse is swell, but as a developer my responsibility is to efficiently deliver good software and keep it up over the long term. RDBs are one tool I'll use to achieve this, but only when they deliver the best business value.
3. Re:"scattered willy-nilly" by Tablizer · 2003-01-05 18:03 · Score: 1
  
  The trick is to put all your persistence logic in one tier of the code. I have one app that will happily use several different backing stores (java serialization, XML files, and SQL storage) at the change of a config line. I could easily add code to use an OO database or an LDAP database.
  
  Sounds like you are using the DB as *just* persistence, otherwise it would not be that swappable. There is really no way out of that. If you use the *full* potential of a RDBMS, then its functionality would simply NOT be swappable with say XML DB's.
  
  Java, for example, has handy built-in stuff for indexing, joins, filtering, and multi-user contention management.
  
  I do not believe Java makes a better DB than RDBMS. I would have to see it with my own eyes, and if true, then the network databases of the 1960's have made their comeback and relational is truely dead. Either that, there is a rotton fish somewhere in the back of your system that you tolerate out of habit or lockage to doctrine.
  
  --
  Table-ized A.I.
4. Re:"scattered willy-nilly" by dubl-u · 2003-01-05 20:28 · Score: 2
  
  Sounds like you are using the DB as *just* persistence, otherwise it would not be that swappable. There is really no way out of that. If you use the *full* potential of a RDBMS, then its functionality would simply NOT be swappable with say XML DB's.
  
  Yes, I am. Although I did make use of some SQLisms when optimizing performance.
  
  We agree: If you use all of an RDBMS's power, it probably won't be much of an OO app. If you really do good object work, you probably won't use all of an RDBMS's power.
  
  I'm willing to use whatever tools it takes to get a client's problem solved. I find that for apps over about 10,000 lines of code, I'm fastest with OO tools. That's especially true when the project will go through many iterations; then the flexibility of OO code, especially using modern refactoring tools, is unmatched.
  
  But if I'm doing something quick and dirty or a little something where needs are stable enough that it will never go past v 1.0, I'm glad to write procedural code that's heavily coupled to an RDBMS.
  
  I do not believe Java makes a better DB than RDBMS.
  
  It doesn't need to be. RDBMSes are undoubtedly a better general-purpose solution for storing, fetching, and manipulating arbitrary data. But 98% of the apps I write perform a few operations on very specific data. For a given application, almost all of the theoretically appealing relational possibilities remain unused in practice. Manually writing the few things that a database would do for me often doesn't outweigh the penalties a RDBMS brings with it.
  
  I find that is especially common for server apps with moderate data sets (under, say, 2 GB). In that case, keeping everything in RAM in Java objects is hundreds of times faster than fetching from a database. For that kind of speed improvement, I'm willing to go to the slight extra effort of populating a java.util.HashMap instead of doing "CREATE INDEX".
  
  The same goes for things where the data just doesn't fit well into the relational style. I'm sure somebody has found a way to cram images, drawings, and HTML structures into ANSI SQL, but I sure wouldn't want to look at the code.
  
  But if somebody wants to run arbitrary queries against a large body of data, I'm always glad to dump the stuff out into an SQL database. Crystal Reports and its cousins are swell when people want to go romping through the data.
  
  then the network databases of the 1960's have made their comeback
  
  And if they have, so what?
  
  As professionals, we should be interested in using the right tools for the job, whatever they might be. The technology of paper is thousands of years old, but if a client comes to me with some overblow set of requirements, I think it's my duty to tell them if they'd be better off with a stack of index cards.
5. Re:"scattered willy-nilly" by Anonymous Coward · 2003-01-05 21:27 · Score: 0
  
  They'd do a "select *" and put a wrapper around it, and object(ify?) the whole damn chunk of data
  
  Sounds like Java Entity Beans -- a technology driven entirely by the facts that monkeys can't write SQL and your Oracle Admin is an asshole.
6. Re:"scattered willy-nilly" by ledestin · 2003-01-06 02:41 · Score: 1
  
  Java, for example, has handy built-in stuff for indexing, joins, filtering, and multi-user contention management.
  Could you point me at APIs that do that?
7. Re:"scattered willy-nilly" by Tablizer · 2003-01-06 05:49 · Score: 1
  
  That's especially true when the project will go through many iterations; then the flexibility of OO code, especially using modern refactoring tools, is unmatched.
  
  What if there were better relational factoring tools (if such is even needed)? OO seems to be growing into a self-fullfilling prophecy because all the tools are being built for it instead of other paradigms.
  
  I find that is especially common for server apps with moderate data sets (under, say, 2 GB). In that case, keeping everything in RAM in Java objects is hundreds of times faster than fetching from a database.
  
  Even with RAM-caching? (Some vendors are currently building RAM-optimized RDBMS). Besides, Java never has been renouned for its speed.
  
  But if somebody wants to run arbitrary queries against a large body of data, I'm always glad to dump the stuff out into an SQL database. Crystal Reports and its cousins are swell when people want to go romping through the data.
  
  But that costs you a dumping step, and risks stuff being out of date.
  
  As professionals, we should be interested in using the right tools for the job, whatever they might be.
  
  My observation is that certain tools just seem to map to people's heads better. I find relational to be a more organized and consistent paradigm than classes of pointers to classes. It is usually easier for me to grok a relational schema than navigating a bunch of intermingled classes. There is more consistency in relational normalization and design techniques than class normalization. OO often lacks consistency from practitioner to practitioner. Each OO shop is a new wheel. I don't know if this is because relational is more mature or just a better paradigm.
  
  --
  Table-ized A.I.
8. Re:"scattered willy-nilly" by Anonymous Coward · 2003-01-06 05:55 · Score: 0
  
  Could you point me at APIs that do that?
  
  It is called SQL :-P
9. Re:"scattered willy-nilly" by dubl-u · 2003-01-06 06:36 · Score: 1
  
  What if there were better relational factoring tools (if such is even needed)? OO seems to be growing into a self-fullfilling prophecy because all the tools are being built for it instead of other paradigms.
  
  I dunno; I only develop with real tools, not hypothetical ones. It could be that OO is a self-fulfilling prophecy. Or it could be that the reason it didn't live up to its early promise was the lack of good OO-specific tools.
  
  But the current crop of OO refactoring tools have all been developed by single people or small teams, so if you think there's a revolutionary tool that needs to be built, you should take a swing at it.
  
  In that case, keeping everything in RAM in Java objects is hundreds of times faster than fetching from a database.
  Even with RAM-caching? (Some vendors are currently building RAM-optimized RDBMS).
  
  You're welcome to try it yourself. Try Prevyalyer's speed tests against a database of your choosing. Let us know how it works.
  
  My observation is that certain tools just seem to map to people's heads better. I find relational to be a more organized and consistent paradigm [...]
  
  And people doing object work find the opposite. I doubt one side is, in any useful sense of the word, wrong about this. The right tool for the job coesn't just depend on the job, it depends on the people doing the job.
  
  But note that the divide isn't as sharp as it seems at first. In practice, people tend to use a mix of models as appropriate. In the OO world, for example, people often use the Command Pattern when a focus on verbs becomes necessary.
10. Re:"scattered willy-nilly" by dubl-u · 2003-01-06 06:45 · Score: 2
  
  Java, for example, has handy built-in stuff for indexing, joins, filtering, and multi-user contention management.
  Could you point me at APIs that do that?
  
  Sure.
  
  For indexing, the point of which is to speed finding matches, check out the Java Collections classes. For joins, you just establish the proper relationship between the objects. For multi-user contention management, you wrap things in syncronized blocks.
  
  If you need more than the basic stuff that Java provides, check out the page on the Prevayler site where they list various object indexing and query libraries.
11. Re:"scattered willy-nilly" by Anonymous Coward · 2003-01-06 08:02 · Score: 0
  
  Just because you can't grok the OO paradigm doesn't mean it's flawed. It simply means you are flawed. Wake up fucknut! OO is THE programming paradigm you unevolved simian meat eater.
Really good comment by SerpentMage · 2003-01-05 09:32 · Score: 2

Good comment.

What I really am not crazy about in the article is that it fails to realize that databases live on. Applications come and go. Databases just keep on living. Hence doing a bad database design comes back and haunts you forever....

The project Atlas that has worked for the past two years is a CPU cycle in terms of databases. I know at large corporations that they have databases that have been running for thirty years non stop. It is called a mainframe!

--

"You can't make a race horse of a pig"
"No," said Samuel, "but you can make very fast pig"
1. Re:Really good comment by dubl-u · 2003-01-05 13:07 · Score: 2
  
  What I really am not crazy about in the article is that it fails to realize that databases live on.
  
  You're close, but you're not quite there yet.
  
  What really should live on is the data, not a particular database or a particular schema. The reasons that applications have to come and go is that they aren't being maintained properly. And one of the big brakes on proper refactoring is database schemas that aren't allowed to change.
  
  If you force your database schemas to evolve along with the application (which in turn is forced to evolve along with your business) then everybody wins. Well, everybody except the Oracle marketroid trying to sell you yet more magic bullets.
2. Re:Really good comment by cerberusti · 2003-01-05 21:56 · Score: 1
  
  Yes, but what happens when that application becomes applicaions (plural.) Changing a database in that case can have unintended consequences, which is exactly why programmers generally DO NOT get to control database layout (unless of course the database is for a single application, or the company plans to have major IT headaches.)
  
  --
  I'm a signature virus. Please copy me to your signature so I can replicate.
3. Re:Really good comment by SerpentMage · 2003-01-06 00:12 · Score: 2
  
  I agree with your comment, but as the other commentator said the problem is the "s" of application.
  
  Now about applications coming and going because they are not maintained properly? I doubt that one. Some may fall into this category but not all.
  
  The other problem that is not being remembered and I think this is extremely important is that you cannot change databases as simple as developers would like to have it.
  
  For example lets say that you want to keep things flexible and change databases. Can you ensure that all of the data in database A will equal 100% database B? Answer no! I remember at a client once that they wanted to move data from a mainframe to an Oracle database. About 2 terrabytes. Oracle would not guarrenttee data correctness and hence the client did not make the move.
  
  The problem is that developers look at software using a one off design issue, which is correct. But data is a production issue. It is sort of like the engineer who designs the car that cannot be built because of production constraints. In the industry the production constraints determine which car is built. Likewise in the software world the database needs to define what programs can be built, not the other way around.
  
  And I am not a DBA...
  
  --
  
  "You can't make a race horse of a pig"
  "No," said Samuel, "but you can make very fast pig"
4. Re:Really good comment by dubl-u · 2003-01-06 06:54 · Score: 2
  
  Can you ensure that all of the data in database A will equal 100% database B? Answer no!
  
  If you don't maintain things properly, that's true. If you don't have good system tests, then moving to a new system is hard, because you are never quite sure that the new system works like the old system.
  
  Fear, as they say, is the mind-killer. For the project I'm working on right now, I'm confident I could migrate it not just to a different database, but to an entirely different database technology (e.g., to LDAP) in short order. Why? Because I have good tests, and I made sure to keep my persistence logic in a single place.
  
  The problem is that developers look at software using a one off design issue, which is correct. But data is a production issue. It is sort of like the engineer who designs the car that cannot be built because of production constraints. In the industry the production constraints determine which car is built. Likewise in the software world the database needs to define what programs can be built, not the other way around.
  
  All production issues should be development issues. Neither the database nor the code should define what gets built; business needs must determine that.
  
  It sounds like you've worked an environment where the developers can escape consequences of their actions. Try doing weekly iterations and monthly releases; the developers will soon be cured of there ignorance about what it takes to put things into production.
5. Re:Really good comment by dubl-u · 2003-01-06 07:07 · Score: 2
  
  Yes, but what happens when that application becomes applicaions (plural.) Changing a database in that case can have unintended consequences, which is exactly why programmers generally DO NOT get to control database layout (unless of course the database is for a single application, or the company plans to have major IT headaches.)
  
  Then you break things into tiers and have them make requests of one another rather than rummaging around in the raw data.
  
  For example, suppose a previously uncomputerized business decides that the first thing they need is a intranet where they can track the names and addresses of their customers. So you build one.
  
  So then they want to add something to invoice the customers. Since you wrote the contact manager in a clean way, the UI stuff and the domain model were already separate, right? So you just pull the domain model out into its own layer, and now you have two apps talking to the same customer data.
  Sure, you can use the database as an integration layer. But that's only one option, and rarely the best one, as you are then forced to either a) have duplicate code all over the place for the same sorts of things, or b) start putting code into your database via stored procedures, triggers, and hatnot.
  
  Now we all know that duplicate code is the way to madness and woe. But putting logic in the database isn't much better. Now your database isn't a database, it's an application server in disguise, one with weird proprietary languages, poor tool support, and limited options for scalability.
  
  If you're going to build an application server (which will happen in some form when you go from singular app to plural apps), you might as well do it in a language that all your developers already know, not one that is a mystery to all but the DBA. Then you get the advantages of portability, a wide variety of tools, and a panopoly of ways to scale up as requirements grow.
Perl, not PERL by Anonymous Coward · 2003-01-05 09:33 · Score: 1, Insightful

No, and I do mean no Perl programmer ever calls it PERL. It is not an acronym, even though people has been to retrofit one to it at lots of occassions. Unless you are the only person that actually uses the cheap copy with that name, that someone tried to promote a long time ago.

Your credibility goes right out the window if you don't even know that. Reading your post it does seem that you are just throwing out some buzzwords and acronyms, hoping for some cheap karma.

If I wanted to be real nitpicky, there are dozens of other such mistakes - things that are called things they are not and misspelled. Sad, sad.
The problem is how developers design! by MattRog · 2003-01-05 10:13 · Score: 3, Interesting

It seems like he's coming up for a solution to a developer-induced problem, not a problem with DBMS' or DBAs in general.

If you normalize you don't have to worry about null-ability of columns. You don't add and drop columns (usually). Pack those tables behind views and your application doesn't need to change a thing:
CREATE VIEW user_info AS
SELECT *
FROM user u
INNER JOIN user_detail ud ON u.username = ud.username
-- etc

Then your app would simply select from that view.

The problem is that developers don't take the time to properly learn relational theory, instead content to know the basic semantics of the SQL language and call themselves 'fluent in SQL'. They know how to create tables in their GUI of choice (or even code it by hand) and are 'database designers'.

Another benefit is stored procedures. Although abused to include procedural logic in the database, they can help keep database logic out of your application and generally help much along the same lines as views:
CREATE PROCEDURE get_user_info AS
SELECT *
FROM user u
INNER JOIN user_detail ud ON u.username = ud.username
-- etc

In this case the stored procedure would be called in your application.

--

Thanks,
--
Matt
1. Re:The problem is how developers design! by Anonymous Coward · 2003-01-05 10:43 · Score: 0
  
  The primary benifit of sql and most query languages is the abilit to perform ad-hoc queries, thoese that even a user can construct. S
  
  Stored precedures negates this benefit, and forces users into performing the queries the dba thought they might need.
  
  Views are not yet updatable untill they are, your view solution does not yeild the benefits that you or I feel they should
2. Re:The problem is how developers design! by NineNine · 2003-01-05 10:55 · Score: 1
  
  Another benefit is stored procedures. Although abused to include procedural logic in the database, they can help keep database logic out of your application and generally help much along the same lines as views:
  
  While I appreciate the SQL 101 in your post (note sarcasm), I disagree. Languages like PL/SQL are largely designed to hold procedural logic and provide excellent performance.
3. Re:The problem is how developers design! by MattRog · 2003-01-05 11:14 · Score: 1
  
  Languages like PL/SQL are largely designed to hold procedural logic and provide excellent performance.
  Correct. However, the relational model has no need for procedural code, hence the 'abuse' comment.
  
  If every developer out there had taken 'SQL 101' perhaps we wouldn't be having this conversation in the first place. :)
  
  --
  
  Thanks,
  --
  Matt
4. Re:The problem is how developers design! by MattRog · 2003-01-05 11:22 · Score: 1
  
  However in the OLTP application context generally you are not running ad hoc SQL, so views and stored procedures are perfect. To test your query certainly you can type random SQL to the DBMS, but once it is in the application it should be encapsulated in a view or stored procedure.
  
  Views are updatable in most DBMS' -- there is a good series of papers here which discusses the topic more fully.
  
  --
  
  Thanks,
  --
  Matt
5. Re:The problem is how developers design! by asfasmcdas · 2003-01-06 02:02 · Score: 1
  
  Views are updatable in oracle - even views that are based on joines. The only proviso to this is that only the fields in the key preserved table in the join view are updatable. i.e . in a one to many relationship the 'many' table is the key preserved table when you join them together.
  
  As for forcing users to perform queries - isn't this the whole basis of OO? You provide a set of public interfaces and keep the implementation private. If users require a different interface - then develop a new view/stored proc.
Re:The real problem by Anonymous Coward · 2003-01-05 10:36 · Score: 0

Right on brow. I always recommend to all of my deralict friends to become dbas. The field has not progressed beyond the formalized theroy in the 70's. Any idiot can read the Oracle manuals and be billed out at over 100 an hour to gate keep a database from developers.

What really gets me are thoese "database designers". These are the dbas who are too embaressed and arrogant to even do the dba tasks. (similar to software architects)...

There is never any reason for developers to have to go through a dba to make a database change. There is no reason ALL database scripts are not managed in source controll systems. There is no reason A DBA needs any sort of interaction with developers during the design and implementation of a database. ALL dba should be considered system administrators that have more knowledge of certain dbms products then the os.

They are important (I am not saying otherwise) but you don't want to be paying one to review schema changes or come up with a box and line diagram for you... that is the work of a software devloper, if they are unable to do this... fire the stupid software engineer, pay more for someone who can tie their own shoe.
"Automatically Update all Database Developers" by NineNine · 2003-01-05 10:41 · Score: 1

As a former DB developer, I can say that if my DBA did this, he's be eating through a straw. Before my sandbox is updated, I *need* to know what is going to be updated and when. Whether it's a schema change or a data change, a change like that mid-way through development is a serious decision, and shouldn't be undertaken lightly.
Also, I should have some input as to what is updated. A DBA shouldn't have complete control over the schema. The DBA and developers need to work together to work out an ideal schema. Ideally, it's msotly worked out *before* any coding is done. DB objects should not be done on the fly in msot cases.
1. Re:"Automatically Update all Database Developers" by dubl-u · 2003-01-05 13:37 · Score: 3, Insightful
  
  Whether it's a schema change or a data change, a change like that mid-way through development is a serious decision, and shouldn't be undertaken lightly.
  
  That's one way to develop.
  
  Or, instead, you could assume that change is a given and tune your development process for that.
  
  The DBA and developers need to work together to work out an ideal schema. Ideally, it's msotly worked out *before* any coding is done. DB objects should not be done on the fly in msot cases.
  
  Yes, ideally all requirements are identified, and then all the design is done, and then all of the code is written. Ideally. And ideally, you could remove the backspace key from they keyboard.
  
  Alas, most of us don't get to code in an ideal world. The premise behind the various Agile methods (Extreme Programming, Crystal, Scrum, FDD, etc.) is that since the world isn't ideal, we might as well pick development methods that are tuned for the world we live in.
  
  Interesting notion, eh?
Relational databases, grrr by Pseudonym · 2003-01-05 11:33 · Score: 2
Flamebait time!

Summarising several other replies and adding my own biasses...
- The overwhelming majority of SQL-based DBMSes are not true relational databases.
- A lot of data (probably "most") does not fit neatly into the relational model.
- XML-driven databases are close to perfect if you're storing and indexing documents. In particular, trying to index text using a relational system is pretty close to the dictionary definition of "insane". OTOH, non-text data and XML can often be a bad fit.
- Lazy programmers are a problem no matter what you do. Forcing them to stay on their toes by giving them a system they have to fight with is no answer.
- OODBMSes have a lot of promise, but there are no standards which are adhered to. Expect vendor lock-in if you go down this path.
Summary: All data storage solutions suck. For your specific application, there will be one or two which suck the least. That's why they pay you.
--
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
1. Re:Relational databases, grrr by Malcontent · 2003-01-05 19:59 · Score: 2
  
  Eventually people will realize that the unix filesystem is very slose to the ideal database.
  
  --
  War is necrophilia.
2. Re:Relational databases, grrr by Anonymous Coward · 2003-01-06 00:00 · Score: 0
  
  +1: Funny
3. Re:Relational databases, grrr by leandrod · 2003-01-06 05:52 · Score: 2
  
  > The overwhelming majority of SQL-based DBMSes are not true relational databases.
  
  No SQL system is relational, because SQL in and by itself violates the relational principles.
  
  > A lot of data (probably "most") does not fit neatly into the relational model.
  
  All data can fit neatly into the relational model, provided one defines the domains and normalise.
  
  > XML-driven databases are close to perfect if you're storing and indexing documents.
  
  XML DBMSs won't scale, because they are too complex. In fact, they are just a throw away to the hierarchical DBs that were tried and dumped thirty years ago because of complexity.
  
  > trying to index text using a relational system is pretty close to the dictionary definition of "insane".
  
  Why? Just make text a supported data type. In fact, even SQL databases can do that.
  
  > Forcing them to stay on their toes by giving them a system they have to fight with is no answer.
  
  And that's one of the big relational advantages. It can hold any data, optimising access for itself, under a consistent model. Only the DBA has to worry about the physical schema.
  
  > OODBMSes have a lot of promise, but there are no standards which are adhered to.
  
  The problem is actually worse. There is no OO data model to adhere to, so creating a standard would have to cater to too many different goals. Just look at CODASYL, and weep. An OO standard would probably be much more complex than CODASYL ever was.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
4. Re:Relational databases, grrr by Pseudonym · 2003-01-06 11:39 · Score: 2
  
  All data can fit neatly into the relational model, provided one defines the domains and normalise.
  
  Say what?
  
  Any data can be shoehorned into a relational model if you use enough IDs. For any sufficiently complex model, there comes a point when it goes way beyond "neat".
  
  A few examples that I have worked with spring to mind, most of which would take too long to explain. Consider, however, manipulating a directed acyclic (or, indeed, cyclic) graph where you need to query on "reachability" and propagate information around the network. A relational setup for this would got pretty unwieldy pretty fast.
  
  Just make text a supported data type. In fact, even SQL databases can do that.
  
  I'm yet to see an SQL database which can handle multiple terabytes of textual data, pulling in documents and full-text indexing them in "real time". (Disclaimer: I get paid to work on a product which can do precisely this, amongst many other things.) For this, you need a database optimised for storing and indexing text (e.g. SGML, XML).
  
  I think we might be talking about different things when we say "XML databases", BTW. I'm talking about databases which store and index XML data (i.e. XML is a basic data type), possibly in addition to other kinds of data. I think you're talking about databases where XML is the record model too. Sorry about the confusion.
  
  I do take your point about DBAs worrying about the physical model(s) and application developers working on a more abstract model of the data, but unfortunately I've never worked under those conditions. Small to medium-sized organisations working under budgetary constraints can't afford enough DBAs who have domain-specific knowledge about all of the individual problems that the enterprise applications are trying to solve. A physical database which follows the conceptual model of the data closely is a big boon here, as there are far fewer surprises.
  
  There is no OO data model to adhere to [...]
  
  Guess you haven't heard of ODMG. It very much exists, however most OODBMS vendors support it incompletely.
  
  --
  sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
5. Re:Relational databases, grrr by Pseudonym · 2003-01-06 11:55 · Score: 2
  
  I know you meant it as a joke, but for some applications (squid springs to mind, plus a few specialist web cache-like applications that I've written), it's precisely what you need.
  
  --
  sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
6. Re:Relational databases, grrr by leandrod · 2003-01-07 03:04 · Score: 2
  
  > Any data can be shoehorned into a relational model if you use enough IDs.
  
  Again you are thinking SQL instead of relational, and working around its arbitrary limitations. What you need isn't enough IDs, but a normalised database model.
  
  > Consider, however, manipulating a directed acyclic (or, indeed, cyclic) graph where you need to query on "reachability" and propagate information around the network. A relational setup for this would got pretty unwieldy pretty fast.
  
  Please do yourself a favor and get Fabian Pascal's "Practical Issues in Database Management", if you don't want to explain what was your problem and how you tried to solve it. He has some pretty neat graph examples which are simple and powerful in the relational model but nearly impossible in SQL.
  
  > SQL database which can handle multiple terabytes of textual data
  
  As I said, SQL is too complex and full of arbitrary restrictions. These wouldn't be the case with a relational system.
  
  > XML is a basic data type
  
  Relational datatypes are arbitrary, so there is no reason to deviate from the relational model to store whatsoever data.
  
  > organisations working under budgetary constraints can't afford enough DBAs
  
  That is just because SQL is so complex. Relational systems would be to SQL as Unix is to MS-Windows: far less administration required.
  
  > A physical database which follows the conceptual model of the data closely is a big boon here, as there are far fewer surprises.
  
  You are assuming the DBMS to be defective, and lots of procedural code to be needed. You are assuming DBAs' work to be needed in addition to, instead of to replace coders' work.
  
  How it should work is that DBAs work with SysAnalysts to design the types and the logical model. Then the DBA creates the physical model according to expected load. Programmers create the declarative integrity constraints as expressions of the business rules, the DBA tunes again for performance. See how much easier it should be?
  
  > Guess you haven't heard of ODMG. It very much exists, however most OODBMS vendors support it incompletely.
  
  Yes, I have, but ODMG does not a data model makes. It is a data access standard, sure enough, but based on OO ideas, not on any formal data model. At most, it is an ad-hoc data model, and thus without the power and simplicity of the relational one.
  
  Moreover, ODMG is so disputed that probably either it will never be universal, or it will grow to be as complex as SQL or even as much as CODASYL, or worse.
  
  --
  Leandro GuimarÃ£es Faria Corcete DUTRA
  DA, DBA, SysAdmin, Data Modeller
  GNU Project, Debian GNU/Lin
Persistent object model design mastery by plsuh · 2003-01-05 11:58 · Score: 2

One of the best things that you can get out of Fowler and Sadalage's article is some idea of the benefits that an object persistence/database abstraction layer can give you. I work with WebObjects, which comes with the Enterprise Objects Framework, a powerful, mature object persistence layer. Think of it as entity Enterprise Java Beans, except that it's a lot lighter in weight at runtime since everything happens in the same address space, along with about 5 years maturity over entity EJB's and very powerful relationship management.

This gives you the ability to let most of your developers works solely with the object model, and encapsulates things like calculating an age from a birth date and today's date or a full name from first name and last name without polluting the database with miscellanea. Only one or two or a specified small team of people need to be concerned with the design of the database and writing SQL. All of the other developers can just use the Enterprise Objects and not worry about it.

However, this brings me to another point -- all of the successful WebObjects projects that I have seen share one thing in common: a highly skilled architect who has a solid understanding of not only the development process but also how databases work. A good developer teamed with a good DBA is no substitute for one master artiste in this case, as the master can know how to create an object design that meshes precisely with the database to optimize performance and ease of development, as well as making it relatively easy to create the UI.

In fact, a sign of a well-run WebObjects project is that for the first week or two, most of the developers are sitting on their hands or taking long lunches or reading Slashdot. During this time, however, the senior person or two is taking the time to get the underlying data object model right. Only when that is solid does the manager unleash the rest of the troops.

--Paul
1. Re:Persistent object model design mastery by dubl-u · 2003-01-05 13:40 · Score: 2
  
  In fact, a sign of a well-run WebObjects project is that for the first week or two, most of the developers are sitting on their hands or taking long lunches or reading Slashdot. During this time, however, the senior person or two is taking the time to get the underlying data object model right. Only when that is solid does the manager unleash the rest of the troops.
  
  Or not.
  
  Many people use EOF with Agile methods, which generally treat design as something that should be done on a just-in-time basis, rather than in a big up-front spurt. Indeed, EOF works even better with Agile projects than with traditional once, as it makes continuous schema evolution much easier.
2. Re:Persistent object model design mastery by Anonymous Coward · 2003-01-05 16:05 · Score: 0
  
  No buzzword or methodology is going to allow you to build a solid data model one column at a time. Been there, saw it tried over and over, saw it fail over and over. Don't care if you listen, don't care if you agree, but you sure as hell won't try it in any instance I manage. In case you weren't listening the first time, YOU CAN'T BUILD A SOLID DATA MODEL ONE COLUMN AT A TIME.
3. Re:Persistent object model design mastery by dubl-u · 2003-01-05 16:28 · Score: 2
  
  No buzzword or methodology is going to allow you to build a solid data model one column at a time. Been there, saw it tried over and over, saw it fail over and over. Don't care if you listen, don't care if you agree, but you sure as hell won't try it in any instance I manage. In case you weren't listening the first time, YOU CAN'T BUILD A SOLID DATA MODEL ONE COLUMN AT A TIME.
  
  Well, I've done it. Repeatedly. So have others, including the author mentioned in the article that this story is about.
  
  You're right that it can't be done just any old way; if you try it with traditional methods, you'll get flattened. It takes special techniques, carefully applied. Plus a team who are all professionals. For more details, see Martin Fowler's book "Refactoring", and Kent Beck's book "Extreme Programming Explained".
  
  Shout as much as you like, but just because you can't do it doesn't mean it's impossible; it just means you don't know how.
actiually, its the dat architects job. by tom+enterprise · 2003-01-05 12:28 · Score: 0

man, no wonder every enterprise are full of unintegrated kludgy silos. man no wonder why the universal data model is getting popular. cause they came up stuff to make sure you morons wouldnt fuck things up.
Re:The real problem by tom+enterprise · 2003-01-05 12:35 · Score: 0

There is no reason A DBA needs any sort of interaction with developers during the design and implementation of a database youre right thats the data architects job. developers cant model crap. all U get is unitgrated kludgy silos. A data model represents youre business. would trust a dba more than a developer, but its not even the dbas job. No wonder big compnaies are beginnig to use len silverstons universal data model. everybody knows you morons cant build a good model, so they built one that anyone can use for any business. and no wonder they pay data warehouse guys a ton o money cleaning out youre shit.
Re:The real problem by Anonymous Coward · 2003-01-05 14:38 · Score: 0

A database is just storage for applications written by programmers. Don't make database administration out to be something difficult. Most programmers know at least a half dozen computer languages with SQL being just one of them. It is really easy for any programmer to optimize database queries and transactions. It's just common sense.
XP with Databases by Leeji · 2003-01-05 15:38 · Score: 2

I find that testing "code with side-effects" (ie: database inserts) is the hardest type of code to test, and I haven't yet found a solution that satisfies me.

On the eXtreme Programming mailing list, there's been a lot of discussion about how to deal with databases - some deny the need for databases altogether, some advocate using Mock Objects for testing and even development etc.

As the XP mailing list talks about, you can always create mock objects and test their state, etc, but it's still not quite legitimate. You end up building these massive meta-models that themselves might have issues. Perhaps the best solution I found was to have a "test" instance of your database that would always contain an appropriate seed of test data. If you keep your "side-effect code" and "test the side-effect code" inside of a transaction that you roll-back, you're pretty well off. Unit tests can start taking a long time, though.

Things get worse when you look at Ron Jeffries' Adventures in C# where he starts on the slippery slope of re-implementing the textbox class as a mock object.

Not much of a point here, more of a "me, too" response.

--
It all goes downhill from first post ...
1. Re:XP with Databases by Richard+W.M.+Jones · 2003-01-06 01:11 · Score: 1
  
  I find that testing "code with side-effects" (ie: database inserts) is the hardest type of code to test, and I haven't yet found a solution that satisfies me.
  
  One way to test these things which I've found worked in the past was to put the whole test inside a transaction, and roll back the transaction at the end of the test.
  Rich.
  
  --
  libguestfs - tools for accessing and modifying virtual machine disk images
2. Re:XP with Databases by Leeji · 2003-01-06 03:32 · Score: 2
  
  Yes, as my third paragraph points out :) Don't karma whore from my own reply!
  
  --
  It all goes downhill from first post ...
Right-on Kitty... by bubbha · 2003-01-05 17:11 · Score: 1

The "Silver Bullet" Kitty is referring to here is the seminal work No Silver Bullet - Essence and Accidents of Software Engineering written by Frederick P. Brooks. If you are a "young" developer then reading and understanding this paper will start you on the road to being an "old" developer.

One of the main points is that "coding" the application is actually a small percentage of the overall time and cost of delivering non-trivial applications. Because of this, improvements in the software development process that are directed towards the "coding" aspect will necessarily be somewhat inconsequential - much to the chagrin of tool vendors and the IS managers that willingly believe their vendorspeak.

The much much larger part of your budget and flow-time will be consumed by requirements analysis and system design (leaving out maintenance for now.) So even if some new tool reduces your "coding" time by 50% (fat chance), 50% of - say - 20% is still not THAT big of a deal. And lets face it... switching from ASP/VB6 to .NET will probably not reduce your "coding" time anywhere near 50%.

To have a dramatic improvement, you need to go after the "big rocks" first....requirements analysis and system architecture/design.

BTW, Extreme Programming is not about skipping requirements definition. It's about doing meaningful and efficient requirements definition which they believe can best be accomplished by delivering production quality iterations on development projects using evolutionary rapid prototyping (see Structured Rapid Prototyping by John L. Connell and Linda Brice Shafer.)

I find it interesting that with all the work that has been published on XP, that they only NOW are addressing the issues of incorporating database issues into the methodology. And make no mistake about it, if they can not incorporate databases in the way that the article describes - well that pretty much hurts it as a viable methodology for most data intensive business applications. Now, assuming that the inventors of XP intended their methodology to include this most important class of application, it seems almost foolhardy to have gone this far without addressing this database issue much sooner. Hey - but that's what they get for not doing their requirements first!

--
I want to be alone with the sandwich
Evolutionary Database? by s-orbital · 2003-01-05 18:11 · Score: 1

I wonder what it will look like in 100 million years.

--
Patent: from Latin patere, to be open
A different approach - treat database as a class by asfasmcdas · 2003-01-05 23:04 · Score: 1

A different solution that I am experimenting with is the notion of treating the database as another class.
Using this paradigm the database tables are the equivalent of private fields/methods and a combination of views and stored procedures are equivalent to public fields and methods. In this way the DBA is able to update his implementation of the class i.e. the table structure, while maintaining a consistent interface for users of his class through the views and stored procedures.
The problems experienced with typical applications where developers interact directly with the database tables are exactly the type of problems the OO paradigm is designed to prevent.
on sharing data by Tablizer · 2003-01-06 06:29 · Score: 1

And a fifth is that once the database exists, no matter how much the original designers warn against it, people start using the database as an integration layer. Suddenly 14 different apps are munging the same data, making it impossible to change the schema, and nearly as hard to track down a bug. The whole point of OO programming is that data should always be wrapped by the code that goes with it.

I am curious how OO philosophy is supposed to handle this. I am not criticizing OO here (I'll save that for elsewhere), I just want to make sense of OO doctrine. The idea of a database is that *multiple* applications, languages, and paradigms can share the same data without mass copying. (This is a good thing, no?) But OO wants to wrap the data or state within a single language interface. If that is done, then how is the data shared with multiple applications, languages, and paradigms?

Also, what kind of 'bug' are you referring to? A bug in the RDBMS engine, or a shop app? Setting up app-level logins and/or transaction logging can help trace who the writer is. Although it might slow things down while on.

--
Table-ized A.I.
1. Re:on sharing data by dubl-u · 2003-01-06 07:22 · Score: 2
  
  The idea of a database is that *multiple* applications, languages, and paradigms can share the same data without mass copying. (This is a good thing, no?)
  
  In the OO perspective, no, that's not good. The OO notion is that you should think of data and behavior together, not separately. So it's good to provide access to the object (data + behavior), but dangerous to provide universal access to the raw data itself.
  
  But OO wants to wrap the data or state within a single language interface. If that is done, then how is the data shared with multiple applications, languages, and paradigms?
  
  You expose it via some sort of interface, as needed by the situation. There are lots of good solutions to this, from low-level OS stuff to the venerable CORBA to the latest web services fads. The right choice depends on your circumstances.
  
  Also, what kind of 'bug' are you referring to? A bug in the RDBMS engine, or a shop app?
  
  Exactly my point. Since the responsibility for managing a particular datum is not cohesive, it's hard to say where to look. Is it in one of the apps? Is it in some of the code that's been pushed into the database? Is it a bug in the database itself? Or perhaps there are bugs in all of those, interacting in some subtle way. Who knows?
  
  Whereas in good OO designs, all of the manipulation of a given datum goes through the object that contains that datum. If it becomes corrupt, you know where to look.
2. Re:on sharing data by Tablizer · 2003-01-06 10:12 · Score: 1
  
  There are lots of good solutions to this, from low-level OS stuff to the venerable CORBA to the latest web services fads. The right choice depends on your circumstances.
  
  But these are not much different than Stored Procedures. (Not that I necessarily recommend such.)
  
  Exactly my point. Since the responsibility for managing a particular datum is not cohesive, it's hard to say where to look.
  
  But just because you have OO accessors does not necessarily mean you know who used those accessors. Global validation can be forced via database triggers, so it is not a validation issue to compare.
  
  I still don't get it.
  
  --
  Table-ized A.I.
3. Re:on sharing data by dubl-u · 2003-01-06 12:53 · Score: 1
  
  But just because you have OO accessors does not necessarily mean you know who used those accessors.
  
  Yes, but if somebody is corrupting an object by passing a bad value into an accessor, then my object is poorly written. The responsibility for managing a datum should all be in one place.
  
  Global validation can be forced via database triggers, so it is not a validation issue to compare.
  
  Uh, anything that is Turing-complete can do anything any other Turning Machine can. If you find an OO proponent who says "X can only be done with OO," he's an idiot.
  
  So yes, you can do that with triggers if you like. If that's easier for you, you should it. But if you're doing OO programming, you shouldn't.
  
  But these are not much different than Stored Procedures. (Not that I necessarily recommend such.)
  
  To you, perhaps.
  
  My broader point is that good programmers must master many tools and use them when they are appropriate. Until you've master the OO stuff to the extent that it seems like a reasonable choice for some problems, then its choices will always seem weird to you.
4. Re:on sharing data by Tablizer · 2003-01-06 14:07 · Score: 1
  
  Until you've master the OO stuff to the extent that it seems like a reasonable choice for some problems, then its choices will always seem weird to you.
  
  If you get 20 different people who believe they have "mastered OO" together in a room, they will probably disagree wildly where to use OO and where not to.
  
  --
  Table-ized A.I.
5. Re:on sharing data by dubl-u · 2003-01-06 14:25 · Score: 1
  
  If you get 20 different people who believe they have "mastered OO" together in a room, they will probably disagree wildly where to use OO and where not to.
  
  Not when I've done it!
  
  Try coming by the next Extreme Programming conference. Or if you're in the bay area, come by the XP user's group.
  
  Certainly, we can have a lot of different opinions on the right way to tackle things. But I've never had a problem getting a team to cohere around a set of choices.
6. Re:on sharing data by Tablizer · 2003-01-06 17:11 · Score: 1
  
  Certainly, we can have a lot of different opinions on the right way to tackle things. But I've never had a problem getting a team to cohere around a set of choices.
  
  Is it possible that the hirer tended to filter by certain ways of seeing things?
  
  If there is such a concensus, then why are there no good book on OO biz modeling?
  
  --
  Table-ized A.I.
7. Re:on sharing data by dubl-u · 2003-01-06 18:23 · Score: 1
  
  Is it possible that the hirer tended to filter by certain ways of seeing things?
  
  Anything's possible. But I doubt it. I often differ wildly from my colleagues on the theoretically optimal way to do something.
  
  But we all recognize that the point is to build useful software, and that there are many paths to the same destination. Generally we just pick a path and see how it goes.
  
  When we have serious disagreements, it's generally because we don't have enough data to decide, so we agree to try an experiment to see which works better.
  
  If there is such a concensus, then why are there no good book on OO biz modeling?
  
  Isn't there? I've heard good things about both Agile Modeling and Domain-Driven Design.
  
  But I think that it's mainly a practical skill, so I'm not sure that a book, however good, is much use. There are a lot of books on cooking, but any chef will tell you that no book will substitute for time in the kitchen.
  
  Say, I'm giving up on this thread now. If you're really interested in trying OOP so that you can see what problems it might be good for, then you should have enough to get started. If not, drop by an XP user group in your area, and they'll point you at the modern crop of tools and books.
8. Re:on sharing data by Tablizer · 2003-01-07 14:40 · Score: 1
  
  I've heard good things about both Agile Modeling and Domain-Driven Design.
  
  I could not find "Domain-Driven Design" on Amazon.
  
  (All design should be "domain driven", BTW.)
  
  --
  Table-ized A.I.