Ask Database Guru Brian Aker

← Back to Stories (view on slashdot.org)

Posted by Roblimo on Monday November 12, 2007 @04:00AM from the earning-a-living-with-open-source-software dept.

Brian Aker is Director of Architecture for MySQL AB. He has also worked on the code (and database) that runs Slashdot, and is well-known in both Apache and Perl circles. Outside of the arcane world of open source "back-end" programming, though, hardly anyone has heard of him. This is your chance to ask Brian (hopefully after looking at his blog and Wikipedia listing) about anything you like, from Perl to database architecture to open source philosophy to upcoming events in Seattle. We'll send Brian 10 of the highest-moderated questions approximately 24 hours after this post appears. His (verbatim) answers will appear late this week or early next week.

4 of 232 comments (clear)

Min score:

Reason:

Sort:

Re:Tabular vs hierarchal arrays by DaleGlass · 2007-11-12 07:51 · Score: 2, Informative

How do you do a SELECT DISTINCT name, or an OUTER JOIN in that model? What if you need to search by a non-key column?

Key/value systems have their place, but doing very normal RDBMS things in them is a pain.
Re:Object databases? by daniel_newton · 2007-11-12 08:42 · Score: 2, Informative

db4o (http://en.wikipedia.org/wiki/Db4o) is an open source object database. Apparently BMW, Boeing, Intel and others think it is "industrial grade".

It has a Java and a .NET version.
Re:Document databases? by Pseudonym · 2007-11-12 11:56 · Score: 2, Informative

Well, one might argue also that hierarchical data is also not a good fit with the relational model.

True enough.

If we already have XML data type in many DBs, what prevents us having Document data type like what's in CouchDB (yes, it's proprietary, but so are many of the features in current RDBMS).

It's the way you interact with it that's critical here. Think about how you work with your favourite SQL DBMS and with Google, and you'll see that they're quite different at a very basic leve.

The fundamental result data type in a relational DBMS is the stream of tuples, and tuples contain real data. In other words, querying (i.e. finding what matches some criteria) is essentially the same as presentation (i.e. getting real data out).

The fundamental result data type in a document-based DBMS is the sequence of document numbers. Document numbers do not contain real data. You perform a query, and get a sequence of document numbers. Then you let the user refine the sequence. Maybe you present some metadata, or KWIC information. Maybe you sort on a field. Maybe you add more constraints. Eventually, you get to the point where you present real data.

Yes, you can do this in principle with relational databases, but it costs. It's not uncommon for a relational model for a document database to use 5-10x the disk space of a native document database engine.

Damien Katz has for example said that he might consider supporting SQL syntax, but it's too early to say.

SQL is a poor fit for textual data, too. Partly because support for textual querying operators (e.g. phrase or adjacency queries, query-based ranking) is poor, and partly because SQL has poor support for managing result sets on the server.

--
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
A few options: by einhverfr · 2007-11-12 13:35 · Score: 2, Informative

1) TotalRekall (Python-based)
2) PgAccess (a TCL front-end and form builder for PostgreSQL)
3) Once:Radix (a web-based front-end builder for PostgreSQL).

OnceRadix is quite new and I think it is well thought out in a lot of ways.

--

LedgerSMB: Open source Accounting/ERP