Slashdot Mirror


Are Relational Databases Obsolete?

jpkunst sends us to Computerworld for a look at Michael Stonebraker's opinion that RDBMSs "should be considered legacy technology." Computerworld adds some background and analysis to Stonebraker's comments, which appear in a new blog, The Database Column. Stonebraker co-created the Ingres and Postgres technology while a researcher at UC Berkeley in the early 1970s. He predicts that "column stores will take over the [data] warehouse market over time, completely displacing row stores."

4 of 417 comments (clear)

  1. Re:They're not mutually exclusive. by XenoPhage · · Score: 5, Interesting

    Since when is a column store database and a relational database mutually exclusive concepts? I thought that both column store and row store (i.e. traditional) databases were just different means of storing data, and had nothing to do with whether a database was relational or not. I think the article misinterpreted what he said.

    Agreed. It definitely looks like a storage preference. Though column-based storage has definite benefits over row-based when it comes to store once, read many operations. Kinda like what you'd find in a data warehouse situation...

    Also, I don't think it's news that Michael Stonebraker (a great name, by the way), co-founder and CEO of a company that (surprise!) happens to develop column store database software, thinks that column store databases are going to be the Next Big Thing. Right or wrong, his opinion can't exactly be considered unbiased...

    Hrm.. You must be new here....

    --
    XenoPhage
    Technological Musings
  2. The guy... by AKAImBatman · · Score: 5, Interesting

    ...is duping himself and thus Slashdot is duping the stories by extension.

    Stonebraker has been pushing the concept of column-oriented databases for quite some time now, trying to get someone, ANYONE, to listen that it's superior. While I think he has a point, I'm not sure if he really goes far enough. Our relational databases of today are heavily based on the ISAM files of yesteryear. Far too many products threw foreign keys on top of a collection of ISAMs and called it a day. Which is why we STILL have key integrity issues to this day.

    It would be nice if we could take a step back and re-engineer our databases with more modern technology in mind. e.g. Instead of passing around abstract id numbers, it would be nice if we had reference objects that abstracted programmers away from the temptation of manually managing identifiers. Data storage is another area that can be improved, with Object Databases (really just fancy relational databases with their own access methods) showing how it's possible to store something more complex than integers and varchars.

    The demands on our DBMSes are only going to grow. So there's something to be said for going back and reengineering things. If column-oriented databases are the answer, my opinion is that they're only PART of the answer. Take the redesign to its logical conclusion. Let's see databases that truly store any data, and enforce the integrity of their sets.

  3. Re:Yea, it's all the same. by theGreater · · Score: 5, Interesting

    So it seems to me the -real- money is in integrating an RDBMS which, for usage purposes, is row-oriented; but which, for archival purposes, is column-oriented. This could either be a backup-type thing, or an aging-type thing. Quick, to the Pat(ent)mobile!

    -theGreater

  4. Re:IMS--Hierarchical DB harder to use? by cdn-programmer · · Score: 5, Interesting

    On the contrary.

    From a standard 3rd generation programing language one can read and write into flat files and we can do close to this with a hierachical database.

    We lose this with relational databases because the way the database organises data has no direct mapping to the way it might be set up in a standard programming language.

    What this means is that every transaction to and from the database must go through a literally horrible re-mapping. IE. The language data structures do not correspond to the RDBMS data structures and visa versa.

    As an example - in postgreSQL the last I looked at writing a simple row into a table where there were something like 100 columns in the row...

    In the 3rd generation programming languages this was just a simple structure with 100 entries.

    The data transfer from that structure generated a function call with more than 1000 parameters. This was to be mapped and re-mapped with each call to transfer data, this is even though the structure itself is static and determined at compile time.

    Next: There were about 10 parameters per field (column).

    1: Column name
    2: Column name length
    3: data type
    4: data length
    5: character representation ... etc

    finally 10: Address where the data lives.

    The thing is such a table could be set up very easily and populated with a simple loop that rolls in the required values via say a mapping function with about 10 arguments. This could be done ONCE at run time to prepare for the transfer of data and then the same table could be referenced for each call and simply an address could be sent with the transfer.

    Noooo.. It was dynamic and the data was encoded as parameters on the stack. This means the stack must be build and torn down and rebuilt for each call.

    Next - the implementation was so bad that the program would run in test mode with only a few parameter but it failed when the whole row was to be transfered.

    I gave up on that interface.

    ---------------

    Oracle had pre-compilers. They did the same damn thing. The code generated by the pre-compilers was just awful.

    ---------------

    While there is much good to say about RDBMS's in general. The issue I ran into was the interface from 3rd generation languages took a HUGE step backward. IMHO we should have a high level language statement called DBRead() and DBWrite(). In C this should generally correspond to fread() and fwrite(). If this is too complex then DBWriteStruct() could be implemented with suitable mapping helper function.

    Nooo...

    In the old days one could read and write into a flat file at a given location with a single statement or function call depending on the language. Of course "where" to read and write became a real issue and I do fully understand the complexity of file based tree structures and so forth, especially since I wrote a lot of code to implement these algorithms.

    The thing is now we have RDBMS and other solutions that give us the data organisational abilities we need - and we lose the ease of mapping these structures into a suitable structure or object in the programming language.

    I for one do not think we have stepped forward very far at all.

    -------------

    I'll toss in a case in point made by a good buddy of mine who just happens to be one of the top geophysical programmers in this city.

    One of his clients was running an application hooked to an Oracle database running on a fast SUN. Run times were measured in close to a day.

    Finally they removed the Oracle interface and replaced it with a glorified flat file. They clearly built in some indexing. The result is the run times dropped to under 20 minuets.

    As my buddy says - He will NOT use any RDBMS. He can take 5 views of the data comprising 1000's of seismic lines and the user can click on any trace number, line number, well tie and so forth and in real time he can modify all views of the data on as many as 5 s