The Future of Databases

"Nothing to see here. Please move along..." by codergeek42 · 2005-05-02 12:14 · Score: 0, Funny

Wow! That's a wonderful future! :-P

More like... by Anonymous Coward · 2005-05-02 12:14 · Score: 1, Funny

Turing award winner? by Anonymous Coward · 2005-05-02 12:15 · Score: 5, Funny

As in, he passed the Turing test?

Re:Turing award winner? by Anonymous Coward · 2005-05-02 12:33 · Score: 1, Funny

As in, he passed the Turing test?

That stiff? No, he won the award for interacting with someone who did.
Re:Turing award winner? by Khashishi · 2005-05-02 14:12 · Score: 2, Insightful

He must not be a slashdot user then.
Re:Turing award winner? by Erpo · 2005-05-02 20:04 · Score: 2, Funny

Turing award winner? As in, he passed the Turing test?
--
He must not be a slashdot user then.
--
Why do you say He must not be a slashdot user then.?
Re:Turing award winner? by sydb · 2005-05-02 23:36 · Score: 2, Funny

Do you often feel the need to ask questions like that?

--
Yours Sincerely, Michael.

Umm, Yep! by CypherXero · 2005-05-02 12:15 · Score: 0, Redundant

"...the greatest of these [research challenges] will have to do with the unification of approximate and exact reasoning. Most of us come from the exact-reasoning world -- but most of our clients are now asking questions that require approximate or probabilistic answers."

Just nod your head and agree...umm..yep, I concur!
Disclaimer: I don't know a damn thing about databases.

Re:Umm, Yep! by Sinus0idal · 2005-05-02 12:25 · Score: 4, Interesting

Imagine for example, you want to use a database to store information about packets flowing through your network. Thats all dandy on normal network links, but if we are talking about a multigigabit link, it is likely that your hard disk can't keep up with storing that data. Or that the hardware to do so would cost too much. So instead you could take every second packet and look at that, and approximate. This particular example is refering to data stream management systems.
Re:Umm, Yep! by Anonymous Coward · 2005-05-02 14:47 · Score: 1, Insightful

Funny, I know a lot about databases, and I don't know what the hell he was talking about either. I think he was saying that it would be cool if Java was embedded in the DB, or something.

I guess I've just learned to tune out any article that talks about "XML-enabled object-relational databases" and other nonsense.

Show me the relational model. Show me how you've made a better implementation. If you haven't done either of those, go away. Leave the buzzwords for the vendors..oops, this guy IS a vendor. It all falls into place...

Pretty long by Anonymous Coward · 2005-05-02 12:16 · Score: 1, Funny

Could someone summarize it without using the letter 'e'?

Re:Pretty long by julesh · 2005-05-03 00:18 · Score: 1

Sure, here you go: SQL db's suck. OODBs don't. You can program your DB in Java or C#, isn't that cool?
Re:Pretty long by Woody77 · 2005-05-03 05:35 · Score: 2, Interesting

(this is serious)

Aside from the access mechanism on top, really, what's the difference? I've used both (OODBs heavily), and really, I've always looked at it as a bunch of tables with columns for member variables and rows for objects.

Is it really all that different under the hood? Or is this more marketing hype/spin?
Re:Pretty long by Awful+Truth · 2005-05-09 00:51 · Score: 1

You need George Perec but he's dead.
http://en.wikipedia.org/wiki/A_Void

approximate answers.. by ShaniaTwain · 2005-05-02 12:17 · Score: 4, Funny

most of our clients are now asking questions that require approximate or probabilistic answers.'"

42..ish

--
Starsucks

Re:approximate answers.. by Tackhead · 2005-05-02 12:29 · Score: 1, Offtopic

> > most of our clients are now asking questions that require approximate or probabilistic answers.'"
>
> * 42..ish
Larry: Of course, he only had the two arms and the one head, and he called himself Jim Gray.

Melinda: But you must admit, he did turn out to be from another planet.

Larry: By my yacht! Melinda Gates!

it.slashdot.org: Infinity minus 1. Improbability sum now complete.

Larry: What are you doing here?

Melinda: With a degree in human-computer interaction and another in psychology, it was either that or back to refactoring Microsoft Bob into Longhorn::Clippy on Monday.

Bill: Oh God. Ford, this is Melinda. Hi. Melinda, my semi-cousin Ford, who shares three of the same mothers as me. Is this sort of thing going to happen every time we attempt to unify approximate and exact reasoning?

Melinda: Very probably, I'm afraid.

Bill: Bill Gates, this is a very large drink. Hi.
Re:approximate answers.. by djbckr · 2005-05-02 13:27 · Score: 1

Three words: Oracle's "MODEL" clause.
This clause, as part of a SELECT statement lets you view data, kind of in a spreadsheet format, with the ability to forecast trends based on existing data.

Pretty heady stuff...
Re:approximate answers.. by Anonymous Coward · 2005-05-03 04:37 · Score: 0

Jobs: iLife. Don't talk to me about iLife. It's all sooo depressing.

Why complicate things so much? by bigberk · 2005-05-02 12:20 · Score: 5, Interesting

How many times have we heard of huge sites going down because databases become corrupt or unrecoverable, or of the huge resource strain (memory and CPU) from a large database?

In my opinion, the future of databases is nothing so complicated as pitched here -- but rather a move to simpler, more reliable back ends where the filesystem is the database. This is certainly the vision pitched by Hans Reiser and reiserfs, which aims to put more database like intelligence within the filesystem. So you eliminate extra unnecessary layers that just eat up resources and create fragile databases.

Re:Why complicate things so much? by fireboy1919 · 2005-05-02 12:29 · Score: 3, Informative

Ah yes. Harken back to the earlier days, when databases were just files on a file system, and did not distribute the resourses at all.

Certainly that's not going to lead to more crashes.

Certainly it's a better idea than, for example, distributing the databases and using load-balancing and regularly scheduled back-ups to ameliorate the loss of the least realiable portions of a databases design - the harddrives.

When you've only got a hammer, everything seems like a nail...what does Hans Reiser do? He could be right. Microsoft is jumping on the filesystem-database wagon with their new filesystem, and we all know that if anyone knows and cares about reliability it's Microsoft.

--
Mod me down and I will become more powerful than you can possibly imagine!
Re:Why complicate things so much? by dioscaido · 2005-05-02 12:34 · Score: 4, Insightful

This random example just server to clarify what you mean -- How implement a airline database that has entries for 1,000,000 customers, 150,000 flights a year, and 12,000,000 reservations a year? and what would a query look like to find an open flight on a particular date range, and register a reservation? And how would doing all this on a ReiserFS be any less prone to data corruption than an often backed up database?
Re:Why complicate things so much? by Anonymous Coward · 2005-05-02 12:34 · Score: 0

Are you complicating things by re-implementing a dbms from scratch embedded within filesystem code? To add complication, such as functionality, is the essence of a new software release. :D
Re:Why complicate things so much? by Anonymous Coward · 2005-05-02 12:46 · Score: 1, Informative

On a journalled and properly transacted database, it could never cause corruption.

I've managed databases that are 150 GB large with hundreds of thousands of records in SQL Server 2000 and I have only ever seen one corruption. That was caused by a schema change on a large table during which the power to the machine was cut. When the machine came back online the database was marked as corrupted and set aside for about an hour while it was being recovered. The incomplete changes from the transaction log were rolled back and the database was back online as if nothing happened.
Re:Why complicate things so much? by dioscaido · 2005-05-02 13:08 · Score: 1

I agree. I'm trying to figure out what the original poster meant by their post, which seemed a bit nonesensical.
Re:Why complicate things so much? by Anonymous Coward · 2005-05-02 13:16 · Score: 0

"So you eliminate extra unnecessary layers that just eat up resources and create fragile databases."

Agreed - this manifests itself in IBM DB/2 in fact, where the native OS filecache gets in the way (believe-it-or-not) of DB/2's performance!

That happens because of double-buffering by DB/2 itself & then again by the OS cache!

(Which, imo, in Windows-NT based OS' because it is a logical (vs. block type) cache, which needs contiguous files for read-ahead to benefit it, is the problem... )

Databases, in general, are NOTORIOUS for updates/deletes & temp. files/scratch tables. That lends to file fragmentations, but also, under multi-user use patterns?

The generation of TONS of EXCESSIVE copy-on-write overheads in shared sections of set associative memory/data used on say, common records being simultaneously updated/edited etc. by multiple users...

(Say in a billing system scenario).

The filesystem itself being the db engine might help alleviate this type of thing occuring: This ill effect of double-buffering adverse OS cache conflicts w/ DB engines like IBM DB/2...

To research this problem? Look up the term:

Direct IO

and

DB/2

You'll see exactly what I mean, & it is noted that you have to set this on AIX... or, use another registry hack to the Operating System on NT-based Os', called DB2NTNOCACHE...

APK
Re:Why complicate things so much? by dioscaido · 2005-05-02 13:33 · Score: 1

Sorry, I was a bit unclear in my post. I meant that I'd like the original poster to clarify what they mean. Given the example I pose, how exactly would you 'go simple' and use reiserfs instead of a full SQL system that handles this problem perfectly?
Re:Why complicate things so much? by NineNine · 2005-05-02 13:33 · Score: 0

How many times have we heard of huge sites going down because databases become corrupt or unrecoverable, or of the huge resource strain (memory and CPU) from a large database?

Occasionally. And it's almost always due to user error, misconfiguration, or using the wrong tool for the wrong job. Your use of the word "site" makes it obvious that you think that databases are used primarily for websites, or that websites are a test of a real database. How often do you hear of Visa or Mastercard going down? How about the NYSE? How about banks? I would say that "sites" are rarely used as any kind of litmus test for databases. They're simply NOT "fragile".
Re:Why complicate things so much? by Anonymous Coward · 2005-05-02 13:44 · Score: 0

Hard drives, especially ever since modern technologies like RAID were invented decades ago, are not the least reliable portions of databases.
DBAs are. A great paper on TerraServer by microsoft when they were shooting for some number of 9's uptime revealed that operator error was the single biggest cause of downtimes.

You might have been trying to be sarcastic; but in reality your statements were dead on and very true.
Re:Why complicate things so much? by quinnharris · 2005-05-02 13:45 · Score: 3, Interesting

Hans Reisers vision is about unifying namespaces (filesystem, relational database, XML, etc...) by providing the functionality in a filesystem to make this reasonable. In otherwords, making the file system better than current databases.

Do we evolve the file system into a database (Reiser approach) or evolve a database into a file system (Microsoft WinFS approach)?
Re:Why complicate things so much? by jbplou · 2005-05-02 13:49 · Score: 1

Why even go to files, everything should be kept in one large file, that way we don't have to worry about opening multiple files or searching them, this approach is much simpler and will lead to no database crashes. Those extra layers like caching data pages, what did they do but eat up resources. Or about DBMS's annoying layers of backup options, who needs those. Index that make loading data quicker, come on we all know the b-tree concept was just made up, indexes actually do nothing. Finally whats up with relationships, what a hassel, lets just store all the data for all our records on every record and update it every where it occurs when we need to change it, that way we don't have the unneeded overhead of joining tables and only storing data once.
Re:Why complicate things so much? by drsmithy · 2005-05-02 13:59 · Score: 3, Informative

Microsoft is jumping on the filesystem-database wagon with their new filesystem, [...]
No, they're not. WinFS is *not* a filesystem, it's a DB layer that sits on top of the filesystem.

And when you consider NTFS *on its own* (like BFS) has the capabilities to do most of what WinFS is supposed to achieve, WinFS just looks sillier and sillier...
Re:Why complicate things so much? by Nutria · 2005-05-02 14:27 · Score: 2, Interesting

<i>databases that are 150 GB large with hundreds of thousands of records</i> That's not very big. It's down right small, in fact. These figures, on one of many systems I manage, are about 30 minutes old. And they don't include index space, rollforward logs, etc, etc. Names have been changed for privacy, of course. TABLE_NAME CARDINALITY TOT_BYTES TABLE_1 850,719,662 195,665,522,260 TABLE_2 756,309,106 223,867,495,376 TABLE_3 317,181,446 72,951,732,580 TABLE_4 179,099,344 11,462,358,016 TABLE_5 103,419,546 4,343,620,932 TABLE_6 95,075,479 9,222,321,463 TABLE_7 67,378,918 20,820,085,662 TABLE_8 64,940,525 12,598,461,850 Since I am fully aware that "my" databases are no where near the biggest, this is not the beginning of a pissing contest.

--
"I don't know, therefore Aliens" Wafflebox1
Re:Why complicate things so much? by Tack · 2005-05-02 15:42 · Score: 2, Insightful

That's not very big. It's down right small, in fact. [...] [T]his is not the beginning of a pissing contest.

I must be missing something.

Jason.
Re:Why complicate things so much? by Baki · 2005-05-02 17:05 · Score: 1

Databases such as oracle used to do this, use the underlying disks directly without an extra layer, i.e. use raw devices. In a sense, the database and filesystem were one. However there has ben a move away from this: using a separate filesystem below makes it easier to manage and move around, while costing only about 1% performance.

Note: huge sites don't go down because of 1% performance loss due to an extra filesystem layer.

And I doubt very much that applications that use filebased storage instead of a real database are more reliable. What do you get that idea from? Concurrently updating and searching to complex structured data is hard, and doesn't get easier by doing things the simplistic way.
Re:Why complicate things so much? by Nutria · 2005-05-02 17:09 · Score: 1

I must be missing something.

GP thinks that 150GB is a large database.

--
"I don't know, therefore Aliens" Wafflebox1
Re:Why complicate things so much? by Anonymous Coward · 2005-05-02 17:11 · Score: 0

I thought Jim Gray gave up on dropping ACID at MS...now his mind is gone completely...
Re:Why complicate things so much? by Master+of+Transhuman · 2005-05-02 17:36 · Score: 1

Well, I suppose we could go back to the original UNIX way of doing things:

Everything is a string of bytes on the hard disk.

No file system at all.

Somehow I don't think that's going to take off. Grep is great - but it's not THAT great.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Re:Why complicate things so much? by Nefarious+Wheel · 2005-05-02 17:53 · Score: 1

Why even go to files, everything should be kept in one large file
Works for search, not quite so well for concurrent updates, and the security controls would sort of suck.

You'd be loading all the locks back into the applications layer. Google model works fairly well for lookup, distributed db leafs across multiple (many tuples) systems. However I suspect the lock mechanism behind the search, the population of the database, would be fairly arcane, and not general-purpose at all.

--
Do not mock my vision of impractical footwear
Re:Why complicate things so much? by Anonymous Coward · 2005-05-02 18:05 · Score: 0

The file system already *IS* a database. Albeit a simple one with a very limited set of attributes per tuple.

Oh, you mean in the iFS (Oracle) or WinFS (Microsoft) sense, where embedded attributes (like, the so-unused MS Office document attributes) in tuples (i.e., files), as well as content, is indexed in some fashion?

Well...they just suck in application, and seem to be a solution looking for a problem.

Perhaps, someone at Postgres or MySQL will figure out what kind of conditions qualify for some sort of Bayes-based indexing, both on foreign key relations, intra-table relationships and intra-data qualitative analyses.

Maybe Bayes indexing would provide a case of not requiring foreign keys?

Inserting data into a table would automatically run various Bayes classification/training operations. Then, perhaps "full-text indexing" would become slightly meaningful?

RDBMSs become corrupt sometimes because of data file integrity, but sometimes not.

Do you think that somehow leveraging the things that most database systems do will onto the filesystem will somehow make both better (i.e., database more robust and filesystem faster)?

ReiserFS is a clever way of spreading out file data at the physical layer, especially file description data. It cares nothing really about the human-relevant content of the files.

If some databases are "fragile", it's because they're under serious transactional loads, and perhaps should be on better hardware (i.e., IBM Z-Series). Throw 100000 simultaneous file i/o, create, delete and read-write operations on a Unix box, and you'll see how fragile the file system is, too.

Seriously I/O-bound OLTP databases tend to still run without probems on mainframes, even if they're COBOL or APL-based systems like Burroughs A15's.

Next come the monster 2^n systems from HP and Sun, and perhaps, P-series AS400 boxes.

It's kind of like arguing that there's got to be a better solution than a bicycle. I mean, it's like 100-year old tech! But in reality, no, there isn't really much better in general. The last great innovation was Campagnolo's derailleur, and that came out in the 1930's.

'Bents and other HPVs have their special niches where they out-perform a regular bicycle, and their adherents.

But, in general, it's hard to beat a regular bicycle.
Re:Why complicate things so much? by Anonymous Coward · 2005-05-02 18:13 · Score: 1, Interesting

I think really that WinFS is Microsoft's implementation of Oracle's iFS, the Internet File System. Remember that? (Oracle 8/8i days, but still installable from Oracle 9i).

Again, until there are sort of reliable and meaningful machine-based classification engines that can peep into all file types and extract and index some sort of "meaning", then it's going to always rely on a human to do the classifications.

Since no one uses document properties in MS office documents, even though Office has its own indexing engine, that other COM structured storage docs can do the same thing, as well as XML files, etc., how is a computer system going to keep up with things, especially when a relevant meaning or context today is totally different in 6 months?
Re:Why complicate things so much? by term8or · 2005-05-02 22:49 · Score: 1

That's not very big. It's down right small, in fact. [...] [T]his is not the beginning of a pissing contest.
I must be missing something. The database dealing with a million passengers and 150,000 flights is not at the edge of database technology in terms of size. It is not small but it is certainly not huge compared to the UK governments new plan to produce a database that contains the details of its entire population of 60 million (i.e. address, biometric information etc) and the inevitable links to the database that contains details of all the cars in the uk, the database that contains all the medical information for citizens of the uk, the credit checking database, the banks credit card / banking information database, the database that contains all the tax information of citizens in the uk... all of which are small compared to the equivalent databases for the USA, and civilian projects like the internet archive project.

--

"As a writer / novelist you might want to spellcheck your sig. :) " - AC
Re:Why complicate things so much? by Rich0 · 2005-05-02 22:58 · Score: 1

That's why /dev/hda is chmod 777 on my system. Who needs all this filesystem stuff anyway?
Re:Why complicate things so much? by dioscaido · 2005-05-02 23:19 · Score: 1

sigh... i should have written my post more carefully. My example was purposefully very simple. A system like the one I described would be very easy to implement in a regular database, with a handful of tables and a few stored procedures. I would have hoped the poster of the 'go simple' idea would have gone ahead and explained how such a system would have been done directly on a filesystem -- an idea that seems absolutely ridiculous.
Re:Why complicate things so much? by Anonymous Coward · 2005-05-03 00:29 · Score: 0

Compared to: How many times have we heard of huge sites staying up because the databases are properly maintained and recoverable, or the huge efficiencies gained by utilizing the database optimizer.
Re:Why complicate things so much? by mattspammail · 2005-05-03 01:44 · Score: 1

What an excellent point, except that you just described one of the main points of TFA.

One interesting development worth noting, however, has to do with the integration of database systems and file systems. Individuals who keep thousands of e-mail messages, documents, photos, and music files on their own personal systems are hard-pressed to find much of anything anymore. Scale up to the enterprise level, where the number of files is in the billions, and you've got the same problem on steroids. Traditional folder hierarchy schemes and filing practices are simply no match for the information tsunami we all face today. Thus, a fully indexed, semistructured object database is called for to enable search capabilities that offer us decent precision and recall. What does this all signify? Paradoxically enough, it seems that file systems are evolving into database systems--which, if nothing else, goes to show just how fundamental the semistructured data problem really is. Data management architects still have plenty of work ahead of them before they can claim to have wrestled this problem to the mat.

--
Now accepting PayPal donations!
Re:Why complicate things so much? by KagatoLNX · 2005-05-03 02:19 · Score: 1

This does not complicate things. It simplifies them. They are not talking about writing a database application, they are talking about writing DATABASES! So, since your question is about how to write a database-based application when the poster is offering a solution for a database implementation, this is kind of an apples-oranges situation.

To expound on Reiser4, it addresses the database problem-space by allowing individual files to contain extended attributes that may be queried on using a boolean algebra (think a table row with an optional payload).

As the query language used is very Reiser specific, and very general purpose (and not people oriented), I won't bother to construct a query, but it would not be an extraordinary feat of computer science to write an SQL to Reiser4 query converter.

More to the point, the tabular nature of most enterprise databases is the rub here. In Reiser4, you can have path structures, you can have indexed queries across these paths, and you can have it at database speeds. Try to do a hierarchical data structure in SQL. It can be done, but performance for some operations can be...suboptimal.

Furthermore, Reiser4 not only has your standard ACID guarantees available, it also has what Hans calls a transcrash. Transcrashes are pretty much ACI but not so much D. This is very useful in a backend scenario where a cluster of machines provides the Durability (which is where I think databases are going).

Also, hard links beat any foreign key scheme I've ever fought with.

If you read what the poster means, new databases may be based on these filesystems. An SQL translator and distributed transaction processor based on Reiser4 would probably compete well in todays DB performance world. If not, I suspect Hans would make sure it did eventually. :)

I've always fancied writing an XML parser that backended in Reiser4 (containment = directory, attributes = xattrs) just to play with the indexing. When Reiser4 hits a stable tree, I might...

--
I think Mauve has the most RAM. --PHB (Dilbert Comic)
Re:Why complicate things so much? by poot_rootbeer · 2005-05-03 03:13 · Score: 2, Insightful

How many times have we heard of huge sites going down because databases become corrupt or unrecoverable, or of the huge resource strain (memory and CPU) from a large database?

How many times? Not all that many, in my experience. How many times when the sites were running off a hardy RDBMS like Oracle, rather than something in the MySQL range? Even fewer.

Of course, "websites going down" is not exactly the best indicator of database reliability in the first place...

While you're proposing making databases more like filesystems, what Reiser and others are actually doing is trying to make filesystems more like databases. That's an important distinction to note -- databases are the superior design.
Re:Why complicate things so much? by LordMyren · 2005-05-03 03:23 · Score: 1

"Do we evolve the file system into a database (Reiser approach) or evolve a database into a file system (Microsoft WinFS approach)?"

The answer is strikingly similar for both cases; we read a press release 4-15 years ahead of time and begin waiting...

Still waiting.

In the meanwhile, FUSE is available for those wishing to play around with the existing FS.

Myren
Re:Why complicate things so much? by zardo · 2005-05-03 03:34 · Score: 1

While we're at it, unify the database and the filesystem with object oriented technology. The two obviously have a common heritage.
Why are we still using relational databases?

Don't answer that. It's a rhetorical question. :)
Re:Why complicate things so much? by Anonymous Coward · 2005-05-03 03:40 · Score: 1, Insightful

As others have pointed out....you're talking out of your ass.

But...just for kicks...

How many times have we heard of huge sites going down because databases become corrupt or unrecoverable, or of the huge resource strain (memory and CPU) from a large database?

Um...never?

How many times have I seen a small badly coded site not respond to me because of an application error? A few...one this week, and perhaps one 6 months ago.

But the database going down? A REAL database? (Insert my disparaging view of mysql as not counting as a real database here). Umm..never.

Come on...let's get real here. "Fragile databases"...hahaha
Re:Why complicate things so much? by 2short · 2005-05-03 03:42 · Score: 1

If your database is more "fragile" than your file system, you are doing something very wrong. Half the point of a DB is to build on the unstable, corruptable base that is the file system, and achieve something stable and, in particular, incorruptable. The other half of what databases are for is querying power. If the queries you do with a database can be handled by a filesystem, you don't know what databases are for.

Reiser, as far as I can tell, wants to make filesystems more like databases. Because even for doing things handleable by a filesystem, some more DB-like query power and corruption resistance would be nice. But that doesn't mean file systems will now handle all the things DBs do.

In answer to your original question, I can't recall a huge site going down because a DB became corrupt, got an example? As for the resource consumption, the DB is not throwing the memory and CPU time away, it is doing stuff with them. You seem to be proposing "Just don't do that stuff." Um, yeah.
Re:Why complicate things so much? by petermgreen · 2005-05-03 04:03 · Score: 1

does wikipedia meet your definition of huge site?

remember the "power currupts power failure currupts absoloutely" incident

lukilly there was one slave db server being used for offline queries that wasn't actively replicating at the time of the power failure and it was possible to recover this database server and bring it up to date from the binlogs. From here the data could be restored fairly quickly to the other db servers (it still took more than a few hours) and service restored.

if that non replicating slave hadn't been it may have been nessacery to restore from a dump and replay a huge number of binlogs which could have taken a day or so.

it is belive that the problem was caused by hard drives claiming data was written to disc when in fact it was still in buffer.

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Re:Why complicate things so much? by peetola · 2005-05-03 04:16 · Score: 1

if anyone knows and cares about reliability it's Microsoft I would say the majority of slashdotters know and care about sex, but they aren't really experts on the subject.
Re:Why complicate things so much? by Anonymous Coward · 2005-05-03 08:37 · Score: 1, Funny

"I must be missing something."

The GP was attempting to stress how important it is not to mistake a database for a urinal.
Re:Why complicate things so much? by adamgolding · 2005-05-03 23:40 · Score: 1

please enligten me then--i am not a programmer, but i am frustrated by the filename-directory organization of the windows filesystem--how can NTFS work like a database? ok, i not sure a normal database can even do this, but I want to have music and PDF files and so on sortable/searchable by multiple chronological criteria--i.e. date of premiere, date RANGE of composition, date of first publication, various revision dates, date of recording, release date of recording, release date of the relevant compilations, date of encoding--and so on... don't we need a database integrated into the file sytem to achieve this kind of interaction with data?
Re:Why complicate things so much? by hansreiser · 2005-05-04 02:45 · Score: 1

I don't think that whether it is a filesystem or a database will have much effect on reliability. That depends on who wrote it, and more importantly (I wish it were otherwise), how long ago they wrote it.

Reiser4 has some atomicity functionality that will make applications that lack the sophisticated logging features of databases more reliable and secure, and it will improve performance and simplify application code, but one should not imagine that it offers more reliability than a traditional database with sophisticated logging built in to it.

Hans Reiser
Author of ReiserFS
Re:Why complicate things so much? by bigberk · 2005-05-04 04:46 · Score: 1

Thanks for the reply Hans! Quite something seeing your post on slashdot. I guess I was wrong about the reliability being a consideration... but from having implemented a database system based on reiserfs I can say that the performance is spectacular compared to our earlier SQL version which for us was the main consideration.

A real problem comes full circle by Anonymous Coward · 2005-05-02 12:22 · Score: 5, Interesting

Data is data. Just data. Save it, read it, sort it how you like. Efficient results mean having rapid, low-latency access to data.

Add code to it, and you have data+code.

OF course, code is data, and thus data can be treated as code, and handled by other code. LISP does this moderately well.

But you can't avoid the fact that, as it stands, databases are just engines for keeping your data structures outside your code, or when you add code to them, engines for reading your data structure for you so that you don't have to think about how to do it. ... except that you still do, because SQL isn't a way of avoiding logical errors. ... and that they still don't save time. At best, they allow for some parallelism, external access to the data, and a separation of concerns.

I'm getting rather tired of the fad that databases should be tacked on to everything, ranging from a shopping list to guidance systems. When did adding overhead become the mark of skill?

Re:A real problem comes full circle by zappepcs · 2005-05-02 12:52 · Score: 4, Insightful

I think I agree with the parent. Databases are methods of storing and retrieving data. Trying to make queries fuzzy, or less structured is just wrong.

If you want to be able to ask probablistic type queries of a database, you need to add some code between you and the database.

More to the point, the fuzzier your logic is, the higher the probability that your database will not contain all of the answers on its own, and you will have to cross reference your data to the data owned by someone else or gathered from a different disparate source.

It sounds like M$ is going to try to re-invent data warehousing? and then of course, patent it.

Trying to make the database do everything is not right and simply doesn't make sense. The code that accesses the data for you needs to do the fuzzy probablistic stuff.

P.S. I have no faith that M$ (no matter who they hire) can effectively provide the code required to make it work in the idealistic manner spoken of... mostly because they would have to patent accessing other people's data before they could do it.

Just my thoughts

--
Support NYCountryLawyer RIAA vs People
Re:A real problem comes full circle by kfg · 2005-05-02 13:04 · Score: 2, Insightful

When did adding overhead become the mark of skill?

The second it became profitable to market it as such.

KFG
Re:A real problem comes full circle by NineNine · 2005-05-02 13:40 · Score: 0

Other than the fact that one of the article's authors works at Microsoft, there is not a single reference to Microsoft. What article are you reading? And if you think that MS is an important player with enterprise-class RDBMS', what are you smoking?
Re:A real problem comes full circle by NineNine · 2005-05-02 13:43 · Score: 1, Informative

When did adding overhead become the mark of skill?

Never. But using databases to encapsulate business logic (PL/SQL, for example) has been a mark of good developers and engineers for years. Apparently, you've missed that mark...
Re:A real problem comes full circle by Anonymous Coward · 2005-05-02 14:41 · Score: 1, Interesting

Thank you for your kind assessment of my abilities; something I had had great difficulty in achieving with the help of professional training.

Courtesies aside, I'm very well aware of the uses of encapsulation. I am in fact in the process of writing just such a system right now (oracle, python to handle a CGI frontend, and no, the choice of technologies can not really be said to be my own).

I have been soul-searching in my effort to establish what, precisely, oracle brought to the party. I identified two changes effected by its presence:

1) It permits readily scripted queries on accumulated data. This isn't as big a gain as it might seem, since the same could be said of a directory full of flat files. Or files in a format suitable for reading in and analysis by a custom-written program. But I will allow that this benefit does exist.

2) It adds a lot of overhead. Yes, oracle is a beefy beast. So is MS-SQL, and Informix, and DB/2, and Sybase, and many others.

The trouble is that they also add overhead in terms of manpower. All of a sudden your system administrators have more machines to watch. (You do run critical data on redundant systems, don't you? I mean, it is critical, right?)

They also add overhead in terms of cost. The packages themselves are expensive, and the machines to run them are expensive, and the electricity is expensive, and the cooling is expensive, and the floorspace is expensive, and when you need more, simply expanding isn't all that easy any more ...

They add a need for DBAs. It has been amply observed that your DBA can't be a fool; it's not a game for amateurs. Especially not with enterprise systems. Especially not these days.

And so on, and so forth. I'm sure anyone as deeply knowledgeable as you plainly are with the day-to-day realities of fortune-50 companies will be able to add quite a few more line items in terms of cost, man hours, internal communication needs and so on.

And what did it buy? Anything which can't be achieved with a few well-chosen data structure libraries and routines? Anything which a developer, for all the hair-pulling over SQL's quirks on different platforms (and the quirks are there, rest assured) could not have done for the same trouble?

Maybe, if you're storing terabytes of data and utterly require hard guarantees of retrieval times to any given piece of data. I have worked in such an environment, and the funny thing is that for all the vaunted scalability and power of these enterprise database systems ... they have real trouble scaling to that level. So does everything else, or almost everything, but I find it interesting that at that level even the mighty Google rolls a lot of their own code. Specialised need? Yes, but look how flexible they've been in application of it.

Ultimately, I'm not saying the emperor has no clothes. He just looks rather natty in his polkadotted speedo.
Re:A real problem comes full circle by Anonymous Coward · 2005-05-02 15:09 · Score: 0

SQL isn't a way of avoiding logical errors

Thats what constraints are designed to help with.
Re:A real problem comes full circle by pfafrich · 2005-05-02 21:44 · Score: 1

I think I agree with the parent. Databases are methods of storing and retrieving data. Trying to make queries fuzzy, or less structured is just wrong.
Its not so much a question of the data but more the way the actual data is structured. Databases impose a rigid structure on the data, or a leaky abstraction. If the data you have does not fit a rigid structure then the underlying asumptions of a database do not fit your needs well.

There are certain problems where this sort rigid structure works well, say a catalogue of car parts. But theres others where the rigid structure can prove to be more a pain than an advantage.

Take for example representing the medical records of a person. This has potential to be a very complicated structure. Indeed any structure imposed may well fall down at some point. There will be patients with rare conditions, who will have some specalised diagnostic tests. The database designer will not no aproir how to represent the results from these test. Fitting such data into the structure could be a maintance nightmare (maybe this is why so many of the big government IT systems have such a hard time). We could play a fun game here: you propose a data structure, I'll come up with with a counter example which will not fit.

I guess the problem is we are trying to represent the world here. Its known that there is no ontology (clasification system) which will serve all purposes. Theres philosophical problems about the nature of languare and how we can represent grammer and draw inferences. Yet what do databases offer us to represent textual data: a block of text! Fifty years of computers and they best method we've comeup with for representing a richly structured piece of writing like a wikipedia article is: a block of text. Ok, theres a bit of markup there but its all just stuffed together in a textblob. I might as well just dump it in a file for all the good a database does, indeed thats what yahoo does.

To my mind databases are broken beyond belief.

--
There are four sorts of people in the world: fools, lunatics, idiots and morons. - Umberto Eco, Foucaut's pendulum.
Re:A real problem comes full circle by Anonymous Coward · 2005-05-02 22:18 · Score: 0

The database designer will not no aproir how to represent the results from these test. Fitting such data into the structure could be a maintance nightmare

Nope. Database schemas change all the time.
If you aren't going to give any more information than "there will be results", you are right, this can't be designed for.
But, shirley there will be a 'positive' or a 'negative', so we can store that. The test will have a name so we can store that.
Now, what else do you need? Name of Dr. that ran it? A likelihood of false positive? Likelihood of regression?

In fact, even more to the contrary, rigidly structured data is the *worst* : trees go into a database very very badly. Just about every database you'll ever see has some dirty trick to shoehorn a natural tree datatype into a relation.("well, we'll just add a new client record for each department!")
Re:A real problem comes full circle by chthon · 2005-05-02 23:30 · Score: 1

When did adding overhead become the mark of skill?

When Microsoft started dominating the market.
Re:A real problem comes full circle by Ed+Avis · 2005-05-03 00:42 · Score: 1

SQL isn't a way of avoiding logical errors
No language is magic and can prevent all errors, but in general moving from a lower-level to a higher-level language can help avoid some kinds of mistakes. You're less likely to get memory trampling bugs in Lisp than in assembler, though there are other tradeoffs. SQL is a higher-level way of manipulating data than most programming languages: what might take a hundred lines of Java can often be done in ten lines of SQL. That in itself reduces the chance of logic bugs. Then SQL also has strong type checking which also (in my experience) helps catch programming errors.

Most RDBMSes give you more than just SQL: they also offer ACID properties with transactions, which simplify a lot of programming work. If something goes wrong halfway through an operation, you can tell the database to roll back and everything is put cleanly back how it was at the beginning of the transaction. This is often a lot easier than writing your own undo mechanisms to start making changes to a data structure and then unpick them.
I'm getting rather tired of the fad that databases should be tacked on to everything, ranging from a shopping list to guidance systems.
If it is a fad, it is one that has been around for thirty years or more. If hardware in 1985 was powerful enough to run a relational database and offer full transaction safety, it's powerful enough now. I think the current fad is for people to ignore the past few decades of research and experience and try to do without a database system. Good luck to them, but they should make sure they don't need the useful things an RDBMS gives you.

--
-- Ed Avis ed@membled.com
Re:A real problem comes full circle by Anonymous Coward · 2005-05-03 01:21 · Score: 0

The push to put code in the database is a marketing scheme to get projects deeply coupled to a single vendor solution. e.g. SQL Server. This is why MS loves it.
Re:A real problem comes full circle by Anonymous Coward · 2005-05-03 01:48 · Score: 0

What are opions (sic)?
Re:A real problem comes full circle by Anonymous Coward · 2005-05-03 01:49 · Score: 1, Interesting

Uh, no, you misunderstood the problem. The point of fuzzy database queries is to get queries that return faster or give early partial results. You can't do that efficiently outside the database architecture because you don't have access to all the internal stats the DB is using.

So you can't just "add some code between you and the database."

This is a real-world problem, not just something he made up. Today we have enormous terabyte databases that can take forever to do a query, so finding ways to get faster, but approximate, answers is something I know many people are researching currently.
Re:A real problem comes full circle by Inkieminstrel · 2005-05-03 03:21 · Score: 1

Perhaps he is smoking Microsoft SQL Server?
Re:A real problem comes full circle by Anonymous Coward · 2005-05-03 03:55 · Score: 0

I disagree. A good database is NOT just data. It's data, organized in a logical way, with a set of indexes lurking in the background that let you get to that data in an efficient way.

So what is wrong with having the internal indexing mechanism understand and build fuzzy indexes? If the goal is efficiency, then I should not be forced to build an additional index between my front end and my database. That is slow, and that is wrong.
Re:A real problem comes full circle by WaterBreath · 2005-05-03 04:05 · Score: 2, Insightful

Yet what do databases offer us to represent textual data: a block of text! Fifty years of computers and they best method we've comeup with for representing a richly structured piece of writing like a wikipedia article is: a block of text. Ok, theres a bit of markup there but its all just stuffed together in a textblob. I might as well just dump it in a file for all the good a database does, indeed thats what yahoo does.
To my mind databases are broken beyond belief.

Well let us know when you think of an alternative to expressing concepts through some sort of language, in a way that simulatneously allows the measure of definition and ambiguity that all conceptual (i.e. ideas, not strict data) communication requires. I'm sure there will be great fanfare, as it would revolutionize life for all humanity.

I propose an alternative complaint that gets more to the source of the issue:

"Thousands of years of communication and the best method the human race has come up with for representing a rich tapestry of ideas and concepts is: words. Aural and written communication. Ok, sure writing has a bit of markup in there and speech has little pauses strewn about, to delimit blocks of thought, but it's all just stuffed together as a stream of letters or phonems. I might as well just draw a picture in the dirt.

To my mind, languages are broken beyond belief."

Leaky != broken. However, using the improper abstraction = a waste of time, money, and effort. The problem is probably not that databases "only" support text as "blobs" of characters. It is more likely that people insist on applying a data-oriented abstraction (it's right there in the name: DATA-base) to a fluid body of information that requires human language.
Re:A real problem comes full circle by pfafrich · 2005-05-03 05:39 · Score: 1

Good response.
As to alternatives, ideas from the semantic web get somewhere close. This is actually a real problem for me. We're trying to produce an open source plant "database" permaculture.info. We keep running into big problems with data representation, any schema we use seems to be very week with holes in it, and instantly limiting what can be represented. Flexability and extensibility are central requirments and, as you say, RDMS are probably the wrong tool.

Is language that broken? Language is sort of words plus grammer, both of which have shown great potential for flexability and extensability.

Interesting time.

--
There are four sorts of people in the world: fools, lunatics, idiots and morons. - Umberto Eco, Foucaut's pendulum.
Re:A real problem comes full circle by Decibel · 2005-05-03 10:39 · Score: 1

The problem with your reasoning is that it becomes extremely expensive as database size grows. The reason to add analytic capabilities to the database are because it's more efficient. 10 years ago, the same argument could be (and was) made about adding OLAP functions to databases. Yet now they're in all of the 'big 3' in one form or another.

If you only need a database as a way to store and retrieve data then you don't need a database, you need a flat file. :P
Re:A real problem comes full circle by PigleT · 2005-05-03 23:05 · Score: 1

Well yes. The other perspective of this is that if you write for a RDBMS, it takes over your programming life, particularly if you have to optimize the number of queries in order to minimize network-related latency.

OK, now I've just realised what I want most is a generic tie - whatever your data, serialize it (like pickling in python), with random-access updates/retrieval. Hmm :)

--
~Tim
--
.|` Clouds cross the black moonlight,
Rushing on down to the circle of the turn

Great Article by Spaceman40 · 2005-05-02 12:23 · Score: 5, Insightful

The requirements for a database today aren't too much different from those twenty years ago - except for what we want to get out of them.

Now that data mining is a $[insert large number here]million industry, databases are being asked to do a lot more processing with this data than before. For example: old database query = get these attributes from tuples that match this pattern. New database query = determine how likely a user who has accessed 30 or more times this last month is to subscribe to the second-level pay service within the next ninety days, with or without an email advertising said service.

--
I [may] disapprove of what you say, but I will defend to the death your right to say it.

In other words ... by Daniel+Dvorkin · 2005-05-02 12:26 · Score: 5, Insightful

... MBA's want the magic glowy box to do their thinking for them.

Fortunately, Microsoft will be there to take their money.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.

Re:In other words ... by Anonymous Coward · 2005-05-02 12:32 · Score: 1, Informative

When all the glowy boxes said sell the stock market crashed.
Re:In other words ... by furchin · 2005-05-02 12:39 · Score: 4, Funny

MBA's want the magic glowy box to do their thinking for them.

If I had to pick between a magic glowy box and an MBA to show signs of intelligence, I'm definitely going with the magic glowy box.
Re:In other words ... by YrWrstNtmr · 2005-05-02 12:40 · Score: 5, Funny

I don't know how many times I've heard that thought process over the years.
[MBA tool]"I want to come in in the morning, push a button, and have the program distribute all my stuff."

[me]"If I could make it do that, I could make it push its own button, and the company wouldn't need you anymore."

[MBA tool]"Oh."
Re:In other words ... by Daniel+Dvorkin · 2005-05-02 12:44 · Score: 1

Good point.

--
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Re:In other words ... by the+eric+conspiracy · 2005-05-02 13:08 · Score: 1

Once they have the glowy box that thinks they will start looking for a way to outsource it.
Re:In other words ... by jallen02 · 2005-05-02 13:39 · Score: 1

So... what would someone with a computer science degree and an MBA represent in your universe?

A business person masquerading as a computer scientist?

A paradox?

The end of the world?

J
Re:In other words ... by Anonymous Coward · 2005-05-02 13:46 · Score: 0

there are plenty of people with CS degrees who couldn't code their way out of a paper bag. I work with some of them.
Re:In other words ... by jallen02 · 2005-05-02 15:32 · Score: 2, Interesting

Tragic really. I have seen it as well. I always held CS people to a higher standard for coding.. what with the hours of courses you spend just learning how to design and implement basic things like data structures. How do people make it out of college without being able to code these things? It always amazes me.

Jeremy
Re:In other words ... by identity0 · 2005-05-02 19:53 · Score: 1

My magic glowy ball says, "All signs point to yes".
Re:In other words ... by Rich0 · 2005-05-02 23:08 · Score: 1

I know somebody who works for a defence department contractor. He passed around the joke that there are only 10 types of people in this world - those who understand binary and those who don't. One person in the room didn't get it. Suddenly everybody understood the QA problems they were having...
Re:In other words ... by jallen02 · 2005-05-03 01:36 · Score: 1

lol yeah.

My co-worker has a think geek shirt with that saying on it. It is quite fun when you are out at lunch and someone exclaims, "Only ten people, why aren't they listed on the shirt?".

J
Re:In other words ... by dykmoby · 2005-05-03 02:58 · Score: 1

If you can (and it's rare) actually build the program the has it's own button and send out all the stuff.

Managed to build a couple of star schemas that reduced a 2 week, 2 MBA per month process to something like 5 hours.

They were thrilled until they realized they had effectively half the work to do.

A fight to the death between two MBAs is sick, sad yet funny

--
Fear, Uncertainty and Doubt = [citation required]
Re:In other words ... by Anonymous Coward · 2005-05-03 04:01 · Score: 0

Tragic really. I have seen it as well. I always held CS people to a higher standard for coding.. what with the hours of courses you spend just learning how to design and implement basic things like data structures. How do people make it out of college without being able to code these things? It always amazes me.

I don't get it either.. The worst is when I see an "experienced" developer making the most ridiculous rookie mistakes. The kinds of things you are supposed to learn to avoid when you first start learning. I think some people just don't have the passion or drive to be good at what they do and just do the minimum to coast by and get a paycheck. I don't know how these people can keep their jobs; and how they managed to get their degree is a mystery to me as well.

Question by Anonymous Coward · 2005-05-02 12:28 · Score: 1, Funny

Could indeed be useful at Microsoft.

At support desk:
SELECT user, probability(likly_madness_level) FROM caller_queue
-- to see if you should take the next call or let the person next to you take it.

At beginning of important day:
SELECT probability(crash_counter) FROM computer_log WHERE date=now();
-- to see if you should go on with your important report or just call it a day and play minesweeper.

Re:Question by Anonymous Coward · 2005-05-02 12:33 · Score: 1, Funny

SELECT * FROM funny WHERE post_id = 12414849;

Empty set (0.00 sec)
Re:Question by Anonymous Coward · 2005-05-02 12:56 · Score: 0

SELECT * FROM clever WHERE user_name LIKE '%Anonymous Coward%'

Oh wait....
Re:Question by Anonymous Coward · 2005-05-02 12:58 · Score: 0

SELECT * FROM your_ass WHERE my_foot - TRUE;

$article_title by $blowhard. by Anonymous Coward · 2005-05-02 12:28 · Score: 5, Funny

$techology is dying. It will be replaced with $replacement. Insert 4000 more words sprinkled with $random_buzzwords. I am so smart! The end.

Bioinformaticists (and spies) use this a lot by Dioscorea · 2005-05-02 12:31 · Score: 5, Informative

most of our clients are now asking questions that require approximate or probabilistic answers

Bioinformatics databases are a good example of this. DNA and protein sequence databases are often searched by approximate string-matching algorithms based on "dynamic programming" to hidden Markov models and other stochastic grammars.

Historically, drug target-hunters in Big Pharma created a market for accelerated hardware to facilitate dynamic programming searches, some of which (e.g. Paracel's Fast Data Finder chip) was originally marketed to government agencies who, um, shared an interest in approximate string-matching ;)

Accountable bitemporal DBs by G4from128k · 2005-05-02 12:32 · Score: 3, Informative

The rise of Sarbanes-Oxley highlights a key insecurity in the accountability of enterprise systems. Although the high-level applications can do a good job of tracking who did what to the financial data, the core DB may be open to tampering. If a DB admin with the right password can manually diddle a field in a database, they can change the financials of the company.

In contrast, a secure bitemporal DB would record not only the date of the what the data refers to (e.g., the purchase order was entered on March 3rd, 2004) but also the date(s) of any modifications of the data (the quantity and total was changed on December 31, 2004, Uh-Oh!).

This is more than just securing the DB with a hierarchy of privileges, it means that no one can overwrite the old data or change any data without creating an audit trail. This, of course, also means changes in the DB, OS or file system to make critical data only accessible through a secure DB layer that tracks changes (e.g., no accessible plain-text DB data structures). These same concepts could be used (probably are, for all I know) for OSS version control to track who did what and when to the code.

--
Two wrongs don't make a right, but three lefts do.

Re:Accountable bitemporal DBs by jdhutchins · 2005-05-02 13:05 · Score: 1

As long as the data's there, it can be changed. The only thing you can do about it is turn to verified cryptographic solutions. As long as the data is there, someone will have the password necessary to change it. If you can't do it from within the database program, you can edit the file on disk and change it that way.
Re:Accountable bitemporal DBs by Anonymous Coward · 2005-05-02 13:08 · Score: 0

Yeah. Right.

Then I guess we need to encrypt the backup, as well. In case the malicious admin decide to change that and restore. So who's got the keys?

And what about the case where your accounting app screws up and you actually *need* the admin to go change the value in the DB? Don't laugh, a company a worked for had that as a standard operating procedure for software we sold.

Look, I'm all for accountability and all that, but at some point you need to trust *somebody*. Hire a DBA you can trust.
Re:Accountable bitemporal DBs by Anonymous Coward · 2005-05-02 13:29 · Score: 0

And what about the case where your accounting app screws up and you actually *need* the admin to go change the value in the DB? Don't laugh, a company a worked for had that as a standard operating procedure for software we sold.

He's not saying you can't change the data, only that you can't change the data without leaving an auditable trail.
Re:Accountable bitemporal DBs by Anonymous Coward · 2005-05-02 13:46 · Score: 0

If you can edit the data on the disk, you can change anything you want. The only way to make a secure audit trail is to write the logs to a write-once medium like cd-roms.
Re:Accountable bitemporal DBs by TopSpin · 2005-05-02 13:46 · Score: 3, Insightful

The rise of Sarbanes-Oxley highlights a key insecurity in the accountability of enterprise systems.

Yeah, I've heard that one too. Reality has a way of factoring out the ambiguity of such abstract, open-ended claims.

On way to deal with the problem of DBAs and their ability to access/modify financial data is to register them with the exchange, just like the finance and executive types. Now they're Sarbanes-Oxley insider compliant! That's what has been done where I earn my living.

Thus, we may dispense with elaborate schemes of secure data version control using unspecified, hypothetical systems, paid for with budgets that don't exist. Next!

Until some future revision of Sarbanes-Oxley begins to specify the design and implementation of electronic finance systems, no one can claim a database is more or less susceptible to malfeasance than a locked filing cabinet. That's why the auditors stop once they've concluded you're changing your password with adequate frequency.

--
Lurking at the bottom of the gravity well, getting old
Re:Accountable bitemporal DBs by Anonymous Coward · 2005-05-02 14:01 · Score: 0

My bad.

I stand by my point, tho. You can't prevent everyone from not leaving a trail unless you go the whole NGSCB route, and even then, you need to trust somebody with the key. If it's a third party, you're fucked when (not if) you need to restore from backup *now*. If it's in house, then there's no point.

And of course the audit trail is only as good as the weakest password...
Re:Accountable bitemporal DBs by Nutria · 2005-05-02 14:46 · Score: 1

In contrast, a secure bitemporal DB would record not only the date of the what the data refers to ... but also the date(s) of any modifications of the data

You mean your RDBMS doesn't have full auditing capabilities?

What are you using SQL Server?

Any "enterprise" RDBMS worth it's salt has had such features for 20 years.

Of course, before you enable full auditing, you'd better double your IO capacity, well as increasing your CPU capacity.

--
"I don't know, therefore Aliens" Wafflebox1
Re:Accountable bitemporal DBs by magarity · 2005-05-02 15:49 · Score: 1

The rise of Sarbanes-Oxley

You misspelled 'unholy birth'.
Re:Accountable bitemporal DBs by Anonymous Coward · 2005-05-02 18:30 · Score: 0

The rise of Sarbanes-Oxley [slashdot.org] highlights a key insecurity in the accountability of enterprise systems. Although the high-level applications can do a good job of tracking who did what to the financial data, the core DB may be open to tampering. If a DB admin with the right password can manually diddle a field in a database, they can change the financials of the company.

I remember arguing about database security with an HP 3000/9000 guy. Apparantly, on those systems, there *IS* no root user/sys/sysdba/sa-equivalent on those systems. There is no user or privilege level that allows unrestricted, unfettered access to all data.

Audit trails suck performance (i.e., trigger-based data modification logging). And they need to be...well...audited. Again, it all comes down to the human factor. You can only put so much into the computer system, and at some level you have to trust the people operating the system. And anyone can disregard an audit that shows there are problems, either because they're colluding, or they just don't care.

Secure bitemporal DBs can be set up already. It's not some black magic. If you're really sneaky, your DBAs set up a box that has database snapshots stored on a separate machine, and compares the current database version with the snapshot at the time of the snapshot, to discover discrepencies between the systems...

SarbOx really comes down to holding people that have in the past been far less accountable to a level more inline with the rest of their employees. If the CFO signs off on a report, his ass is on the line now. So he's inherently going to do a bit more due dilligence before he signs his name.
Re:Accountable bitemporal DBs by fabu10u$ · 2005-05-02 22:49 · Score: 1

This, of course, also means changes in the DB, OS or file system to make critical data only accessible through a secure DB layer that tracks changes (e.g., no accessible plain-text DB data structures).
But wait! There's more! Expect all the DB vendors to do this in their next major releases because it also happens to create lock-in to their product. "Migration tools could be used for tampering! Sarbanes-Oxley made us do it!"

--
They say the mind is the first thing to ... uh, what's that saying again?
Re:Accountable bitemporal DBs by LordMyren · 2005-05-03 03:28 · Score: 1

whats to keep the programmer from backdooring some way to alte the modification date of the file? or even just to keep the accountant from changing the date of the system clock?

time-authenticity is one of the greatest problems remaining in the world.

Generalizing too much? by bogaboga · 2005-05-02 12:33 · Score: 0, Troll

-- but most of our clients are now asking questions that require approximate or probabilistic answers.

Not when my credit rating is at stake! OR, When an airline agent is mistaking me to be a member of Al Qaida, and therefore denying me a seat on the plane.

In these cases I want *exact* answers to everything time related.

Re:Generalizing too much? by Anonymous Coward · 2005-05-02 12:38 · Score: 0

You're a member of Al Queda? Report to the CIA at once!
Re:Generalizing too much? by nate+nice · 2005-05-02 12:47 · Score: 1

"Not when my credit rating is at stake!"

Taken out of context that's a pretty funny statement. It makes us sound like pathetic, capitalist pigs. To wonder, the first thing that comes to our minds is our credit rating. I hope it's only a troll.

--
"If you are a dreamer, a wisher, a liar, A hope-er, a pray-er, a magic bean buyer ..."
Re:Generalizing too much? by no+soup+for+you · 2005-05-02 14:05 · Score: 1

Not when my credit rating is at stake!

Your credit rating is a probability rating of your ability to pay your future debts. I mean, I know that you're kidding, but I think both of your examples show an actual probability. It would be thought that there is an X% that you might be Al Qaeda, so therefore you won't be granted the ability to purchase a ticket on this flight.

--
If you blog it...

I predict... by rainman_bc · 2005-05-02 12:38 · Score: 3, Interesting

Better indexing, faster lookups...

That's... about... it...

Object relational was the "new thing" that didn't really take off as well as they'd hoped.

Hell, I work with people who still can't handle compound keys and joins well...

--
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

Re:I predict... by superpulpsicle · 2005-05-02 13:49 · Score: 1

All the industry db experts should shift their focus to working with MySQL. You look at oracle and it's just so bloated.

The worst is every oracle db runs on some expensive production environment hardware. Every MySQL db runs on a cheap PC. Until this is changed, oracle is stuck as the industry standard.
Re:I predict... by jbplou · 2005-05-02 13:53 · Score: 1

I've known developers who have almost no test data in their tables develop thier app with no indexes and redicuolusly inefficeint queires that are speedy in development because thier is almost no data. Then when a few months of data gets in the database they have no idea why the app mystriously slowed down.
Re:I predict... by drew · 2005-05-02 14:18 · Score: 1

The worst is every oracle db runs on some expensive production environment hardware. Every MySQL db runs on a cheap PC. Until this is changed, oracle is stuck as the industry standard.

What are you smoking? I've seen oracle running on compaq proliant pentium pro systems, and i've worked with mysql installations running on sun enterprise 5000's. Oracle is not the industry standard because it runs on bigger hardware. Oracle is the industry standard because even though it is the biggest pain in the ass to use and administer, it provides a level of reliability, scalability, and functionality not found anywhere else. MySQL doesn't even come close.

That's not to say that MySQL doesn't have its uses, but I think it's a little bit insane to suggest that all the experts should quit working on real databases and work on MySQL instead.

--
If I don't put anything here, will anyone recognize me anymore?
Re:I predict... by Nutria · 2005-05-02 14:56 · Score: 1

it's a little bit insane to suggest that all the experts should quit working on real databases and work on MySQL instead.

You're right. They should use PostgreSQL instead. ;)

But seriously, an Oracle DBA (who's not dependent on GUIs) would feel at home with PostgreSQL 8.0.

--
"I don't know, therefore Aliens" Wafflebox1

I want clustered databases for high-availability by SpecialAgentXXX · 2005-05-02 12:45 · Score: 5, Interesting

The "next great advancement" in databases will be when I can setup 2 or more linux servers and have them act as a single database server. Our database server is the most expensive item in our datacenter because it's an N-way IBM server.

The clowns down the hall by tyates · 2005-05-02 12:46 · Score: 0, Troll

Queues? Workflows? Business logic? Excuse me for thinking that a database should just store data. I guess that makes me a caveman or something.
He didn't mention the biggest problem with databases - it's that the clowns down the hall (the DBAs and sysadmins) own it and you don't. I've seen teams store their data in flat files just because they didn't want to deal with those bozos.

--
Tristan Yates

Re:The clowns down the hall by NineNine · 2005-05-02 13:36 · Score: 0

Queues? Workflows? Business logic? Excuse me for thinking that a database should just store data. I guess that makes me a caveman or something.

Well, either you're a caveman, or you are simply ignorant. Either way, it can be cured. Try talking to some of those DBA's and Database Developers. You may learn something. If all you need is data stored, then use a flat file (or MySQL if you're feeling lucky). If you want data integrity, performance, stability, AND business logic, use a database.
Re:The clowns down the hall by gorfie · 2005-05-02 13:47 · Score: 1

Actually, it's the business unit that owns the database, the DBAs just ensure that the database is available and that clowns like you and me don't end up corrupting/modifying the data in a way that the business unit wouldn't appreciate. Sure the business unit suffers a loss in flexibility in that they can't just arbitrarily go in an change a name here, add a digit there, etc., but they can now have a set of data that can only be changed in certain ways and that can have auditing.
Re:The clowns down the hall by tyates · 2005-05-02 15:18 · Score: 1

The business unit? That's like saying the stockholders own the database. The programmers are the only ones who understand how the data in it is used, and the only ones who have the responsibility of keeping the application going. Yet, if they have to make any changes to the database, they have to crawl on their hands and knees and ask the DBAs for permission. It's insane. Give the people who have the responsibility the access they need.

--
Tristan Yates
Re:The clowns down the hall by Master+of+Transhuman · 2005-05-02 17:55 · Score: 1

Ah, not so fast, cowboy...

Just last week my boss at City College of San Francisco was "fixing" something in the production database and managed to delete a few thousand records he shouldn't have. He got it back from elsewhere okay, but it shouldn't have happened.

There's a reason developers get development and test databases and DBA's don't allow them to touch production databases.

In defense of my boss, we don't have a snapshot of the production database every night - which you need if you expect your systems people to debug problems in the production database apps. You need live data to reproduce problems frequently, and if you don't have it, you have to muck around in production.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Re:The clowns down the hall by Anonymous Coward · 2005-05-02 19:02 · Score: 0

The users own the data, but the DBA is responsible for the data, because the users "don't know, can't know, and never will know" the ins and outs of the data system.

The funny thing is, if the DBA screwed up the production data the way a User could, the DBA would be shitcanned on the spot. But all the User would get is a "just let the DBA do it, OK?"

Do you do your own home electrical, HVAC and plumbing? To code? Probably not. You hire and trust someone who professes to know that stuff to do it.

Most programmers might have high level knowledge of the system and how it's supposed to work, or a deep knowledge of silos of the data how it *does* work, but most programmers (except DB programmers) seem to hate with a passion SQL, and will do anything in their power to isolate themselves from it. So, they rely on the DBA to keep things straight and make it work, too.

Ask the responsible programmer how the AP package works in their financial system. They will know the technical details of how it's implemented, but there will be some finance type they work with who understands the bigger picture of how it's all supposed to work.

The DBA cares about how the data ties together, can look at the tables and structure and say, "if you delete this, you'll delete a bunch of other stuff, maybe you want to try a different approach", even if the programmer insists that it's not supposed to be that way.

The app is an elephant. Each group, user, programmer, project manager, executive, has different feels of the beast, but not too many grasp the entire beast conceptually.

moving past relational model? I thinketh not by Anonymous Coward · 2005-05-02 12:48 · Score: 1, Informative

Slashdotters- please remember that comments are better when the articles are actually read... that being so...

Here are my problems with what Gray et al said.

Under the mounting onslaught, our traditional relational database constructs--always cumbersome at best--are now clearly at risk of collapsing altogether.

In fact, rarely do you find a DBMS anymore that doesn't make provisions for online analytic processing. Decision trees, Bayes nets, clustering, and time-series analysis have also become part of the standard package, with allowances for additional algorithms yet to come.

He seems to be confusing two issues- one is finding what (data) to find and the other is finding that data. A database enables the second...a program, a human or both in combination with the technologies he mentions is responsible for the first. We shoudl never confuse these two. A database is a way to identify unique things by their properties, literally, a way to distinguish "things" from each other. If it's done that in a manner that provides for ACID and retrieval then our work is done. Finding WHAT to find is another completely different concern.

The relational model is not going anywhere- and that's what every database is - an implementation of the relational model; some better than others.

Gray is a great researcher, and maybe he was talking about the need to make current databases PRODUCTS "better".. but it doesn' read like that... it reads like a battle cry for us to move beyond the relational model. ... it's NOT going to happen...

Re:moving past relational model? I thinketh not by Spiked_Three · 2005-05-02 13:56 · Score: 2, Interesting

I doubt if he is "confusing two issues" as he probably knows a lot more about it than you and I. He may indeed have a different opinion, but that is not confusing two issues.
I will admit I was around before relational databases. Back then there was good old hierarchical databases, and they did a damn good job of what a relational database does 50% of the times these days. The problem was the other 50% they couldn't do. So along came relational databases. Now to think that there is nothing beyond relational is like IBM saying no one will ever need more than 16 colors on their PC, shortsighted.
The part I really wish would die is SQL. It was invented as a way end users could enter data queries. It became adapted to be imbedded in COBOL programs, and the fact that it's at the center of most enterprise applications today is hideous. I don't care when the next database technology comes along, but please get rid of the SQL dinosaur.
Personally, I'd just as soon get rid of databases, I have already designed my business logic, why must I now design and code ways to store objects? Yes, I know, some technology is already out (and I use it), but it is not mainstream yet. That is what I would like to see sooner, persistent object oriented databases become mainstream.

--
slashdot troll = you make a compelling argument I do not like the implications of.
Re:moving past relational model? I thinketh not by kpharmer · 2005-05-02 14:26 · Score: 2, Insightful

> it reads like a battle cry for us to move beyond the relational model. ... it's NOT going to happen...

a couple of thoughts on that -

1. relational databases are really quite wonderful for analytical apps. Need to store two years of firewall/sales/whatever data - then churn away analysis? Great - no problem. And it's easy enough to do either through hand-written sql or via a tool. There's plenty that requires third-party tools (and data stores), but even in this scenario the staging area is almost always the relational database.

2. a lot of folks who would like to eliminate relational databases fail to account for point #1. They complain of the object/relational mapping problems. Ok, that's fine - but if you put your data in container-managed persistence or an object database - you'll then have to pay someone to pull it out and put it into a separate relational database for analysis. Of course, you might be fired right about that time...

3. java in the database is mostly a pain in the butt: On the performance side you've got optimization complexity, on the managability side you've got unusual dependencies, build processes, etc, on the availability side you've got the ability to take out the entire server with some bad code (ok, sometimes).

4. two-tiered architectures with a web service driven directly out of the database is more than just a pain in the butt. It's a security disaster. A cobbled together architecture. And Jim Gray shilling for microsoft.

5. the column-store as Jim Gray described it has never really left us. And we don't really need new technology to handle it.

6. users love tables. there are quite a few users out there that truly love tables - they understand them, they look just like spreadsheets, they can query them. This is important: it's fabulous when your users can easily understand your design.

7. however - like Gray said, we now need methods of working with data that go far beyond boolean logic. We need fuzzy logic queries. And we need new types of models - allowing for multiple many-to-many relationships via relationship tables. This breaks codd's rules - but is essential for agile & fast-moving projects.
Re:moving past relational model? I thinketh not by Anonymous Coward · 2005-05-02 14:40 · Score: 1, Interesting

comments are better when the articles are actually read

True, but this article is very difficult to read, it's just a barrage of noise.

He seems to be confusing two issues-

He is confusing *many* issues: Model. Implementation. Language syntax. Application features. He even talks about web browser architecture and query optimizers! It's all over the freakin' map.

A database is a way to identify unique things by their properties, literally, a way to distinguish "things" from each other.

A database is a set of assertions about the real world.

The relational model is not going anywhere- and that's what every database is - an implementation of the relational model; some better than others.

Exactly, the relational model is a theoretical model for data storage, manipulation, and retrieval, perhaps the only one, and any data storage/retrieval system can be described as some subset or sloppy implementation of it.

Gray is a great researcher, and maybe he was talking about the need to make current databases PRODUCTS "better"

I saw nothing in that paper that makes me think he's "great" at anything. And he's definitely not a researcher, he's a VENDOR, no matter what his title. His paper was simply incoherent at worst, and at best, a metaphor for the state of the IT and DB industries today.
Re:moving past relational model? I thinketh not by Nutria · 2005-05-02 15:09 · Score: 1

The relational model is not going anywhere- and that's what every database is - an implementation of the relational model

If you think that the only kind of DBMS is the Relational DBMS, you must have flunked your Database Theory class, or gone to a piss-poor school.

--
"I don't know, therefore Aliens" Wafflebox1
Re:moving past relational model? I thinketh not by electroniceric · 2005-05-02 15:31 · Score: 1

right on, right on. Especially #s 4 and 7. It took people years to realize that durable software is stuff that can withstand people's constant impulse to change things. After finally realizing that doing this means keeping things as distinct as possible, we've essentially arrived at the wisdom of 3-tiered (n-tiered) architectures. There are definitely problems with that architecture, but it's hard to believe that it's not important to have the option to keep some of the logic outside the database.

On #7, I totally agree, but let me see if I understand the basic issue well enough to paraphrase. Databases basically work by using a variety of indexes plus hashing to quickly determine whether data meets a set of (exact) criteria. Different indexes work better for different kinds of exact queries, but you can relatively quickly work out a balanced set of indexes that do the job. In other words, some indexes are great for "field > number" queries, while others are better at "field matches string_pattern" queries.

So in order to have effective heuristic or "fuzzy logic" queries, somebody will need to work out indexes and hashes for each fuzzy logic matching operator, or write an algorithm that figures out how to make those indexes and hashes. And that's ummm... rather more difficult. So until then, we have to catch as catch can with with analyzers and aggregators doing underlying exact queries and applying as many optimizations as they can.

So that's my understanding of it - please correct me if I'm wrong.
Re:moving past relational model? I thinketh not by kpharmer · 2005-05-02 15:57 · Score: 1

> In other words, some indexes are great for "field > number" queries, while others are better at "field matches string_pattern" queries.

yep

> So in order to have effective heuristic or "fuzzy logic" queries, somebody will need to work
> out indexes and hashes for each fuzzy logic matching operator, or write an algorithm that
> figures out how to make those indexes and hashes. And that's ummm... rather more difficult.
> So until then, we have to catch as catch can with with analyzers and aggregators doing
> underlying exact queries and applying as many optimizations as they can.

Right - we need new sql operations as well as new optimization capabilities. Some of these might be in the works, I don't know.

But we also need new ways of modeling - right now quite a few of us are creating more flexible models that sit within a relational database - but aren't able to take full advantage of key relational database capabilities like referential integrity, unique constraints, etc. That might sound a little wacky but sometimes we need more flexibility in modeling data than you can get out of 3NF: like the ability to add new types of things, new relationships between existing types, or perhaps attributes (weight, etc) upon the relationship. And do this without schema changes.

Lastly, one of the big obstacles is that we're quickly moving into new terrain here. I remember once discussing an e-commerce catalog with a team of engineers - and they *could not* understand how a hierarchy could be insufficient for describing items in the catalog. The notion of network of items in which there are n-parents caused a lot of sweating!
Re:moving past relational model? I thinketh not by Anonymous Coward · 2005-05-02 16:03 · Score: 0

One- a DBMS is NOT a model... it's a THEORY around which PRODUCTS, specifically DBMS's are based. If you learn to distinguish teh two, as I did, you would go through life less confused.

Two- It's not that I think that the rlatoiinal model is the only model ever invented. It's that I KNOW the relational model is the only model that's worth a shit. If you don't know THAT then YOU need to READ something beyond whatever you either did or did not learn in school, and by the way, I went to the University of California and did very well indeed.

Three- as other slashdotters have told you, "other" models" like teh network or hieracrchial models, are,it turns out, variants and subsets of Codd's relational model. In fact, it can be shown that ANY database is really just a subset of the elatinal model OR it's not a database.. of course, you have to know WHAT a datbase is as oposed to a DBMS...

Finally- ALL conceivable databases implement a some form of logic exactly equivalent to existential-conjunctive logic. It states "true" facts about the objects in the tables and it makes those things locatable- period. Maybe you think you have something else databases should do and another way they should do them, but you're just wrong. Of course, if I have inspired you to go find that other thing, and you waste a few years out of your life pursuing that until you ifnally understand the relational mocdel, then i will have done the IT world a service.

Four-
Re:moving past relational model? I thinketh not by Anonymous Coward · 2005-05-02 16:09 · Score: 0

I saw nothing in that paper that makes me think he's "great" at anything. And he's definitely not a researcher, he's a VENDOR, no matter what his title

Yup. 122 papers and an ACM Turing award (the highest honor in computer science) doesn't make anybody great, much less a researcher.
Re:moving past relational model? I thinketh not by Master+of+Transhuman · 2005-05-02 18:01 · Score: 1

Not being an expert on the current interpretation of the relational model, I wouldn't assume that the needs of analysis necessarily "break" that model.

My limited understanding is that the current crop of SQL languages and database implementation fall short (how much I'm not sure) of what the relational model is capable of. Perhaps that's where we need to start looking for improvements.

In any event, as I've said numerous times before, without some adequate simulation of conceptual processing, it's not likely that the shortcomings of database technology are going to be adequately resolved any time soon. The purpose of all that exists in the database field now is to provide a (very poor) simulation of conceptual and physical reality.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Re:moving past relational model? I thinketh not by Master+of+Transhuman · 2005-05-02 18:08 · Score: 1

Not to denigrate the man, his papers go back to 1966 and most of the recent ones have to do with the Sloan Digital Sky Server Data Base (whatever that is).

He has apparently done work in database benchmarking and database modeling, System R, and a variety of other things.

So I'd be inclined to think he knows quite a bit about the history and evolution of current databases - which is what his article is mostly about, as a recap.

But I'm not entirely sure he has any solutions or is even barking up the right tree in his suggestions.

I see no evidence he's on a par with Codd and Date with regard to what is possible with the relational model. I'm not saying he's not, however.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Re:moving past relational model? I thinketh not by Master+of+Transhuman · 2005-05-02 18:14 · Score: 1

I don't think I'd go so far as to say that there CAN'T be any better theory than the relational model. I don't think the human brain works on the relational model (I could be wrong, but I don't think anybody is in a position to prove that at the moment.) I think we need a decent conceptual processing algorithm or model which would exceed the relational model in expressiveness and precision.

However, it does appear correct to say that the relational model has not been fully applied in existing products and therefore it's premature to suggest that we need to dump it for something else at this juncture - at least if that something else doesn't also provide what the relational model provides - and I haven't heard of any such model.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Re:moving past relational model? I thinketh not by Anonymous Coward · 2005-05-02 19:10 · Score: 0

I don't think the human brain works on the relational model

Most people focus on the surface processing, that which we think we're aware of, but find out that this simple surface stuff has a lot of background processing that has to be teased out.

Which is why we can read text in obscured pictures, and machine programs have a hard time (because to the computer it's just a bunch of bits).

Maybe if you have a program that watches Life (the cellular automata system), and can understand gliders, cannons, etc., then we're there. But I don't know if we even have that.
But you or I can look at a Life snapshot, and say, "hey, there's a glider, there's a cannon!"
Re:moving past relational model? I thinketh not by m50d · 2005-05-02 20:34 · Score: 1

By that logic how did we ever get to the relational model? After all, flat-file is good enough for separating things from each other and finding them by their properties

--
I am trolling
Re:moving past relational model? I thinketh not by Anonymous Coward · 2005-05-02 20:52 · Score: 0

Oh, I was wondering when the warring purists would show up, shouting "There is no god but RDBS theory and SQL is NOT its prophet!!!"

OK, there you are.
Re:moving past relational model? I thinketh not by Anonymous Coward · 2005-05-03 05:42 · Score: 0

Actually, The Fine Article (when one discards the meaninglessly vague but ominous assertions) appears to be a statement that all of today's databases are broken, and require improvements in the features that happen to be emphasized in the marketing for the next release of Microsoft SQL Server. Surprise, marketing fluff!

Re:Montreal? by nate+nice · 2005-05-02 12:49 · Score: 0, Offtopic

awesome.

--
"If you are a dreamer, a wisher, a liar, A hope-er, a pray-er, a magic bean buyer ..."

It warms my heart... by Baldrson · 2005-05-02 12:50 · Score: 2, Interesting

To see Bill Gates' organization of "really smart people" spinning their wheels so energetically really warms my heart.

I can see they've no hope of being any competition at all come the real db revolution.

--
Seastead this.

Re:It warms my heart... by Anonymous Coward · 2005-05-02 13:08 · Score: 0

yeah, that turing award winner is a fucking moron

nice geocities site, btw
Re:It warms my heart... by Baldrson · 2005-05-02 13:34 · Score: 1

On the contrary, that Turing Award winner has a very powerful mind -- spinning its wheels unlikely to pop the clutch.
Thanks.

--
Seastead this.
Re:It warms my heart... by radiophonic · 2005-05-03 03:19 · Score: 1

+1 Sarcasm

--
Whenever you read this sig someone's refrigerator light turns on.

LIKE '%approximate answers..%' by otisg · 2005-05-02 12:51 · Score: 1

More seriously, this means something like + Lucene[1] (or, more likely, lucene4c [2])

[1] http://lucene.apache.org/
[2] http://incubator.apache.org/lucene4c/

--
Simpy

Channel 9 by DaHat · 2005-05-02 12:52 · Score: 1

If you want more Jim Grey, head on over to Channel 9 to see a couple of sit downs with him. Personally I found both Part 1 & Part 2 are both quite interesting and thought provoking.

--
Help Brendan pay off his student loans

One word by Anonymous Coward · 2005-05-02 12:53 · Score: 0

XML.

Re:I want clustered databases for high-availabilit by ankhcraft · 2005-05-02 12:55 · Score: 1

You can already do this with Oracle RAC. We run this at the office. Works great.

--
...

How it this news? by TekGoNos · 2005-05-02 13:05 · Score: 1

Well, I should no longer expect news from Slashdot but :

this is almost the same content as his SIGMOD 2004 speech,
which is available since April 2004 :
http://research.microsoft.com/research/pubs/view.a spx?tr_id=735

How is the refurbishing of an one year old article news?

(And, BTW, I find the keynote speech better structured then the refurbishment)

--
I have discovered a truly remarkable proof for my post which this sig is too small to contain.

Re:How it this news? by jayloden · 2005-05-02 13:26 · Score: 1

Yes yes, but, despite the "news for nerds" tagline, Slashdot is really more of a dicussion site than a news site. If you really just want the latest in tech news, try someplace like http://freshnews.org/ where you can get everything consolidated into a nice digest for you, including the ancient slashdot headlines. If you want a discussion forum for geek chatting, stupid memes, repetitive jokes and overall fun, then read slashdot.

Personally, despite some of the crap on slashdot, I still stick around, mostly for the commentary and the interesting angles I find other people view things from.

-Jay
Re:How it this news? by Master+of+Transhuman · 2005-05-02 18:18 · Score: 1

"geek chatting, stupid memes, repetitive jokes and overall fun"

You forgot insults, fucking foul language, uneducated morons, and religious/political fanatics.

Oh, yes, and dupes - of both news and the people who believe it.

Oh, wait, you did refer to "crap"...

Never mind.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!

Atomicity in filestores is a great benefit by Anonymous Coward · 2005-05-02 13:07 · Score: 2, Informative

The parent makes a good point, and it's pretty easy to see why if one holds off the usual anti-Reiser reactions and thinks it through a bit.

Databases require a mechanism for atomicity to create their transactions, and because no common operating system has ever provided such, they need to implement it themselves at application level. It's like the bad old days before PCs provided networking, and you had to run up your own networking stack if your application needed comms.

Well reiserfs has the goal of providing atomic transactions at filestore level, so in principle it will become possible to leave a good chunk of the very hairy rollback processing of conventional RDBMSs to the operating system.

It won't remove the need for proper RDBMSs for power database applications, nor will it in any way obviate the need for database distribution, but it should make professional databases both simpler and more robust. And it should also allow mini-database applications to be coded directly around the filestore with better transactional properties than the traditional flat-file designs.

Re:Atomicity in filestores is a great benefit by dioscaido · 2005-05-02 13:35 · Score: 2, Informative

Does reiserFS support atomicity at the group level? Can I edit a group of 30 files, and only once the modifications are done for those 30 files do we commit it to the file system, and in any other case none of the files change? That is a major feature of a transactional database, where you can modify various tables simultaneously and if at any point there are issues, all the data is easily restored by doing a roll-back.
Re:Atomicity in filestores is a great benefit by Unordained · 2005-05-02 14:11 · Score: 2, Interesting

... oh, and the file system should also verify the integrity of the files, and the system as a whole -- make sure that your changes are "allowed" (both state-constraints and transition-constraints), make sure that everything works together (imagine your FS making sure that your changes to your mail server config match up with your changes to the user list?) ... ... oh, and allowing multiple users to modify files at the same time, and know enough about the file formats to reconcile possible conflicts (not stupidly like CVS does, where everything is either binary or treated as sequences of lines of text delimited by a carriage return) ... ... oh, and maybe we should resolve the issue of putting the type of the file in the filename (variables have names, values have types) ... ... oh, and don't forget support for, say, two-phase commits, nested transactions, and all those other things ... which, by the way, Jim Gray has one of the authoritative books on.
Re:Atomicity in filestores is a great benefit by SPY_jmr1 · 2005-05-02 20:28 · Score: 1

And a Pony too!
Re:Atomicity in filestores is a great benefit by chthon · 2005-05-02 23:28 · Score: 1

Atomicity at the file system level exists already a long time. It is called indexed-sequential files. This is handled by ISAM/VSAM and likewise files.

All minicomputers had (have (AS/400)) something and mainframes will surely have it.

This is a transactional system implemented at the OS level.

I have only experience with this on the WANG VS minicomputer platform, but there the transactioning was also implemented at the filesystem (or OS level).

With the advent of Un*x like operating systems, this functionality disappeared from the file system, and third party packages where needed to implement it. And of course the most logical course was to do this through the database.
Re:Atomicity in filestores is a great benefit by bobv-pillars-net · 2005-05-03 05:18 · Score: 1

Does reiserFS support atomicity at the group level? Can I edit a group of 30 files, and only once the modifications are done for those 30 files do we commit it to the file system, and in any other case none of the files change?

Yes; arbitrary-length transaction support is already implemented, though user-level tools don't take advantage of it yet.

... oh, and the file system should also ...(Long laundry list of desired features)

There are hooks for arbitrary user-defined file behaviors. Again, implementation has not trickled down to the user-level, but give it time. IIRC, Hans Reiser estimates the project will take at least another 10 years before reaching most of his goals.

The original article was about where database technology is going in the future, not about where it is today. Obviously, if you want to implent your long list of desired features right now, then you probably need to buy an Oracle license or three. But give Hans credit for having guts and vision enough to start integrating the database and the filesystem into a unified data store. In any discussion of "where database technology is headed" I think that his project deserves honorable mention, at least.

--
The Web is like Usenet, but
the elephants are untrained.

Soundex by NineNine · 2005-05-02 13:08 · Score: 0

There's been a bit of movement in this direction for a while now. Oracle has had soundex() for a while at the least. It's never worked very well, though... :|

That is what SAS is for... by the+eric+conspiracy · 2005-05-02 13:13 · Score: 3, Insightful

most of our clients are now asking questions that require approximate or probabilistic answers

What are my chances of getting laid tonight...
What are the odds of my winning the lottery...
What are the chances that my boss will find out about that phoney dinner reciept...

Seriously, SAS stat analysis software does exactly what this numbskull is talking about. You don't need a new kind of database, merely somebody with training in stats.

Re:That is what SAS is for... by Anonymous Coward · 2005-05-02 13:19 · Score: 0

Thanks for pointing out what this "numbskull" is talking about. I'm sure you are quite a bit more of an expert than someone who has won the equivalent of the Nobel Prize in computer science (i.e., the Turing Award).

Yes, SAS does this, but SAS is not a database product. SAS is for analysis. There's much more room for improvement in terms analysis coupled with high-performance data retreival.
Re:That is what SAS is for... by johnalex · 2005-05-02 13:39 · Score: 1

Back when I was a kid, people used a Magic-8 ball to answer questions like this.

--
JA
http://www.johnalex.org/
Re:That is what SAS is for... by the+eric+conspiracy · 2005-05-02 15:01 · Score: 1

the equivalent of the Nobel Prize in computer science (i.e., the Turing Award).

Over the course of my lifetime I have had the opportunity of working with a number of people who have won Nobel Prizes. Here are some clues:

- The Turing Prize is NOT an equivalent to the Nobel Prize. Not even close.

- Computer Science is NOT a science.

- Aliasing two disparate problem domains and then calling for a fusion to combine them results in a design that solves neither problem well.
Re:That is what SAS is for... by Anonymous Coward · 2005-05-02 15:58 · Score: 0

What are my chances of getting laid tonight... What are the odds of my winning the lottery... What are the chances that my boss will find out about that phoney dinner reciept...
Since you're looking for approximate answers: 0, 0, and 1.
Re:That is what SAS is for... by gabbarbhai · 2005-05-02 16:42 · Score: 1

No it's not. Think data security, ease of access, reliability of storage, not having to replicate the code for fetching and putting the data back to the database.
And we haven't even started to talk about very large databases, distributed data, clients not willing to give you flat files of all the records but only the summaries due to obvious privacy concerns, server-side integration of the analysis programs and the stored data for on-line smart analysis, which, by the way, IMHO is overwhelmingly statistical in nature..

Want me to go on, or are we already feeling numb in the skull? :-)
Re:That is what SAS is for... by Anonymous Coward · 2005-05-03 01:36 · Score: 0

What are my chances of getting laid tonight...
What are the odds of my winning the lottery...
What are the chances that my boss will find out about that phoney dinner reciept...

These are not the questions being asked. The questions being asked are:
Will I get laid tonight...
Will I win the lottery...
Will my boss find out about that phoney dinner reciept...

The questioner is only willing to accept probabilities because exact answers are not available.
Re:That is what SAS is for... by agusus · 2005-05-03 01:53 · Score: 1

[mod parent down]....

No, you misunderstood the problem. The point of fuzzy database queries is to get queries that return faster or give early partial results. It doesn't even make sense for you to suggest SAS. We're talking about getting faster query results on data that is *already* in databases. And I don't think you can put a terabyte of data into SAS efficiently.

This is a real-world problem, not just something he made up. Today we have enormous terabyte databases that can take forever to do a query, so finding ways to get faster, but approximate, answers is something I know many people are researching currently.
Re:That is what SAS is for... by sjwaste · 2005-05-03 04:11 · Score: 1

At first glance I was inclined to agree with you, mostly because it makes sense from a division of labor standpoint. However, have you ever worked with truly large dataasets in SAS? Even taking a random or stratified sample on which to base calculations takes forever. For that reason, approximated analytics in the db itself are likely necessary. It's not meant to replace your statisticians, where more rigorous analysis is necessary, but more to approximate answers where approximates are good enough, is the impression I'm getting.
Re:That is what SAS is for... by sjwaste · 2005-05-03 04:13 · Score: 1

Oh, forgot to add, I'll qualify by saying I do earn my living as a SAS programmer and I really don't think these db architectures are in any dange of replacing me in the workforce. What I'm saying is, I agree with the other commenters, they fill a different role all together.

Re:I want clustered databases for high-availabilit by Anonymous Coward · 2005-05-02 13:17 · Score: 0

Can you do this without shared storage?

I would LOVE that, but I have little experience with Oracle (or any other high end DBs).

Sure... by Anonymous Coward · 2005-05-02 13:19 · Score: 5, Funny

Could someone summarize it without using the letter 'e'?

Sure.

Th Futur of Databass
Postd by timothy on Monday May 02, @08:12PM
from th your-flight-status-is-'mayb' dpt.
gManZboy writs "vr wondr whr databas tchnology is going? This is somthing that Turing award winnr Jim Gray from Microsoft has givn a lot of thought to. H rcntly publishd an articl in which h looks at th many forcs pushing databas tchnologis forward, and what thos nw tchnologis will look lik. Gray writs, 'th gratst of ths [rsarch challngs] will hav to do with th unification of approximat and xact rasoning. Most of us com from th xact-rasoning world -- but most of our clints ar now asking qustions that rquir approximat or probabilistic answrs.'"

Hmmm, I kind of like 'databass'.

Re:Sure... by Rich0 · 2005-05-02 23:11 · Score: 1

The mods are asleep at the wheel. -1 redundant.

Half the posts on this site look like this...
Re:Sure... by smittyoneeach · 2005-05-03 00:24 · Score: 1

Did you RTFA? It looked like a collision between Cal Worthington, his dog Spot, Geek Squad, and a bottle of tequila.
I think Mr. Gray, as Lemmy put it, is "Talkin' to the Devil on the batphone all of the time".

--
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
Re:Sure... by Rich0 · 2005-05-03 10:50 · Score: 1

Uh, that was intended as sarcasm...
Re:Sure... by sharkdba · 2005-05-04 05:49 · Score: 1

Could someone summarize it without using the letter 'e'?

Sure.

That should be "Sur."

--
The purpose of life is to find the purpose of life.

Well, NOW I got a decision to make! by Anonymous Coward · 2005-05-02 13:24 · Score: 0

Should I take the Teradata Physical Implementation test? Or just let it slide since databases will eventually disappear anyway?

Re:I want clustered databases for high-availabilit by jjohnson · 2005-05-02 13:25 · Score: 2, Funny

And since Oracle is *way* cheaper than IBM, it's problem solved!

--
Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.

MOD PARENT UP! by Anonymous Coward · 2005-05-02 13:31 · Score: 0

funny++

Re:I want clustered databases for high-availabilit by DuctTape · 2005-05-02 13:34 · Score: 1

The "next great advancement" in databases will be when I can setup 2 or more linux servers

I'm sorry, you must have misread the article (MTFA). I think you mean, "The 'next great advancement' in databases will be when I can setup 2 or more Microsoft servers."

DT

--
Is this thing on? Hello?

approximate answers..Pentium Databases. by Anonymous Coward · 2005-05-02 13:36 · Score: 0

"most of our clients are now asking questions that require approximate or probabilistic answers.'"

Fuzzy Logic

Cobol by Anonymous Coward · 2005-05-02 13:44 · Score: 1, Funny

From the article:

The problem starts, of course, with Cobol

Damn those Cylons! Why won't they leave humanity alone?!

A real problem comes full circle-Sounds like. by Anonymous Coward · 2005-05-02 13:47 · Score: 0

"I think I agree with the parent. Databases are methods of storing and retrieving data. Trying to make queries fuzzy, or less structured is just wrong."

The reason we can't is more economic than technological. Now just imaging a terabyte database sitting on top of banks of associative memory.

640K is enough for everybody applies as much to databases as anything else.

Re:I want clustered databases for high-availabilit by iammaxus · 2005-05-02 13:54 · Score: 1

You certainly can do it, it's just not easy. I agree, though, that the first group that can make this easy enough (run program, login to DB network with DB servers, access DB network just as single DB is accessed), will become very popular.

$clever_title by Urusai · 2005-05-02 13:57 · Score: 0

$sarcastic_comment

Slashdot is well on the way to automation of both articles AND commentary.

Re:$clever_title by protohiro1 · 2005-05-02 15:45 · Score: 2, Funny

although both $buzzword and $sarcastic_comment should be stored in some sort of database...

--
Sig removed because it was obnoxious

whoop de do by Anonymous Coward · 2005-05-02 14:01 · Score: 0

Faster computers allow more complex queries.

No fucking shit.

I've got to get off this hamster wheel of doing actual work and become a "computer scientist" or "theoretical physiscist". Then I can just state the obvious or make shit up.

Re:I want clustered databases for high-availabilit by kpharmer · 2005-05-02 14:09 · Score: 4, Informative

> The "next great advancement" in databases will be when I can setup 2 or more linux servers and have
>them act as a single database server. Our database server is the most expensive item in our datacenter
>because it's an N-way IBM server.

lol, IBM has supported *exactly* what you are talking about for at least five years.

That is, you can spread your db2 database across 10,100, or 1000+ linux commodity boxes (ideally blades). Or you can use windows, or aix, or solaris, or hp-ux, etc. Of course, those individual boxes can be SMPs in their own right - so a thousand 8-way aix boxes is certainly possible, if not cheap.

Oracle is now in this game as well - oracle 10g can certainly support 32, and maybe 64 individual linux boxes in a cluster. The techniques are different between the two - oracle might be better at transactional systems. db2 is definitely better at data warehousing, data mining, etc.

Of course, there are still benefits to a big smp: a single P570 16-way will cost you $250k. But each of those 16 cpus is multi-code (and far faster than intel or amd), and with its micro-partitioning - it can run at least 150 linux or aix lpars (logical partitions). These lpars can grow or shink as they need - so you aren't always over-buying for size, buying new hardly-used hardware, or having to colocate apps on a busy server - when a different os would be preferable. Not to say everyone should go this way - but there are definite benefits.

Hmmm Databases by Chitlenz · 2005-05-02 14:09 · Score: 4, Interesting

As a 15 year DBA, currently we are working with some of the would-be far reaching (to most people) concepts described in this paper. The idea of a TRUE SQL Debugger is like, so big it's sick. Quest offers some tools that kinda sorta do this for Oracle systems, but a true realtime debugger would save me YEARS of work during my career as a SQL coder. For an Idea of scale, The last replication project I wrote for an employer propogated over Oracle DB_LINKS via triggers to synchronize a dataset in two cities, log it, and do something with errors. Because this particular system was a Peoplesoft installation, it was a subset of 6800 tables and 15k lines of code give or take some triggers, with NO debugger. OMG, it's like a "finally" moment to have someone even claim to be fixing this soon in their architecture.

Next, there was some inane reference to reiserfs above, which clearly ignores what a database fundamentaly both is, and is becoming. It really began (and I hate to admit this as a former Solaris/Oracle admin) with SQL Server 7 and Oracle 8, and the concept that a database should be object programmable. Reiser is not going to be streaming still frames of image data fast enough to a remote client to rebuild seamlessly into a movie, for instance. Or recalculate all of a company's business logic for point of sale systems so that, for instance, the wrong type of credit card gets rejected, or so a supply chain gets populated, the list is endless. Reiser, and for that matter VFS and the other myriad of database enhanced filesystems, are tools. Good ones, but tools...

It's interesting to note that MS has finally figured out that the "n-tier" was a dumb idea. It's almost like, well you take all this shit, then sell it through a middle man, but expect to not have to pay him anything for brokering. Like, duh. We actively benchamrked this process, in fact, and discovered that it does, not suprisingly, take time to pass data through an extra server.

Workflow is life. It's what make this page exist (SD is I believe run in MySQL). The idea of publishing-subscribers with atomic transactions is hardly new, but I agree with the authors that this is the direction of the market, simply because businesses now are getting spread all over. Read - If your job just went to India, learn to be a DBA, cuz when all that shit they sent over there comes back, you can bet its going to be a mess (and is a mess actually already, which is why, in particular, people in ERP fields that intertwine with mine(as a DBA) demand and recieve very large salaries, 200$US an hour is not unusual). The reason this particular ramble is relevant, is because lots of global companies are either looking at, or are already implementing, the idea of data grids, where all the data servers inside a global network stay in sync. Suzy the secretary checks out a document in Baltimore, and that document flags as in use in Madrid through transactional replication within a kind of database trust-relationship network. It's a very very good way for companies with lots of data to keep it all together, but today it's still a pain in the ass to manage.

Vertical partioning is pretty much worthless except to data warehousing installations, most of whom are probably running on strong equipment already (to have that much data). Not to mention, I believe (I'd have to check, since it's not a feature I'd really use) Oracle's 10G product allows for this already if you really want it. Materialized views is another point here that raises my hackles. This guy is writing about the wonder of materialized views and column partitions, which ARE a cool performance cheat in large systems, but make no mistake that by the time you get to this point, you are probably rearranging deck chairs on the titanic anyway. Essentially Materialized views precache SQL resultsets into a temporary table which gets constantly updated so it can always provide a full resultset without having to parse the parent table. This is processor and space expensive. Vertical par

--
Imagination is the silver lining of Intelligence.

Re:Hmmm Databases by kpharmer · 2005-05-02 14:53 · Score: 1

> XML SUCKS. PASS IT ON.

agreed - as a human-readable way of persisting data it stinks. as a way of persisting data it stinks. But it isn't bad as an over-the-wire protocol.

> vertical partitioning...

Hmmm, don't see much vertical partitioning in data warehouses any more. Used to on oracle years ago, but can't even remember why. But I am finding that both mean range & hash partitioning work well together. The range is cheaper & easier to implement but only gives a performance benefit when you only want a subset of data. The hash comes into its own when you want to query 50% or more of the data.

> materialized views...

have to disagree: whether or not to use them depends a lot of on the update frequency, query frequency and query cost. But I've got queries running 1000x faster on summary tables (not automated materialized views, but same idea) than they would on base data. *1000x* is a lot faster.

> BUT all of this ignores the very real possibility that hardware is getting fast enough
> to not have to do any of this crap.

yeah, but that ignores the very real fact that data is increasing *faster* than moore's law. So the cheap commodity machines *are* table to blow away the machines of ten years ago in doing the tasks from ten year ago. But today it isn't unusual to find a database loading ten million rows a day. Then someone wants to crunch away at 500 million rows in a query - and wants it to come back in 5 seconds. Commodity hardware is a long way away from doing this. Unless of course, you connect them in clusters. DB2 can do that very well - but Oracle is just starting and doesn't scale so well there yet.
Re:Hmmm Databases by Anonymous Coward · 2005-05-02 15:09 · Score: 0

The idea of a TRUE SQL Debugger is like, so big it's sick.

Personally I'd like a better language first. SQL is so bizarre. Just take the set notation of the relational model and make a language out of it. Why type "SELECT * FROM Customers" when "Customers" will do? Why type "SELECT * FROM Customers JOIN Orders ON...." when "Customers JOIN Orders" will do? Why the abomination of sub-selects, when just a set of parens will do? After all, we don't type "OPERATE ON a WITH 1 USING ADD", we type "a + 1". We don't type "OPERATE ON (OPERATE ON a WITH 1 USING ADD) WITH B USING MULTIPLY", instead we just type "(a+1)*b".

Next, there was some inane reference to reiserfs above, which clearly ignores what a database fundamentaly both is, and is becoming.

Hmm, let's see definitions. What is a database, and how is reiserfs not a database? It allows for storage, retrieval, and manipulation of data, so to me that's a database. Sure, the model is simplistic, but so is SQL compared to the relational model.

a database should be object programmable

How is this different than just "programmable"?

is a mess actually already

No kidding. Coming from engineering I wonder about the lack of understanding of most people in this industry.

XML SUCKS. PASS IT ON.

Won't argue with that.
Re:Hmmm Databases by Master+of+Transhuman · 2005-05-02 18:33 · Score: 1

As for the syntax, I think when SQL was invented it was around the same time that COBOL was still "cool" - so they borrowed the verbose syntax in the vain hope the result would be "self-documenting".

They failed.

In any event, use a decent SQL IDE that fills in the verbosity - I assume one exists somewhere.

And I think what he meant by "object programmable" (he can correct me if I'm wrong) is like database tables should be objects with embedded methods - which essentially they are with triggers and the like. This is basically what the article said.

My problem with that is OOP languages can be documented. There doesn't appear to be any good way to document an app with a bunch of tables with half the code buried in triggers and the other half buried in GUI forms. I assume some of the UML -type models could handle it but I've never seen anybody actually document a database system this way.

I'm going on the basis of the SCT Banner university management system which is full of Oracle Pro-COBOL and Pro-C code and PL/SQL code tacked onto Oracle Forms full of triggers and retrieval/update behaviors which are totally incomprehensible from an overall system function viewpoint. The documentation is totally written from an end-user viewpoint - how the developers actually manage the system is unknown to me.

I suspect a lot of database stuff is done like that - which means a maintenance nightmare scheduled for the future along with all the whiz-bang buzzwords.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Re:Hmmm Databases by Anonymous Coward · 2005-05-02 18:46 · Score: 0

I'm going on the basis of the SCT Banner university management system which is full of Oracle Pro-COBOL and Pro-C code and PL/SQL code tacked onto Oracle Forms full of triggers and retrieval/update behaviors which are totally incomprehensible from an overall system function viewpoint. The documentation is totally written from an end-user viewpoint - how the developers actually manage the system is unknown to me.

Do you think database-backed application systems/environments like SAP R/3, PeopleSoft, etc., are any different? They're not.
Re:Hmmm Databases by renoX · 2005-05-02 22:45 · Score: 1

> Reiser, and for that matter VFS and the other myriad of database enhanced filesystems, are tools. Good ones, but tools...

While I agree that filesystems are very different from DB, I consider also that DB are tools: even a super-complex distributed DB is a tool, in my book.
Re:Hmmm Databases by renoX · 2005-05-03 00:45 · Score: 1

>But it isn't bad as an over-the-wire protocol.

Depends on what are your criteria to decide good or bad but converting every integer into its ASCII representation is not a good over-the-wire protocol IMHO.
Not efficient to say the least.
Re:Hmmm Databases by julesh · 2005-05-03 01:33 · Score: 3, Interesting

[XML] isn't bad as an over-the-wire
protocol.

Yes, it is. I've worked on a project that allowed offline modification of a database by replicating a copy to user's PCs, and it originally used XML as the format for data transfer. We got a 30% speedup by switching to tab-separated variables with a line of metadata at the start of each chunk of the stream. Any technology that costs that much in overhead and provides little or no perceivable benefit is a waste of time. (Of course, if your data isn't relational, this is probably not much use to you, but then... what are you storing it in? XML documents?)

The only justification for XML is that there are a lot of tools out there that work with -- I use it is an intermediate interchange format between different environments because the libraries available make it easy with just about anything I want to access the data with.
Re:Hmmm Databases by Chitlenz · 2005-05-03 01:55 · Score: 1

SQL makes sense from a DBA Developer point of view, I mean that's kind of the best way I can explain it. It's kind of like how C++ is immediately attunable to some ppl, while say VB is not, etc. I think in as lot of ways it's become a self-fulfilling prophecy in that pro-DBAs I think tend to think of their "flock" of tables in terms of SQL based transactions at this point. Remember that SQL was intended to offset COBOL functions, and that all of the weird crap from there had to be carried over...

Regarding Resier, I would pose that a filesystem is not truly organized in a way that makes sense from a data point of view. That is, reiser is using a database ideology, albeit an internal one, to organize what are typically very small sets of data. Remember that the concept of relational databases dates to the early 1970s (I think Codd was early 70s?), and that a lot of different 'better ways' have been tried. A good question to example this is, ok resier can organize a million files on disk, but what about 50 trillion? How fast does it search through them then? Does the 'dancing tree' model still hold up under load? From 10,000 simultaneous users? All this said, Reiser IS actually a killer choice for organizing datafiles used by a database, I've actually implemented several Oracle installs on linux/reiser to much success =)

On a future note, I DO think that Oracle in particualr is making strides towards a kind of Database Appliance type thing, where the DB and OS are the same thing. They had a project with Sun a few years ago called like Heavy Iron or Big Iron or somesuch PR BS name that basically was the idea of selling a database/OS hybrid that installed pretuned on Sun gear. That is the future of databases right there. And Ellison knows it, which is why Oracle is so gung-ho for Linux lately. In a serious environment, noone would really consider a database server to perform any other services other than those related directly to database functions, so its about time someone over at Oracle woke up and released an Oracle/Linux pre-tuned and optimized Distribution.

Object Programmable reflects the idea that we will (someday OMG) eventually be able to easily (and that's the key there) extend the actual cores of the databases themselves to accomodate extraneous OS level functions. It would be really nice, too, if this model actual compiled this time, unlike stored procedures in their early day (where they were basically a JIT script language, and in lots of systems like SQL Server still are). A good for instance here is, I'm both a programmer and a DBA. My particular field is medical visualization for Radiology, so essentially I have to organize huge sets of patient data in a way that I can do things like, well, volumetrically render your skull to see if you have a lesion, etc. Today, I have to pull this to the workstation, organize the dataset, and render the scene from the dataset onto the stage. Because of the flowing nature of our data (that is to say, this isn't like a game where you can pre-cache models on the local workstations since every patient is a different model), I would like a way to tie direct3d to a pre-render engine at the database layer so that all I would have to provide to a client like a web page is the end product. I'm working with MS SQL atm, so I'll use it as an example, a typical MRI image of your chest comes out of a scanner in some stupidly high resolution. That scan typically contains voxel data which is defines by the mm thickness of the slice. Your POV as an end user over the web is, 'all I care about is this one particualr diagnostic output', or one image lets say. To actually GET that image may or may not require that a set of transforamtions be applied to a large subset of slices in any particular study. It would be really nice to not have to add external services (another app), and instead be able to directly and natively be able to access the inner workings of the database engine to do this directly, instead of offloading it to the local OS. Obj

--
Imagination is the silver lining of Intelligence.
Re:Hmmm Databases by Chitlenz · 2005-05-03 02:38 · Score: 1

Regarding materialized views, I think the biggest problem I have with them is the annoyance of persitant errosion of performance. These cost. A LOT. And the more you add, the more it costs, especially if the tables are ERP-sized. Just for a frame of reference here, I will presume that you are dealing with very large sets of data to be getting those kinds of performance gains from this, lets say 100M rows. The MVIEW, if it's not a one off, is going to cycle every time the parent table(s) are modified or addressed to build that subset. Yes, it can be timed, but again the idea is one of robbing Peter to pay Paul. Again, we are dealing with very large datasets here and every one of those is typically an individual entity with its own particular quirks, but in tests with Peoplesoft a few years ago in 9i, we noticed that you will indeed take a hit if you let these get out of control in population. To add to this, the entire concept of MVIEWS kind of defies data normalization rules by duplicating data in the first place, since an MVIEW is of course a physical entity.

To reverse all that tho, these were discussions for a time when databases needed to be tuned at very granular levels because hardware was very expensive, and today it simply isn't. I will stand by the argument that materialized views are an unnecessary evil that can, today, be avoided by buying better hardware, for the most part.

Moore's law is not a law, its more of a shamanistic prophecy, but I did your example load of 500M+ rows per day on Sunfire v880 machines for 5 years using regular generic StoreEdge racks, with subsecond respose times to 250 users (OLTP). Most problems with ANY query can be tuned with either A) better code or B) better indexes. Hardware is a last resort in a way, but we already think of it as almost a disposable commodity. I'm about to implement 2 6TB plus installations running single Sun 40z database servers with attached Netapp storage containers. Building a simple resultset of any size is becoming very trivial on this kind of modern hardware, and will be the death of shortcuts before long.

Clusters is a whole topic unto itself, but RAC and Grid architectures are actually both more stable and more progresssive on Oracle than DB2 atm. The entrire idea is becoming more of a sit anywhere and select like it's local phenomenon, where the 'network' of databases are all aware of each other and know how to automatically retrieve pertinent data from wherever it lives and make it all seem like it just magically happened. Clusters are all about redundancy and processing power, but once machine power hits critical mass (I think it's soon) and each node is capable of huge amounts of stable processing, the next step is to stop storing so much redundant trash and just have each piece of data live one place. This is Oracle's Grid.

-- chitlenz

--
Imagination is the silver lining of Intelligence.
Re:Hmmm Databases by FriedTurkey · 2005-05-03 03:01 · Score: 1

For an Idea of scale, The last replication project I wrote for an employer propogated over Oracle DB_LINKS via triggers to synchronize a dataset in two cities, log it, and do something with errors. Because this particular system was a Peoplesoft installation, it was a subset of 6800 tables and 15k lines of code give or take some triggers, with NO debugger. OMG, it's like a "finally" moment to have someone even claim to be fixing this soon in their architecture.

If you are trying to sync PeopleSoft instances, PeopleSoft provides OTB tools to natively sync databases using the Integration Broker. Managing the sync of 10 to 20 components is easier than managing the thousands of tables used by the components.
Re:Hmmm Databases by poot_rootbeer · 2005-05-03 03:34 · Score: 1

It's interesting to note that MS has finally figured out that the "n-tier" was a dumb idea.

Except it's not. N-tier without an appropriate form of CACHING, now there's a dumb idea. But a properly designed n-tiered architecture can be more scalable and ultimately more functional than a flat architecture could.
Re:Hmmm Databases by kpharmer · 2005-05-03 03:35 · Score: 0

> Yes, it is. I've worked on a project that allowed offline modification of a database by replicating a
> copy to user's PCs, and it originally used XML as the format for data transfer. We got a 30% speedup
> by switching to tab-separated variables with a line of metadata at the start of each chunk of the
> stream.

Yeah, but I don't think that's a reasonable example - replicating a copy of a database to a user?

You also probably could have gotten a 75% speedup by compressing it before you sent it. But that doesn't mean that plain ascii is a bad thing.

And no, I don't enjoy working with xml. I think it's bloated & overhyped. But...in sending data between organizations or systems, its self-documenting nature is a plus. Usually.
Re:Hmmm Databases by Chitlenz · 2005-05-03 10:19 · Score: 1

Yeah this is I believe a relatively new thing for them. We were stuck in 7.52 at the time, with no upgrade path in sight due to legal issues, a problem I might add that is very common in implementations.

Even so, unless I'm mistaken , OTB tools are similar to Data Mover which used to hardcore suck, I don't know about now.

Our problem was one of replicating subsets of a large dataset that were not necessarily single tables, and the target was not a Peoplesoft install either (in other words, PS -> remote data warehouse project), so it probably wouldn't have applied anyway.

--chitlenz

--
Imagination is the silver lining of Intelligence.

is that why WinFS will require 1GB RAM? by weighn · 2005-05-02 14:18 · Score: 1

the huge resource strain (memory and CPU)
... will come, not from your dbms, but your os. The difference being?

--
Mongrel News all the news that fits and froths

Microsoft is taking over the world by n2networksolutions · 2005-05-02 14:19 · Score: 0, Redundant

Does anyone besides me get the feeling Microsoft is going to take over the world one day? Jeremy MCSE MCSA CCNA http://www.n2networksolutions.com/ Arizona computer consulting

Re:Microsoft is taking over the world by tocs · 2005-05-03 01:27 · Score: 1

No.
I think that MS is at best (best for them) just going to be around for a long long time. It will be the companies that deal with wide ranges of the data these databases are suposed to keep track of that will take over the world. MS will always be trying to be trendy. The companies that know who we are voting for, what we are buying, where we want to live, why we are afraid to travil to certain places, and how to keep track of all that and more are the companies that are going to have a shot at taking over the world.

Personaly, I think google is going to take over the world.

Re:I want clustered databases for high-availabilit by Chitlenz · 2005-05-02 14:21 · Score: 1

"The 'next great advancement' in databases will be when I can setup 2 or more Microsoft servers."

This actually works too. Stably too, go figure.

--chitlenz

--
Imagination is the silver lining of Intelligence.

You're thinking of Object Databases by SuperKendall · 2005-05-02 14:22 · Score: 1

Object relational was the "new thing" that didn't really take off as well as they'd hoped.

You're thinking of Object databases, which indeed did not take off at all.

However Object-Relational systems are EVERYWHERE. There's hardly a big database anymore that doesn't have several object-relational mapping systems between it and code...

Object->Relational mappers have taken off in a big way, which is good in a way since the databases can remain the nice placid solid systems they've always been and you can go to them directly when things get inefficient.

--
"There is more worth loving than we have strength to love." - Brian Jay Stanley

Re:You're thinking of Object Databases by Anonymous Coward · 2005-05-02 15:44 · Score: 0

No. He was talking about ORDBMS. You're talking about ORM.

time to retire, perhaps by cahiha · 2005-05-02 14:22 · Score: 1

From a practical point of view, the kind of databases that people like Gray have made a career out of have been very useful up to a point, no question.

On the other hand, those databases have already been pushed far beyond their limits: people have been using them inappropriately in many applications. Much of the "hot" recent stuff Gray mentions is not new technology: smart people have been proposing it and using it for years, only to be beaten down in the market by the relentless push behind relational technologies beyond any reasonable limits.

The research problems aren't new either: approximate, probabilistic, and inferential retrieval have a long tradition, but they have received relatively little funding because most of it has been going to fairly meaningless and incremental improvements in the kinds of traditional databases Gray has made a career out of.

Let bygones be bygones, but perhaps it's time for people like Gray to retire and leave the next generation of databases to people who actually have the background to work on areas like approximate and probabilistic retrieval. Research in those areas can build on the core storage technologies for existing databases, but the kind of expertise required for making new discoveries in those areas is completely different from what someone coming out of a database research group will have learned.

More drivel from people who should know better by Anonymous Coward · 2005-05-02 14:26 · Score: 0

How many times have I read database articles that says the same damn things through a thick and wooly fog of imprecise language and confused concepts, offering absolutely nothing: 1) the world is different now 2) the relational model is dead 3) XML, objects, blah blah blah.

We live in a time of extreme change, much of it precipitated by an avalanche of information that otherwise threatens to swallow us whole. Under the mounting onslaught, our traditional relational database constructs--always cumbersome at best--are now clearly at risk of collapsing altogether.

*sigh* Where to begin? First of all, the relational model is just that: a model. In fact it's safe to say that it's the model for data storage and manipulation, since nobody has presented one that's 1) more general and 2) doesn't reduce to the relational model. Clearly what the author is referring to are current poor implementations of non-relational SQL-based database applications. Normally, a knowledgable person should stop reading. But let's continue. Perhaps, even with poor wording, he still has something to offer. Even if it's just more quotes of the week.

Aside: has there been any time in modern history when we weren't in a time of extreme change, and there wasn't an onslaught of information? Why were databases invented in the first place?

Accordingly, the modern database system increasingly depends on massive main memory and sequential disk access.

This was a particularly funny quote.. what did they depend on before?? The printer?

now because databases have become the vehicles of choice for delivering integrated application development environments.

Another great quote! Databases are used to deliver IDEs?? What is he selling? Maybe he meant applications, not application development enviroments, but that doesn't make sense either.. applications have always been about data + algorithms + UIs.

Now ... business logic can run inside the database. Active databases are the result, and they offer tremendous potential--both for good and for ill.

Good lord.. hasn't he ever heard of a constraint? A stored procedure? A user-defined function? Column types?? That's business logic! At least he got some new terminology out of it: "active database". I'm now confident predicting that this article will consist mainly of coming up with new names for old things.

The really big news here is that these languages have also been fully integrated into the current crop of object-relational databases. The runtimes have actually been added to the database engines themselves such that one can now write database stored procedures (modules), while at the same time defining database objects as classes within these languages.

What does any of this mean? What was I using to write stored procedures before? Why are you calling them "modules" now? What is a "database object" and how can you define an "object" as a "class"? I thought a class was a template for an object?

With data encapsulated in classes, you're suddenly able to actually program and debug SQL

Anybody remember this TV commercial: "When pizza's on a bagel, you can have pizza anytime". I don't know what that meant, and I don't know what he means here either.

Now, fields are objects (values or references); records are vectors of objects (fields); and tables are sequences of record objects.

Yup, I was right... let's rename everything, that makes it new and fresh! Fields are now objects, values, or references.. records are now vectors and tables are sequences (is that different than a vector?). And somewhere in th

Google is a good example... by dantheman82 · 2005-05-02 14:34 · Score: 3, Insightful

I'd personally ask a Google employee where the future of databases is heading. The Google FS really shows where databases are moving...

I give Gray a lot of respect in most cases because he's a really smart guy. But the math and computationally-intensive parts should be focused in the probabilistic searches.

In one sense, though, Gray is quite right. And this is the direction of speech recognition. I might add that the Speech Server beta out by Microsoft is quite good...even at this stage.

--
This sig donated to Pater. Long live /.

Re:Google is a good example... by Anonymous Coward · 2005-05-02 19:16 · Score: 1, Insightful

Google is a fine example of a specific kind of database that is optimized for a very specific kind of query. Google writes everything to support their access patterns.

That's fine for Google, but they don't make general purpose DBs. There's no way that Wal-Mart would want to run their transaction processing or OLAP on anything resembling Google's DBs.

dom

The Future Of Databases? by brian_olsen · 2005-05-02 14:50 · Score: 2, Interesting

I see the future now and it will happen in two phases: getting rid of SQL and then replacing it with something half-way decent (like a properly implemented relational algebra.)

Re:The Future Of Databases? by Anonymous Coward · 2005-05-02 19:52 · Score: 0

Are you joking? Trying to program in the relational model is like trying to program in Pascal -- it contains no real-world necessities like outer joins or null values.

And the whole point of this article is that the relational model is no good for representing the types of data and queries that we need to have. For example, if a hospital were to store patient charts in a database, the thousands of tables needed would quickly give their Oracle DBA a heart attack. Do you want to write the relational algebraic notation for the query "Which patients are most at risk of a heart attack"? Ideally the computer would be able to figure out which parts of patient histories of heart attack victims correlate the best, and would then be able to find the patients with the closest sets of risk factors. In fact, I would like to be able to phrase the question like I did, instead of having to encode it in some special query language.

Sure, you could argue that these sorts of things don't belong in the database proper, but they are all general purpose data manipulation tools. They belong in the database just as much as bulk import/export tools do.

dom
Re:The Future Of Databases? by brian_olsen · 2005-05-03 06:54 · Score: 1

a. Why is it not possible for the relational model to manage data of arbitrary complexity? Current implementations probably can, probably can't, but that says nothing about the model itself.

b. If you can show that natural language research has reached us to a point where it can be used in production systems, then your argument is worthy about asking questions without the encoding process from question to database query. Until then, I would like to rely on the explicitness and preciseness of a query language, despite the fact that I have to "encode" my question into a query language (it seems like this is done all the time in programming, anyway.)

c. You are asserting something based on something I know nothing about (hospital systems). I don't know how you can really even assert your point about the weaknesses of the relational model without first giving me a explanation of the domain that you are using as an example. Even then, the domain could be open to subjectivity in terms of implementation options.

Hey, use what you want. For me, I enjoy the generality and simplicity that the relational model provides. Making it easier without SQL (which I originally was saying) will make it better, I believe.
Re:The Future Of Databases? by Anonymous Coward · 2005-05-03 07:31 · Score: 0

SQL is extremely useful. SQL is essentially a people friendly way of using first order predicate logic. As such, it allows programmers -- most of which today, unforntunately, have poor mathematics skills -- to create simple yet extremely powerful ways of analysizing and combining data. Imagine how difficult most programmers would find using databases if they used the symbols that mathemations use.

Granted, most programmers who write SQL statements, write crappy ones. Then again, most programmers who write [insert language of your choice] code, write crappy [repeat language name] code.

That's not an argument against SQL. It's an argument against mediocre programmers. When used properly, SQL is simple, powerful, and elegant. The parts of SQL that are less elegant are the parts that are database specific, i.e. propriertary. Stick to standard SQL and you will find it much more "deccent".

Finally, remember that SQL is just a language for performing predaticate logic. Regardless of what language you use, you still need to understand the mathematics of set theory in order to effectively manage large data sets. This is not a property of computers or languages, but rather it is a property of the universe. Ie. blame whatever god you believe in for the complexity of managing data.
Re:The Future Of Databases? by brian_olsen · 2005-05-03 09:28 · Score: 1

In many cases, I think a simpler (as in more orthogonal) query language can make life a little easier than with SQL. Otherwise I agree with what you say. SQL or not, it's not imperative programming, but playing with sets. Some don't want to see it that way.

For what it's worth, saying something in SQL than saying it with imperative code is much better, I think, given its descriptive properties.

The future of databases is... no Database at all!! by vhogemann · 2005-05-02 14:51 · Score: 2, Interesting

Picture this... memory nowdays is a hell lot cheaper than a full Oracle Licence. So, instead of investing on a DBMS why not buy massive quantities of ECC memory and keep all instances of your data in-memory for near instant access?

Crazy idea, huh? What if I said that this can be as fast as 8000 times faster than Oracle? And 3000 times faster than MySQL!

Crash recovery? No big deal, keep a serialized version of your in-memory-objects, and a transaction log and you're set!

Read more at:
http://www.prevayler.org/

--
---- You know how some doctors have the Messiah complex - they need to save the world? You've got the "Rubik's" complex

Re:The future of databases is... no Database at al by Anonymous Coward · 2005-05-02 14:56 · Score: 1, Insightful

Uhm, keeping your data in RAM with a serialized version on disk is a database, what makes you think it isn't?

But what if you want to access your DB from a different application that has a different serialization format? What if you want to perform arbitrary, ad-hoc queries that have nothing to do with your original object structure? What if my DB grows beyong my RAM? Oops. Welcome to 1970, we're working on solving these problems.

(For the record, the author did talk about memory databases.)

Re:I want clustered databases for high-availabilit by Nutria · 2005-05-02 15:03 · Score: 1

Can you do this without shared storage?

Why would you want to?

With shared storage (hello, VAXcluster 1984!), you still have access to all of your data as long as one of the nodes stays up.

--
"I don't know, therefore Aliens" Wafflebox1

The important question is by The+Big+Ugly · 2005-05-02 15:09 · Score: 0, Offtopic

will this still work in future databases?

1. On a new Worksheet, Press F5

2. Type X97:L97 and hit enter

3. Press the tab key

4. Hold Ctrl-Shift

5. Click on the Chart Wizard toolbar button

6. Use mouse to fly around - Right button forward/ Left button reverse

Yes, i AM aware that excel is a SPREADSHEET. it's my feeble attempt at a joke you vultures.

Google or something like it by Anonymous Coward · 2005-05-02 15:10 · Score: 1, Interesting

Will Google or a post-Google answer questions like:

Show me the cost of airline tickets when The Who were touring during winter and compare that, inflation adjusted to airline tickets today that I can purchase now.

Don't laugh, these people have one goal in mind - answering questions based on data on your disk or on the web.

Database as file system by bananahead · 2005-05-02 15:12 · Score: 2, Interesting

The only force that can change the nature and architecture of current database technology is a fundamental change in the way they are used. Change the requirements and the technology will change to meet the new requirements. Change the requirements in a radical way and you will get radically new technology.

The use of a database as a file system will require radical new technological advances in database theory as the current methods break down under the new requirements. The functionality of the file system will change as the capabilities of an underlying database are realized. The two forces together will create an interesting discontinuity in the industry, the kind the venture capitalists look for.

It's all good. Pray for WinFS.

--
A most overlooked advantage to owning a computer is if they foul up there's no law against wacking them around a bit.

Re:Database as file system by argent · 2005-05-02 16:22 · Score: 1

Pray for WinFS.

Like, one assumes, one would pray for a sick friend?

What is now considered the traditional file system API is not well designed for databases, but there have been other ones that might be better used in the past: an API that does for databases what the UNIX API (after all, virtually all file system APIs these days are based on it) did for files is needed.
Re:Database as file system by Anonymous Coward · 2005-05-02 19:51 · Score: 0

What a crock of shit!

Right now my "requirement" is an FTL drive. No? You mean I won't get radically new technology just because I want?

Technology doesn't advance because you want it to, much less the way you want it to. Technology advances when new discoveries are made, a notoriously unpredictable phenomenon, ESPECIALLY radical ones. Somebody could have come up with relativity in 1800 or 10 years ago. And no amount of funding or "praying" can change that.

Has it occurred to you, that, just perhaps, WinFS was dropped because our present knowledge of DB theory doesn't allow it to work well? And that, Moore's law notwithstanding, perhaps never will?

Or has it occurred to you that WinFS was dropped because, at the end of the day, it's a bad idea. A DB on the side for metadata, a la Spotlight? Sure. A DB inside the OS for DB like content? Yeah... A DB as a file system? Huh? And that helps with what exactly?
Re:Database as file system by bananahead · 2005-05-03 02:23 · Score: 1

I can certainly understand why you would not want to show your real identity with a reply like that.
Just to be clear, I know EXACTLY why WinFS was postponed. It has nothing to do with anything in your rant.

Technology advanced out of need. SOmeone could have come up with relativity 1800 years ago but for the fact that they were more interested in not dying. Survival was the order of the day.

You are a waste of everyone's time.

--
A most overlooked advantage to owning a computer is if they foul up there's no law against wacking them around a bit.

Re:I want clustered databases for high-availabilit by Slashamatic · 2005-05-02 15:27 · Score: 1

Digital did this over ten years ago. One of the things that Oracle inherited when they bought RdB from Digital was the cluster support. However it seems they tool a long time to get the technology into their own RDMS.

Re:I want clustered databases for high-availabilit by afabbro · 2005-05-02 15:37 · Score: 2, Informative

Who moderated this interesting? "-1 Clueless" or "-1 ill-informed" is more like it.

2 servers acting as a single database server has been available for many years...e.g., Oracle 9i RAC, Oracle 10g, DB/2's something or other, etc.

--
Advice: on VPS providers

Re:I want clustered databases for high-availabilit by kpharmer · 2005-05-02 15:47 · Score: 1

> Digital did this over ten years ago. One of the things that Oracle inherited when they bought RdB
> from Digital was the cluster support. However it
seems they tool a long time to get the technology
> into their own RDMS.

I've got fond memories of an 800 gbyte billing & customer data warehouse on rdb around 1995 - giving sub-second access and running on a vms quad. That was such a slow system compared to what we've got now - but it sure handled a ton of data well.

On the other hand, I don't remember how much was due to the excellent clustering in vms - or how much was due to rdb...

Burned-out Hippie Speaks of Love, DB Integration by Anonymous Coward · 2005-05-02 15:55 · Score: 0

Man, is there anything he left out? My God, you'd think that everything (my TIVO, my IPOD, XML, streaming data, web servers and my mother's apple pie) was a database. This guy was a stoner, but his brain's too fried to qualify now. Most hippies were a lot brighter and did good theoretical work; this fellow's little more than the burned-out husk of what once passed for a hacker.

What ever happened to OODBs? by elgee · 2005-05-02 15:58 · Score: 2, Interesting

At one time, I though object oriented databases were going to be the next big thing.

Re:I want clustered databases for high-availabilit by SpecialAgentXXX · 2005-05-02 16:03 · Score: 1

My original post said:

The "next great advancement" in databases will be when I can setup 2 or more linux servers and have them act as a single database server.

i.e. Out of the box, I can setup a database cluster. I'm talking about costs for HA. If I can dump the big iron for 2- or 4-way x86 servers, I'd save money. But if I need to pay a lot for support for Oracle's RAC, or a custom setup/installation, etc., then I'm not saving money. It all comes down to the bottom line.

Add Prolog To the Database by Anonymous Coward · 2005-05-02 16:04 · Score: 0

Prolog is the best relational language to date and should be integrated into databases. Only one more procedural language, Common Lisp, should be added.

Prolog can do everything SQL does and much more. It is the natural language of relational databases.

-xeo_at_thermopylae@yahoo.com

regex by yagu · 2005-05-02 16:24 · Score: 1

Regex! As processors get faster, memory gets cheaper.... I wouldn't be surprised to see more better, faster, etc. implementations of regex that allow doing what full blown databases do today. Of course that's in a read/only context, but I've implemented full blown "database" applications centered around the regex. And some will point out regex doesn't deal with integrity and data management issues, I would point out many databases are implemented in overkill mode where data integrity and management are handled sufficiently and nicely with underlying OS mechanisms and the database engine itself becomes uneccessary (sometimes evens adds overhead).

Personally, I think so many things are "database" implemented because some glossy brochure somewhere convinced a room full of PHB's they needed a database solution.

Again, let me re-iterate, I wouldn't suggest this replaces and/or solves database issues, and becomes the new direction of database technology, but the increased processor speeds DOES allow for implementations relying solely on "crufty" technologies (e.g., regex, be it perl, awk, python, whatever) instead of databases costing tens of thousands (and more) dollars.

Re:regex by Mant · 2005-05-02 20:57 · Score: 1

I think in larger companies lots of things are implemented as database because IT departments keep getting bitten by things not implemented as databases.

These things have a nasty habit of growing in size, complexity and importance, then someone wants to joining the data up with something in your DB. Now you have to reimplement the whole thing.

In a large company you probably have licence agreement or existing machine you can use (just put a new schema in) so the money cost isn't much. You probably have tools for developing and writing/running reports on the DB, and in house expertise in using them. Your databases are probably already set up with backup and failover, so all that is done for you.

So what some people see as overkill, others see as future proofing.

Re:The future of databases is... no Database at al by kpharmer · 2005-05-02 16:27 · Score: 2, Informative

> So, instead of investing on a DBMS why not buy massive quantities of ECC memory and keep all
> instances of your data in-memory for near instant access?

because a *well-tuned* relational database with a 1:4 ratio of memory to disk is almost as fast as an in-memory database - due to efficient caching

because some queries require an enormous amount of temp space. supporting them can easily double your space requirements - which have to be purchased in memory.

because if you just want to run your database in-memory you can already do that with most databasees.

because you don't have the same speed requirements for every piece of data in your database. You might have some tables used for session & user management that are often read & written to and must be very fast. But other tables that just hold seldom-accessed historical data. A modern database would allow you to keep the small & fast tables effectively in memory, and the huge 100 gb history table on disk. And you don't have to buy 100 gbytes of memory to do it.

because...it's just a bad idea.

May Be is a ligth in the horizont by ktija · 2005-05-02 16:29 · Score: 1

www.teramanager.com Teramanager - HRI (Historic Retrieval Interface) In some cases, Customer Care Departments need to have historic information online to fulfill the customer's requests. This requests force the operator to be online with the master database generating an important traffic over the network and an extra workload for the host. HRI offers a way to avoid theses problems making a large historic database very easy to handle, distribute and install in remote locations without having to be connected the "real" database. This module is used to publish information over the internet avoiding exposition of the main database. The information is secure and is accessed at very high speed. Applications: Reduce Host traffic and database inquires. Access Security (users do not access the host database) Internet Access

Re:The future of databases is... no Database at al by rossifer · 2005-05-02 16:53 · Score: 4, Insightful

[misc drivel] Read more at:
http://www.prevayler.org/

Oh my dear god. You've never actually used Prevayler have you? Prevayler isn't nearly as useful on actual data problems as Prevayler's worshippers would have you believe.

I know this because I tried to use it. If you'd ever tried to use it, you'd know how unbelievably poorly it performed when attempting to implement real world queries. You have to implement every query in Java, and Java is a particularly poor implementation choice for creating complex queries.

What if I said that this can be as fast as 8000 times faster than Oracle?

This "performance comparison" that the Prevayler group trots out is particularly funny as their test uses a single ArrayList of objects as in-memory "storage" and then "queries" it by index. Not exactly a realistic problem. Try a query across four classes with a few million instances of each class and you'll quickly discover what relational databases are good for.

Regards,
Ross

Bullshit artist... by Anonymous Coward · 2005-05-02 16:58 · Score: 2, Informative

Prevayler exposed.

Unfortunately... by Craig+Ringer · 2005-05-02 17:05 · Score: 1

there isn't a +6(Interesting).

The big end of the database world has always seemed strange to me. Your post provides some interesting views on that area.

Thank you, Wikipedia by peachpuff · 2005-05-02 17:27 · Score: 1

. . . I no longer have to debunk the sweeping claims of an AC--I can simply provide a link.

--
-- . . ramblin' . . .

Re:Thank you, Wikipedia by Anonymous Coward · 2005-05-02 18:14 · Score: 0

I would also like to take this opportunity to thank Wikipedia.

Anything in that link (which for those who have not followed it, is an explanation of the ACID buzzword for databases) is not database-specific. In point of fact elements of it apply all the way down to assembler, if you take things like the atomicity of interrupt handler behaviour as being significant.

Really, anything in there can be said to apply to the desired behaviour of any multithreaded data structure handling code.

You do not need a database to provide the ACID guarantee, and you should demand the ACID guarantee in the behaviour of code which has to deal with databases as well, otherwise you will find eventually that your code is out of sync with the behaviour of the database. As a trivial example, consider something which generates reports based on the information in a live database, but is naively programmed and in generating two reports in sequence, generates two graphs showing two different revenue rates.

This, for those who are in any doubt, is bad.

So all that leaves you with is the assertion that somehow ACID is the special sauce which makes databases tingly.

I'm sure it is. But that doesn't, by itself, justify using a database when entirely other programming approaches can do the same thing; usually faster, cheaper, and smaller.
Re:Thank you, Wikipedia by peachpuff · 2005-05-02 19:11 · Score: 1

"So all that leaves you with is the assertion that somehow ACID is the special sauce which makes databases tingly.

"I'm sure it is. But that doesn't, by itself, justify using a database when entirely other programming approaches can do the same thing; usually faster, cheaper, and smaller."

Maybe you should re-read the post I was responding to. The AC (you?) claimed that databases have no "tingle" at all. I pointed out that there is something of value there: it's not just deadweight overhead and generic storage. Thanks for taking my side on that one.

Sure, you can get all the features of a database from something that's not a database, but let's let people make their own price/performance decisions. (There are some free DBMS's, by the way.)

--
-- . . ramblin' . . .

One (At Least) Problem I Have With The Article by Master+of+Transhuman · 2005-05-02 17:43 · Score: 1

This notion of "active databases" seems to me to be interesting but fraught with problems.

Not least of which is the old bugaboo - documentation. How do you document a system composed of myriad triggers scattered on myriad tables in myriad databases communicating over the Net?

All I know from trying to decipher ONE Oracle Forms application at City College of San Francisco is that it is nearly impossible to get a handle on what happens where when. There appears to have been NO effort made by Oracle to enable a coherent method of documenting an application developed with their Forms technology - or of reverse-engineering such an application in order to develop such documentation.

Just printing out a bunch of trigger code and GUI design panels says nothing about how the app actually is supposed to WORK.

Great for obfuscating your proprietary code, I suppose.

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!

Re:One (At Least) Problem I Have With The Article by anvil+{UK} · 2005-05-02 22:52 · Score: 1

I agree entirely about the issues of documenting (and indeed instrumenting) database code (Though I'd characterize forms as a programming language and client side code).

But, but, but

look at the general standard of documentation of traditional client side and server side code - all those reams of clear lucid documentation of how the EJB links to that EJB to this datastore to this web service and so on. The base problem with documentation is that geeks can't write. (or don't understand the app they are building or both).

Oracle code (the RDBMS I work with) can be documented and apps can be documented. That they aren't generally isn't a technology fault but a design and build fault.

Re:I want clustered databases for high-availabilit by Master+of+Transhuman · 2005-05-02 17:49 · Score: 1

Well, since Microsoft recommends running a separate server for every server function, I imagine they'd say if you want to run two SQL Server databases, you'd best use two SQL Server engines running on two separate Windows Servers on two separate hardware systems - for which of course, you pay for two licenses (and two more for the Windows Servers).

Of course, Oracle with their database layout basically says the same thing - except they want you to put your indexes, your tablespaces, your logs and everything else on at least SEVEN separate servers...

Funny how that works out to mean more licenses to buy...

I view this article as meaning that Microsoft intends to introduce a new "Data Mining Server" - which they will recommend running on yet another Longhorn server running on yet another PC...

--
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!

Re:I want clustered databases for high-availabilit by Slashamatic · 2005-05-02 18:10 · Score: 1

Three words: Distributed Lock Manager

This is one of the most basic cluster services in OpenVMS. It is fast and scaleable (I use the present tense as there are still some big installations knocking around). The main thing about it is the way it allowed you to keep buffer caches synchronised across a cluster. The I/O system underneth RdB was actually part of Digital's CODASYL product, DBMS-32 which had been clustered quite happily over twenty years ago so it was well proven.

To be fair, this is probably why Oracle had trouble using Digital's technology as Oracle needed to be platform independent and not many platforms support the multiple functions of DLM in such an elegant way.

A difference between "DBA" and "clown" by Moraelin · 2005-05-02 18:28 · Score: 4, Interesting

Yes, a good DBA and/or Database Developper is a very valuable addition to any team.

The problem is that in a lot of corporations (e.g., the one I work for), they -- and all other admins -- have been taken and put in a different building. And more importantly they don't actually have to cooperate with any team.

Their job's goal is no longer the same as the developpers: to get a program done by a deadline. They've been turned into a bureaucracy whose only job is to see that the servers run. No more.

That's an _awful_ job description, because it directly makes the developpers their enemy. I'm not even talking "slippery slope", but direct cause-effect. Instead of being "the other half of the team that will make this program work", developpers just become "those assholes who crash our servers."

It's not hard to get from that point of view to pathologic cases like the admin that limited our productive servers to 3 connections per server. He kept his own servers running perfectly (which is his job description) at the expense of making the company's productive programs grind to a halt (hey, it's not in his job description to care about those.)

That's the problem with that kind of internal organization. As one BOFH-wannabe once said "The source of the problems on my network are the users. Would you prefer that I cut your access? Then there wouldn't be any problems any more." Another one threw a hissy fit that we dared ask that he does his job, during work hours. Yeah, how dare we bother him by asking if he could please reboot the test server he's managing.

That's the underlying problem. Instead of providing a service _to_ the users, a whole caste has been created whose job is to serve the computer, and the users are just those pesky assholes disturbing his majesty the computer. That's a very unproductive situation to create.

Worse yet, a bunch of companies invented the devastating practice of internal invoices. The admins in one department won't even go to the toilet unless they can send an bill to another department for it.

They won't even talk to each other (e.g., the WebSphere admin telling the DBA and the Unix admin that he needs a Solaris patch and a newer version of Oracle for the "transactionBranchesLooselyCoupled" setting.) No, you have to personally talk to all three of them, because otherwise they can't send three bills for it.

And predictably, they'll do _nothing_ more than the bare minimum that was requested and billed. E.g., you have to tell the DBA explicitly to set this and that, to this and that value, because she won't do that on her own. Which basically means you already need to have all the knowledge of a DBA, and she is just acting as a proxy over the phone... and sending you a bill for it.

Basically if you're not that kind of a DBA, you have my respect. All I'm saying is that when you read about "teams of clowns" or about people who'd rather invent their own storage than deal with a DBA... well, they're not necessarily avoiding _your_ kind, but the kind of clown I've described above.

--
A polar bear is a cartesian bear after a coordinate transform.

Re:A difference between "DBA" and "clown" by NineNine · 2005-05-02 18:48 · Score: 0

You're talking about beauracracy, which as you describe it, is very common in large organizations (I worked as a hired gun for many large companies). I'm saying that from a technical and architectural standpoint, there are many, many benefits to fully utilizing a RDBMS like Oracle or DB2. The unfortunate fact that your company has locked down the databases too tight (and from what it sounds like, *no* database developers to handle that middle layer) really doesn't have a lot to do with the fact that RDBMS' are *not* designed to eb simple data repositories, but are designed to, and are effective at handling a large amount of business logic and data integrity for many applications.
Re:A difference between "DBA" and "clown" by Moraelin · 2005-05-02 20:38 · Score: 1

Well, I'm not going to argue about that. I was known to argue myself to _not_ reinvent joins in Java, when Oracle or DB2 already do it better.

And, yeah, I've worked in (far smaller) places before where the DBA was a part of the team, and sat about 10m from the team I was in. It really helped the project.

I'm just saying that if someone started directly for a large bureaucratic corporation, they probably never experienced that. They only experienced the "clowns in the admin department" fuck-up. It's a pity, but as you've said, it's very common.

--
A polar bear is a cartesian bear after a coordinate transform.
Re:A difference between "DBA" and "clown" by Chitlenz · 2005-05-03 02:58 · Score: 3, Informative

I live in the middle. Im a DBA Architect, which means I both design and build the databases our company uses. Add to that, we're a small company, and we design very specialized software in a way that not many people can do so I also wear the hat of C# coder. I understand both sides of this fence, and have actually been in the odd position of fighting for both points of view. A good DBA is responsible for all of the flexible information that makes a modern corp. run. Think about that. All the paper, all the reports, your payroll, everything worth owning informationwise within a company is in a database somewhere. HELL YES these guys live at corporate hq. That said, in a healthy company, the DBAs and devs are able to debate rather than fight. One particularly obstinate Peoplesoft lead dev in my past and I have become very good friends over the years through this kind of argument, so its not all bad =)

My sympathy, however, does indeed go out to the poor devs who get stuck with some tool that doesn't really understand, or even want to understand, his position as an admin. Too many people slipped into the field with dollars in their eyes in the 90s, and it's led to some truly spectacular screwups. Essentailly, in my mind, almost every single failed ERP implementation could and should be blamed on insufficient database administration, and there are LOTS of flameouts there.

The upside ... hehe maybe.. is that corporate scrutiny of their IT staff is at an all time high! So if they really suck that bad, their days are probably numbered.

--chitlenz

--
Imagination is the silver lining of Intelligence.
Re:A difference between "DBA" and "clown" by Anonymous Coward · 2005-05-03 04:03 · Score: 0

Fire them all and start over. Seriously. The productivity hit for training new people up will be outweighed by dropping the productivity anchor you're carrying around now with the obstructionist assholes. If you don't have hire&fire auth, then just slash their tires repeatedly until they quit.
Re:A difference between "DBA" and "clown" by spoonyfork · 2005-05-03 04:16 · Score: 1

Wow. Do we work at the same company? You just described my last 1.5 years perfectly. Now I have the shivers.

--
Speak truth to power.

Re:I want clustered databases for high-availabilit by Anonymous Coward · 2005-05-02 18:32 · Score: 0

It's called... Oracle 10g.

Re:I want clustered databases for high-availabilit by Anonymous Coward · 2005-05-02 18:35 · Score: 0

Of course, Oracle with their database layout basically says the same thing - except they want you to put your indexes, your tablespaces, your logs and everything else on at least SEVEN separate servers

No, at least in the Oracle books I've read, it's spreading all that crap out across seven separate disk devices (i.e., SCSI, not IDE), with some on separate SCSI controllers. It's increasing that parallelism in data i/o...

Inter-server communications with linked DBs is slow.

The future may be open source databases by Anonymous Coward · 2005-05-02 18:47 · Score: 2, Interesting

There's a lot of talk in database circles about the fact that open source databases may do to commercial databases what linux did to commercial unixes. i.e. wipe them out. Recently LazyDBA one of the most well known websites for database administrators started supporting open source databases. Add to that the fact that Oracle is going on an app buying fest (Peoplesoft and now maybe Siebel), database people see that the commercial database in danger.

Re:The future may be open source databases by Anonymous Coward · 2005-05-02 22:29 · Score: 0

So are we saying that the reason Oracle is buying ERP software companies is because it sees it's cash cow, the core RDBMS dying and being replaced by open source databases? It could make sense, as it's reported that most revenue is now coming from license renewals and support for commercial database vendors. i.e not many new licenses.

But did linux wipe microsoft off the desktop? no. And most of it's revenues come from license renewals in the form of people replacing their pcs.

Re:The future of databases is... no Database at al by Anonymous Coward · 2005-05-02 18:50 · Score: 0

Even better, add or modify attributes on those tables in the RDBMS, vs doing so in an Object database (even the object layer in Oracle. Typically, it means tearing down the database and rebuilding the schema...er, object hierarchy, with 100% data loss...).

To answer the question... by jim_v2000 · 2005-05-02 18:55 · Score: 2, Funny

"Ever wonder where database technology is going?"

Yeah, all the time.

--
Don't take life so seriously. No one makes it out alive.

Graph-based Databases by Gloggy · 2005-05-02 19:20 · Score: 1

Pity he neglected to mention Graph-based databases (as in DAG). A substantial problem lies in the dynamic nature of information. Relational databases are lousy at storing relationships between data that were thought to be unrelated. Having to change the database structure all the time is a nightmare anyone can do without. Graph databases are able to model knowledge much more accurately with the added benefit of being able to store relationships between nodes without changing the design. Bioinformatics has been a good example of an application area for graph-based databases. Here the masses of information (ontologies, pathways, RNA sequences) need to be related in many different ways. Graph-based databases allow quering information in novel ways that relational databases simply aren't capable of handling. From that aspect, the requirements today are really very different from 20 years ago.

Re:Graph-based Databases by plopez · 2005-05-03 01:53 · Score: 1

How is a graph different from a relation?

--
putting the 'B' in LGBTQ+

You mean in the last 25 years? Nowhere. by Qbertino · 2005-05-02 19:38 · Score: 3, Insightful

Honestly, folks, databases are like crutches: Pathetic, but you when you need them, there's hardly an alternative. They are the living proof that abstract concepts and computer simulation of those on real world hardware need the strangest type of hacks to be mended together.

On top of that - and this is the worse part - what we call databases today is nohing much more of a historically grown apocalyptic chaos. With one of the crappiest programming languages ever as a cornerstone of its technology. A weedy mumbojumbo of wanna-be virtual machines, wanna-be server daemons, makeshift security layers, obstrusive user management and pseudo operating systems and a bazillion proprietary variants of said programmin language. With features bolted on left right and center. This basically is the case with any current DB in widespread use, be it MySQL, Oracle or anything inbetween.
And if you look at the core of it Database technology and how long it has been that way there isn't much hope that DB's will go anywhere anytime soon.

Then again, if you want to get a glimpse of a possibly brighter future, I'd actually recomend Zope. I consider it's object relational DB a working proof of avantgarde "database" concepts and a prototype of what DBs generally could look like in the future if anyone were interested.

--
We suffer more in our imagination than in reality. - Seneca

Re:You mean in the last 25 years? Nowhere. by poot_rootbeer · 2005-05-03 03:38 · Score: 1

nothing much more of a historically grown apocalyptic chaos. With one of the crappiest programming languages ever as a cornerstone of its technology. A weedy mumbojumbo of wanna-be virtual machines, wanna-be server daemons, makeshift security layers, obstrusive user management and pseudo operating systems and a bazillion proprietary variants of said programmin language. With features bolted on left right and center.

I agree. While the concept was a good idea, the sheer amount of historical cruft that now clutters up Unix is... huh? You were talking about databases?

$ /usr/local/bin/color "surprised" > /dev/me

Future or History by Randyj70999 · 2005-05-02 21:18 · Score: 1

This sounds very much like a program I assisted in in Cybernetics nearly 30 years ago. Based on modeling Intuition which pitted two automatons against each other playing a game (tick-tac-toe) one automaton kept a 'database' of failed moves and learned from them and got smarter. While this is not new, even then, and while written in FORTRAN IV it was a primitive version of this Idea.

My thought is this, if this is the best microsoft can do, dredging up ideas 30 years old as new thought, no wonder they are constantly reinventing the same wheels in development.

RJ

databases and filesystem by Herve5 · 2005-05-02 21:40 · Score: 1

There is an interesting analysis of databases in filesystems (and metadata...) in the Ars Technica review of OSX: extended attributes managed at system level, an application like Spotlight making (some) use of this, etc. http://arstechnica.com/reviews/os/macosx-10.4.ars/
(this link was already given in the recent OSX Tiger discussion here)

Hervé

--
Herve S.

Databases have no future by jandersen · 2005-05-02 21:41 · Score: 1

Well, of course they do, but I think the database concept has evolved as far as it makes sense. Relational and object oriented databases are more or less the logical limit to what you can meaningfully do about organising data - and don't forget, databases are about organising data, not about how you use it afterwards. You could argue that retrieval methods are part of what a database is, since eg. indexing is a way of retrieving data - but that is bad thinking, in my view. An index is just data organised in a certain way.

That is just my opinion about things; now roll out Wikipedia and Webster dictionary to 'prove' me wrong, I don't really care. But why make a fuss about it at all? Well, I feel there is a trend to muddle the concept, and I think it is a good idea to keep those things clear and simple; otherwise it all just ends up being marketing drivel. Take for example the way 'the internet' has become equivalent with 'web sites' - which is obviously nonsense. The internet is a physical network plus a number of protocols, of which http is one. But we still see from time to time some stupid sod blaring out 'The Demise of The Internet' because of some new virus or other nuisance that affects a large number of web sites. Hey, let us keep our minds clear - certain applications of the internet may go out of use, but the internet continues.

So, back to databases - OK, some wise guy thinks that we will see databases more like so and so, and that it could be cool if whatever. All well and good, but its not that databases are changing fundamentally, its just new applications.

Most of our clients... by cardpuncher · 2005-05-02 21:46 · Score: 2, Insightful

... are now asking questions that require approximate or probabilistic answers

I suspect that may translate as "most of our clients want to be given easy answers to difficult questions".

I'm sure there'd be a big market for a database system that stored flight bookings and could answer the question "which of our customers is a terrorist?". You don't address that market with new technology, though, but by developing new sources of snake oil.

Re:Most of our clients... by Mr+Silly · 2005-05-02 22:40 · Score: 0

Unfortunately probabilistic answers will probably be wrong.

Re:I want clustered databases for high-availabilit by Anonymous Coward · 2005-05-02 22:22 · Score: 0

Shared storage hardware is VERY expensive for me (I am in india), so I much prefer a share nothing system.

Replication is nice, but a true multi-master setup would be even better.

Object persistence vs. database by Trinition · 2005-05-02 22:33 · Score: 1

Both the parent and gradparent are right. The fact is, you're talking about apples and oranges.

When I look at things like Prevayler and XL2, I see systems designed for making it so your in-memory object graph is persistent in case of power loss. Yet these objects are still tradional objects with complete data encapsulation

When I looks at RMDBMSs, I see systems designed to store data, queried in arbitrary ways, and violating object-oriented encapsulation (your data is no longer only accessible through your object).

With an object persistence mechanism, if yo want to "Search" through your object graph, you have to build your own indices and lookup mechanisms (HashMap, TreeMap, BTree data structures in memory, etc.). With an RDBMS, you get that for free. THere are supposed to be OOBDMS that will let you do what you're used to in RDBMS, but I've not used one.

But the thing is, often all I need *is* object-graph persistence, not a full-blown RDBMS. And I can throw in a few hand-crafted mechanisms for indexed retrieval of my objects when necessary.

Re:Object persistence vs. database by topham · 2005-05-03 01:01 · Score: 1

Which is why in Tiger (OS X 10.4), Apple has included Core Data.

An application programmer shouldn't have to worry about the low level details of storage.

(I realize Tiger may mean nothing to you; this not being an Apple forum, but I think it is of interest that an OS vendor has supplied even a basic mechanism to handle the task.)

It isn't full featured, but it has the ability to use SQLite as its repository. I'm not curious as to how easy it would be to add an Oracle implementation to allow for transition from low end to high-end. (note: Apple expects you to not fiddle with their storage details, I expect it is sub-optimal on fields and value types stored. Don't expect to map it to an existing database).
Re:Object persistence vs. database by kpharmer · 2005-05-03 01:12 · Score: 1

Actually, I think that using sqlite would be far better than oracle for this purpose: Oracle carries a lot of baggage along with it - and really isn't optimized to just as few resources as possible on a desktop.

SQLite on the other hand, is so small and has so little overhead that it's probably very good for system performance. And with so little code involved it's much less likely to require upgrades. And (unlike another light database) it's highly sql compliant.

Microsoft Access! by Your+Average+Joe · 2005-05-02 23:19 · Score: 1

The future of databases must surely be Microsoft Access.

--
Your Average Joe

A solution looking for a problem? by QuietLagoon · 2005-05-03 00:21 · Score: 1

Microsoft has hired a bunch of pie in the sky forward thinkers from the golden age of computers. The question is - will their knowledge and visions be relevant in the future? How many visionaries manage to remain visionaries through the passage of time?

Brain salad model railroad by Senor_Programmer · 2005-05-03 00:40 · Score: 1

Well that's the essence.

The DB as a big artificial brain.
Associations are weighted. Parts of it 'dream' up new associations. It organizes itself based on assorted 'self conceived' recipes.

It's one very fuzzy wart on the ass of the AI as a black-box brain,
rhinocerous. Kinda cool that folks are thinking about AI in functional rather than structural terms, even if they may not know it.

Mind you this is just a first cup of coffee opinion!

Database technologies sagging to their knees by hey! · 2005-05-03 00:49 · Score: 1

The reason that database technologies are a sticking point, IMO, is that as a relatively mature (although still developing) technology, they don't attract much mindshare, and overall skill levels in using the technology is stagnant or deterioriating.

I've known with newly minted advanced degrees who could tell you about the finer points of implementing model view controller using web frameworks, talk you ear off about user principals and aspect oriented method interception, yet they get stopped like a deer in the headlights when faced with how to construct a relatively simple join query. Nested queries, outer joins, and null value semantics in aggregate functions are completely exotic to them. They no just enough to get by, they have no art in this area.

Yet, I can tell you that twenty-five years ago, the much smaller geek community of the day had tons of people who could argue the finer points of relational calculus.

It's technological fashion. These days, database technology is to informatics what plumbing is to architecture. Unglamorous, but you can't live without it, and when all the toilets are plugged up it doesn't matter how spiffy your new corporate headquarters are.

Of course, now that supply of database talent has dried up, people want their corporate information not only to flow, but to do tap dances and pull rabbits out of a hat. It seems to me that Mr. Gray is talking about is the union of two highly useful but passé technologies: database technology and fuzzy logic.

--
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.

Language by JJ · 2005-05-03 01:51 · Score: 1

Language certainly requires a probablistic database, at least for the lexicon. Syntax, semantics, morphology and phonology all respond more to the rules of probablistic databases than relational ones. If such a key function in human life/ culture is a probablistic database system, isn't it logical that higher-level functions would also be?

--
So long and thanks for all the fish . . . !!!

Got Your SQL Server 2005 License Yet? by LifesABeach · 2005-05-03 02:16 · Score: 1

Actually folks, this article is in timing with SQL Server 2005's release. One only need do the math to see that Rational Rose has a m$ smack light on it. SISS is DTS with more "Lower CASE" handling. There was no tie in with the Visio/UML solution; I figure its just around the corner. Other businesses that make money off of SQL Server were overheard grumbling at the Anahiem conference about the SQL Server 2005 release. M$'s BI,(Business Intelligence), interface can do some matching with Fuzzy, and Nueral logic; Along with 3 other methods of "Best Guessing". I give mySQL/apache/perl/firefox/openOffice about a year to catch up on what the 2005 product is doing.

abstraction by psbrogna · 2005-05-03 02:42 · Score: 1

IMHO, continuing to evolve the db to be as simple as possible is a good direction to go in. If you need fancy fuzziness, can't this just be another layer that enables various types of indexing? It seems to me that a modular approach to db, like mySQL uses, is a solid architecture to invest in. The modular system of back ends (ie. myisam, innodb) really empowers us to tailor an application infrastructure appropriately for all the different db use cases; primarily a read db in the case of dynamic web sites vs. rd/wr with heavy, contention, etc. Doesn't it seem appropriate to take the same approach to indexing functionality? To abstract it into a different layer? There could be modules for the r-tree stuff, full text searching, and all sorts of other fuzzy stuff. Giving the designer/programmer choices seems to be the best way to go; the write tool for the job. Wasn't this the philosophy behind Wirth's Oberon? To have collections of modules instead of distinct applications? It also reminds me of how/why *nix is so powerful and succesful- collections of small simple tools that can be put together like building blocks. Abstract modules or layers is obviusly a resilient, efficient way to design a system.

$post by $anonymous_coward by Anonymous Coward · 2005-05-03 02:53 · Score: 0

$famous_person is such an idiot. I thought about $hard_problem (without reading the article) for 30 seconds and came up with $obviously_broken_solution. Insert 4 or 5 more misspelled words. The end.

Re:$post by $anonymous_coward by Anonymous Coward · 2005-05-03 11:44 · Score: 0

$troll_about_your_mother

Huh? by radiophonic · 2005-05-03 03:12 · Score: 1

I set up an IMAP server, told the developers it was Oracle, told the boss they loved it, saved $15K and got a 10K raise in salary.

And then...I woke up.

--
Whenever you read this sig someone's refrigerator light turns on.

Ask Chris Date and Hugh Darwen about ... by CodeArt · 2005-05-03 03:23 · Score: 2, Insightful

.. future of Database Technology. Actually you don't need to ask them. Just go to any bookstore and buy one of their books and you will quickly learn that relational doesn't mean SQL. Relational databases are about two-valued predicate logic and set theory and there is not more solid then this to be used as a basis for storing and manipulating information. Future databases will be truly relational truth systems with the support for user defined types and temporal data at the logical level and the much better implementation at the physical level. Jim Gray is authority in area of transaction processing but not in area where databases and database languages in general.

Re:Ask Chris Date and Hugh Darwen about ... by psbrogna · 2005-05-03 03:32 · Score: 1

Certanily Date, Darwen, Codd, etc are the ones that pioneered this stuff but when you hear them speak they don't seem all that receptive to evolving the seminal work they layed out. Gray does seem to have written and been recognized for his work in the db area. Nothing wrong with new ideas; right or wrong.
Re:Ask Chris Date and Hugh Darwen about ... by CodeArt · 2005-05-03 03:42 · Score: 1

Chris and Hugh are evolving the work very intensively but it looks like nobody in the industry is listening (there is very strong alliance between application and hardware vendors whose interest is to maintain the current state to be able to sell more obsolete or retrograde stuff). The Web sites that database pioneers are maintaining are pretty much alive and updated on the weekly basis. Check these Web sites: www.dbdebunk.com , www.thethirdmanifesto.com or books like classic 8th edition of "An Introduction to Database Systems", "Temporal Data and Relational Model" and upcoming book about database fundamentals from O'Reilly.
Re:Ask Chris Date and Hugh Darwen about ... by psbrogna · 2005-05-03 04:14 · Score: 1

Thanks for the leads, I'll check 'em out. My impression was based solely on their stance during talks.
Re:Ask Chris Date and Hugh Darwen about ... by Randyj70999 · 2005-05-03 04:28 · Score: 1

You should ask Chris Data about Trans-Relational databases. He'll fill your head with it.

RJ

Re:The future of databases is... no Database at al by LordMyren · 2005-05-03 03:33 · Score: 1

Prevayler is useful as an object store, thats about it. As a database, it fails pretty miserably, but I dont think anyone had pretentions of it ever being a database.

There's a very easy way to see that prevayler is not a db: prevyalerworks by loading the entire db into memory. This is the exact purposeo f a db, to hold data which cannot fit into memory and still perform queries on it.

If you just want an easy way to store your objects, prevayler aint half bad. Its very useful for saving state.

Myren

bathgates nazgul by Anonymous Coward · 2005-05-03 03:52 · Score: 0

These 'distiguished microsoft engineers' are essentially nazgul. Once powerful thinkers, corrupted by greed, they have thus been subverted by the power of the dark lord, and betrayed all the values they may have once stood for. Now they are kept alive by spells.

Vendor Recs: Objectivity, Caché, M$FT??? by mosel-saar-ruwer · 2005-05-03 04:31 · Score: 1

My particular field is medical visualization for Radiology, so essentially I have to organize huge sets of patient data in a way that I can do things like, well, volumetrically render your skull to see if you have a lesion, etc. Today, I have to pull this to the workstation, organize the dataset, and render the scene from the dataset onto the stage. Because of the flowing nature of our data (that is to say, this isn't like a game where you can pre-cache models on the local workstations since every patient is a different model), I would like a way to tie direct3d to a pre-render engine at the database layer so that all I would have to provide to a client like a web page is the end product. I'm working with MS SQL atm, so I'll use it as an example, a typical MRI image of your chest comes out of a scanner in some stupidly high resolution. That scan typically contains voxel data which is defines by the mm thickness of the slice. Your POV as an end user over the web is, 'all I care about is this one particualr diagnostic output', or one image lets say. To actually GET that image may or may not require that a set of transforamtions be applied to a large subset of slices in any particular study. It would be really nice to not have to add external services (another app), and instead be able to directly and natively be able to access the inner workings of the database engine to do this directly, instead of offloading it to the local OS. Object programmability, in the .NET for instance, would allow me to actually write all of the above applications directly in SQL, rather than writing them in C# and then using ADO.NET as as interface layer for the database (again another middleman).

We're about to start on a big database backend for scientific and engineering frontends, and I'm having the damndest time trying to find a product that was designed with an eye towards what I'd call "basic mathematics".

Our short-term needs:

1) True 64-bitness in the access language, so that we can take advantage of our AMD64 hardware & Win64 OSes with an eye towards very large data sets in the future. Java is a no go here, because it will NOT take something as trivial as a 64-bit counter in a "for"-loop. [Recent versions of C++ and C# will, however].
2) A very strong sense of type in the access languages, and preferably in the underlying database itself. For instance, ideally the database would know [inherently] how to deal with primitives such as

a) all variety of 16-bit & 32-bit Unicode characters
b) 96-bit Intel & AMD extended doubles
c) 128-bit Sparc extended doubles
d) 128-bit Altivec extended doubles
e) 128-bit LabVIEW timestamps
etc

Classical business-oriented programming languages, like SQL, are very ASCII oriented, and typically everything gets dumped in the database as strings of ASCII [8-bit] characters, with proprietary logic added afterwards to lend a sense of type to the data. We want the underlying database to understand type, however.
3) A sane, stable, and rather fast transport protocol to move data from client workstations to a centralized repository. Candidates might include DSTP, OPC, SOAP/SOAP+, etc. Preferably the transport protocol would have a strong sense of type as well, so that you wouldn't need to add extra logic on the client end to encode the type, followed by extra logic on the server end to un-encode the type.
4) Solid, stable, and fast replication for redundancy purposes.
5) Good, solid integration with an industry standard user authentication system, such as Novell Directory Services, or Microsoft Active Directory.

Long term, a future interest will be in the area of what I might call a systematized approach to scientific data analysis, and particularly things that go under the guise of e.g.

Re:Vendor Recs: Objectivity, Caché, M$FT??? by Chitlenz · 2005-05-03 08:37 · Score: 1

Whew, that's a lot of ground to cover, but here goes...

C# fits the bill for the strongly typed functions you mentioned, and there are a LOT of code shortcuts you can make via canned components. These typically come inline with whatever your particular standards commitee has set forth as the mean, and there are many competent component vendors for .NET that can cut literally years out of your dev cycle.

Almost all the good databases do replication well today, with a side note that this is a seriously underused function of the databases themselves.

I think any of the modern database architectures can deal with 64-bit definitions, and oracle should allow for strong typing via pl/sql. This approach may require 64-bit native versions of Oracle though, which may or may not exist for your target platform in some current version (is 10g for amd64 out yet?). Since the backend can inherently handle the extended primitive sizing (I'm pretty sure here), and since these primtives can be 'compiled' into pl/sql packages within the Oracle framework to give them some sense of propriety, that'd be the direction I'd choose. Probably on AMD64.

Odds are pretty good you can just use SQLNET for transport. It's pretty much hella fast, so I've never bothered overcomplicating the network layer in my app design. I'm going to go out on a limb and say that SQLNET can indeed probably understand type translation natively. Thus your 32 bit development box under your desk should have no problem establishing a trust relationship with a 64 bit solaris box or whatever, and to allow that node to exchange data seamlessly.

The network tie-ins are an interesting point. I've never been a huge fan of branching too far away from strict IP based network topologies, but granted I don't manage 10k machines either. SQL Server definitely directly integrates with AD, not sure about Novell. Oracle definitely integrates with Novell, not sure about MS =) probably MS too tho, at least on their Oracle for windows servers.

The statistical analysis functions of your design area classic case for the Data Warehouse. The good news is you can probably make that kind of high-order math function work in any of the current backend vendors, and something like Cognos can help you make great strides quickly (or SAS, etc.). I find that it's important to remember that the simple way is usually the best way in these cases, that is, if you are not in the commerical software business and you have the money for it, there's lots of good packages for sale. The more interesting alternate re: your project (to me) would be to just sit down with pl/sql and write it heh. But then, I code, so I tend to see code based solutions as the end with the most clarity (and the added benefit, ESPECIALLY in math functions, that everyone has to crosscheck each other's code when you write it, and you can by result usually minimize errors and other logic trainwrecks). In math fields, precision is king, and who do ya love baybee if it ain't you?

A last note, post architecture, is that ANY product is only as good as the DBA tuning and designing it. That , in a nutshell, is why most projects fail. I pose that hardware performance needs are equal to the skill of the people tuning your system, so the better the people the cheaper the hardware (not an original idea, but one with truth). Warning, this is NOT a bad thing. In fact, in many companies, bad DBAs become good dbas over time as they learn the system. Each system is an individual entity, and as entities, a good DBA becomes familiar with the inner workings of his/her charge. There's no such thing as overkill here, there's only having enough, or not having enough capacity as well. Beg borrow or steal the best gear you can, set a sunset target of 5 years if you can. Eventually, if the whole thing's a fiasco, someone's going to need that extra capacity to work in, and if it all goes perfectly you get ... well 5 damn years of uptime heh. Trust m

--
Imagination is the silver lining of Intelligence.

Where did trust go? by Stone316 · 2005-05-03 05:00 · Score: 1

At some point, no matter what the technology your going to have to trust someone to manage your data/infrastructure/building access, etc, etc.. Whatever happened to trusting your employees who have superuser access? If you can't trust them, then you should hire someone else.

For awhile there was a big push here for auditing the DBA team.. Every option they put forth I told them how I could get around it. We are the superusers of the data, at some point we have to be trusted.

The funny thing is, SOX was brought on by executives who broke the law... Maybe its just me but the only people that seem to be affected by SOX are the people who weren't involved in the recent scandals. ie, my productivity has been impacted by 30-50% because of all the extra process. But when asked, no one is monitoring the effect these process are having on productivity and if headcount needs to increase because of it.

--
"Thanks to the remote control I have the attention span of a gerbil."

Power Failure should absolutely not corrupt. by 2short · 2005-05-03 06:06 · Score: 1

"You can yank the power cord out of the wall anytime you like, and the database won't get corrupted." - That's how I've generally heard the main advantage of a good DB described to non-technical people. Power failure won't cause corruption is the cannonical example.

I only vaugely recall the incident you mention; I thought at the time it was a case of "Why MySQL isn't a real DB", but before MySQL fans jump down my throat, I'll admit that I didn't (and don't) really know. It may well have been, as you describe, hardware outright lying to software, in which case, to return to the original discussion, using only a filesystem and not a db would not have helped. Certainly there might be bugs in hardware that a DB can't cover for. But there are many many sorts of bugs and other failures that a DB will prevent from corrupting your data, because that is one of the main design goals of a good DB.
I'll stick with my original point: If using a DB makes your data storage more fragile, you are doing something very wrong.

Re:I want clustered databases for high-availabilit by Decibel · 2005-05-03 10:43 · Score: 1

Actually, Oracle supported this in 9i as well.

Re:I want clustered databases for high-availabilit by Decibel · 2005-05-03 10:45 · Score: 1

DB2's clustering doesn't use shared storage iirc. In Oracle what you're looking for is replication, not clustering.

Re:I want clustered databases for high-availabilit by Anonymous Coward · 2005-05-03 11:03 · Score: 0

and since none of us on this board are into opensource your solution remains informed.

Re:I want clustered databases for high-availabilit by Nutria · 2005-05-03 12:45 · Score: 1

Shared storage hardware is VERY expensive for me (I am in india)

Ah.

Replication is nice, but a true multi-master setup would be even better.

Huh? Multi-master is replication.

Are you confusing replication with clustering?

--
"I don't know, therefore Aliens" Wafflebox1

Computer Science is a Science by Vryl · 2005-05-03 21:16 · Score: 1

But I don't think "IT" is.

CS follows the "scientic method" of 'observe, hypothesise, test' and is falsifiable in a Popper-esque sense.

Hrrrm ... this query is running slow, maybe I should build an index and see if that fixes it... Hrrrm these results have improved nicely with that new index.

Looks like a science to me.

What exactly do you mean when you say it is not a science? What is your basis for that?

Swallow us whole?!?! by fbg111 · 2005-05-04 08:43 · Score: 1

We live in a time of extreme change, much of it precipitated by an avalanche of information that otherwise threatens to swallow us whole.

EEEEEEKK!!! Run for your lives!!!

--
Flying is easy, just throw yourself at the ground and miss. -Douglas Adams

Re:I want clustered databases for high-availabilit by anvil+{UK} · 2005-05-06 07:13 · Score: 1

Actually Oracle Parallel Server has supported this for years, at least since 8 officially.

315 comments