Beyond Relational Databases
CowboyRobot writes "Relational databases were developed in the 1970s as a way of improving the efficiency of complex systems.
But modern warehousing of data results in terabytes of information that needs to be organized, and the growing prevalence of mobile devices points to the increasing need for intelligent caching on the local hardware.
According to the ACM, the future of database architecture must include more modularity and configuration.
Although no concrete solutions are included, the article is a good overview of the problems with modern data systems."
Some of the biggest problems that "new" database designs have:
1) Overly complex
2) Don't scale
3) Tied to a single platform/implementation
4) Poor performance
It's typical to see all four in a single try!
SQL, on the other hand:
1) Reasonably simple API
2) Scales to very large databsaes
3) Cross-platform/architecture
4) Performs very well.
Given the insane amount of inertia SQL has, it will extend into an object model, rather than be replaced by one. (EG: C/C++)
I have no problem with your religion until you decide it's reason to deprive others of the truth.
Doesn't make it obsolete. "Databases are old and kludgey. Teh suXX0rs for R0xxng H4XX0rs liek me.
Just because people are too stupid to take the time to read and understand the theory and learn the application doesn't mean the technology is no longer relevant.
Of course no solutions are proposed. There are none because relational theory is correct, and appropriate for real database driven applications. Little crap bulletin boards can use MySQL.
Netcraft confirms relational databases are dead!
People,
Have been crying for the need to replace relational databases since the early nineties at least.
We can all see where that got them.
---- Go ahead, mod me down, I'll just post it again and you lose your mod points.
Funny how they never are, eh?
KFG
The future will not be found in the relational model, object model, or hybrid, but in the comma-delimited list.
I didn't RTFA but for my needs
Or the summary
mySQL suits me quite well.
That's nice. It won't handle a multi-terabyte database, though. That's the domain of Terabase, Oracle, and (blech) DB2. It's also what the article is about.
The power of PHP and mySQL is all I need.
And a moped is all you need to get to work. If you want to haul 300 metric tons of rock from point A to point B, you need a dump truck. Again, that's what this article is about.
Back on topic, this entire article is mostly speculative for the moment. A lot of excellent work has been done in OODB and XMLDB designs, but no singular design has yet emerged to solve all our woes. For example, I love the Prevayler concept. It solves a lot of problems, lowers data access times, and provides for complete data security. It also isn't usable or scalable without a lot more design work.
The future will hold some very interesting things, but for now we'll have to keep inventing until we come up with a consolidated solution.
Javascript + Nintendo DSi = DSiCade
See "COBOL to be replaced...." for an example of just how unlikely that is...sure, the latest hip "Tres Kewl" software for business might be written in something else, but SQL will be around for a long, long time.
Consider just the fact that "Relational Database" technology as laid out by Cobb back in the early days specifically says "You don't *HAVE* to do it this way, but it will be more effecient if you do"...realize that SQL handles Denormalized Warehouse and Datamart tables just as well as it does the 5th normal form model of perfection...and relax...it ain't goin nowhere.
but not a real http://it.slashdot.org/article.pl?sid=05/05/02/194 4248&tid=221&tid=198&tid=8 dupe.
I don't know what new structure they'll come up with for storing data, but I'm sure someone will try to port Linux to it.
I don't do this for karma, I do it for cash. It's much better.
Nothing builds character like manually searching megabytes of raw, unorganized information for a relevent entry. Except maybe sorting it by hand.
Databases are for sissies.
An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
How about just getting filesystems to be relational? Replace the ancient 1960s-era hierarchical inode database that underlies filesystems with a modern relational one. Then distributed databases can provide a more consistent platform for all our distributed apps.
Enough stuffing metadata into filenames. Enough shoehorning all data into a file/folder/cabinet model, now less familiar to people than the networked infosystems that mimic them. Enough fake hierarchies inconsistent with accurate data models, forcing whole technologies like Apple Spotlight, GNU Dashboard, and Google Search just to transact basic relatioships buried in the data. Enough reinvention of the wheel with every initial RDBMS schema, just a layer on top of the DB's actual hierarchical filesystem - a shell for an inode database. Enough empty promises of "WinFS" and "OLEDB" vapor - get relational filesystems into developers' hands, and developers will move beyond them, building apps that meet users actual needs, dragging the database tech along.
--
make install -not war
Local caching data gets ugly. Eventually the connectivity issues will be fixed. Mobile devices will seamlessly connect to databases preventing the need for caching.
"Organizing" issues aren't as important as getting the data faster and handling a large amount of users. Databases are getting larger and larger. Query optimizations can only improve queries to a certain extent. Eventually the amount of users and data reaches a threshold that more and more hardware doesn't always fix.
s/data security/data safety/g
I just realized that line might be confusing.
Javascript + Nintendo DSi = DSiCade
I remember the hype surrounding object databases and xml databases. What happened?
I believe there is simply too much existing data+code that depends on traditional RDBMS infrastructure.
To get the best of both worlds, I use PostgreSQL for both traditional RDBMS and the less well-publicized ORDBMS features.
With version 8.0 available as a native Windows software, I think it'll start grabbing marketshare at a much faster pace. Having it available only on Linux and *nix natively kept adoption rates at bay.
ps
For many applications, I found sqlite3 to be sufficient. Low overhead and speeds that blow away client/server solutions like PostgreSQL or MySQL. And cdb is even faster than all of these for read-only data.
Quite true. MySQL does very well into the gigabytes. I haven't seen any good evidence of its abilities in handling terabytes of data. Don't get me wrong, I'm a huge fan of the MySQL, but I'm a bigger fan of using the right tool for the job. For your web message board, MySQL works fine. For holding product, sales, distribution, etc. information for, say Levis, it would not.
I don't do this for karma, I do it for cash. It's much better.
SQL, on the other hand:
1) Reasonably simple API
2) Scales to very large databsaes
3) Cross-platform/architecture
4) Performs very well.
Given the insane amount of inertia SQL has, it will extend into an object model, rather than be replaced by one. (EG: C/C++)
SQL is a language for set operations. By itself it isn't a database or storage utility. There are some different versions similar to what you describe. Oracle's PL/SQL allows you to make temporary tables and materialized views. Neither solves the overall problem the article describes.
SQL by itself doesn't perform. It is based on the database engine, and how good the developer is. I have gotten SQL queries that took minutes to exectue in seconds by adding indexes, analyzing tables, and totally rewriting inefficient code. It is only "cross-platform" if you follow the ANSI SQL standard. Each database has it's own set of handy functions that make the code database centric.
SQL doesn't really have an API. It is a specification that is sometimes followed by database designers, and sometimes ignored. For example, in Oracle you can either use the ANSI joining sytax (LEFT OUTER JOIN) or use the (+) in the where clause.
It scales to large databases only when they are designed properly. I work with 18 terabytes of data. My sql code wouldn't work so hot if the tables weren't designed correctly. Indexing, partitioning, and table structure have more to do with performance at that level than the code. The code can make a large difference too, but if the underlying structure is wrong, even the best SQL won't help you.
/. ++
Most mainstream databases support replication. They are designed to be as fast as possible under heavy load.
Synchronization for a mobile device has another main requirement, robustness when the connection to the server is lost. A mobile device has to gracefully handle when the owner runs down into the subway.
The Internet is full. Go Away!!!
I'm taking a DB course right now and we've been discussing the future of databases, and the prof seems to think it's Object relation databases.
How wide-spread is OR DB is the industry? I've never used one. What's the story with them?
"one size no longer fits all"
This is absolutely correct when referring to databases for different application. However, why do people always assume that they have to choose the Oracle's, MS and IBM's out there? There are already databases that have been tailored for certain application environments. Take for example http://www.ianywhere.com/ who has databases like SQL Anywhere and UltraLite which are tailored for smaller workgroups and mobile devices.
I don't think the solution to the problem is to build a more complex non-relational system but rather to choose the right tool for the job. Why reinvent the wheel when you don't have to?
Adventure City Tours
Yeah, for those terabytes of data taken up by your mom's recipes and your cd collection, the extreme power of PHP and MySql is all you need, man.
In the real world you get to third, then carefully denormalize for performance.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Designed in the 1970s, the RDBMS has nevertheless proven to be the cornerstone of Web development three decades later. Thanks to systems like MySQL deployments are surely at record levels.
Essayist Clay Shirky has gone to far as to suggest that MySQL is at the center of a whole new software movement.
In my experience with Web applicaions the chief problem with the RDBMS seems to be that it does not do text indexing and search very well, so I have to keep a second store of data in something like Lucene.
The other major problem is the level of skill required to tune the database to achieve high-performance SQL queries, so hopefully the RDBMS will evolve with more self-configuration capability.
The article, which I only skimmed, actually addresses these two concerns but seems to pooh-pooh the notion of simply refining the existing RDBMS systems. Instead it says " Old-style database systems solve old-style problems; we need new-style databases to solve new-style problems. "
The paper seems awfully squishy on what this means. The clearest I found was a call to "produce a storage engine that is more configurable so that it can be tuned to the requirements of individual applications."
But this call for new highly modular/configurable storage "engines" seems to me to require at least as much fussy care and feeding as a traditional RDBMS. You're just replacing one DBA with another. And throwing out decades of refinement in the process.
The raison d'etre of the RDBMS is to allow the programmer to treat storage as a black box while gaining nifty ACID features. Extending this to text indexing seems logical.
'nuff said. thekeyboardgoblinsstolemyspacebar
I agree with the sentiments of the posters that SQL is not going anywhere, but I had a question.
As I am designing more and more complex web apps, I am constantly having to think of new, innovative ways to design the tables and databases and am currently making it up as I go. Does anyone have a reccomendation for books/sites that talk about good design proactices, that is not "How to use SQL" and relatively agnostic on the specific brand on DB?
Sorry for the OT post, its just something that has been bugging me for a while
It's cool. We'll start scanning peoples' consciousnesses into computers, as per yesterday's article, and make them our database-indexing cyberslaves.
IIRC uses DB2 as its 'file system'.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Fabian Pascal trolls, thee also do I summon!
Let the flamage commence!
They have some interesting things to say about databases, and monster truck rallys.
According to the ACM...
No... Acording to Sleepycat, who have a great name and logo, but an otherwise very annoying data store.
The Third Manifesto came out years ago.
If you want to haul 300 metric tons of rock from point A to point B, you need a dump truck.
I do believe you would need more than one. Perhaps a convoy of them.
http://www.mysqluc.com/cs/mysqluc2005/view/e_sess/ 6218
Slide 6 shows their MySQL database to be 26TB in size with 600,000 I/O per second.
Just so you know.
by MARGO SELTZER, SLEEPYCAT
Sleepycat? The guys who make a brain-dead key/value database with no data manipulation or integrity capabilities? Who are they to educate others on the topic of relational databases? (Sleepycat's products are useful tools, but they are not true databases).
while data management has become almost synonymous with RDBMS, however, there are an increasing number of applications for which lighter-weight alternatives are more appropriate.
Ahh, so the proper title of this paper should be: "Beneath Relational Databases" or "Below Relational Databases". Because the relational model is a *complete* model for data storage and manipulation, so if you have a subset of this functionality, you are not "beyond" it.
As argued by Stonebraker, the relational vendors have been providing the illusion that an RDBMS is the answer to any data management need. For example, as data warehousing and decision support have emerged as important application domains, the vendors have adapted products to address the specialized needs that arise in these new domains. They do this by hiding fairly different data management implementations behind the familiar SQL front end. This model breaks down, however, as one begins to examine emerging data needs in more depth.
Well, the mention of Stonebraker's name as an authority on databases is generally an indicater of a content-free paper, but let's be sure we're talking about the same thing: the relational *model* is a *complete* model. There is no other more effective model, in fact as far as I know, there are no other complete models!
So if you want to use the relational model as a foundation to build new database products, go right ahead. If you're talking the same old vendor BS about "post relational" or "XML" (hierarchical) or "object" (network and/or hierarchical), then please shut up!!
My feeling is when he says "in depth", he means "less depth".
As more documents are created, transmitted, and operated in XML, these translations become unnecessary, inefficient, and tedious. Surely there must be a better way. Native XML data stores with XQuery and XPath access patterns represent the next wave of storage evolution. While new items are constantly added to and removed from an XML repository, the documents themselves are largely read-only.
Uh, yes there is a better way: create an XML data type in a relational database with a full set of XML operators. The relational model doesn't care about data types.
I have no interest in giving up the general relational model for a hierarchic model (rejected decades ago as not being general enough) based on a TEXT FILE FORMAT.
Stream processing is a bit of an outcast in this laundry list of data-intensive applications.
I smell Stonebraker.. yes, it's an outcast because stream processing has nothing to do with data storage!!!
Some argue that database architecture is in need of a revolution akin to the RISC revolution in computer hardware
Yes, all these people need to study and understand the relational model which was developed 30 years ago and is still the only complete data model. The relational model can be described in half a page, and consists of a small number of core operations from which any possible data storage and manipulation need can be developed. Stop thinking about implementations, think about the *model* and then use that develop new implementations!!
Old-style database systems solve old-style problems; we need new-style databases to solve new-style problems.
What does this mean exactly? I need to store and manipulate data without limitations. The relational model offers this. What is "old" or "new" here? I'm not going to switch to an ad-hoc subset of the relational model because it's "new".
This "paper" (wasn't there one a couple weeks from some Microsoft dude, which was equally useless?) commits the same old sins: 1) look at existin
One dump truck, 15 trips.
You'll have that sometimes...
I woke up and had an interesting thought...
I can imagine XML documents created in such a manner that they could constitute an object from an OOP (Object Oriented Programming) perspective, containing their own schema, characteristics, relationships and data. Further, I can imagine the ability of accepting such an XML document object into a range of other things, such as a modular program (dynamic program extensibility) or a database (temporary database extension). I can imagine the ability to package up such XML documents so that databases can be built simply by linking the XML documents together.
So (for example), one could send a request to the XML index in the sky, find and link documents containing a subset of know medical facts, and then kick off a data mining process that could discover previously unknown medical relationships. All without needing to know anything other than where to find the XML files.
Now imagine a tool that could convert all of the terabytes of data the world is generating every day into small, linkable, OOP like XML files... Sounds like a great open source project to me...
The NSA: The only part of the US government that actually listens.
When relational systems finally began to appear (and I'm thinking specifically about IBM's System R) they were dog slow, and the extant hierarchical and CODASYL network databases of the day ran rings around them. Still do, unless you throw lots of hardware at the RDBMS.
RDBMS have lots of advantages over older technologies, but performance is not among them.
1) Reasonably simple API
2) Scales to very large databsaes
3) Cross-platform/architecture
4) Performs very well.
I am proof that SQL will be around for a while. When I first saw Unix back in the late 80s, I thought "this is too hard to use, why would anyone need this?" I have been a Unix/Linux user since about '92.
When I took my first SQL class, I thought "these queries are very cumbersome. SQL is stupid." I still use it today.
In '93 I heard about this thing called the World Wide Web, and thought "This is unnecessary. I can find whatever I need on gopher and ftp sites. Why would I want a gui thrown on top of it?"
As you can see, I am quite the visionary.
My beliefs do not require that you agree with them.
The problem is not that you have terabytes of data in total, because you don't deal with the total data. It's no different from swimming - you're only using the surface. The problem is extracting only that surface, so that it can be used on a local device.
The reason this is a problem is that conventional "intelligent" devices have extremely limited memory - far too limited for even a views-based approach to work. You would need to build devices capable of handling between ten to a hundred times their current capacity to realistically handle such a system.
Not all is lost, though. Most of a PDA is empty space. Lots of chips means lots of casing, and casing is largely vacant. If you had wafer-scale memory, or even had a single strip of silicon instead of individual chips, you could pack in far more memory with actually LESS space being taken.
(You only need one case on a wafer, and therefore only have one lot of overhead. A typical SIMM board has 9 chips, therefore 9 times the overhead. To get 100 times the memory, you'd currently need 900 chips, which would be 900 times the overhead. But you'd STILL only need one wafer, because you can get over 900 chips off a single wafer.)
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Freemasons knew this, so they didn't write anything down. It's funny how "information" is dispursed, and kept. People are fighting over a way to get information in such a way that it is indestructible, but look at the Egyptians did. They tried, even to use pictures, like NASA did when they launched that space pod, in order to express to aliens where Earth was, and what was on Earth. (they had to use pictures do to obvious problems with language)
My point is that I don't think this problem with finding ways to keep data will ever be figured out. There are physical laws that govern the way matter acts. None of which, seem to hold data about a certain time, or place.
ASCII Flat files are the way to go. If I can't fgrep it...then to hell with it. SQL my ass..
Many of the capabilities that, according to the article, should exist in the next generation of databases exist today in IBM Informix Dynamic Server. Examples: The hability to use other than B-Tree+ indexing technology, the hability do describe data beyond traditional types, etc. Furthermore the article relies heavily on comments by Stonebreaker, the father of Informix's Object Relational capabilities.
One of these can carry 290 metric tons (tonnes?). Thats pretty close.
Many of the 'problems' and concerns regarding current tools, could be avoided with a previous set of strong skills and concepts in the software development team. I see that a much bigger problem than the ones analyzed in most of the articles analizing current weaknesses in software development, where they talk about problems with the technology, IMO, problem with humans in previous and more critical.
It's amusing, but ORDBs are just too complicated for the market.
They should be simple, but they're not. Plus, the implementation tends to tie you to a given vendor. It's not a real problem since most people never move off of their database, but people don't like thinking about it too much.
The relational model is nice because it's easy to understand, and you can always flatten an object graph into tables if you need to. You can also poke around and visually see the structure, which is nice.
Lastly, there's no real benefit too ORDBs compared to RDBMSs. People can do everything they need/want to do today with RDBMS, so there's no reason to move. There hasn't been anything really compelling, I suppose.
As for the future, I think that while OO databases are all the vogue, that O databases, where each bit of information is represented by it's own object and will have the ability to demonstrate autonomous agency is a good area for research. Then we can just let the data speak for itself! ;^)
"Every person takes the limits of their own field of vision for the limits of the world."
- Arthur Schopenhauer
"Can there be a Klein bottle that is an efficient and effective beer pitcher?"
Go fuck yourself, Roland.
To me, this entire article reads like a plug for Sleepycat, written, not surprisingly, by a Sleepycat person.
Relational data model is just that -- a data model. It doesn't concern itself much with implementation (and therefore, performance in any particular environment) or with how applications use the data. And that's really the point -- relational databases are application-agnostic. They are designed to store the data that will be possibly accessed by applications that are not yet conceived. That's the reason they put great emphasis on internal correctness of the data. Once you have a database specialized for one application, that's not really a database in the relational sense -- that's a way to persist your application data. And that's where Berkeley DB shines. It doesn't replace relational databases, it just serves a totally different purpose.
My only problem with Microsoft is the severity of bugs in their software.
really implementing a relational model to begin with? Then we can decide if the relational model is broken or just the vendor implementation.
How about... a query language that is fully set operations compliant, i.e., something other than ANSI SQL which is a strange mixture of set and bag operations, and a mixture of relational algebra and relational calculas and some other 'extensions'.
How about... realizing that a major design goal for the relational model was data integrity. Modularity and configurability are also good goals but if you are serious about your data, integrity will be at the top of the list.
The biggest problems I see with databases is very few people understand how to use them. Here's a few tips:
1) a table is *not* a class or an object. Tables + constraints + user defined types + constraints etc. when used properly can define domains which are close to classes and objects.
2) Learn how to normalize. A badly (or flat out not) normalized database threatens data integrity by violating the once-and-only once rule. As a rule of thumb if the table has more than 20 fields in it you should review your data model and make sure it is properly normalized.
3) Point 2 is often the consequence of mindlessly slurping in spread sheets or MS Access database tables. Anyone doing this has no business being within 50 feet of an IDE.
4) Ditch Raid 5. 0+1 will give better perfomance in most cases. Manager like Raid 5 because it is cheap, you get what you pay for.
5) Have multiple channels for data, transaction logs, large indices and O/S or user applications to reduce bottle necks. This is expensive but for large databases going cheap will hurt you.
6) Learn a little theory, it won't hurt you. In fact it can save a large amount of time and trouble. Do not be afraid of learning about the technology you are using. After all, technology is what you are good at, right?
7) If it is a read only database, turn off logging for speed (impossible to due under SQL Server 2000 btw). Also, if a table is on a purge and load paradigm (many reporting and/or datawarehouse tables are) turn off logging on the table level if your version of database engine allows you to do so. Likewise, turning off logging on a hand held or other single user system may be appropriate, just make sure two people do not try to use the database at the same time.
8) Avoid XML. Too much bloat.
9) Learn how to use indices on tables.
10) Learn how to read a perfomance monitor/top etc.
Postgresql is both working hard to become truly relational AND is adding support for geographic objects and objects. The MySQL crew is working hard to improve. Oracle has some nice perfomance features but I think their 'Object/Relational' implementation is broken. SQL Server is getting 'long in the tooth'. There is also a great need for temporal databases and lightwieght engines. But remember, there is no 'silver bullet', no short cuts. Just hard work to be done.
putting the 'B' in LGBTQ+
for example, the wheel, which is why I've invented the square.
And also, the computer... fucking things ancient... hello! I use the Plumation Machine instead because it's knew knew new!
I'm pretty sure whatever is next will be put on top of relational databases... rather than screw with decades of optimizations.
but SQL does suck. Really. not because it's old. it always sucked.
-pyrrho
.. like MAOS
Ah, the good days of Carmageddon...
Stupidity is an equal opportunity striker.
Fellow slashdotter Bill Dog
Has anyone noticed that the author of the article is from Sleepycat (which sells commercial licenses for Berkeley DB to embedded systems developers)?
.. it just so happens that Sleepycat's flagship products are Berkeley DB (a flat-file database) and DBXML (an XQuery engine built on top of that).
She puts forth a case against SQL and relational databases in general and claims that many applications (like directory services and search engines) have read-heavy, hierarchial access patterns which favour lighter-weight, non-relational, transaction-optional databases.
And
Comma-delimited lists? YES! Lisp will rise again!
Oracle, and (blech) DB2
"Blech?"
Can you back that up with some real-world examples where DB2 was worse than Oracle?
One of the problems of current databases when is that a typical relational database doesn't have enough dimensions. Designing a table to store data is trivial - but what happens when you need to know the intersection of X and Y at time Z?
This is a fairly common question in data warehousing: What is the data today, what did it look like yesterday, last week, and last year?
I have seen it worked around in silly ways (snapshot and rename a table every day/week/month) and more clever ways (use separate transaction tables to record changes), but never in a particularly elegant way.
Wiser colleagues whispered to me the dirty answer "object relational" and scurried away to their dens of Rob Zombie and J2EE. I never got my head around object relational databases before leaving that world, and so am left to ponder papers from IDC with statements like this one:
"putting object extensions on RDBMSs is tantamount to adding stereo radios and global navigation systems to horse-drawn carriages"
Ouch, is that a swipe at Oracle? Seems that as far back as 1997 pundits have said that the future is in ODBMS, and not RDBMS or ORDBMS. Hmm...
these queries are very cumbersome. SQL is stupid
You had the right idea on this one, the rest of the industry has it's head up it's ass.
SQL: SELECT * FROM Customer
Relational algebra: Customer
SQL: SELECT first_name, last_name, order_id FROM Customer, Order WHERE Customer.customer_id = Order.customer_id
Relational: Customer JOIN Order { first_name, last_name, order_id)
And that's just simple queries.. try to write a query in SQL that depends on two query results being equal (for instance, "show me a list of all customers who have each bought at least one of every product".
The idea of the relational model is simple: it is based on set theory, which has a strong mathematical model. There is no equivelent model for object databases, nor for tree-based databases like MUMPS. There is no strong mathematical basis by which you can judge the integrity of your data.
Cache', by Intersystems (a Post-Relational Database!) is based on MUMPS. You've seen their adverts here on Slashdot. They claim to be object-relational, but they are no such thing: they are MUMPS. They went on a buying spree and purchased up most of the failing MUMPS vendors (DSM, MSM, etc), and now they are the big guys in the M world.
They have some pretty nify hacks which compiles their "object-oriented MUMPS" programming language (I forget what it's called) into straight M. Fun. Doesn't stop it from sucking hard.
MUMPS is, at best, a fairly bizarre language with persistent storage of global arrays.
MUMPS drives me nuts. It uses whitespace for blocking just like Python, but they had so much trouble with it, they eventually allowed a '.' to replace the whitespace, so you end up with code like this:(I stole that from this duscussion.)
(Sidenote: I have to admit, my exposure to MUMPS is one of the primary reasons I despise Python's whitespace-as-blocking. It seems replaces the poor aesthetics of brace-blocking with something more error-prone and stupid-looking, though more aesthetically pleasing. But all that's just opinion. I'm sure Python is a good language, just as I'm sure MUMPS is not.)
Microsoft is to software what Budweiser is to beer.
"Blech?"
:-)
Can you back that up with some real-world examples where DB2 was worse than Oracle?
I could, but it would only start a useless argument over what database everyone prefers. Let's just say that my experience with DB2 has left me with less than stellar feelings toward that database and leave it at that.
FWIW, my experience is with UDB and not the Mainframe DB2. At the end of the day, the two are very different beasts.
Javascript + Nintendo DSi = DSiCade
'fraid not. One dump truck should do the trick.
http://jcsnippets.atspace.com/ - a collection of Java & C# snippets
Another approach to the problem: JSR 170: Content Repository for JavaTM technology API ...).
Standardizing the interfaces to various data resources (filesystem, database, cache,
The expert group reads like a who's who in data management. And it seems to be very near to the final draft.
I've already got Windows and a girlfriend, I really don't need another irrational database.
Faster! Faster! Faster would be better!
having spent ten years using both, I can tell you that:
Pro-DB2
1. db2 is less than 50% of the price of oracle
2. db2 is much easier to administer than oracle (check out oracle recovery procedures!)
3. db2 is much less vulnerable to being corrupted than oracle
4. db2 scales higher than oracle
5. oracle's Larry Ellison is an offensive nut-case
6. oracle's sales team will screw you
Pro-Oracle
1. oracle's locking methods are easier to develop for
2. oracle has better third-party tool support
3. oracle has more expertise in the marketplace
4. oracle has some features that can sometimes be lifesavers - like compressible partitions. When they aren't they just get in the way tho.
5. db2 fixpacks are sometimes flaky
And I'd usually choose to work with db2 over oracle: much easier to admin, rock solid. You can easily modify the databases or instances right online, restart if necessary (seldom is these days) - all without a worry about backing things up first. With oracle you don't even touch it without doing a backup. That tells you a lot about the differences right there.
Plus, it's easy to train junior IT personnel to become dbas. A year later you've got somebody who's a completely productive prod & dev dba. That never happens with oracle - where you need much more specialization.
Also, until db2 v8 came out two years ago, it was pretty far behind oracle 8 & 9. Now, it's in a great competitive position.
JDBC (probably ODBC too, tho haven't used it in eight years) helps to standardize key generation (in JDBC 3.0 FINALLY), and Date processing (christ, date functions are so annoying). Most other operations can usually be done by the platform language that is processing the data, so you can avoid the tie-in that results from various SQL dialects' built-in functions. XML databases were a total flop, and so were object databases. I agree, SQL engines are so mature now that I don't see any database tech replacing them for another ten years. By the way, can we get PostGres (and now Oracle's) support of regular expression LIKEs standardized? And can we please get JDBC to support something like: INSERT $price INTO pricetable where productid = $productid rather than using ?'s and counting which ? to set values to? Hibernate's query language does this, and I really like it.
Hey, I'm just your average shit and piss factory.
If any smart DBMS developers are listening, is to define a set of queries within the database (like for a _simple_ example "male" and "over 60" and "salary x") and then be able to refer to these criteria by name only, having the database build the query based on these rules as I choose to combine them (select xxx from yyy where 'male' or where 'male' and 'over 60').
:)
Sort of like stored procedures in implementation - they could be called stored query definitions.
Because these query definitions would already be parsed, they don't require overhead to re-parse each time the stored query definition is executed.
Please have this feature ready in about 6 months
Once I was a four stone apology. Now I am two separate gorillas.
An XML doc is a tree, thus it is hierarchically organized data. There have been hacks to try to extend around this limitation, but relational data still has superior flexibility.
that's why XML databases flopped
Hey, I'm just your average shit and piss factory.
That's nice. It won't handle a multi-terabyte database, though. That's the domain of Terabase, Oracle, and (blech) DB2. It's also what the article is about.
Add postgres to that list, too.
Tough to find examples now. They used to ask people who had DBs over 1TB to post to the list, but they stopped several years ago because they had plenty of them.
I remember Datainfosys's spam rules DB is over 1TB, and the American Chemical Society's historical archives, and there was some genome mapping project. Really they had a couple examples a month, so it's not like there are only 5-10 dbs out there in postgres in these sizes.
rage, rage against the dying of the light
You may be being facetious, but that is something that SQL doesn't do well.
Although since PostGres can do LIKE's with regexps in them, things are better...
Hey, I'm just your average shit and piss factory.
I think relational databases are overestimated in detriment of other high quality existent and open source databases. One of the most important and fastest databases are (and from the same author!):
- Gigabase
- FastDB
- Perst
- DyBase
- GOODS
All of them at: Konstantin Knizhnik home page.
Also, until db2 v8 came out two years ago, it was pretty far behind oracle 8 & 9. Now, it's in a great competitive position.
Last time I did a comparison, it was DB2 7.1 vs. Oracle 8. Most of what bit me in the ass with DB2 was its flakey management tools and multitude of minor details that needed tweaking from the days when it was a mainframe database. (e.g. Why does it need a buffer large enough to hold the entire blob chunk that's going to be transferred? That's just stupid. It should pull across as much as the buffer can hold, fill the requesting array, then go back for more and repeat.)
As I said, it would only start a pointless discussion on who likes what database.
Javascript + Nintendo DSi = DSiCade
For those who are around Pittsburgh or CMU,
l _database.pdf
an interesting company sprang up building
P2P based database technology (Maya Design).
For a good read, check out the following paper:
http://www.maya.com/web/what/papers/maya_universa
It has been around since the late 1960's, in fact, one could argue that PICK (actually, PICK BASIC) was one of the first successful commercial instance of a "virtualized processor" system - that is, the PICK core was a VM that ran PICK assembler p-code (of a sort), and the VM was implemented in software running either as the OS or as part of the OS (ie, in *nix implementations) - and PICK BASIC applications were compiled to the p-code - and in theory (which actually worked quite well, IIRC), the compiled objects could be run on any PICK implementation (barring vendor-specific implementation details - always inevitable in this kind of situation, re: Sun Java vs MS J++). Another point of fact is that some companies (I think Fujitsu was one) created hardware implementations of the PICK VM - in other words, "PICK processors" - which obviously ran the code much faster than the software version.
I know that PICK is still available from various vendors (D3 is one - at least, it was not too long ago) - I also think an open-source version is in the works. It was long used for "green-screen", head-down vertical market type applications, but today there are other interfaces to it beyond a serial terminal (GUI, Web, etc). There are also companies who have created completely different DBMS systems based on the PICK data model, but not using PICK BASIC or all of the other old methods...
Finally, it is possible (though very kludgy, and I wouldn't reccommend it except as a way to "play" around) to simulate extra dimensions in a standard relational DB - set up a column as a TEXT or BLOB data type, then store the data in that, separated by a non-keyboard delimiter (ie, ASCII 254 or something). Parsing, insertions and deletions won't be easy (nor fast), but I would imagine one could set up stored procedures to handle such needs. It isn't pretty, it isn't reccommended, but it can be made to work as long as the data being stored isn't too complex...
Reason is the Path to God - Anon
While the XML portion looked interesting, all the other "needed features" were totally useless from my perspective. I'm more concerned with the ability to handle the facts that our biochemical/biological recordings alter as our workflow progresses and we learn new techniques or alter existing techniques. And how we data mine from that changing universe.
So, mostly not very useful from my perspective, and they missed the important things that would actually matter here.
-- Tigger warning: This post may contain tiggers! --
Here is the site for MaVerick, the open source implementation of PICK (crazily enough, running under the Java VM!!!) - which also uses a regular DBMS as the backend (Berkeley DB, MySQL, PostgreSQL) - hmm - makes me wonder if they are simply doing what I said to try, but in a more maintainable manner...?
Reason is the Path to God - Anon
SQL will enjoy a long life. There will probably be another update to the SQL standard, SQL07 or something, just about when everyone gets caught up tot he SQL99 standard of today. It does what it does very well. While the syntax is cumbersome, there aren't many better ways to represent the relational complexity. Sure, they could stand to fix the god-awful decisions that were made with regards to quoting and nesting characters and things of that nature, but the essential syntax really needs to be about like it is.
THe big thing where I see a real major follow-on to SQL emerging is in the Temporal area. There's some existing work out there, lots of papers and designs and whatnot, for Temporal SQL. SQL just isn't that great at handling time-related data, even though a lot of people coerce it into doing so. The Temporal SQL replacements/extensions attempt to remedy these, and someday whatever comes of the Temporal research will become mainstream.
11*43+456^2
> Most of what bit me in the ass with DB2 was its flakey management tools and multitude of minor
> details that needed tweaking from the days when it was a mainframe database.
yeah, i didn't like those earlier versions of db2. in fact, back in the 90s on a really huge project IBM offered my project db2 for free. We opted to spend $800k on informix instead! That was v5 if I remember, but I don't think that v7 was all that much better really.
Databases are going to fragment. Different types for different data.
You do realize you don't know how big my parent table is nor how infrequently the children change.
All generalizations are false.
IMHO When the child data changes very rarely or never, update triggers that recalculate parent totals are sometimes the way to go. This violates third normal form and is the most common de-nomilization I've done. Hell I've lived without the update triggers and just stored totals. When I was a kid we ran data checking batch jobs to check data validity.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
"show me a list of all customers who have each bought at least one of every product"
Congratulations, you've discovered that SQL does not express the full relational algebra. Here's your sticker and decoder ring showing you belong to the Captain Obvious Fan Club!
Speaking of obvious, I don't know how obvious that query is in relational algebra, and hell if I can remember how to express it in SQL. It is possible, but I do guarantee it will be an expensive query, since you'll hit every last row in your orders table to calculate that. A real-world solution would be to just sum up the number of distinct products each customer has ordered and compare that with the total number of products offered. If you haven't removed products, that number will give you an exact answer.
The real irksome thing is that most people wanting to throw out SQL are trying to replace it with something even less expressive.
I am no longer wasting my time with slashdot
The lead-in carelessly claims that the opinions in the ACM Queue article are those of the ACM. This is almost certainly not true; the ACM merely operates an on-line journal where authors can express their own opinions in this case.
Once again, Slashdot manages to bollocks up a lead-in with careless, inaccurate, flip or overly opinionated lead-ins that might have taken about 2 minutes to clean up.
Heh, MySQL...the moped of databases!
Metakit is a radical alternative to conventional RDBMS. Portable, self-contained, on-the-fly restructuring, fast, memory-mapped...
IIRC Apple uses it for MacOS X's address book.
I think it's more because people look at the name, try to figure out how to pronounce it, then give up
Why, given that this FAQ entry clearly states that it's "post-gress-cue-ell"?
All relational databases have a theoretical problem that make it possible for entrants to a market place to succeed.
a) They are based on relational algebra. This is good, but, relational algebra will only take you so far. There is an entire domain of relational calculus, and if, you cannot implement all of it, you can at least make a go of cherry picking a piece of it for your own application. I would like to think my own domain is a good stab at this, but, since I'm maxed on all my credit cards and drive two cars that I can't afford, I'll assume not!
b) They are ruined by the languages that talk to them. Yes, I hear about the virtues of object oriented programming, but relational algebra, for what it is worth, is much more theoretically complete.
I write my own shareware, specialized, non-relational database and I've come to the conclusion that it is enormously difficult to match the performance of experienced relational database designs when trying to do relational types of things. I can make my database load a certain kind of data many, many times faster than SQL Server can, but when I try to do things that are more to the strength of a relational engine, my stuff looks pretty weak.
Anyone can make a relational database engine that looks really good at a million rows. But jack that up to ten million, or a hundred million, and then see just how well your design stacks up!
This is my sig.
sure, the latest hip "Tres Kewl" software for business might be written in something else, but SQL will be around for a long, long time.
Heck, by popular demand, "Tres Kewl" will probably be extended with an SQL backend called "TreSQL".
order(Orderid, Orderedby),
orderitem(Orderid,Itemid),
item(Ite
itemshipper(Itemid,Shipperid),
ship
and Prolog finds them all for you. IOW you specify a query by specifying the relationships among your tables (5 tables here) and by specifying constraints (e.g., "soap" and "marathon", above).
Learning Prolog shows you how SQL should be used, but also shows you how limited SQL is.
You're advice: don't store total's on invoices and just requery the line items every time?
In the ivory-tower relational design, you're supposed to set up a view on SELECT SUM(...) and then index that. This way, you gain the performance benefits of denormalization, but the DBMS takes responsibility of maintaining consistency on update. Application-level denormalization happens primarily because the current SQL DBMS implementations do a urine-poor job of handling indexed views.
One problem is that the price of a given line item often changes as a function of time, but the amount charged on a given invoice is intended to use a snapshot of the price offers at the time the order is agreed upon. In that case, representing each invoice as a separate negotiated offer in the database might still follow the normal forms. A true relational ivory-towerist would add the effective date of a price change as an additional field.
The relational *model* says nothing about performance, which is an implementation detail.
Models have inherent performance limits because each model is only as fast as its best implementation. One thing studied in computational complexity theory is the complexity of some optimal implementation of a given model. If model A's optimal implementation is slower than model B's at a given sequence of operations, then I'd overload the terminology a bit to state that model A itself is slower. How would you disagree?
I know the newest version is slightly less retarded, but I bet most of the client code still uses the DBase III type methods.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
I know how you feel. In high school, my friend was like, "Hey, check out this new thing called 'Napster.'" I took a look, but decided that it could never compete with palavista and the other FTP search & trade sites. Napster was kid's stuff. All the real MP3 heads will stick with their FTP sites, thank-you-very-much.
Of course for real vision, no one beats CmdrTaco's "No wireless. Less space than a Nomad. Lame."
Comment removed based on user account deletion
The healthcare org I work for (3 hospitals, 3500 users) runs the Cache database - it's supposed to be and insanely stable - how does it differ from the traditional oracle/db2? Anybody care to explain in terms your typical slashdotter might understand?
That's seems to be the problem. From reading, database seem to be one of the the most theorized technologies, with a comparatively low set of conrete solutions. Sure a lot of databases and tools exist, but the industry is having a really hard time moving away from the relations model, and maybe for good reasons. I think instead of a better database technology coming along, the database will just gradually evolve and a long from now somebody will notice and write an article and lament a about the way it used to be.
Although relatively new, OLAP (online analytical processing) systems like Essbase and OFA pack a lot of whallop when doing analytical work. I'd like to hear more about OLAP uses for analyzing web click data...
Pokes head out of window and takes in a breath of "real world" air... cough, choke, splutter...
...Pulls head back inside, thinking there must be a better building material than dead elephants.
Hello down there. We don't claim that normalisation is suitable for anything except optimising the amount of space required to hold your data set. For space optimisation nothing has yet been devised to improve apon it.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
What do you find cumbersome today? :)
my blog
Whatever your data is, you can represent it under the relational model. You need cubes to model multidimensional data? You can use a star schema, or a number of other representations embedded within the relational model. Need to keep track of the history of the changes of the values of a property of one entity? Put the value assignments in a historical join table that points back at your "main" table.
when will people learn that SQL was created in a time of ram scarcity? Oracle was designed back when machines had 4 or 8 MB of RAM __tops__. New developers should experiment with vector based systems instead of row based systems. Before anyone boohaahaas this as a toy, I recommend you look at www.kx.com customer's list, all the top Ibanks and bond houses in the __world__ use it. Lehman brothers has a 50 TB bond datbase that uses this technology.
p ,wn);R z;} V2(find)true V2(rsh){I r=3Da->r?*a->d:1,n=3Dtr(r,a->p),wn=3Dtr(w->r,w->d) ; A z=3Dga(w->t,r,a->p);mv(z->p,w->p,wn=3Dn>wn?wn:n); if(n-=3Dwn)mv(z->p+wn,z->p,n);R z;} V1(sha){A z=3Dga(0,1,&w->r);mv(z->p,w->d,w->r);R z;} V1(id){R w;}V1(size){A z=3Dga(0,0,0);*z->p=3Dw->r?*w->d:1;R z;} pi(i){P("%d ",i);}nl(){P("\n");} pr(w)A w;{I r=3Dw->r,*d=3Dw->d,n=3Dtr(r,d);DO(r,pi(d[i]));nl() ; if(w->t)DO(n,P("p[i]))else DO(n,pi(w->p[i]));nl();}
a :c);e[n]=3D0;R e;}
3
http://www.kx.com/
http://www.kx.com/a/kdb/document/contention.txt
read up on http://www.kx.com/ they have a almost fully compliant sql engine written in 200K of code. The interpreter fits in a couple lines of cache.
This is the original J interpreter (written by Arthur Whitney), it looks like line noise, but use the 'indent' command and you will see its beauty:
typedef char C;typedef long I; typedef struct a{I t,r,d[3],p[2];}*A; #define P printf #define R return #define V1(f) A f(w)A w; #define V2(f) A f(a,w)A a,w; #define DO(n,x) {I i=3D0,_n=3D(n);for(;it=3Dt,z->r=3Dr,mv(z->d,d,r); R z;} V1(iota){I n=3D*w->p;A z=3Dga(0,1,&n);DO(n,z->p[i]=3Di);R z;} V2(plus){I r=3Dw->r,*d=3Dw->d,n=3Dtr(r,d);A z=3Dga(0,r,d); DO(n,z->p[i]=3Da->p[i]+w->p[i]);R z;} V2(from){I r=3Dw->r-1,*d=3Dw->d+1,n=3Dtr(r,d); A z=3Dga(w->t,r,d);mv(z->p,w->p+(n**a->p),n);R z;} V1(box){A z=3Dga(1,0,0);*z->p=3D(I)w;R z;} V2(cat){I an=3Dtr(a->r,a->d),wn=3Dtr(w->r,w->d),n=3Dan+wn; A z=3Dga(w->t,1,&n);mv(z->p,a->p,an);mv(z->p+an,w->
C vt[]=3D"+{~=3D'a'&&a'9')R 0;z=3Dga(0,0,0);*z->p=3Dc-'0';R z;} verb(c){I i=3D0;for(;vt[i];)if(vt[i++]=3D=3Dc)R i;R 0;} I *wd(s)C *s;{I a,n=3Dstrlen(s),*e=3Dma(n+1);C c; DO(n,e[i]=3D(a=3Dnoun(c=3Ds[i]))?a:(a=3Dverb(c))?
main(){C s[99];while(gets(s))pr(ex(wd(s)));}
Here is another excellent page:
http://www.kuro5hin.org/story/2002/8/30/175531/76
Unique identifier or index required?
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
try to write a query in SQL that depends on two query results being equal (for instance, "show me a list of all customers who have each bought at least one of every product")
SELECT Customer.* FROM Customer, (
SELECT CustomerId, COUNT(DISTINCT ProductId) AS NumProducts FROM Purchase GROUP BY CustomerId
) AS DistinctPurchases, (
SELECT COUNT(DISTINCT ProductId) AS NumProducts FROM Product
) AS DistinctProducts
WHERE DistinctPurchases.NumProducts = DistinctProducts.NumProducts
AND DistinctPurchases.CustomerId = Customer.CustomerId;
Or in English:
Count how many different products have been purchased by each customer
Count how many products exist
Show me all the customers where the two numbers above are equal
Not that hard really.
Dynamic Relational may be a solution to some of the problems raised. Columns and tables could be "created" on the fly. Another improvement is to replace SQL with a better relational query language. Alternatives include Tutorial-D and my pet, SMEQL (originally called TQL but found a name overlap).
Table-ized A.I.
I have often thought about how cool it would be with an Hierarchical database which you could sort huge amounts of data very easily and then have like symlinks (symolic links) that point from one place to another.
That could sort a huge amount of data in a very simple and hierarchical way kind of like XML or a file system.
Imagine storing the information of a brain in such a database.
SQL isn't an implementation of relational algebra. It's an implementation of relational calculus plus a bunch of extra features (sorting, analytics, and whatnot). The idea is that the user should be able to specify what they want (relational calculus) instead of how to get it (relational algebra).
The author was not disputing the concept of declarative techniques (such as constraint-based programming where you ask for what you want instead of how), but rather complaining about SQL as a language for doing such.
I have to agree with the SQL complaints. I even designed a new query language intended to replace SQL (but have no test implementation yet). I will pit it against SQL as far as simplicity and elegance any day (although measuring "elegance" can be subjective).
Table-ized A.I.
Data management does not equal knowledge management.
The biology example focuses on the changing interpretations of data, which, assuming that the data was well modelled, is really a problem for the KM application layer, rather than the data management layer.
Remember -- data itself doesn't change, only interpretations do.
Yeah, for those terabytes of data taken up by your mom's recipes and your cd collection, the extreme power of PHP and MySql is all you need, man.
Well, us nerds need *some* way to justify the cost of a 4-way CPU box to the folks. How else are we gonna run our 3D porn server?
Table-ized A.I.
You discount human factors. If there are no resources and big jobs to do, you're stuck.
I've enjoyed a few 'That's impossible' moments myself, once at the expense of a over educated 'expert' that did'nt know how to read an execution plan. Same deal, 3 minutes turned into 15 seconds. I was the twentyeth or so person thru the code looking for speed. There sure are a lot of clueless people out there selling themselves as SQL experts.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
The full features of RDBMSs are totally overkill for about 90 percent of business applications. Standardized Business Object level data management is all that is needed.
I am suprised that more developers are not thinking this way while most f500 companies are depending on such systems for their core product and lifecycle management.
Relational Object Bridging is nice but still to programmatic of a solution. A Business layer is the way to go that is a black box with language independent API.
JsD
Most databases do allow you to turn off logging (durability) and allow you to ratchet back the isolation level. I can't think of any sane technical reason to eliminate atomicity and consistency.
-Stu
In recent SQL versions you can at least say
SELECT FIRST_NAME, LAST_NAME, ORDER_ID
FROM CUSTOMER
JOIN ORDER USING CUSTOMER_ID
I agree about the 'SELECT * FROM' crap though. It wouldn't be a big extension to SQL to let you just say 'CUSTOMER'.
All customers who have bought at least one of every product? Not too hard, surely:
SELECT *
FROM CUSTOMER C
WHERE NOT EXISTS (
SELECT *
FROM PRODUCT PROD
WHERE NOT EXISTS (
SELECT *
FROM PURCHASE PUR
WHERE PUR.PRODUCT_ID = PROD.PRODUCT_ID
AND PUR.CUSTOMER_ID = C.CUSTOMER_ID
)
)
OK, a bit cumbersome - 'where does not exist any product he didn't buy' - but you get used to it after a while. A true set difference operator would be useful, or perhaps you could rewrite part of the above using outer joins and testing for a null value in the result of the join.
-- Ed Avis ed@membled.com
Data management does not equal knowledge management.
... ok, that's not my department, but some others here do that.
The biology example focuses on the changing interpretations of data, which, assuming that the data was well modelled, is really a problem for the KM application layer, rather than the data management layer.
Remember -- data itself doesn't change, only interpretations do.
It's not so much the changing interpretations of the data as a combinatorial explosion of altering procedures (even for one set as it progresses thru the labs), altering models (we learn things in biochem, realizing that what we thought was just an ATP process is more complex, as other scientists (or ourselves) discover how things work), and the variation in actual recording between different scientists and assistants.
Most DMBS users exist in a world of facts with static states, here we measure phase transitions in milliseconds and conformational changes of protein domains in varying temperatures and conditions. We need the data, not theories about how we should measure it the way you measure the slow non-biochem world.
But thanks for thinking about it. We'll keep coding in Perl and using Terabytes of storage while designing the next generation of computers that run on these processes
-- Tigger warning: This post may contain tiggers! --
This article speaks of MySQL serving ~1TB in real life, and links to a benchmark which has MySQL scaling as well as Oracle (just one benchmark, but I think it makes the point).
MySQL has a number of features, restrictions and peculiarities which I find irritating, but in terms of raw performance, especially on reads, it doesn't seem to stop when your database gets seriously large. I find PostgreSQL much more pleasant to use, and this article speaks of Fujitsu helping to add Table Spaces to make management of data "into the hundreds of gigabyte" easier, with the implication being that people already have PostgreSQL databases that large, and the feature is basically a bonus. This article also mentions a PostgreSQL database of over a terabyte.
I think you'll find that the limitation is not the software, the limitation is that precious few MySQL DBAs are familiar with databases larger than you can squeeze into a desktop machine (the machine in front of me will take 4x250GB IDE disks for a total of 1TB of storage, for example, and if you had matching SATA drives as many controllers do, put in a new PSU and double that).
The developer.com article mentions that Oracle was harder to tune for larger databases than MySQL, so perhaps this is changing, perhaps we will see more people asking if it's worth spending the extra money for a database that's harder to operate, and no faster. Perhaps it would be cost-effective to spend the money on more servers instead (you can get a pair of jaw-droppingly impressive servers for the price of a single high-end Oracle licence), and rely on redundancy rather than expertise. PostgreSQL supports replication, and there are bolt-ons to do the same for MySQL, kinda-sorta, so it's not an unreasonable proposition and can only get more attractive as these features are improved.
Got time? Spend some of it coding or testing
Now back it up with Real Life(tm) references.
I don't like MySQL (I prefer PostgreSQL or ibFireBird) but at 100 tickets a second it does seem to cut the ice for large applications.
Arjen also routinely mentions "terabyte" databases, although he tends to speak more in terms of "billions of records". If in doubt, email him. You'll get an authoritative answer.
Got time? Spend some of it coding or testing
Some "new" databases exactly meet your requirements. db4o, the open source object database, native Java and .NET is:
.NET
- very lean and not complex at all (one line of code stores any object)
- scales extremely well
- is cross platform Java and
- very performant (up to 44x faster than Hibernate+MySQL, for instance)
Chris
http://www.db4o.com/
The article is basically saying data has to get smart.
That's precisely what object-orientation is all about: bringing data and behavior together.
OODBMSs such as http://www.db4o.com/ will be key players in this future of his.
See you, Klaus.
Db4o - The Open Source Object Database
Prevayler - Persistence is Futile
>we measure phase transitions in milliseconds and conformational changes of protein domains in varying temperatures and conditions.
Look up star schema data models, that might give you some ideas.
Think of events with values that occur over time. So its either sales of blue jeans of size 14 in New York state during the period of Dec 1 to 15th, or the changes of protiens at a certain temp/conditions during the 121 second and 300 second marks.
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
Think of events with values that occur over time. So its either sales of blue jeans of size 14 in New York state during the period of Dec 1 to 15th, or the changes of protiens at a certain temp/conditions during the 121 second and 300 second marks
I think you mean 121 millisecond to 300 millisecond.
Conformational structural changes in proteins are measured in very short time segments, nowadays using lasers to heat them up fast.
We also freeze them at -80 and -40 C temps, or use 4 C, 20 C, or 40 C temps (hint, 20 C is room temp).
-- Tigger warning: This post may contain tiggers! --