What Capacities Do Databases Have?

← Back to Stories (view on slashdot.org)

What Capacities Do Databases Have?

Posted by Cliff on Friday February 2, 2001 @05:20PM from the my-table-is-bigger-than-yours dept.

razor69 asks: "Having programmed PHP in conjunction with MySQL for some time now, I'm quite happy with the whole setup. However, I'm now working up to some more serious clients, which require larger and larger databases. Looking for database comparisons, I'm stuck with the stats MySQL give me and none of the database manufacturers tell me what the maximum amount of records their system can handle is. Now I know that Oracle is unlimited, but so is their price. Which database program is suitable for what price, how many records can they hold (practically as well as theoretically) and how do they all REALLY compare?"

11 of 30 comments (clear)

Min score:

Reason:

Sort:

Re:Not that easy by Diamon · 2001-02-02 18:08 · Score: 2

Gee DB2 doesn't count as one of the "Big" Databases? C'mon it even runs on Linux.
Oracle price unlimited? by dmorin · 2001-02-02 23:28 · Score: 2

Perhaps somebody on this thread can answer my question, since it seems relevant. Most of the time people speak of Oracle they use expressions like "unlimited price" in the same sentence. But the few people I've talked to that are users (and therefore evangelists) of Oracle have all said $25k. That's not really *that* bad, especially for a company that knows its database needs when they make the budget. My question is, is that really the price? What exactly does one get for that price? If not, what's the more accurate price for Oracle in the wild?

--
www.HearMySoulSpeak.com
1. Re:Oracle price unlimited? by CritterNYC · 2001-02-04 21:33 · Score: 2
  
  Microsoft actually has an accurate price comparison of Oracle, DB2 and SQL server. Granted, this is before you negoatiate "corporate discounts", but in an 8-way 1GHz setup, Oracle costs $800,000 compared to DB2 at $180,800 and SQL at $159,992.
  
  --
  Portable versions of Firefox, GIMP, LibreOffice, etc
2. Re:Oracle price unlimited? by raju1kabir · 2001-02-04 05:10 · Score: 2
  
  what's the more accurate price for Oracle in the wild?
  
  You can go to their web site and price it for yourself. As others have mentioned, their pricing-scheme-of-the-month involves adding up the speed of the processors that will be running it and multiplying by some magic numbers.
  
  The last Oracle installation I shepherded came to about $2000/year. That was for 8i on a single-processor, 600Mhz machine. In fact, that's enough power to handle a fair amount of traffic as back-end to a web site that doesn't do anything hugely complicated. In a case like that, throw money at hardware: fast multi-channel storage in particular.
  
  The trick comes when you want to scale too much beyond that. To use multiple servers you need features of Oracle Enterprise Edition, which is about 5 times as much (per server).
  
  But the fact is, it may well be worth it. The software runs well, is pretty fast these days, can do everything under the sun, and it's the easiest serious database to find qualified staff for.
  
  On the other hand, it's not the sort of thing you install one morning and then start merrily using that afternoon, like MySQL. Factor in your costs that you'll either need to hire experienced people (recommended) or spend many weeks with piles of expensive books making false starts before you start getting productive return.
  
  --
  "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
MySQL by lal · 2001-02-03 01:07 · Score: 2

We are read-only ~10 million row database in MySQL on a dual PIII 800 w/1GB RAM. Indexed lookups are fast (sub second). MySQL was created as a data warehousing database, and when you use it for its original design purpose, it is fast.

We found that the limiting factor is the number of indexes on each table. All indexes for each table are stored in one file. As you add indexes, reindex performance degrades; we're guessing that the cause is b-tree rebalancing. Anyway, we addressed that issue by creating and indexing sub tables. It's a query-only application, so we can afford to do things like that.
Interbase/Firebird 6.0 limits by baudtender · 2001-02-02 13:20 · Score: 2

Maximum size of database: 32TB using multiple files; largest recorded InterBase database in production is over 200GB

Maximum size of one file: 4GB on most platforms; 2GB on some platforms

Maximum number of tables: 64K Tables

Maximum size of one table: 32TB

Maximum number of rows per table: 4GB Rows

Maximum row size: 64K

Maximum number of columns per table: Depends on the datatypes you use. (Example: 16,384 INTEGER (4 byte) values per row.)

Maximum number of indexes per table: 64KB indexes

Maximum number of indexes per database: 4G indexes

And don't forget, Interbase/Firebird is absolutely free and open source!
Not that easy by SuiteSisterMary · 2001-02-02 13:19 · Score: 2

1: It's not the record count, it's the data size. 2: Quite a bit depends on the underlying OS, and what it can and cannot do. 3: Quite a bit relies on what you're doing; a database optimized for transactions will have different high water marks than a database optimized for querying. 4: That having been said, the Big Four (Oracle, Sybase, Informix, MS SQL Server) can pretty much handle what you throw at them, so long as you can shell for the hardware.

--
Vintage computer games and RPG books available. Email me if you're interested.
1. Re:Not that easy by SuiteSisterMary · 2001-02-03 03:22 · Score: 2
  
  In my personal experience, no. When I was doing developer support for a web app development platform, I never once saw DB2 in use. Somebody else could probably tell you more about it than I. :-)
  
  --
  Vintage computer games and RPG books available. Email me if you're interested.
PostgeSQL info by Leknor · 2001-02-03 18:02 · Score: 2
since no one has mentioned it I thought I'd dig up a few links I remember reading.
- From the PostgreSQL FAQ: "What is the maximum size for a row, table, database?"
- semi-old article on why not to use mySQL
- a comparison of mSQL, mySQL, and PostgeSQL: part 1, part 2, part 3
- and if your lazy: google search for "linux sql database"
From the PostgreSQL FAQ, linked above:
- Maximum size for a database? unlimited (60GB databases exist)
- Maximum size for a table? 16 TB
- Maximum size for a row? unlimited in 7.1 and later
- Maximum size for a field? 1GB in 7.1 and later
- Maximum number of rows in a table? unlimited
- Maximum number of columns in a table? 250-1600 depending on column types
- Maximum number of indexes on a table? unlimited
If you couldn't tell I like postgres but as a business you should get what you think is best. From what I've heard Oracle on Solaris is where it's at if you can afford it. (I can't)

Leknor
Database talk, but not all about speed. by Anthony+Boyd · 2001-02-02 15:38 · Score: 2

I have practical experience with Oracle on Solaris (while I was at Borland and SST), with Access on Windows NT (while I was at Actuate), with CSV flat-file databases on Solaris, and with MySQL on Linux (at SST). In addition, the employee I found thanks to the /. thread on hiring, has experience with MySQL on Linux as well.

From what I can tell, MySQL is fast, fast, fast for certain limited uses, and can handle databases into the hundreds of thousands of records, although I have not pushed it into millions yet. At somewhere around 250,000 database calls every day, the thing starts to have problems, like the server needs a reboot. I don't know if the same problem happens if those calls are spread out over a long period of time. Throwing more iron at it, or reworking the indexes may help. We haven't upgraded to 3.23 stable yet, though, so that may solve the problem. In addition, we've begun using the "p" functions for databases -- you know, the "persistent" database connections. Use "mysql_pconnect" instead of "mysql_connect" for (sometimes) better response.

One of the nice backups for MySQL (and few people seem to use this, but I like it a lot) is that it can write out every SQL update/insert/delete statement to a file. In the case of database corruption, you can take that file, go though the last few statments, and try to correct it. Then regen the ENTIRE database from that SQL file. In addition, 3.23 has something new -- replication or mirroring, something where it duplicates the database onto backup disks. I haven't used that yet. Perhaps someone reading this could share more.

As for Oracle, yes, it can handle just about anything. I don't like the speed on older versions of the database. 8i is good, but actually not as fast as MySQL for some selects. I haven't tried 9. But it is a workhorse, it is storing hundreds of tables and millions of records for us. My current employer is trying to move away from it, because of the cost issue you cited. One of the things I like about Oracle is triggers, which MySQL doesn't have yet.

I know that the worst database I used was back at Borland in 1996 -- I don't remember what the database was called, it was part of a Lotus Domino system. In any case, with 100,000 member records in the database, it took a good 20 minutes for it to add a new record. It was miserable. We had to put big bold text on the "become a member" page that warned of the wait, and I had to write a JavaScript (onclick) for the submit button, that returned false on subsequent clicks, so that people didn't abort the procedure on accident. I hated that. I felt like we were this technology company that didn't understand Web technology.

As an interesting aside, back then Borland had a product called IntraBuilder. It was one of the first Cold Fusion-like systems, but it's database system was just poor. That's partly why they named it as they did -- they needed to brand it as an intranet tool, because they knew it wasn't fast enough for a full Internet deployment. But once the product was out there, the pressure was on to use it on Borland's site. We tried, Chris Malatesta and I. But I recall showing the IntraBuilder team some pages it served up, and how there was a full-second delay per user. Their response? "That seems acceptable to us." Chris and I tried to talk to them about scalability, about how the seconds accumulate and build up. They had no idea how hammered the product would be on a "real" Internet Web site. The product died, thankfully, when we ran a test with dozens of users, and database corruption was rampant -- one person's password appearing on another person's browser, and so on. Ugh. I don't miss those early days.

--
My Greasemonkey scripts for Digg &
Massive databases... by SClitheroe · 2001-02-02 14:27 · Score: 3

I think any large DB project should be treated as an infrastructure project...

When you're dealing with clients who have (relatively) massive data requirements (ie., data requirements that outstrip the smaller DB services such as MySQL that can be hosted on cheap, commodity hardware), you've got to step back and take a second look at the requirements of their business.

Before you start looking at particular DB packages, and their relative merits, you should really start by identifying the business requirements for the database. For example, what are the uptime and recoverability requirements? If your client requires 24x365 uptime, and also wants a hot standby site, you've got to start spec'ing out the system at the hardware and network level.

Identify the infrastructure that will sustain the requirements of your client, and then begin the process of selecting a DB package that works well within the constraints of that environment. When dealing with larger clients, always keep in mind that you are fundamentally delivering a DB infrastructure, not simply a DB.

A couple things you might want to consider:

Availability - does the system have to be redundant? Does it have to support automatic failover between redundant server?

Scalability - does the DB need to grow beyond the constraints of a single server? Can you forsee this data existing on a large array (ala EMC), and do all nodes of the DB complex need simultaneous access at some level?

Middleware - If it's a large DB, possibly being accessed from multiple physical locations, does it look like a queued middleware (ala IBM MQ) is needed, or perhaps appropriate?

Disaster recovery - If the client is a 24x7 shop, or even if they aren't, are you going to have a sufficient window to back this beast up? Are they going to ask you to architect an off-site recovery strategy? If so, does it have to be a "hot" standby site? If so, does your hardware/network infrastructure support dynamic mirroring over a WAN?

Once you've hammered out these kind of questions (and there are lots more..), then you are in a position to start evaluating DB packages that will work well withing the infrastructure you've designed.

As with any project, getting the requirements defined out front is going to save you a lot of time and agony, and is going to save your client a lot of money and aggravation.