When is Database Muscle Too Much?

← Back to Stories (view on slashdot.org)

When is Database Muscle Too Much?

Posted by Cliff on Tuesday October 29, 2002 @09:35AM from the databases-vs-flat-files dept.

DBOrNotDB asks: "At some of the places I've worked in the past, there have been DBAs who generally insisted that given accurate specifications and enough hardware and software, you could stuff nearly anything you wanted to into a database, manipulate it, and pull it back out again in a reasonable time. The feeling at my current workplace seems to be that very few projects lend themselves to database usage and that a customized one-off data storage solution should be developed for each project. This seems like a violation of many major software engineering principals (e.g. reuse) to me. My question is, what kind of success or horror stories does the community have about trying put different projects into databases? Numbers (# of rows, tables, total data storage, cost, etc) would be nice, but even just anecdotes would be helpful."

2 of 93 comments (clear)

Min score:

Reason:

Sort:

We use the DB for nearly everything by ComputerSlicer23 · 2002-10-29 10:18 · Score: 5, Interesting

I write a screen scrapping application, that downloads lots (100K + web pages a week). We write absolutely everything page into the database. 5GB is enough to hold about two days worth of pages. So we can't keep it in our production database. Especially because 95% of the data doesn't need to be stored except for auditting, but we keep it around in case a mistake is found, or a new tread in the pages happens.
The reason we use the database instead of the filesystem is deathly simple. The database is god-like. I can do point it time recovery, and guarunttee that the database is completely consistant with the recovery point. If I had all that in a filesystem it'd be harder. It means I have hot-rollover capaibility from server to server, without having to duplicate my filesystem from machine to machine, I just let the hot archive logs deal with that. It means I have one backup system, and one failure recovery plan. It means all I have to monitor is the Oracle tablespace to ensure I don't run out of space. It means when I say "commit", I can hold Oracle accountable for ensuring the data is there, rather then having myself held accountable by the management.
If I was a good little boy and swallowed all the kool-aid, I'd use iFS (Oracle's Internet Filesystem) and it'd be all good. However, I don't I just use a huge array of blob's in my Oracle database.
Now that said, I have a remote filesystem that all of this data gets spooled to. Once spooled there it gets written to CD. Once the CD's are written, they are then used to find, compare, and if they match delete the blobs out of the Database. The CD then deletes the files out of the spool. Duplicate the CD, compare the two, send one offsite.
The other reason we use the database, that it's easier to deal with in our application, because writting a join against the filesystem is tricky...
Kirby
1. Re:We use the DB for nearly everything by ComputerSlicer23 · 2002-10-29 19:29 · Score: 5, Interesting
  
  What I thought was interesting, is that you say that you can hold Oracle accountable for missing data. What's to keep Oracle from claiming that you're a dumb ass and messed it up all by yourself? Doesn't the click through on the license basically keep them from being held liable? What do you think management is going to say when Oracle tells them the employee is the one who screwed up?
  I've submitted a request into the Oracle iTAR (Technical Assistance Request) system, at 3:00 AM, gotten a call back by 3:10AM, and a resolution by 3:45AM on a relatively esoteric bug. When I say, I get to hold Oracle accountable, I mean, I can hold Oracle accountable to get me up and running pronto with as much data as I'm going to get. It costs an arm and a leg, but the only people I've ever heard of who have lost their data using Oracle just didn't do backups properly. Oracle is pretty serious about keeping your data around. If you stay with stable tested versions, you'll be fine with Oracle.
  I've had Oracle help me on de-supportted platforms, using non-standard configurations, doing crazyness of my own making. They always help me when I contact them for support. For production machines, I stay on the tried and true, but Oracle has never let me down in weird situations.
  Oracle as a general rule, has *never* told me buzz off it's your fault. They stayed with me, and found it was my fault, and then showed me the doc's where it says I'm doing something that won't work. For all their faults, Oracle has *NEVER* failed me in any way when it comes to support. When it's my fault, I stand there and take it on the chin. However, as a backing store, Oracle will whoop anything I write eight ways to Sunday for speed, reliability, portability, quality, documentation and support. Hand's down. If you can afford it, there is no excuse for writting your own custom storage manager. If you can't afford it, try PostGreSQL. Oracle or PostGreSQL *WILL* be better for 99.9% of the cases out there. Google is one of the few examples of a situation where writting your own is probably a good idea.
  For the record, out of the ~200 million records I've processed, I've lost 1, count'em 1 record using Oracle that was Oracle's fault (I've lost any number of them when Linux crashed, but that's my fault, not Oracle's). Even then Oracle clearly identified which one it was, and when it happened, so it was easy to recover.
  Kirby