30+ GB Databases On Unix?

No, only Microsoft SQL Server can do it. Period. by Anonymous Coward · 2000-07-25 19:22 · Score: 1

Win 2K doesn't have 2GB file limit so it is infinitely more scalable than Linux. Also, you get nice GUI tools to help you if you don't know a heck about databases.

Re:Three words: by Anonymous Coward · 2000-07-25 19:47 · Score: 1

The new oracle pricing model isn't based upon users anymore. You basically pay $15 per MHz of cpu speed you have, at least for the standard edition. The enterprise edition rings in for quite more. For instance, if you have oracle running on a dual PIII at 700MHz, that's 1400 total cpu points.

1400($15) = $21,000

Support cost is an additional 22% of the total list price.

$21,000(1.22) = $25,620 total price.

That's per machine. It is ironic, that once you license a machine under this model, you can run as many instances on it as you like (barring performance). Seeing that this is a ~30G database, I wouldn't see that happening.

The low end single user licenses your talking about are time based licenses on a single 500 MHz processor (if i remember correctly). So, with one of those, you could run as many instances you can squeeze on a single PIII 500 MHz for two years. I do reserve the right to be wrong on this though. The large sum pricing above is for permanent licenses.

Cheers

Sybase on Linux by Anonymous Coward · 2000-07-25 20:17 · Score: 1

We've been running ASE on linux for almost a year now. We had some initial issues with our RAID array and performance issues that more memory solved. Since then, it has been rock solid.

See the ase-linux-list for more info on large db's and raw i/o. mailing list archive.

However, replication server is not supported. yet. I think this is going to be a showstopper for you, eh?

Again see the list for more info.

http://www.sybase.com/linux/

michael peppler's home page

Re:Raid 5 for a database? You must be kidding. by Anonymous Coward · 2000-07-25 21:01 · Score: 1

Uhm, actually, seek times are dramatically improved in most (if not all) RAID levels due to the fact that redundant data + multiple drive heads = faster concurrent access. Write times on the other hand suffer terribly under RAID 5 (since you have to write multiple copies of the data).

I've never seen a production DB not run on at least a RAID 5 array (most run on something more serious, like netapp drive arrays, which are basically a RAID 5 type of system).

Who moderated this misinformation up?

it works fine by Anonymous Coward · 2000-07-25 21:10 · Score: 1

Linux and 30Gb+ database work fine. In my company we are using both oracle on linux and oracle on sun. We have databases running on linux and Oracle 8i that are more than 200Gb, running on a dual PII and it really work flawlessly, we are not using raw partition but a regular filesystem mounted from a raid 5 volume on a mylex adapter. We are extremely happy with that, and we are using it on a extremely important production system for the company. We are not using sysbase for linux, but I have some friends that are using it and they are very happy with it, so my guess is that it should work fine for you. After that, for the kind of hardware you need, well, it all depends of what kind of traffic you have, but if i were you, i'll definitly go at least with a dual-PIII-something and a raid 5 card like mylex, and a couple of 9 or 18Gb scsi harddrive.

64-bit Hardware by Mike+Hicks · 2000-07-25 20:57 · Score: 2

If at all possible, I'd recommend getting some 64-bit hardware. Probably an Alpha-based system. Next, get a decent filesystem like ReiserFS or Global Filesystem.

If you are running on x86 hardware, there's not telling if the accesses will be capable of reading large files (>2GB).
--
Ski-U-Mah!

Re:30Gb databases by Phroggy · 2000-07-25 20:43 · Score: 1

Linux/IA32 probably not, at least under e2fs as you'll likely hit the 2Gb filesize limit, depending on how the database engine involved implements storage (Oracle using its own data partition in "raw iron" style?). Linux on other architectures, specifically the 64bit ones (Alpha, Sparc, Sledgehammer and IA64 before long) would probably be fine.

I hate x86 as much as the next guy, but wouldn't file size limitations be an issue with the operating system or filesystem, rather than the CPU architecture?

--

--
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;

Re:Clarity of Expression by Phroggy · 2000-07-25 21:01 · Score: 1

BTW: Before I get flamed, the Hotmail/FreeBSD thing I remember from somewhere, but I can't remember where. I do know its NOT on an NT box, which basically leaves UNIX.

My understanding is, shortly after Microsoft bought Hotmail, they send in their engineers and tried to convert it to NT. After awhile they gave up and left. They tried again several months later, with similar results. NT won't do it.

Here is the NetCraft query.

--

--
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;

Re:30Gb databases by Phroggy · 2000-07-25 21:59 · Score: 1

Linux uses the native word size of the machine for file offsets, so on 32bit architectures, file sizes are limited to 2GB, while on 64bit archs, Linux can handle files up to 8EB (1 exabyte ~= 1 million terabyte).

Shortly after posting, something like this occurred to me. Not something I know much about though; thanks.

I want an exabyte of something.

--

--
$x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
$x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;

Re:You really mean 30 GB Database on Linux by Stefan · 2000-07-25 20:16 · Score: 1

You'd also want RAID 5, preferably hardware which is supported by Linux

What kind of advice is this people are giving???

RAID 5 is real slow for small writes common with a database. You have to first read the whole stripesize (much bigger than the oftentimes single block to write) from all disks, calculate parity and write the small changes back (data + parity). What you want is mirroring, RAID 1, which won't decrease write performance to a crawl but keep your data safe.

Sybase caveats and a new free version by emil · 2000-07-25 20:32 · Score: 1

I've had great luck with 11.0.3.3 on Linux, but I'm not doing anything serious with it yet.

The original 11.0.3.3 could use both files in the file system and raw hard drive partitions to store data. Do not use files in the file system for your large database, because the limit is still 2Gig for a single file. Use fdisk to allocate a large block, then format it with Sybase.

A fresh release of 11.0.3.3 is available at linux.sybase.com that addresses all sorts of bugs and caching issues. I suggest that you start with this product. If you can make it work with a free product, you have lots more options - you can have two backups on the free version, and one on the 11.9 supported version.

Sybase is just plain cleaner by emil · 2000-07-25 20:50 · Score: 1

When you have Sybase configured properly, you can have only a single UNIX process acting as your database server (if you don't run the backupserver when you aren't running backups). If you have SMP, you run one more Sybase server process per processor and they communicate with standard IPC. Installation is tricky for a novice, but the tools work as advertised. Sybase uses the RPM format for their Linux installations.

The Oracle installer (written in Java so you must have the blackdown JRE - it is just sick and wrong) commonly fails when configuring a database instance. Yes, there are workarounds available, but why not write an installer that works properly? When you have Oracle running, it lights up your process table like a Christmas tree - at least 4 server processes, plus some sundry rubbish.

A UNIX admin who admires efficiency will be happier with Sybase.

Re:Sybase is just plain cleaner by myconid · 2000-07-26 03:23 · Score: 1

burton:~ # ps aux | grep mysql root 232 0.0 0.0 1904 0 ? SW Jul20 0:00 [safe_mysqld] root 246 0.0 0.1 11244 48 ? SN Jul20 0:00 [mysqld] root 254 0.0 0.1 11244 48 ? SN Jul20 0:00 [mysqld] root 255 0.0 0.1 11244 48 ? SN Jul20 0:00 [mysqld] Whats your point again?

--

SB.
Re:Sybase is just plain cleaner by java.bean · 2000-07-25 21:06 · Score: 1

The Oracle installer (written in Java so you must have the blackdown JRE - it is just sick and wrong) commonly fails when configuring a database instance.

The latest version includes a JRE. I've installed it 5+ times and never seen a failure. No argument about earlier versions.

When you have Oracle running, it lights up your process table like a Christmas tree - at least 4 server processes, plus some sundry rubbish. A UNIX admin who admires efficiency will be happier with Sybase.

Yeah, no UNIX admin would ever run something like Apache because it creates too many processes. :-)
--jb

Re:Clarity of Expression by Matthew+Weigel · 2000-07-26 04:56 · Score: 1

Your history is so far out of whack that I can't really address it... suffice to say:

BSD and AT&T UNIX shared stuff from Version 6 through Version 8, IIRC, but SysV was developed afterwards by AT&T in conjunction with Sun
SCO now handles SysV, and the Open Group handles UNIX
BSD and SysV are the two major strands of UNIX, not SysV and "SrV"
Linux is not UNIX and neither are {Net|Open|Free}BSD, but they might as well be

This might help you out a bit...

--
--Matthew

Re:Of course, yes... Wait, of course, it depends. by riffraff · 2000-07-25 19:58 · Score: 1

I worked at Nielsen for a bit, and (IIRC) they had around 10-12 TB databases, using Sybase under sun hardware (E10000s and others).

Because I had a half-dozen Sybase SA's within a stone's throw of my desk, I used Sybase for my personal database at home (it was only about 160M).

Sybase works really well under Linux. I'm pretty sure you won't need to worry about file size, because I ended up having several files for the database (my database was initially too small, and I just initialized a new disk and added it to the current database to add data and log space).

lance

please clarify by soellman · 2000-07-26 00:36 · Score: 1

so I'm not sure why your jaw is on the floor.. What are the benefits of local storage?

In this scenario, I'd think a gig interconnect or two (on a dedicated storage network) to a NetApp might be fine. Probably faster than local storage (assuming you have a better subsystem on the NetApp), and you contain your storage concerns into a dedicated machine, rather than having to deal with host storage. Adding extra scsi host adapters while the db machine is active is a dance any sysadmin would surely avoid.

And while you're at it, why not have two NetApps clustered?

cheers,
-o

Large databases by nerdin · 2000-07-26 04:56 · Score: 1

Of course that will run on Unix. Most large databases do and only a few clueless DBAs run that kind of DBs on NT.

I'm sure they'll run also on Linux x86, but I'd be concerned about I/O... and other architectures like IBM or Sun can handle that in better ways (and, oh yes, you can still use Linux there).

Re:Large databases by Optic · 2000-07-25 20:09 · Score: 1

yes, you can get Oracle for x86 Solaris.

Large Production Databases by Zachary+Kessin · 2000-07-25 18:54 · Score: 3

Machines desinged to deal with very large databases tend to be more expensive than your average desktop, even if it has the same amount of disk. They tend to be built for reliablity and stability and speed. All of which cost money.

My Advice don't skimp on buying the box, you will probably loose anything you save in admin costs on a cheap and not very good box.

The Cure of the ills of Democracy is more Democracy.

--
Erlang Developer and podcaster

Re:Large Production Databases by Tet · 2000-07-25 19:00 · Score: 2

My Advice don't skimp on buying the box, you will probably loose anything you save in admin costs on a cheap and not very good box.
Yep, couldn't agree more. Don't even think of using anything other than hardware RAID for something like that, too. You won't regret it.

--
"The invisible and the non-existent look very much alike." -- Delos B. McKown
Re:Large Production Databases by JonK · 2000-07-25 19:05 · Score: 1

I'd guess it's more about the cost of a commodity x86 server (say a ProLiant 6400) being considerably less than the cost of, say, a Sun E4500. Then again, the original questioner might be wanting to run it off an old 386 that's lying about the office (because Linux runs really well on 386s, right )
--
Cheers

--
Cheers

Jon

Re:You really mean 30 GB Database on Linux by tzanger · 2000-07-26 01:54 · Score: 1

It's only "Silly" until your UPS dies (or the card fails or your SCSI bus resets) while there are cached writes.

I would be under the impression that if your UPS going to die it would let you know through the power control protocols. Unless you mean if the UPS explodes unexpectedly, in which case I thought the battery kept cache data, not state data, which was the reason for my "silly" comment. If it keeps state data (a transaction log if you will) then I'm all for it. :-)

Cache will alleviate the performance problem for brief, small transactions.

Which was exactly the context in which I was speaking. The parent to my reply had stated that the bulk of DB transactions were small and that the multiple-write nature of RAID5 made it a performance bottleneck. I had said that a large write cache would allieviate that.

If you're moving more than 256MB through the controller (in either direction, remember that reads consume that cache, too) in less time than the disks can service it, then your I/O's become as slow as the disks. This is unavoidable and unfixable.

Agreed. But then you're back to square one anyway, with the system (usually) being faster than the bulk storage, which is why you have a small but fast disk cache, a slower but bigger controller cache, and a slower yet but bigger filesystem cache on the OS. Each time you step back from the hardware you get a larger cache. System memory is slower than the fast SRAM on the disk cache, but if the memory has it it's a ton faster than actually waiting to get the drive to give you the data (and waiting to get it over a 16/32-bit bus

RAID5 is best-suited for read-intensive environments, or cost-sensitive customers. It is not a high-performance solution. As others have said, RAID0+1 (striped mirrors) are the answer if you want fast and safe instead of cheap and safe.

I'll state again that it depends on your situation. No need to spend a pile on 30G SCSI-II UW disks for a database when you're doing many small transactions. Better to get a few smaller SCSI-II UW disks and RAID-5 with a large cache. There's the ultimate, then there's the practical. :-) The lines between which depend on the pocketbook and the application.

Re:You really mean 30 GB Database on Linux by tzanger · 2000-07-25 22:41 · Score: 2

RAID 5 is real slow for small writes common with a database. You have to first read the whole stripesize (much bigger than the oftentimes single block to write) from all disks, calculate parity and write the small changes back (data + parity). What you want is mirroring, RAID 1, which won't decrease write performance to a crawl but keep your data safe.

Personally I don't like having two very large disks around. Give me a half dozen or so smaller ones.

Also, Most hardware RAID controllers have a decent amount of cache with them. The DPT controllers I use can have up to (I think) 256M of ECC cache RAM and optionally battery back it up (silly IMO). That'll fix your performance issues on RAID5.

I think that RAID5 is a good idea, but YMMV.

Of course, yes... Wait, of course, it depends. by BadlandZ · 2000-07-25 19:00 · Score: 5

I have absolutely NO experiance with Sybase w/ Linux, but Sybase has claimed they support Linux, and are planning on being at Linux world, so it's worth calling them about it. (They seem to be trying to hire Linux techs pretty agressively!).

SQL database at 30G, sure. I would say call Sybase Inc. first, then VA Linux second, and get the answers streight from the people who are most likely sure to give you a usable product. Get your prices, then compare.

I'd be more worried about the differances in _how_ your going to mirror the data (connection speeds, transfer methods, how frequently) and that Sybase doesn't garble things when going from a database on one OS to another (unlikely, but possable).

I'm sure Oracle for Linux will be mentioned, because there are many claims that it will handle such a situation. But, your problem there is going from Sybase to Oracle, not from another OS to Linux. Keep in mind, not all "SQL" databases are identical, the SQL may be, but the extentions provided by the manufacture won't be.

Re:Of course, yes... Wait, of course, it depends. by Cedric+C.+Girouard · 2000-07-25 20:35 · Score: 1

I have absolutely NO experiance with Sybase w/ Linux, but Sybase has claimed they support Linux, and are planning on being at Linux world, so it's worth calling them about it. (They seem to be trying to hire Linux techs pretty agressively!).

I have firsthand experience with this, and one thing I can say: Yes they do Linux, yes they do it well, support is awesome, and prices are very reasonable.

Cons: Their Openclient is not thread safe yet, and wont do well on SMP. I'd have to look back in the specs, but I dont think a 30Gb DB would be a problem.

One word of caution: As many have said, do not skimp on hardware. _ever_

--
Marriage is considered capital punishment for the theft of a goat in some third world countries...
Re:Of course, yes... Wait, of course, it depends. by sbeitzel · 2000-07-25 22:42 · Score: 3

I'm using Sybase ASE 11.9.2 as my company's database, and running it on Red Hat Linux 6.2. We've found that using raw partitions can work, but with this version of Sybase the largest you can get a partition is 2GB so you have to distribute your database across several devices. That's no big deal, though.
Now, if you wanna talk about performance...get yourself a RAID and use a multiprocessor system. Sybase understands SMP systems and the RAID will help you on your I/O.

--
Oh, go on, check out my job.

It's what you do with it that counts! by stephend · 2000-07-25 21:07 · Score: 1

As many other people have said, 30Gb is nothing special volume-wise on 'big' machines, like Sun's.

30Gb is probably at the top end of what you could expect to put on an x86 box and so the question is, what do you want to do with it? If you're just storing the data and doing a few simple queries, you should be okay, although you'll probably want more than a gig of memory.

If you're doing heavy duty processing with many users then forget it. It's not a problem with Linux, but the hardware. (Yes, Linux will run on a mainframe but you can't get Oracle/Sybase/Informix on it.)

The software is less of an issue. Any of the big commercial databases would do the trick (I prefer Oracle, but then I wrote the Oracle on Linux Installation HOWTO -- URL above). MySQL has no transactions or referential integrity, so even if it could handle the volume it wouldn't be appropriate. Don't think I'd trust PostgreSQL, either.

Bottom line, I think you'd be cheaper with the expensive hardware in the long term.

Dejanews has a Oracle/Linux Implementation by ChiefArcher · 2000-07-25 20:51 · Score: 3

Take a look at Deja.com (aka deja.com)
All of that is run off of an oracle database..

The Database is HUGE!
/dev/rd/c0d0p1 71706488 41278452 29710576 58% /v/10

41GIG
As long as you have the right indexes... you're all set..

ChiefArcher

Hardware/Software by citmanual · 2000-07-25 18:59 · Score: 1

I think this really comes down to what you are choosing to have as your true goals. If you want cost, speed, redundancy, scalability or any other factor.

You need to decide what things are important to you and start making choices.

In the end, you will need to trade off all of these things to find your solution.

Personally, I do a lot of work with banks at the moment. They want brand name, proven tech. Not necesarily the latest greatest. On top of which, they are willing to pay for brand names. As a result, I would spring for a RAID tower coupled with a Sun box running Oracle. But, if I wanted cost, I would probably pick up a VA box or custom built with a RAID tower and run linux with oracle or maybe try postgresql.

It all ends up being a trade off.

Re:Size is not the issue by Amphigory · 2000-07-25 22:46 · Score: 2

Are you seriously suggesting running Oracle on an NFS filesystem?

*jaw drops*

I would have to recommend against this. By buying hardware RAID and an appropriate filesystem add/on (e.g. Veritas File System) you can get all the benefits of the filer with all the benefits of local disk.

--

--
-- Slashdot sucks.

Re:Credibility dropping fast by Chang · 2000-07-25 20:10 · Score: 1

While I agree that lately a lot of questions have been pretty brain dead, I think a lot of these questions have been pretty good discussion starters and I get a lot out of the responses from people who either know what they are talking about or know how to look things up and provide a reference URL when they respond. There are usually at least a few of these people replying to most questions.

I do ask people posting replies to avoid posting anything if you don't know for sure or you are too lazy to check your facts before posting. There are far too many people writing uninformed opinions and using phrases like AFAIK and IIRC to forgive themselves for not checking their facts before posting.

Sorry this post turned into a rant.

At work... by Palin · 2000-07-25 20:47 · Score: 1

At work a group implimented a Oracle data wharehouse on Sun equipment running solaris. The database sizes are expected to scale to 1 TB. But if I am remembering correctly the cost on the equipment/oracle was about $2.0 million.

--
Palin...

Re:raw partitions by peter · 2000-07-25 21:10 · Score: 1

Using raw disk partitions is not the same thing as Linux's "raw block device" support, which lets you access block devices without going through the buffer-cache layer. Some database programs want to use this so they can do their own caching, etc. However, lots of things access block devices, e.g. /dev/hda1, for example mkfs, and fdisk. The raw block device support is still very new, and was developed by Steven Tweedie. It uses kiobufs to do zero-copy IO. You would know about this if you were at the second memory management talk, given by Ben LaHaise, at the Ottawa Linux Symposium last week :)

So, unless it's the database's release notes that say not to use whole disk partitions, there should be no problem. The kernel lets you access a disk partition as a big file very easily.
#define X(x,y) x##y

--
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cordes , .ca)

Re:30Gb databases by peter · 2000-07-25 21:19 · Score: 1

Linux uses the native word size of the machine for file offsets, so on 32bit architectures, file sizes are limited to 2GB, while on 64bit archs, Linux can handle files up to 8EB (1 exabyte ~= 1 million terabyte).

Recently, large file support on 32bit archs has been developed, but it isn't in the main kernel yet, AFAIK.
#define X(x,y) x##y

--
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cordes , .ca)

Re:No remote NT management? wtf? by Mawbid · 2000-07-25 20:40 · Score: 2

If only the US Navy had your attitude. The whole "Gunship dead in the water" episode could have been avoided with a simple "Well, don't enter a zero there!"

I know it must be hard to be a non-windows hater (that's non-"windows hater", not "non-windows" hater) and listen to the crap that's flung about around here, but you've reached an absurd level of defensiveness. You're defending software (an application AND an OS) that crashes (according to one report) when the user makes a simple mistake and placing the blame on the user. An application should not crash when given invalid input. It should notify the user. An OS should not go down when an application misbehaves. It should kill the app, perhaps generate a core file, and keep on chucking.

Now, Linux and other UNIXes are not without their own problems in this regard, but at least the people responsible don't respond with "don't do that" when told about it. Neither does Oracle, I bet, but you do. You and Microsoft.
--

--
Fuck the system? Nah, you might catch something.

Raid 5 for a database? You must be kidding. by Nicolas+MONNET · 2000-07-25 20:22 · Score: 1

You must be smoking crack. Raid 5 is optimized for throughput at the expense of seek time, and DB don't give a fuck about throughput, but on the contrary live and die because of seek time. NO RAID 5 FOR DATABASES. Repeat 100 times.

Re:Raid 5 for a database? You must be kidding. by sbeitzel · 2000-07-28 03:42 · Score: 1

The original question was about a datamart. For production databases, you are absolutely correct. For loading data in and out, however, RAID 5 is perfect.

--
Oh, go on, check out my job.
Re:Raid 5 for a database? You must be kidding. by InsaneGeek · 2000-07-25 21:46 · Score: 1

Actually you want a combination of raid 0 and 1. Raid 0 will get you the performance and 1 will get you the redundancy you want. Raid 5 is VERY bad for any types of writes, the previous poster stated it very well, for reads it's OK, for writes it completely sucks.

I have yet to see a large databaes running Raid 5, small ones yes, ones that never change yes, but anything that gets updated regularly or is big enough to actually require something like Oracle/Sybase/Informix Raid 5 doesn't cut it.

Another thing a NetApp uses Raid 4 for their WAFL filesystem, snapshot is kinda cool as long as you aren't doing lots of writes (you can completely overwrite all your snapshot space, and then you're SOL). I still have issues with running a database over NFS even with gig ethernet directly attached. Of course, I still have problems with running most anything production over NFS (burned in the trenches geezer).

Re:Absolutely Raid 5 for Data Warehousing systems by Nicolas+MONNET · 2000-07-25 22:12 · Score: 2

The kind of database you're talking about is a far fetch from what most people here will need, including the original poster. I don't doubt that very specific cases require exceptions to the rule; the rule being, no RAID-5 for databases. That's not just for simple web-type databases; actually it's stated in the O'Reilly Oracle DBA book, which addresses much wider needs.

Re:Oracle officially recommends against RAID ... by Nicolas+MONNET · 2000-07-26 05:12 · Score: 2

In the Linux install notes, they claim that for optimal performance you have to split the Oracle install on 4 disks; which implies no RAID.

But hey, am I supposed to have higher journalistic standards than the slashdot editors? Eh eh eh eh.

Re:Oracle officially recommends against RAID ... by Nicolas+MONNET · 2000-07-26 14:14 · Score: 2

And one of the discs could also be a tape drive. Yeah.

Oracle officially recommends against RAID ... by Nicolas+MONNET · 2000-07-25 22:52 · Score: 4

seek times are dramatically improved in most (if not all) RAID levels

Seek time is not going to be any better in mirrorring, for one. The two heads reading the same data won't go faster than one head, will they?

Then for striping, this usually won't make any kind of difference since data access will be randomly spread over the disk. So there you go.

NOW smartly organizing the database WITHOUT striping amongst several disks *will* make seek times faster, actually, it will require less seeking. A typical Oracle installation (as recommended by Oracle) will have for example the software on one disk, the indexes on another, and the actual data on a third.

Now since one DB transaction requires typically at least one index lookup and one data retrieval, which are unlikely to reside close to each other on one disk. Now when they're separated on two disks, subsequent queries will have less seek time .

Now, since I was right, will you give me my karma back? ;)

Re:Oracle officially recommends against RAID ... by Speed+Racer · 2000-07-26 02:41 · Score: 1

Your subject is patently false. I defy you to show me where Oracle officially recommends against RAID.

--
Free Mac Mini. Yes, I'm

Re:Interoperability and limits by Johann · 2000-07-26 04:30 · Score: 1

...[S]ybase will NOT load database dumps made on different platforms...

BTW - This is the case for most (all?) RDBMSes. The short answer is that binary database files (they store the data) are platform dependant.

Some RDBMSes, like Oracle, allow you to specify the block size of your binary database files. But even if you have 2 files with the same block size, they may not be transferable between databases.

--

--
"You're gonna need a bigger boat." - Chief Brody

Veritas for remote replication by mzito · 2000-07-25 19:47 · Score: 1

Well, when you're at the 30 gigabyte size, your options as far as remote replication are a little limited. For example, its not enough to warrant the sort of large-scale storage array that an EMC Symmetrix would offer that comes with built-in remote replication (SRDF).

What you could use is Veritas Volume Replicator. It runs as a service/daemon on your box and mirrors every write over IP to another box. It can be configured to do it synchronously (the I/O blocks until the remote I/O completes) or asynchronously (higher performance because there's no delay, but you run the risk of data loss when the db server goes down).

Unfortunately, Veritas VR is not available under linux - I think you said it wasn't under linux anyway, but a lot of people are offering linux solutions.

Also, given that your database is only 30 gigabytes, do you actually do a lot of writes? Realistically, if you only do a couple of hundred inserts an hour, you could just, every hour, manually insert the changed records into the remote db. Heck, do it every 5 minutes. I'm not familiar with Sybase, but on Oracle, you can just run the redo logs on the remote data center. That's going to be the cheapest option, and the most linux compatible.

Anyway, if this is really enterprise-level, spring for Veritas - their stuff is expensive but really good.

Cheers,
Matt
Matthew J Zito, CCNA

--
me@mzi.to

160 Gigabyte database by mzito · 2000-07-25 19:58 · Score: 1

At my work, we run two large databases, one that's about 95 gigs and the other 160 gigs on linux. Now, we run our production db on solaris, but the data warehouse and the ticketing db are on linux with oracle. We've had great results with linux for the most part - the biggest problem is that the documentation by Oracle is not as good for linux as it is with Solaris, and its harder to find DBAs with Linux experience.

I can't vouch for Sybase's stability under linux, but Oracle will do you just fine. Get a dual or quad-cpu box, depending on how much data you need to do, and 2 gigs of RAM either way.

Matt

Matthew J Zito, CCNA

--
me@mzi.to

Re:~30Gb Sybase Database by Doctor+Memory · 2000-07-25 22:47 · Score: 1

I would stick with Sybase
Absolutely. I spent several months last year adding Oracle support to an application designed around SQL Server and cross-vendor development is really something you should avoid if you can. Leverage your existing DBA knowledge and you can probably use one DBA for both sites. If you do go with another vendor, you'll wind up with another DBA either on salary or on retainer.

--
Just junk food for thought...

Re:You really mean 30 GB Database on Linux by Espressoman · 2000-07-25 20:41 · Score: 1

I agree with you. RAID 10 will give a nice combination of safety and performance. If your crazy (well, for just 30 gigs, perhaps not *that* crazy) there's the new Adaptec UDMA 66 RAID card which I think may support RAID 10. It definitely supports RAID 5. Actually, I wonder how bad RAID 5 would actually be with one of those cards and five UATA 66 Maxtors with the 2MB cache on them. They are pretty fast drives....

I don't know if the Adaptec card has it's own caching, but it would be very cool if it did!

Re:Clarity of Expression by Cato · 2000-07-26 02:03 · Score: 2

BSD is not 'derived from System V' - it forked off from Unix earlier, maybe version 7 Unix.

Also, Solaris 2.x is based on SVR4 (System V Release 4) - SVR4 is quite upward compatible with SVR3.x.

And Solaris is not spelt with a 'u'...

Linux was not 'built on Posix' (not a meaningful term, Posix is an API spec) but I believe Linus tried quite hard to conform to the 1003.1 specs, and the bash people have tried to conform to the POSIX shell specs.

Caching, and RAID10 vs. RAID0+1 by ansible · 2000-07-25 21:47 · Score: 3

The problem with a caching controller is that unless it's well engineered (with it's own battery backup), you more likely to run into filesystem corruption in the case of a power failure or OS crash.

A standard filesystem (such as ext2) on top of RAID5 will never be fast for small writes.

NetApps get around this because the WAFL filesystem is explicitly designed to sit atop a RAID4 drive array.

And there is a difference between RAID10 and RAID0+1.

RAID10 is a stripe of mirrors. Each pair of disks stores the same information (RAID1), and a stripe is created over those mirrors. This can tolerate multiple drive failures as long as at least one drive from each mirror is working.

RAID0+1 is a mirror of stripes. Two stripes are created(RAID0), each with half the total of disks. These stripes are then mirrored(RAID1). The problem here is that if a drive goes out, it takes out the entire stripe. If a drive in the other stripe goes out before the rebuild is complete, you're hosed.

Normally RAID systems (like RAID5) can't tolerate more than 1 drive failing at the same time. However, RAID10 provides more protection than RAID0+1, at the same price.

Re:Three words: by rnturn · 2000-07-25 20:43 · Score: 2

``With Intel, isn't the limit still 4CPU for the latest generation?''

I don't think so. Didn't the recent benchmark comparing IIS vs. the new webserver from RedHat run on an 8 CPU SMP system? You can get more CPUS... you just don't see them in the advertising aimed at Joe Sixpack. They tend to be just a bit on the pricy side.

Cheers...
--

--
CUR ALLOC 20195.....5804M

Re:Three words:with three words by rnturn · 2000-07-25 20:57 · Score: 2

``Alpha is cheap. A reasonably good alpha is under 5000$.''

I think you're talking about an Alpha-based workstation. No one's going to be hosting a 30+GB database on a workstation. They would be looking at a DS10 or DS20 at a minimum. Expect to pay something in the area of US$20K for a smallishly configured DS20.

``Storage will be a 1000$ more.''

A whopping $1000 for disk space to host a database? Only if you plan on sticking the entire thing on a single 36GB drive which would be an inexcusable performance hit. And that would leave no money for any kind of mirroring.

I guess this $6000 configuration isn't intended for a production system.
--

--
CUR ALLOC 20195.....5804M

Re:The question changed by rnturn · 2000-07-25 22:19 · Score: 2

``Remember, raw devices also means only one file per disk.''

Depends on your UNIX. Under Tru64 and some other Unices, you have storage management tools (under Tru64 there's Logical Storage Manager, for example) that'll let you slice up a disk into as many pieces as you like. You then access the disks through either /dev/vol/... or /dev/rvol/... (if you really want to use raw data partitions). Striping across SCSI adapters for better I/O performance is quite easy.

``P.S. $2K is way low for both the hardware and the database!''

Agreed. Some of the dollar estimates that people are throwing around are fairly humorous.
--

--
CUR ALLOC 20195.....5804M

Re:raw partitions by rnturn · 2000-07-25 21:46 · Score: 5

``oracle uses its own raw partitions/filesystem to store its data. this speeds up oracle''

It doesn't have to manage it's own disk space. And it may, under certain conditions, provide better performance. We have been moving away from raw data partitions. This after running some benchmarks of a large table residing on raw partitions vs. the same data residing in tables in a filesystem. The performance was actually better while accessing the data in the filesystem. We're talking 10+% better performance not just a few percent. Our experience, based on our benchmarks, and discussions with Oracle technical people, is that the preference for using raw data partitions was based on performance tests using older versions of UNIX and less capable filesystems. Of course, your mileage may vary.

Aside from performance, if your database changes frequently, adding and deleting tablespaces is a major pain (with long downtime) when you're using raw data partitions but is a snap when you're using filesystems for data. If your database is fairly static raw partitions might buy some little bit of performance but, again, at the expense of managability. IMHO, raw data partitions just aren't worth it. Even if comparitive performance were a wash, the easier means of managing the database weighs in favor of filesystems.

--

--
CUR ALLOC 20195.....5804M

running Oracle on NFS by jonbrewer · 2000-07-25 23:26 · Score: 1

Yes, using a Network Appliance Filer is using Oralce with NFS - but this is a solution developed by NetApp in conjunction with Oracle, and is AFAIK the only NFS solution Oracle recommends.

(No I don't work for NetApp.)

We're considering Filers to replace local disk on some of our Sun 450s (running Oracle 8.1.6) at the place I work.

Re:Wouldn't go with Linux myself - fallacy by Lumpy · 2000-07-25 19:44 · Score: 1

We here at Giganto Communications (name concealed for my protection) use Intel based multi-processor machines exclusively for all our important database needs. you obviously are in the minority when the largest communication companies in the world will do what you will not.

as for Linux, no. We suffer with NT crashes, but we are sneaking Linux in the door, one server at a time, until.... well you get the picture :-)

--
Do not look at laser with remaining good eye.

Re:Three words:with three words by arivanov · 2000-07-26 00:25 · Score: 2

think you're talking about an Alpha-based workstation. No one's going to be hosting a 30+GB database on a workstation. They would be looking at a DS10 or DS20 at a minimum. Expect to pay something in the area of US$20K for a smallishly configured DS20.

This is UK price for DS10L. You do not need the expandability of a DS10 or DS20. Also AXP has even cheaper machines. Sold in the UK by evolution.

A whopping $1000 for disk space to host a database? Only if you plan on sticking the entire thing on a single 36GB drive which would be an inexcusable performance hit. And that would leave no money for any kind of mirroring.

You are right. Off by 2-3 times. Was thinking of an external IDE RAID to SCSI box. Works fast enough. Is cheap enough. If necessary mirror two or more at RAID0.

--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/

Re:Three words:with three words by arivanov · 2000-07-25 20:30 · Score: 4

Very bad idea. Or maybe even "Stupidity is limitless"

1. If you have not noted Oracle legal has walked around every single site that had Oracle vs X benchmarks (X=mysql, sybase, informix) and made them drop them. This is actually possible under the 8.0x EULA. Actually just read the EULA. It is a masterpiece in itself. You are not allowed to benchmark the product and not allowed to question the fact that it is fscking slow and not ANSI compliant. That is besides the fact that if I was you I would not buy something where the manufacturer intentionally disallows fair comparison with other products. It is enough to say fsck this at least for me...

2. The original database is on Sybase. Sybase is at least more or less syntactically ANSI SQL compliant. Oracle is as far from ANSI as it gets. It will be a good guess that it will take you ages to port the bloody thing. And porting it will be more expensive than the "expensive" hardware.

3. I would see if the database design is implementable under postgreSQL or MySQL on an Alpha. Alpha is cheap. A reasonably good alpha is under 5000$. Storage will be a 1000$ more. This is as much as an appropriate x86 box. Postgres does not have a 2GB database limit anyway as it splits database files. MySQL does not have this limit on alpha because the platform is 64 bit. Your problems are in the key limitation/lob interface for postgress and transactions for MySQL.

4. If Neither of the solutions in 3 is implementable you have to open wide you wallet and buy informix for Intel or DB2 for intel. Both of them work and are ANSI compliant. In btw DB2 for Intel linux developer edition is free. Free period. No expiration. So you can actually see if the database will work. And they match Oracle on some benchmarks and DB2 beats the crap out of it when it comes to real scalability and clustering.

--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/

Re:Three words: by lcase · 2000-07-25 20:53 · Score: 1

Oracle on Linux may be the best choice. DO NOT I repeat DO NOT skimp on hardware. Some of the Linux platforms advertised in Linux Journal would be you r best bet for industrial strength hardware.

Much Success!!!!

--
lcase - @home in cyberspace

kdb by muchandr · 2000-07-25 20:03 · Score: 1

will do your mirror asynchronously, with no impact
on performace. Will your obese monster of a
database do it? You can download from www.kx.com

Ever considered Adabas/D? by Paranoid · 2000-07-26 10:27 · Score: 1

I would skip the filesystem layer and run off of partitions (or /dev/md devices?) directly. Although I've only had 10Gb at a time going under it, and on actual partition devices, not RAID devices, (I am resource-underprivelaged, unfortunately), Adabas/D handles this very well. I'm sure other commercial databases would as well, if they are truly supporting linux rather than just paying it lip service =)

I see absolutely no issues with pushing Adabas/D farther than I have, it has not had any issues with it whatsoever. Of course, if you used files on ext2 or reiserfs or whatever, you would have unnecessary slowdown and potential instability, use disk partitions.
--
Paranoid

--
Paranoid
Bwaahahahahaa.

wrong question by jetson123 · 2000-07-25 19:54 · Score: 4

Of course, UNIX can handle it, probably better than just about anything else out there. Linux isn't UNIX, of course, and whether Linux can handle it is a different question. It probably can if you find the right software (I'd give DB2 a try).

But why ever would you replicate a database to a different kind of server? If the original database runs on Sybase SQL on whatever, then the obvious answer is to replicate it to an identical setup. Anything else, whether mission critical or not, is just going to be a lot more work, training, and maintenance.

two words for you: by um...+Lucas · 2000-07-25 20:42 · Score: 1

Is Unix capable of handling a database of this size and what other terrible pitfalls do you foresee?

MacOS and Filemaker.

Honestly, though, of course "Unix" can handle a database this size... it all depends on what hard ware your "unix" is running on. Obviously Linux or *BSD on a 368, 486, or Pentium system won't cut it, but if you up your ante to a dual P-II or P-III system, or even a Quad P-II Xeon system (which should be relatively "cheap" compared to offerings from Sun and Compaq), you'll be well on your way...

Multi CPU by Hammer · 2000-07-25 21:55 · Score: 1

In an interview with Linus himself he stated that he has an 8 CPU Intel....

Re:Multi CPU by fsck · 2000-07-25 23:08 · Score: 1

The last interview I saw with his system named in it was that he had a quad Xeon 400. kernels in less than a minute. Of course that was quite a while ago..

--

Lars - ...I could always phone Linus when I had a problem.

Re:No, only Microsoft SQL Server can do it. Period by bruceg · 2000-07-25 22:49 · Score: 1

Linux 2.4 will take care of the 2GB limitation.

Am I missing something here? by dreamt · 2000-07-25 19:09 · Score: 1

This question seems to be asking whether Unix can handle a 30GB database? Should this be asking if Linux can handle it, or any general Unix? I would guess that the answer to either of these questions is yes. I certainly know that Solaris is more than capable, but can't see why Linux wouldn't (hey, even Sybase under NT can handle it, although not as well as Solaris).

As far as the 2GB file system limit, this is something that is easy to get around. Up until Solaris 7, when Solaris became 64 bit and supported files/partitions > 2GB, all you needed to do was to create multiple Sybase database devices and span the database across them.

Re:Three words: by Sensor · 2000-07-25 19:18 · Score: 1

does anyone actually know what the license costs for Oracle on Linux actually is? If this is a datawarehouse application then I'm assuming that you can't use one of the low end single user licenses.

Whenever I have looked at Oracle the software costs have dwarfed the hardware costs.

Re:No, only Microsoft SQL Server can do it. Period by Dissenter · 2000-07-26 01:16 · Score: 1

Who cares about a file size limit. Oracle databases are designed to span multipule files. DO you really think that no Oralce/Linux system goes over 2GB? LOL
Dissenter

--

Dissenter
"There is no knowledge that is not power."

Re:Clarity of Expression by Felinoid · 2000-07-25 21:26 · Score: 2

True :)
Above that... Linux is really a Unix clone. (People who call it a "work-alike" are just being cute... it's a clone thats all)
Sun entered the market and really got it's fame with SunOs (a BSD based Unix clone).

Linux is usually called a *nix not a Unix becouse it is not liccensed from AT&T or SCO. (Or anyone else who held the trademark)

It should be noted that BSD and Solarus are Unix forks. The BSD dev group and Sun must maintain compatability by relying on documented standards just like Linux.
[notied becouse some Unix people who dislike Linux will attack Linux becouse it is built on standards not on the accual code. The idea being that Solarus and BSD are the same code and by default compatable. This is false for the above reason. The below is just to extend the point nothing more.]

Solarus and BSD are forked from diffrent code. BSD is from the original AT&T code later known as SysV. Solarus is from a total rewrite in the 1990s known as SrV. SysV and SrV are not compatable.

So in reality Linux, BSD and Solarus are three totally unique (and multally compatable) operating systems. Linux being the only one of the three with no liccens to the name Unix.

Over time many Unix clones were incorrectly called Unix. However this fact was less than noticable as forks and clones had no standards to folow and ended up pritty much being mutually incompatable.

Linux was built on Posix the first effort to correct this issue.

On a side note... Linux disordented me becouse I learned Unix on an AT&T 3B2/300. But Linux didn't thow me much.

One gripe people have about Linux is that it is posable to write Linux only code that dose not work on BSD or Solarus.
While true it is equally posable to write BSD or Solarus only code.
It is an effort of the programmer to maintain portability. Failling that it dose not matter what operating system the code was made on.

--
I don't actually exist.

Re:Clarity of Expression by Felinoid · 2000-07-25 21:30 · Score: 2

BSD is liccesned from AT&T and thus is a Unix.

Small issue :)
I believe BSD was never cerifyed... It simply is by age alone

--
I don't actually exist.

Re:No remote NT management? wtf? by fcw · 2000-07-25 21:21 · Score: 1

So you don't consider field names to be critical data?

Critical to the correct behaviour of the Oracle application? Probably.

Critical to the stability of the server OS? Absolutely not.

Silly question! by MeanGene · 2000-07-25 20:36 · Score: 1

This story should not have been posted - boo to the maintainers!

How much is your data worth? Will you cheap box have all the servery things? Here're just some:

RAID
Hardware monitoring
Hardware redundancy
ECC low latency RAM
Over-engineered cooling

I bet when you add all of those, your x86 box will become much pricier.

Re:Silly question! by johnlcallaway · 2000-07-25 21:33 · Score: 1
Pricing comparision --
- Sun E450 dual CPU w/1 GB memory 2 18GB drives - around 30K (mirroring supported by OS)
- HP LC2000 dual CPU (700Mhz) w/1GB memory 4 18GB drives and HP Netraid board - $25K
Yep -- big boxes, if either Sun or Intel based, cost big bucks.....

BTW - Linux runs great on the HP server (install time - 15 minutes, 1 reboot). Haven't tried it on the Sun box...
--
I rarely read replies, it's my opinion and if you thought about your opinion a little more, I'm OK with that.

Re:Absolutely Raid 5 for Data Warehousing systems by Surak · 2000-07-25 23:59 · Score: 2

As to the original question, can Linux handle a 30 GB database, my answer would be "Yes, but it will hurt". Ever try staging more than 2GB of data on ext2? Ever try moving more than 1GB of data on ext2 with less than a 4KB block size? It hurts!

My understanding is that Oracle can now use its own filesystem on Linux...I don't know very much about Oracle's properietary FS... but my thinking is that it would make life easier. I dunno. Anyone else know?

--
My journal has hot /. gossip.

Re:No remote NT management? wtf? by rm+-rf+/etc/* · 2000-07-25 21:28 · Score: 1

As an NT engineer, I can do ANYTHING from my laptop, from ANYWHERE in the world.

Really? Can you explain the procedure for rebuilding the kernel remotely for an NT machine?

Re:No, only Microsoft SQL Server can do it. Period by CristianoMonteiro · 2000-07-26 01:13 · Score: 1

People, this is a joke !

How can a joke (and this is not a funny one) be marked as "insightful" ???

--
-------------------------------------------- Se você consegue ler aqui então fala português. Óbvio

More info by _Spirit · 2000-07-25 19:02 · Score: 1

It's hard to give sound advice without a little more info. Is this mirror goig to serve as a backup-server in case the main server is not available ? How many users are we talking here ? If there's only a few users, i can't think of anything against an x86 based server, other then maybe supporting a linux/insert_db_engine_here.

Message on our company Intranet:
"You have a sticker in your private area"

--

beauty is only a light switch away

Re:Custom built machines by _Spirit · 2000-07-25 19:07 · Score: 1

Funny thing, could you refer me to a shop that can sell me an x86 based machine that will outperform a fully loaded RS/6000 S80 on large databases ?

Message on our company Intranet:
"You have a sticker in your private area"

--

beauty is only a light switch away

Re:No remote NT management? wtf? by FascDot+Killed+My+Pr · 2000-07-25 20:13 · Score: 1

Oracle on NT a. '~crash by mistyping...' The answer to that of course, is to not mistype mission critical data, you should be using scripts for bulk trtansfer anyway.

Who said anything about typing "mission critical data"? As I said originally, I was typing field names into a GUI. My point was not "I'm going to need to type field names all the time so it better robust". My point was "if something so simple can go so wrong, what ELSE is broken".

"As an NT engineer, I can do ANYTHING from my laptop, from ANYWHERE in the world. Using only MS tools and a few scripts I wrote in vbscript. I concede that sometimes it would be nice to have a 'true' terminal connection to the server, but you don't 'need' it."

Oracle provides no facility for remotely starting a "local" bulk load (that I could find, anyway). This means that you must be running locally to load from a local disk. On Linux this is easy: telnet. On NT this requires time and/or money (which is what "MS tools and scripts I wrote" translates to).

"Only 20mb a minute? bwahahahaha I can reload data into my NT, MS SQL server at over 150MB PER MINUTE."

Different hardware. I was using a simple desktop for benchmarking (to get comparisons, not absolute numbers). In any case I wasn't using bulk loading to restore--Oracle has an actual backup/restore mechanism that doesn't require reinsertion of data.
--
Give us our karma back! Punish Karma Whores through meta-mod!

--
Linux MAPI Server!
http://www.openone.com/software/MailOne/
(Exchange Migration HOWTO coming soon)

Re:No remote NT management? wtf? by FascDot+Killed+My+Pr · 2000-07-25 20:52 · Score: 1

'~no local bulk load in Oracle'

RTFM. Need I really say more?

No, you need to read more. I said "no facility for remotely starting a local bulk load". If I am on machine A, I have no way to tell Oracle on machine B to bulk load a file directly from B's harddrive. If you wish to claim that is possible, you are going to have to provide a URL for proof.

"...600 mhz p3, 128 mb ram Dell inspiron 3800. Your desktop is probably about as powerful, yes?"

Nope. I finished testing a year ago (started 18 months ago) with a spare desktop. If I recall, it was a PII 300 with 64 MB. Also, totally unoptimized (i.e. no kernel tweaks, etc). Just a straight RedHat install with a straight Oracle install on top.
--
Give us our karma back! Punish Karma Whores through meta-mod!

--
Linux MAPI Server!
http://www.openone.com/software/MailOne/
(Exchange Migration HOWTO coming soon)

The question changed by FascDot+Killed+My+Pr · 2000-07-25 19:04 · Score: 5

The title and summary say "Can Unix handle it?" while the "below the fold" area asks "Can Linux/Intel handle it?".

I'd say the answer to the first question is a resounding "duh!". The answer to the second is a resounding "probably".

I found Oracle on Linux to be quite usable and nice (except for lame non-readline-enabled interactive tools) and fairly fast. But there is something...incongruous about spending $2000 on hardware, $2000 on Oracle and then using a free OS (that you WILL have to tweak to optimize).

Other tidbits:
1) Do NOT, I repeat NOT NOT NOT use Oracle on NT. The (evaluation) version I tried sucked BIG TIME. The bulk loader didn't properly support all the file formats it was supposed to and I was able to repeatedly crash the box by mistyping field names into the table creator GUI. Add all the problems of NT (no real remote management, etc) and you have yourselves the makings of a nightmare.

2) Raw devices are for more than recovery. They also help in the speed department. If you are going to be loading 30+ GB of data multiple times (this is a backup, right?) you are going to want speed. IIRC, ~100MB took about 5 minutes to bulk load (raw, not insert) on Oracle for Linux. That's 25 hours of load time for 30 GB.

3) Can't you take the backups from your primary DB and load them as restores to the backup DB? That would save tons of time and effort (up front AND ongoing).
--
Give us our karma back! Punish Karma Whores through meta-mod!

--
Linux MAPI Server!
http://www.openone.com/software/MailOne/
(Exchange Migration HOWTO coming soon)

Re:The question changed by thing12 · 2000-07-25 19:24 · Score: 1

They changed the licensing structure: http://oraclestore.oracle.com -- you can now get Oracle 8i Enterprise edition licensed for multiple servers for $750 - add on support and you're probably in the 2k range.
Re:The question changed by thing12 · 2000-07-25 19:37 · Score: 1

Er... not support - but add on their per client license which at the minimums for Oracle 8i Enterprise (that's buying 1 named user license for every UPU, which for Intel is 1x the MHz x the number of CPUs).... so it would be about 2k+ for a 450 Mhz, single processor Intel box. Of course it's much cheaper to buy a 2 year license (35% of the cost), or to buy Standard Edition if you don't need the added functionality of Enterprise.
Re:The question changed by thing12 · 2000-07-25 23:50 · Score: 1

right, and 17 * 100 = 1700 + 750 for the software = $2450 (using that 500 MHz CPU).
Re:The question changed by CigarBuff · 2000-07-25 20:45 · Score: 1

1. 30GB is nothing - even Oracle on NT can handle that fine. I've got bigger databases than that on NT without a problem. A 30GB database doesn't mean 30GB files. And why would you intentionally mistype table names? 2. Raw devices are a thing of the past. The speed increase you're looking at nowadays for raw devices is almost negligible, and is nothing compared with the increased management capabilities and flexibility you get from working with a filesystem (esp. in the area of backup tools). Remember, raw devices also means only one file per disk. Considering today's drives are likely to be at least 9GB, it's easy to get a lot of waste there, especially when you want to split your files among multiple drives/controllers for performance reasons (that's where you'll get the biggest bang for the buck in terms of performance - remember I/O is almost always the culprit when it comes to database bottlenecks) 3. If you use Oracle, you will not be required to reload the database regularly just to keep a remote hot site. Oracle has standy-database functionality that allows you to take the archived redo logs (a copy of all the transactions issued to the database) from the primary, and simply ship those to the backup and apply them. That way, both databases are completely in sync while the data you're shipping is only as large as your transactions. And, considering this is a data warehouse, transactions are generally read-only, so you're talking virtually no traffic during normal use. That said, when you do your data loads (daily? weekly?), you're talking considerable traffic, but you can spread out the shipment of the logs to compensate for that. CigarBuff (Oracle/SAP DBA) P.S. $2K is way low for both the hardware and the database!
Re:The question changed by jaclu · 2000-07-26 01:44 · Score: 1

>2.Backup Solution
There is at least one descent alternative for linux backup, I use Arkeia (www.arkeia.com) massivly parallell, and does most other OSes as well.
Re:The question changed by twisteddk · 2000-07-25 19:56 · Score: 4

And You also quite nice manage to change the question Yourself.
Nobody said anything about Oracle. No wait.. I take that back... But the person posing the QUESTION didn't say anything about Using oracle for a DB. The question actualy stipulates a Sybase DB !

But anyway, to answer the question posed in the first place: Yes You COULD probably handle a UX/NT trasition of the data, but try not to change database as this often screws with the data. Not all tables are stored identically in all databases (probably one of the reasons there are more than one supplier of databases). So for gods sake.. Even IF You want to have a backup/mirror on the UX box, make sure You run the same DB.

But still, it sounds like you want to "exchange" the UX box for an Intel machine running Linux.. Am I right ?
If this is indeed the case, yes even a 30 (or 50 gig for that matter) DB is possible. The major pitfalls in this scenario are (I've been there myself):
1. Physical space for disks.
If you go buy a Intel machine You limit Yourself to say about 3-4 SCSI controllers, and unless You go and buy a shitload of External conenctivity (kabinets and such), which can be a pain, You're often limited to about 8-10 disk drives, so size Your DB with some future expansion in mind.

2.Backup solultion
Make Sure to have a decent and FAST backup. I've not yet been able to run parallel backups on Linux (maybe I'm just not very good at configuring it), and it DOES take a while to backup 30 Gigs, even on a DLT, so if the client wants high-availability, take this into consideration. However, in Your situation, this might be redundant, since this DB WILL be the mirror (but the point should be handled otherwise).

3. High Availability.
Your client might want the DB to be accessible at ALL times, and we all know that when a PSU or CPU goes in a NT box, the machine is pretty darned worthless. And getting a decent service level on a Intel box is almost impossible (usually 24 hours is as good as it gets). Also You should consider if this mirror should be used as a fail-over in case of whatever.

4. Remote access.
Remote servicing is a bit easier though, as You can easily set up Telnet or whatever. However, You can get some goot remote programs for Windows machines also, just not AS good. But this should only factor in if You need to access the machine frequently. If the choice is between UNIX/Linux, it's the same diff. But if it's UX/NT, then think about it for a while.

5. Maintanence
Maintenance is a BIT heavier, especially when the machine gets older, but the first year or two, how gives a S***. Also, whatever peole might say of UNIX harddirves, they're EXACTLY the same as the ones sold for Windows machines. They're just formatted differently. So You will save a bundle on the costs of aqusition, which should cover for the added maintenance of trading in old components that can no longer hack it (MB's, Networking cards, SCSI controllers, RAM etc.), All of the components which are NOT the same :)

6. Choosing the right hardware.
You might want to make sure to spend a few more dollars on the right hardware. Whatever people might say, the UNIX boxes are most often put together with the best of hardware, ECC ram, Redundancy controllers, and hot-swap drives (and sometimes also other pieces can be swapped whilst power is still on). DON'T save more money here than absolutely nessesary. A good point to make would be: It's basically the same hardware, only the software is different.

These are my thoughs/experiences on this matter. As for "FascDot Killed My Pr". I REALLY have to say: I've been running an Oracle DB (8.04) on an NT for over two years now, not a single glitch yet. And YES, it's a development DB, so there ARE active users. And installation was as sweet as pie. Only major flaw in my opinion is the inability of older Oracles to "bundle together", You could not have more than one major relase DB installed on one system, You have to add another logical DB to the exising one, or install a different major relase version of oracle as the second DB. But that's SUPPOSEDLY done away with in version 8 and up (not that I'm not haveing problems with it anyway)

--
--- To err is human... Am I more human than most ?

Yes by Dacta · 2000-07-25 19:04 · Score: 2

It's not really a platform problem. You might have to partition the DB into multiple files to get around the 2GB file size limit on Linux (I think Sybase can do that), but I doubt there would be any other real problem.

Sybase runs on Linux, of course, so there is no problem there.

I'd ask in the Sybase newsgroups about the biggest database they have seen on Linux - they have a good reputation for quick answers. (About the onlt good thing I have to say about Sybase, but still....)

I'd be surprised if there aren't quite a few Linux DB's bigger than 30GB anyway.

Re:Yes by bero-rh · 2000-07-25 19:07 · Score: 3

You might have to partition the DB into multiple files to get around the 2GB file size limit on Linux

Or patch the kernel so it doesn't have the limit.
Patches for this are available; if you don't want to build your own kernel, get the Red Hat Linux Enterprise Edition, which has this patch by default.

--
This message is provided under the terms outlined at http://www.bero.org/terms.html

How I'd do it. by Genady · 2000-07-25 21:09 · Score: 2

I looked at this sort of thing over a year ago, and previous posters are right, 30 GB is kinda puny. Here are the headaches I had that I hope you can avoid.

1. First and foremost, stack the box with as many SCSI adapters as you can. I/O quickly becomes a bottle neck on large DB systems. Also if you're doing Linux go with Linux's built in RAID, I hear it's faster than the hardware raid cards you can buy out there. That said be sure to get more than one of some hot processors, you're going to be using a goodly portion of one of them to do your RAID.

2. A journaling filesystem would be good. I don't know of any available for Linux (except maybe XFS, what's the status of that?) you really really don't want to fsck your Raid 5+1 (Yes I said 5+1)

3. Unless you have the funds to implement a slightly lower performance box, expect to be developing on a seperate instance on this same server. That means worst case another 30 GB of space for the new istance, which will also require a kernel re-compile to get the shared memory and semaphore settings right. (You are using Oracle aren't you? ;0)

4. Better yet get your requirements up front for number of instances and design the hardware for that number + 2, and tune the kernel appropriatly. Whatever Oracle gives you for kernel parameters multiply them by the number of instances.

5. Don't sweat the raw devices stuff. It's generally more trouble than it's worth. It makes backups harder, makes restores harder, and makes RAID harder. It's just not worth the headache.

6. Invest in a nice DLT library that is supported up front. Get your backup scheme in place, even if it's just your DBA's writing dump files nightly. A good DBA can restore from a dump in a few hours, AND they can restore a dump of production to your development database, making those refreshes from production a fairly painless task (and management/developers/DBA's *WILL* ask for refreshes from production.

7. DON'T considder RAID 5, onless it's 5+1. RAID 5 can be murder on DB performance, especially in a VLDB, where you perform inserts (it's a little less bad on Datawarehouses) Think 1+0 or 0+1, and span the + across multiple controllers/disk arrays.

8. Don't skimp on your DBA. In reality most any competant SA can administer a DB *system*, sink any payroll money into a very good DBA, it will save you in downtime and calls to oracle later (You are using Oracle aren't you ;0)

g:wq

--

What if it is just turtles all the way down?

try a clean design by FonkiE · 2000-07-25 19:36 · Score: 2

does the indices fit into main memory? not close - a good fit is needed. so you need at least 50% more memory than the indices to be on the safe site. (you wanna do joins, etc ...)

are there big tables, each one >memory/2 ? or are there 1000 small ones. (we talk about real mem here not virtual)

the rules are:

1) design
2) choose hardware and software on the details of 1.

sometimes a little redesign makes it possible to have more freedom on hard/software ...

(the 50% i mention above, are a value form experience. the more flexibility you need the larger the real memory needs to be. having indices in ram and 50% of the memory to work gives you a fair amount of flexibility. driven by the needs of the application the % can be 20% too, it depends on how often you create entries, what and how often you look up fields and what joins are necessary to do that ... this gets really complex ... ;-)

Re:try a clean design by FonkiE · 2000-07-25 21:35 · Score: 2

you are totally right, i wanted to avoid your answer by my last addendum. i personally use very big tables with a few indices, becaue the data is structured like that, therefore the 50%.

if thats the RARE application im sorry. i thought mine is pretty common ;-)

my mail was mainly about thinking about the design, how you could use the memory you have for faster access and not about the exact %. (leaving everything to oracle is a non-hackers choice :-)

but i stay with the rule, that the indices used by common queries should if possible stay in memory ... (complex situations make this of course impossible)

ask slashdot: does anybody now have >1000 tables with >3 indices each ?

:-)
Re:try a clean design by twisteddk · 2000-07-25 20:25 · Score: 1

you need at least 50% more memory than the indices
I HOPE TO GOD You don't really mean that !

I often run Oracle DBs (also MS-SQL, Informix, DB2 and others). Most of our DB's are more than 30 GB, and almost ALL of them have at least 30% of that being indices. Som even go as high as 60%. Reason being: For complex searches You DON'T want to do an index scan, You just want to use an index that holds the information You are Looking for. This means that You often will have 5, 6 or even more indices pr. table.
yeah, ok, some DB's know how to search through several indicies and copmare the info, but that's downright RARE to find. Most DB's either run rulebased optimised searches, or costbased optimised searches. And neither of these will be any good if You only have ONE index pr. table

--
--- To err is human... Am I more human than most ?

IBM can handle it! by raffe · 2000-07-25 19:13 · Score: 1

DB2 is a great product and DB2 Universal Database runs on AIX, HP-UX, Linux, NUMA-Q, OS/2, OS/390, OS/400, Solaris, Windows 2000, Windows 95 & Windows 98 and Windows NT. Check it out.

You really mean 30 GB Database on Linux by DamageBoy · 2000-07-25 19:03 · Score: 5

Unix systems handle the largest databases known to mandkind
as we speak.
Databases managed by unix systems have been known to be in
the vicinity of around 2-6TB.

Your question seems to refer to Unix on x86 databases that
have that size.
Of course that running unix on x86 systems usually boils
down to running Linux...

Linux is officially supported by both Oracle, Informix and
I think that even Sybase altough I'm not completely sure
about that.

Obviously running it on the same RDBMS would be an easier
to accomlish, so you'd probably want Sybase to support Linux.

You'd also want RAID 5, preferably hardware which is supported
by Linux.

You'd probably want to use some sort of journaling file systems.
I myself have no problem trusting the beta versions of ReiserFS.
I've also ran oracle on them witout any problem.

If you feel reluctant in using bleeding edge kernel patches
for a production environment, I can only recomend that you use
SMALL ext2 partitions to avoid catastrophic FSCK times, and let
Oracle / RDMS do it's magic in managing a single 30GB database
over smaller files...

Re:You really mean 30 GB Database on Linux by Tower · 2000-07-25 20:25 · Score: 1

Raid5 performance with a decent controller (non-software RAID) is a very good performer. You won't get quite as much write advantage as RAID 0, but it's better for writing than RAID 1. Reading back is very fast, especially with a decent number of arms.

If you are planning on running a 30GB+ database, I'd hope you could shell out a few bucks for a halfway decent RAID controller... Even the ones with only a small cache perform admirably.

--
"It's tough to be bilingual when you get hit in the head."
Re:You really mean 30 GB Database on Linux by nakaduct · 2000-07-26 00:21 · Score: 1

[DPT controllers support] 256M of ECC cache RAM and optionally battery back it up (silly IMO).

It's only "Silly" until your UPS dies (or the card fails or your SCSI bus resets) while there are cached writes.
This is especially pernicious if another write is performed after the cached one, but is committed to disk (maybe on a different controller?). This can leave the database in an inconsistent state, rendering it unusable.
That'll fix your performance issues on RAID5.

No, it won't. If cache could fix arbitrary performance problems, then we'd all be using 1200rpm 15GB/platter drives in 100-member RAID5 sets which last forever and are almost free.
Cache will alleviate the performance problem for brief, small transactions. If you're moving more than 256MB through the controller (in either direction, remember that reads consume that cache, too) in less time than the disks can service it, then your I/O's become as slow as the disks. This is unavoidable and unfixable.
RAID5 is best-suited for read-intensive environments, or cost-sensitive customers. It is not a high-performance solution. As others have said, RAID0+1 (striped mirrors) are the answer if you want fast and safe instead of cheap and safe.
cheers,
mike
Re:You really mean 30 GB Database on Linux by theonetruekeebler · 2000-07-25 21:43 · Score: 3

Whever you build a database, you must at how it will be used before you make physical layout decisions. The Asker here specified a data warehouse, which to me implies a DB which will be written to once then read hundreds of times afterwards. With an R/W ratio that high, write performance is only a minor consideration compared to read performance. While RAID-10 would give great all performance, for read access it won't do an awful lot better than RAID-5, at just over half the hardware cost.
So for a data warehouse I would not hesitate to do RAID-5.
As for mirroring, I can't speak for Sybase, but Oracle supports a wide variety of mirroring and networked DB options. I would look into something akin to snapshots, which are read-only copies of a master database. Designate one copy of the DB as the write-to master, and snapshot it over. Of course, this all depends on why you're mirroring. If you are doing this for redundancy in the event of catastrophe, look at your loss tolerance and acceptible downtime. You could do something as simple as making a copy of the database remotely, then copying over your redo logs at every log switch. Then if your database fails, use the redo logs to roll your remote database forward, and bring it on line.
World of possibilities.

--

--
This is not my sandwich.
Re:You really mean 30 GB Database on Linux by ostiguy · 2000-07-25 22:23 · Score: 2

You can throw up to a 64 mb EDO Dimm on them.
Re:You really mean 30 GB Database on Linux by e.+boaz · 2000-07-26 00:30 · Score: 1

DamageBoy wrote:

> You'd also want RAID 5, preferably hardware which is supported by Linux.

Actually - RAID5 can slow things down (even hardware), especially for databases. Check with your database software vendor for recommended settings. You are much better off wasting the disk space and using mirroring or mirroring and striping (which we do for ~100GB databases.)

I highly recommend using raw space for your database - it helps the performance of your machine greatly. Also, it will avoid caching the data twice, which is a waste of memory that could be put to better use.

Keep your raw spaces small, 512m - 1g should be your max depending on your disk size. Find out which tables are the most heavily hit (both read and write, but most especially writes) and spread them out among several disks.

If you can, make use of a volume manager.. it especially helps manage the 100's of small spaces...

--eli
Re:You really mean 30 GB Database on Linux by JonK · 2000-07-25 19:57 · Score: 1

If the biggest Unix RDBMSs are only in the single figure TB range (which I very much doubt) then they're by no means the biggest databases in the world: a banking system I worked on a few years ago had individual tables in the >terabyte range.
--
Cheers

--
Cheers

Jon
Re:You really mean 30 GB Database on Linux by Omega996 · 2000-07-25 19:29 · Score: 1

Sybase ASE is available for Linux (in fact, it was one of the first, if not the first, commercial RDBMS available for Linux)...
I think RAID 10 (or 0+1, however you want to count it) would be a better choice for performance... RAID 5 is not a great performer, unless you're going to spend bucks on a caching controller that'll let you get around the small-write problem...
Re:You really mean 30 GB Database on Linux by Omega996 · 2000-07-28 03:54 · Score: 1

my bad - the poser of the question did ask about DW...
Re:You really mean 30 GB Database on Linux by mccrohan · 2000-07-25 21:15 · Score: 1

Unix systems handle the largest databases known to mandkind as we speak.
Mmm. Possible, but I'm dubious. The 'largest databases known to mankind' have generally gotten that size by accumulating over many, MANY years; and they live on mainframes. For totally massive applications of brute force, there's still no beating Big Iron.
Re:You really mean 30 GB Database on Linux by duffbeer703 · 2000-07-26 07:05 · Score: 1

Pinching pennies on hardware is a bad idea. Linux does not yet support raw disk volumes, and a lack of enterprise class volume managers like veritas make Linux a poor choice for enterprise DB's. In our shop, we use Linux or OpenBSD for everywhere, even for a few smaller databases.

Spend the bucks on Sun boxes, you will not regret it, especially when disaster strikes and you company is losing $6000 / hour.

As far as RAID goes, use RAID 0+1, if you cannot afford a few extra disks, you shouldn't be doing this in the first place. Basically, using RAID 5 for databases is very dumb. With RAID 5, controller failures can slowly corrupt your data before anybody notices. If you want to learn why RAID 5 and databases do not mix, go to www.iiug.org and search comp.databases.informix for "Art Kagel" and "RAID", you will find several excellent explanations about why RAID 5 is bad; Art is a guru and explains this topic very well.

If your goal is increased performance for read operations, try database mirroring. I am an Informix DBA, and we have the database engine mirror data chunks on our decision support databases. When the database takes care of mirroring, some read ops are offloaded to the mirror chunk. I do not know whether or not Sybase supports this, but it would not hurt to check.

--
Conformity is the jailer of freedom and enemy of growth. -JFK

Re:Size is not the issue by dublin · 2000-07-26 03:19 · Score: 2

Oracle over NFS to a NetApp Filer would work fine on a Sun or such. But despite HJ's valiant efforts, Linux's NFS isn't there yet. Linux is getting there, but there are still real good reasons to go buy those Suns if you've got a big, mission-critical problem.

--
"The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post

Of course you can do it by sbeitzel · 2000-07-25 22:36 · Score: 2

You can absolutely do this. Now, depending on which version of Sybase ASE you use, you may run into some dumb limitations. For instance, version 11.9.2 has a 2GB limit for the size of a device, so you have to partition your disks into 2GB slices, and distribute your database across multiple devices. I think they increased the size in ASE 12, but I haven't worked with that yet so I don't know what the limit is there.

--
Oh, go on, check out my job.

Re:Why not Sybase on Linux? by jackmama · 2000-07-25 19:17 · Score: 1

Perhaps this is the link you were looking for?

Unless, of course, you wanted us to go to that Digital Couple website.

Re:Size is not the issue by ajs · 2000-07-26 19:38 · Score: 2

Are you seriously suggesting running Oracle on an NFS filesystem?

Not only am I suggesting it, but in timing tests, Oracle is performing a little better over the Filer! We were previously using a locally-attached diff-SCSI Sun A1000.

Oh, also, you can write your Oracle redo logs to the Filer, even though they recommend against doing so to anything other than flat disk or RAID 0/1. Why? Because the Filer uses a journaling filesystem in NVRAM, so the writes happen as fast as the wire (GigE in our case) can run.

would have to recommend against this. By buying hardware RAID and an appropriate filesystem add/on (e.g. Veritas File System) you can get all the benefits of the filer with all the benefits of local disk.

How long does it take to back up that local disk? For us, it takes about 2 seconds, and takes up almost NO STORAGE SPACE! The Filer has a feature called "snapshot" which is basically a copy-on-write filesystem. You tell it to snap and it comes back after a second or two. After that, you can always go back to that point in time and recover files on-line, without any sort of programatic interface (just filesystem access). There is even an add-on package called snap restore that will instantly restore the entire filesystem to that previous state....

So, get this, our Oracle backup is: put all of our Oracle tablespaces in locked/suspended mode; call tell the filer to snap; unlock the tablespaces. Now if we ever need to restore, we just bring Oracle down, swap in the old data files, bring Oracle up. We can also do tape backups this way, as the Filer backup program uses snapshots. Thus, as soon as a backup is started, you can start writing to the data again safely!

Size is not the issue by ajs · 2000-07-25 19:14 · Score: 3

I run a database of this size, and it's not a challenge. Cost is very high, but that's mostly because a database of that size is one that you cannot afford to have to restore.

I currently use a Sun architecture, but I know of sites that use Intel/Linux, HP PA/RISC and even (may all the little gods help you) Intel/MS/SQL server which does have it's place in non-mission-critical places where you're never going to have a good DBA.

I can seriously recommend the Network Appliance Filer for back-end storage. Their claim that their network-attached storage array is faster than local disk sounds silly on the face of it, but there are good and valid reasons that it's true (mostly due to their journaling and caching strategy which is highly optimized for NFS). The Filer makes databases a lot easier to manage. For example, the Filer can make an online backup in less than 5 seconds, no matter how much data you have!

Back to your original point: 30GB is small, don't sweat it. But, don't cut corners either!

Re:Custom built machines by Tower · 2000-07-25 20:34 · Score: 1

Heck, even a 'low-end' F80 (1-6 500MHz copper Power-III CPUs with up to 16GB of RAM) would be able to take on PC hardware...

--
"It's tough to be bilingual when you get hit in the head."

2GB filesize limit by antiher0 · 2000-07-26 18:54 · Score: 1

According to Joe Pranevich's Wonderful World of Linux 2.4 (Final Draft) under the heading "Linux Internals"...

In addition, support for more powerful hardware is provided in the new kernel which now supports 64 gigabytes of RAM on Intel hardware, up to 16 ethernet cards, 10 IDE controllers, multiple IO-APICs, and other pointless abuses of good hardware. The 2 gigabyte file size restriction has also been lifted.

Re: 30 gig no problem in HP/UX at least by thing12 · 2000-07-25 19:15 · Score: 1

No need to even think about this one - on HP/UX at least Oracle can handle db's in the TERABYTE range. Oracle is usually configured to use raw partitions, so you just add more table space by creating more partitions until your disk is gone... and then you add more disk!

Re:Custom built machines by thing12 · 2000-07-25 19:18 · Score: 1

mysql is not a real database.... it's a filesystem on drugs. don't get me wrong, I use it! It has it's place and a 30 gig database is NOT it.

Re:Interoperability and limits by Starselbrg · 2000-07-26 11:31 · Score: 2

Is that 2 GB limit a 32-bit limitation or is that limitation also present on 64-bit machines?

--
Got HTML? Want LaTeX? Try html2latex

Re:Clarity of Expression by superlame · 2000-07-25 20:17 · Score: 1

The question of course is what industrial strength databases are supported on FreeBSD? I don't think Oracle, Sybase, or DB2 are. That just leaves interbase, and I don't really here too much about interbase.

--
-- Superlame http://catpro.dragonfire.net/joshua/

Tiny! by RallyDriver · 2000-07-26 11:43 · Score: 1

30Gb is pretty modest for a database; you can get home PC hard drives double that size. A big database is measured in Terabytes.

A wee Pentium server with one of those little hot plug SCSI trays is fine - just be sure not to use RAID-5 on your log drives :-)

Run Oracle on Linux or even En-Tee - beware that on Linux there is a 2Gb per file limit which may constrain your layout.

30GM...Try 2 Terra-Bytes by MooseMunch · 2000-07-25 19:50 · Score: 1

yes, you read it right....
We run an oracle DB on Sun equipment (the DB is on 1, uno, one, singular, machine). Yes it's a sun enterprise level 64way with 40GB of ram, but our DB is over 2 Terra-Bytes.

So to answer your question...yeah, 30GB, no problem. When we have ext3, you could even do it on linux :)

Wait for next kernel release by La+Camiseta · 2000-07-26 01:46 · Score: 1

I would wait until the next version of the kernel came out unless you're planning on using raw file devices. Linux kernel 2.2 and below have a 2 GB filesize limit which will be removed in kernel 2.4.

30gig is SMALL by CountZer0 · 2000-07-25 22:15 · Score: 2

I work for France Telecom, as the SysAdmin for Voila.com

We use Linux exclusively on our servers. (Well, except for one lil box running NT to interface with Reuters, because they refuse to make their proprietary client for Linux)

Our current database is around 4 Terabytes. It sits on about 80 servers all running Linux.

Admittedly, we use a custom database package, developed in house, and not an RDBMS, but when your dealing with such a specific dataset (we index web pages... thats it...) you don't need the flexability of Sybase.

Then theres Google... How many thousands of Linux boxes are they running? How huge is there database?

So yes, Linux is more than capable of handling a puny 30 gig database. Heck, I have more than 30gigs of data indexed on my HOME machine. (30gigs of MP3's all indexed and cataloged with Postgres) not quite the same as a "30 gig database" but similar.

clarification by theonetruekeebler · 2000-07-26 22:19 · Score: 1

This will teach me to use the preview button--and not to skip my morning coffee.

With the cost of the RAID cabinets being equal, RAID-5 requires half plus one as many disks as RAID-10. Eight 9GB SCSI drives at RAID-10 will yield 36GB storage striped across five mirrored pairs of drives; to get 36GB RAID-5 storage, you need five 9GB disks with each stripe's parity information alternating among the drives.

A good RAID-10 setup will be able to read different data from each mirrored drive simultaneously, creating a potential 100% read performance advantage over RAID-5 or simple striping--200% for three-drive mirroring if you're that rich. Realistically, though, it comes out to a lower number whose upper limit is defined by the SCSI channel's throughput, and insert-your-bus-architecture-here's bandwidth, and your computer's general ability to keep its shit together.

Probably the best advantage RAID-10 has is that you will probably put each RAID-0 on its own controller, which that in addition to being able to survive a drive failure, you could live through a controller failure as well. Redundancy is your friend.

Okay enough rambling. This was supposed to be a simple clarification that said "RAID-5 costs less than RAID-10, not the other way around."

--

--
This is not my sandwich.

Re:"Not mission critical?" by be-fan · 2000-07-26 00:49 · Score: 2

That's not what "not mission critical" means. Not mission critical means that it is okay if you have to do a reboot and the server is down for 10 minutes. It means that you don't have to have a redundent cluster to make sure that if one goes down you don't need to take the database ofline. Losing an entire 30GB database and having to reload it is unacceptable under any circumstances. Treating it mission critical usually means cost takes a back seat to having 100% uptime. I don't think that's what he needs.

--
A deep unwavering belief is a sure sign you're missing something...

Yes, but by strombrg · 2000-07-25 22:28 · Score: 1

Of course unix can handle it - a LOT of people do this sort of thing on Solaris/sparc, for example.

I expect Linux would be able to handle it to, but don't expect the same throughput per MIPS from Linux/x86 as you'd get from Linux/sparc or Linux/alpha. Intel and AMD have great CPU performance for the price, but they aren't that much of a server architecture.

Solution Found! by Danborg · 2000-07-25 20:09 · Score: 1

You really need to talk to EMC.
They have a high performance disk storage array called Symmet rix, which is pretty cool in it's own right. However, what makes it REALLY REALLY cool is that they sell it with a software package called Symmetrix Remote Data Facility (SRDF). SRDF allows you to copy/mirror data to an offsite Symmetrix array that can be located anywhere in the world! This is the software that all the large companies use to provide their "disaster recovery" site at another geographical location.

Re:Solution Found! by Danborg · 2000-07-25 20:45 · Score: 1

"The problem is, EMC is expensive. I don't think you can get anything from them for under six digits, and I'd be surprised if it was much under seven."

Actually Richard, that's no longer true. EMC purchased Data General back in October of 1999 and thus acquired their CLARiiO N line of storage products. These products are aimed at distributed environments and are actually quite affordable.

Not to mention the fact that EMC will let you lease their equipment if you absolutely have to avoid the upfront aquisition costs.
Re:Solution Found! by richardbowers · 2000-07-25 20:34 · Score: 1

I'll second this. One of my first jobs was at a large telecom company, which was doing a 6TB database (back when that was a lot). They tried various storage solutions, but nothing worked well. They finally settled on EMC, and their problems went away... At my last job, we had 15 boxes, each box sharing about 1.5 tb of storage, all on EMC. The boxes went down often - the EMC only had one problem, and the EMC box used a spare phone line to call EMC and ask for help before the problem became critical. The problem is, EMC is expensive. I don't think you can get anything from them for under six digits, and I'd be surprised if it was much under seven.

--
Law is whatever is boldly asserted and plausibly maintained. -- Aaron Burr

Re:raw partitions by java.bean · 2000-07-25 19:45 · Score: 1

Oracle can use raw partitions, it doesn't have to. Last I heard, the use of raw devices wasn't recommended under Linux...I'll check the release notes again.

--jb

Re:30Gb databases by java.bean · 2000-07-25 19:07 · Score: 2

Just FYI: a 30GB database doesn't imply one file. I have a 10GB+ Oracle on Linux database right now; Oracle organizes data into tablespaces which contain one or more data files. The data files can be spread over any number of partitions; in fact for performance it's better to spread them over multiple disks.

Now what I'm doing isn't mission critical, so I can't comment on that aspect of it, but I will say this: a 30GB database will certainly require more than 1GB of memory.

--jb

Re:Some minor problems to look out for... by matthead · 2000-07-25 20:35 · Score: 1

With respect to journalling filesystems...

At my site, we've been using ext3 on production NFS servers for almost a month now, with no trouble in terms of stability. Disk I/O has suffered a big performance hit (less than 1/2 previous performance), but for the better filesystem reliability, it's worth it.

The 2GB file size limit is your biggest problem. If you're going to go with Linux for sure, look into DEC- I mean, Compaq- Alpha hardware. It's a 64-bit architecture, so that limit shouldn't exist there. I haven't ever actually used Linux on Alpha though, so I cannot guarantee that.

As far as the software RAID bit...

The ext3 patch is against kernel 2.2.17-pre9, so we're sticking with that for now. No development kernels for us, here. Once Steven Tweedie's ported it to a moderately stable 2.4-test, I'll look at giving that a shot. From what I understand NFS performance has increased significantly there. Don't know about RAID. Don't know about databases, either, but if that will put a moderate load on your computer, you ought to look at hardware RAID in any case- you'll get better performance by far, if you take the RAID load off the main CPU.

--

-Matthead

Re:No problem by matthead · 2000-07-25 20:41 · Score: 1

Not a flame, I'm serious:

People in this thread who are talking about 2 gig limits and storing database tables in files have no fscking clue how to help you with your problem...

Then please, tell us how it is. I understand that high availability and outright performance is probably going to be more a concern than cost.

I don't know much at all about working with databases of any significance. Is the data stored in separate files, or when he says a 30GB database, does that mean a 30GB file?

How much is in RAM? If the database server were to crash, would all the changes made be sync'ed to disk? Would you have lost data when you do get the computer back up? How does this depend on the platform the database runs on?

What other concerns are there? Can you give us a "fscking clue?"

--

-Matthead

Re:Three words:with three words by bradleyjg · 2000-07-26 10:32 · Score: 1

"4. If Neither of the solutions in 3 is implementable you have to open wide you wallet and buy informix for Intel or DB2 for intel. Both of them work and are ANSI compliant. In btw DB2 for Intel linux developer edition is free. Free period. No expiration. So you can actually see if the database will work. And they match Oracle on some benchmarks and DB2 beats the crap out of it when it comes to real scalability and clustering."

Ah slight modification - Personal Developer's Edition is free. This lets you develop when you want to deploy on Personal Edition. DB2 Universal Developer's Edition is $499 (currently on sale - normally $999) not a ton of money but the point needs to be made.

Re:Three words: by Paranoid+Diatribe · 2000-07-26 00:16 · Score: 1

The new oracle pricing model isn't based upon users anymore. You basically pay $15 per MHz of cpu speed you have, at least for the standard edition. The enterprise edition rings in for quite more. For instance, if you have oracle running on a dual PIII at 700MHz, that's 1400 total cpu points.

Somebody please tell me that this is complete bullshit. Firstly, I can't even fathom a company being this arrogant about its own product. I thought that Microsoft's "per seat or per head, whichever is greater" Client Access Licensing was absurd. But more importantly, I can't believe that people would actually buy into a license like that.

Have software vendors stooped that low? (Well, I guess they have is MS wants us to "rent" it's software for a monthly fee in the near future...)

PC hardware does this, easy by rlglende · 2000-07-25 22:07 · Score: 1

I am a consultant - programming, sysadmin, ...

I know of several large web sites built entirely with PC hardware. (Walk around above.net in SJ and most cages don't have any SUN, SGI, ... equipment, only PCs.)

Largest is 2M unique visitors per month, 20+M pages per month. 30GB database.

Hardware is dual 450MHz Pentium IIs, 2GB ECC DRAM,
Mylex external RAID controllers for 2 chassis of
9GB IBM SCSI disks.

Software is Solaris/Oracle. Runs in 'recovery mode' (I am not a DBA) with log files copied to
another system between DB backups.

Uptime is good. Main problems in the last 2 years have been Mylex controllers and a failed
system disk in the PC chassis. Solaris provides
software mirroring to avoid this kind of problem
next time.

Disk I/O is the bottle-neck. More DRAM for caching is first improvement to be done, followed by next generation of RAID controllers with lots more cache, followed by more disks/heads.

Lew

--
"The Constitution, the WHOLE Constitution, and nothing but the CONSTITUTION."

Of course Linux/Unix can handle 30 GB by fence · 2000-07-26 03:24 · Score: 1

In the recent past, I've worked on Sybase databases that were in the hundreds of gigabytes on unix.

Currently, I work on small databases in the 8 to 20 GB range.

I've got a dual processor box at home with 512 meg of memory running Sybase on Linux and I've got a couple of 10+ GB databases loaded there.

So, don't sweat the small stuff.

my advice, get as much memory as you can afford/use. RDBMSes love memory!
---
Interested in the Colorado Lottery?

--
Interested in the Colorado Lottery or Powerball games?
check out http://colotto.com

Look better for a decent UNIX offer by Baki · 2000-07-25 20:40 · Score: 1

How can the UNIX offer be 5 times more expensive?!?

I work in a large bank, on a 80GB datawarehouse (mirrored, so 160GB diskspace). An internal competitor uses NT (Compaq) with SQL server for a similar (but smaller) application, we use Solaris/Sparc with Oracle.

Constantly we are being judged on cost/performance but others. Recent comparisons showed that an Intel solution (Compaq) would be 30% cheaper. In return you get CPU's with smaller cache, generally less reliability and it is questionable if our app could run at all on Compaq.

Note that the 30% difference only accounts for the hardware cost Sun/Sparc vs. Compaq/Intel.

Taking into account the OS cost (NT versus Solaris) it is sure that NT would become much more expensive, since Solaris is included with the hardware, and NT licenses for such large applications are extremely expensive. Not to mention the extra system administration costs that NT would cause.

As for Linux/Intel? I would not do it. As mentioned you can gain maybe 30% on HW cost, but for that you can be sure that Linux cannot handle load and scale like Solaris/Sparc can.

Re:Journaling File System by n3bulous · 2000-07-25 20:50 · Score: 1

There really shouldn't be a need for a journaling file system since the sql server basically does this already through the transaction log.

That may be so, but after a FS crash a 30GB EXT2 fsck will take what seems like forever.

Actually, time seems to stop after a crash as you s**t yourself worrying about a successful recovery and catching it from your manager and everyone else relying on that DB.

The quicker it comes back, the better you'll feel. A JFS will help very much. But in this case, since it used as a backup DB, EXT2 will be fine.

--
"The area of penetration will no doubt be sensitive." ~ Spock

The biggest data warehouse today... by Cushman · 2000-07-25 22:17 · Score: 2

Wal-Mart's Teradata data warehouse is one if the biggest (if not the biggest) data warehouses in the world. You can read about it at NCR's website. In the article, they say it is 7.5 terabytes, but from what I have heard, they now have two warehouses that total 110 terabytes.

It runs on NCR's 5200 system, which is based on Intel architecture. It scales up to 512 nodes, with 1-4 Intel processors per node.

The operating system is NCR's MP-RAS (a flavor of UNIX that runs on Intel architecture). I'm not sure if it runs Linux ;-)

*disclaimer*
I _do_ work for NCR, but I just thought this was some neat information. I don't work in our data warehousing department. The system above would cost many millions of dollars, so it's out of the range of the average /. reader, and if you are going to spend that money on a data warehouse, chances are you are talking with NCR anyway.

Re:Of course it can by MindOpen2 · 2000-07-25 22:01 · Score: 1

Sybase uses a concept called "Devices". Sybase supports up to 256 devices per server. If each "device" was really a 2gig disk or a 2gig file, then that would be 2gig * 250+ (you have to subtract for certain system devices that are already used) of available space. To support a 30-40gig DB, you would only need about 15-20 devices (or 2 gig files). We have done this on Linux already and it is quite easy. Of course, All the usual stuff about Administration is in effect (ie., backups, dbcc's, etc.) but with Sybase its extremely easy for a single admin to administer many DB's (and I won't even get going on the Database "Holy War" between Sybase and the Other BIG Relational DB companies).

--
-- Even racing cars don't crash as much as windows. --

Why not Sybase on Linux? by JonK · 2000-07-25 18:55 · Score: 1

As far as I know, Sybase have been doing Sybase AS (what SQL Server morphed into) for a good year or so now: wouldn't the obvious call be to move to that?

To any OSS/Free Software advocates: trying to do this on MySQL is a Bad Idea.

Oh, and as a sidenote, 30GB is a Very Small database: I've had SQL Servers with terabyte-sized databases.
--
Cheers

--
Cheers

Jon

Re:Why not Sybase on Linux? by steelhawk · 2000-07-25 19:12 · Score: 1

To any OSS/Free Software advocates: trying to do this on MySQL is a Bad Idea.

Them, how well would PostgreSQL handle this? ;)

--

--
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
Re:Why not Sybase on Linux? by steelhawk · 2000-07-25 20:05 · Score: 1

LOL... oops, sorry...
I accidently typed .com instead of .org...
Won't happen again.. I almost promise!! =)

--

--
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
Re:Why not Sybase on Linux? by -brazil- · 2000-07-25 18:58 · Score: 1

To any OSS/Free Software advocates: trying to do this on MySQL is a Bad Idea.
Namely because MySQL stores its tables in files, and 64bit file pointers (necessary for files larger than 4GB) are still a hack job.

--
The illegal we do immediately. The unconstitutional takes a little longer.
--Henry Kissinger

Re:Custom built machines by JonK · 2000-07-25 19:08 · Score: 1

*cough*Real databases*cough*

Here's a nickel - go and get yourself a clue.
--
Cheers

--
Cheers

Jon

Re:Wouldn't go with Linux myself by JonK · 2000-07-25 19:15 · Score: 1

Not necessarity: if you don't need to have both servers absolutely lock-step, you can ship logs from the live one to the fail-over one. While this may cost you your last few transactions in the case of a crash, it has the advantage that you can keep two servers reasonably closely sync'ed down a kilostream link (64kbps).

Now, if you want proper two-way transactional replication (multiple publisher/subscriber model) then that's gonna cost you. And it's also a bitch to keep running on anything less than a dedicated cross-over cable between two fast NICs (been there, got the t-shirt AND the ulcers AND the hair-loss)
--
Cheers

--
Cheers

Jon

Re:No.. Three words by JonK · 2000-07-25 20:09 · Score: 1

It's the difference between

knowing what you're talking about and

being aware that Oracle is an RDBMS and is available on Linux, and hence feeling the urge to say something - anything - however irrelevant.
--
Cheers

--
Cheers

Jon

Re:not totally irrelevant by JonK · 2000-07-25 22:32 · Score: 1

True, but let's face it: shifting the data over's the least of your worries.

Once you've got a big data puddle on your new server, you're going to have to recreate all the TSQL stored procedures as packages. You're quite possibly going to have to rewrite significant amounts of either the clients (if it's a C/S system) or the middle tier (tiers?). You may need to roll out new data-access libraries across all your clients (not an undertaking to be dismissed lightly on anything but the smallest of LANs). You're going to find that unless the whole thing's been put together without using a single vendor-proprietary extension to ANSI SQL (probably the 89 version, 'cos SQL 92 support isn't ubiquitous) you're SOL.

And, finally, once you've done all that you're going to find that performance optimisations which worked well on one platform turn your database into a dead dog on a second (for an example, compare the performance of Informix and MS SQL Server on cursors: Informix screams, MS SQL Server runs like a geriatric full of Largactil) - this is the big problem with point releases: they tend to break your carefully-honed performance optimisations, which is why you run them on a testbed first, then roll them out only when you've worked the kinks out about six months after release. Remember kids, release-early-release-often doesn't work in the world of databases - no DBA worth his or her salt wraps anything even vaguely unstable around his or her data (or at least if they do, they'll be looking for a new job immediately afterwards if I've got anything to do with it).
--
Cheers

--
Cheers

Jon

Re:Interbase? by JonK · 2000-07-25 22:46 · Score: 1

Who mentioned Access?

I've had 1.8 TB of data in a DSS database on MS SQL Server: this was indexed up the wazoo and loaded in batch from the sister OLTP system every night (and the load process was deeply fun...). The devices all lived away out on the (rather big) SAN and the backup hardware was a sight to behold...

Obviously my experience is no match for your opinion <g> - oh, and the day Oracle open their source is the day I see pigs flying. C'mon, this is Larry Ellison we're talking about.
--
Cheers

--
Cheers

Jon

Re:Excuse me! by JonK · 2000-07-26 17:39 · Score: 1

You said:

I've heard several stories about Microsoft crashing when serving more than 30 clients on a Microsoft Access 97 database

Now, what I was discussing was TB-sized SQL Server databases (both MS and Sybase, BTW). In this context Access was a red herring. And FYI, I've had 400 - 500 users backended onto a MS SQL Server database from all across Europe (Smalltalk client, SQL Server 6.5, NT 4 Server) connecting over everything from 64KB frame relay upwards. The problems we had were exclusively with the (extremely badly-written) clients: the servers were stable.

Like I said in my previous post, go away and find out about what you're talking about. I'm posting from experience of running big SQL Server installations, you're posting your (unfounded) opinions. You've worked with big databases, you say: well, so've I (mail me privately for the full list if you're interested) and since you're so DBaware, you'll also be perfectly aware that what's true of one big database isn't true of another.

Oh, and try and relax a bit.
--
Cheers

--
Cheers

Jon

Two words by JonK · 2000-07-25 19:23 · Score: 4

Bad Idea.

Changing RDBMSs is a Really Painful Experience and one to be avoided at all costs if possible: it makes changing OSes look trivial (hell, even upgrading from one point release to the next can be a world of pain). If the data's already on Sybase then for god's sake keep it on Sybase. Go for Sybase on Linux, Sybase on SCO, Sybase on NT or whatever but remember: it's a RDBMS and the underlying platform is effectively irrelevant (pauses for flames as thousands of enraged Slashdotters start to spout off and steam at the ears)
--
Cheers

--
Cheers

Jon

Re:Two words by john_many_jars · 2000-07-25 21:38 · Score: 1

Oh yes Sybase has these.. Believe me, they work differently and the migration turns out to be a complete rewrite. Stick within the same DBMS, the migration is easier. Since Sybase for Linux is free to develop, it is possible to test it before purchasing to see if it will handle the data.
Re:Two words by stephenbooth · 2000-07-25 20:17 · Score: 1
As I see it there are 3 main issues with migrating from one RDBMS to another:
- Migrating data.
- Migrating packages (ie processes such as triggers that run within the database).
- Skills Transfer
On the first one you really have two choices. If you can extract the data as Comma-Separated-Variable or fixed field width files then you can use SQL*Loader to perform the upload. Alternatively Oracle do supply a Migration Workbech product that can help semiautomate the process, more details can be found here.

Packages running within the database (I must admit I don't know if Sybase has these) will probably need to be rewritten for the new RDBMS. In favour of Oracle it is now possible to write these in Java as a JVM is now included as part of the basic install of the server, I believe that it is Java 2 as of Oracle 8.1.6 but you would have to confirm this with Oracle themselves. Release 3 of Oracle 8i definately supports Java 2 API and includes XML support and Apache bundled within the database according to this page. Try searching the Oracle Corporate Website for further details.

Oracle uses the SQL-92 (ie ANSI) SQL for those areas that it covers, as has been quite rightly pointed out the extensions will differ from RDBMS to RDBMS. There are a lot of very good books available for Oracle which cover everything from introducing a total newbie upto assisting someone skilled in another RDBMS to transfer to Oracle. Try O'Reilly or Amazon for some good examples.

I hope that this is helpful

Stephen
--
"Don't write down to your readers, the only people less intelligent than you can't read" - Sign on Newspaper Office Wall

SUN SPARC E3500 by $nyper · 2000-07-26 05:40 · Score: 1

Currrently I have a 58GB oracle database running on a Sun SPARC E3500. The drives sit in an external Sun A1000 storage unit and they are configured in a mirrored RAID 5 array. Our volume is currently capable of holding up to around 85GB.

During peek performance of 425 concurrent users all slamming the system at once I am only using about 35% of the system's resources. This server has been up since birth with no down time for 8 months now. A properly configured Unix server is more than capable of handling your data size and work load for SQL databases.

--
"Help me Obi-/.-Kenobi,your my only hope!" -$

MySQL thinks very highly of itself.... by .havoc · 2000-07-25 19:58 · Score: 1

http://www.mysql.com

tcx claims to be running some giga-huge db on thier linux based computer and having never had a problem.....

I dun'know....

--
Don't you think it's time to start communicating?

Re:MySQL and data warehousing don't go together by .havoc · 2000-07-25 23:02 · Score: 1

I'm not going to argue with you about using an open source DB for any given application -- that's a waste of bandwidth.

However, Chapter 15 of the MySQL manual explains how to add new functions and proceedures....

--
Don't you think it's time to start communicating?

As I remember from old documentation by mr · 2000-07-25 20:24 · Score: 1

Seti was using postgres for data storage. 50+ terrabytes of data.

--
If it was said on slashdot, it MUST be true!

I've got customers running .5TB DBs on FreeBSD by ericr · 2000-07-25 21:11 · Score: 1

without even straining the box. $15k for all the h/w and installation, setup, etc. Performance is pretty good, too. FWIW, we tried this using red hat first, but it wouldn't deal with the i/o very well. Still, the important thing is that the project stayed in the open source community.

--
It was Judge Woodlock, in the US District Court for Massachusetts, with a gavel.

Re:More on Partition Sizes by bluetoad · 2000-07-26 06:44 · Score: 1

I would also recommened that you adjust the parameter for the maximal mount count on each of the partitions so that they are staggered. That is, so that the maximal mount count is not reached at the same time on all of the large partitions. (I think it's tune2fs)

When you create your partitions you might also want to adjust the number of possible inodes.

30G is nothing by wharfrat · 2000-07-25 19:37 · Score: 1

30G is nothing.There are hundreds of thousands of SAP instalations on Unix with 300G (yes three zeros) and more. You know what DB they run. Oracle.
SQL 7.0 is based off of the single user PC enviroment and has been scaled up to the enterprise server. I am not saying it is not a good DB. It has not out performed Oracle yet. That is a fact.
Oracle was designed for enterprise DB and scaled down to PC servers.
If anyone is interested in the fastest platform for enterprise ERP go to http://sap.com I was surprised to find it is Oracle on Linux. It beat out Oracle on HPUX, Oracle on NT and SQL 7 on NT.

Re:30G is nothing by mrfiddlehead · 2000-07-25 20:55 · Score: 1

... SAP instalations on Unix with 300G (yes three zeros
I see four zero's, oh wait, never mind, I'm still drunk.

--
:wq

For real power... by TrailerTrash · 2000-07-25 21:56 · Score: 1

Use DB2. My companies' system is 35TB running DB2 on AIX. I realize it's not Linux/Intel, but the same truly industrial strength DBMS is available for Linux, from an early proponent of Linux (IBM).

Custom built machines by Relic · 2000-07-25 19:02 · Score: 1

We have found that a high spec x86 machine can be built alot cheaper, and in many cases will out perform the large brand name servers *cough*RS6000*cough*. A 4 unit high rack case will easily accomodate a good quality server motherboard (Intel or maybe Tyan) and will aloow for a DUAL PIII configuration, and depending on your board, between and one and two gig of RAM. My db program of choice would be mySQL.

Re:Custom built machines by Pinball+Wizard · 2000-07-25 22:34 · Score: 2

I see several compaq machines out performing RS/6000s on tpc.org
My recommendations if you are on a budget: Stick with Linux and Sybase and get some vendor support. Definitely stick with x86 hardware since you are on a budget. The size of the database is less important than the actual design. How much data is going to be used at any one given time? Figuring that out will tell you if you need to add another gig or three of RAM. A good dual-processor machine should be sufficient, perhaps a quad if there are lots of simultaneous users. Bottlenecks in a database are rarely at the CPU.
IMHO, you should concentrate on your RAID setup. Get ~20 4GB disks and set them up with RAID 10(full mirroring+striping). That alone is going to give you much, much better performance than a solution with say 4 20GB disks. At $200 per disk this will run you about $4,000. Paying careful attention to this will get you your best database performance while still spending a hell of a lot less than you would with an RS/6000.
You need the performance, but obviously you can't fit the whole database in RAM. So get a good RAID controller and buy as many small disks for it as you can.

--
No, Thursday's out. How about never - is never good for you?

Re:Clarity of Expression by debaere · 2000-07-25 19:40 · Score: 2

This was my thought when I saw the "Linux = x86 UNIX' posts.

For those people, a lot of *NIX's are available for the x86 platform:

Most Linux distro's
Free/Open/NetBSD, and BSDi
Solaris
SCO
etc...

Linux may be the most publicized version of x86 *NIX's, there are others. In fact, I would reccommend that the DB mentioned in this question be run on a BSD. If FreeBSD can handle Hotmail, it can handle almost anything IMHO.

Dave

BTW: Before I get flamed, the Hotmail/FreeBSD thing I remember from somewhere, but I can't remember where. I do know its NOT on an NT box, which basically leaves UNIX.

--

DOS is dead, and no one cares...
If there's a Bourne Shell, I'll see you there

30GB is a no-brainer by Ora*DBA · 2000-07-25 20:49 · Score: 1

I have run 30GB Oracle databases on 8.0.5 using RH 6.1.

The big problem here is the SMP performance. Generally when one is using a database of that size one wishes to take advantage of parallel processing, or at least use SMP to support a certain number of users.

I believe any performance problems you encounter will be in that area. Sun, HP, IBM et al just kick Linux' butt in SMP performance, and when you are supporting large queries and/or large numbers of users, all the memory in the world will only carry you to a certain point - beyond that you need fast, robust SMP support.

Other than that, you can certainly set up an Intel-based system to handle 30GB database, be they ORacle, Sybase, Postgres or whatever.

hth -

Regards,
jh

I personally see two issues here... by PromethiumInfrmation · 2000-07-25 20:40 · Score: 1

1 - ability to run a 'big' database on linux 2 - replication of said database IMHO, issue 1 is the stated problem; yet issue two is sitting quietly in the background, impacting the actual engine selection. no one seems to be outrightly saying that they need a an engine that can mirror a large database quickly, but they are taking this fact into account and this is greatly impacting the decision. As we move closer to a more distributed, three-tier computing paradigm, the will become an even greater issue. My suggestion is the following: 1 - Look for a database that performs well on unix. period. this includes DB2, Sybase, Oracle, or opensource (mySQL, postgres, etc...) 2 - find a good replication engine that is able to directly access the database. UNIX is all about a modular approach to solving problems. However, people seem to be taking an all or nothing stance on this database problem. I'd recommend PeerDirect as a good replication engine. www.peerdirect.com

Linux vs commercial unix vs GatesWare by lehmann · 2000-07-25 19:58 · Score: 1

I agree with all the people saying Linux/Intel can do the job! Sure it can, but running linux in a corp. demands a high level on linux knowlegde in-house. My expirence is that any unix system can/will fail a some point if the expirence to administere such a system is not available in-house. Do not depend on external support for such things. If this corp. is of a smaller size and have a small system admin group go for the easy solution (but not the best!) : Win2K and Oracle/DB2/Sybase/informix) In a perfect world everyone runs unix! (please note that i didn't write Linux ;-) though I am a linux advocate) Personal favorite prof. unix btw: Solaris

--
Never trust a windows system manager

Re:Clarity of Expression by katarn · 2000-07-25 22:58 · Score: 1

BTW: Before I get flamed, the Hotmail/FreeBSD thing I remember from somewhere, but I can't remember where. I do know its NOT on an NT box, which basically leaves UNIX.

Huh, The last I had heard they had converted the entire front end from BSD to NT, but it cost them having to double the number of servers needed. Evidently even with twice the hardware they still couldn't get it to work right under NT, since (as pointed out) Netcraft shows them running BSD. BUT, regardless of what hotmail's front end is running (the front end = the part the user sees as he/she logs in), the real work of processing and storing the email is done under a major *NIX. In this case Hotmail uses SUNs, though other large instalations such as Netscape and AOL use SGIs. I'm not putting down BSD or LINUX, but currently machines from Sun, SGI or the other big players, handle these truly huge loads much better then BSD or Linux.

Fast Oracle on Linux by Rebar · 2000-07-25 22:06 · Score: 1

Your Oracle on Linux will be faster if you change the default block size on ext2 to 4KB or use some other FS, and use 4KB db blocks in Oracle (depending on what you are doing of course - smaller for OLTP, larger for DSS).
And add more disk controllers. If you are serious about it, check out Mylex's line of SCSI RAID controllers.
And add more RAM.
And add another CPU; preferably 4 of them running Oracle EE which has the parallel query option.

Then of course you will have spent more on your Oracle license than on your hardware, with Oracle's new per-CPU-MHz licensing racket.

Re:Absolutely Raid 5 for Data Warehousing systems by Rebar · 2000-07-25 22:56 · Score: 1

The original poster needs a backup mirror of a 30GB database, which is largish but I agree not large enough to make RAID-1 cost prohibitive, and the advantage of RAID-5 over RAID-1 is cost.

Most DBA books assume a smallish transaction oriented database, with advice like "use indexes to your tables to speed query times", which can be very bad advice in some situations.

Point is, there is no "rule" when it comes to databases, which must be tuned very differently depending on their size and the intended use.

OK, there is ONE rule - test your backups!

Re:raw partitions by Rebar · 2000-07-26 08:51 · Score: 1

Did you test just one query, like select count(*) from large_table on an unloaded box, once with filesystem files, and once with raw partitions?

We did, with the data all nice and striped across controllers, and while we did NOT see the performance increase we expected with raw partitions, we saw something unexpected that causes us to use raw devices exclusively: CPU usage during the benchmark.

Raw partition benchmark CPU useage was a third of what the filesystem benchmark was, and just a table scan was consuming 60% of available CPU. We'll be doing more with the data than reading and discarding it like our benchmark. CPU usage on raw partitions was under 20% CPU utilization (give or take a few percent - it's been a couple of years now - hardware was a Digital Alpha 4100 with 4 533Mhz EV5? processors and 4 Mylex DAC960 RAID controllers and Oracle 8.0.5)

We presume that getting rid of the extra layer of filesytem buffering gets rid of the excess CPU usage. Since the box we are on is not I/O bound when doing real work, and as stated elsewhere in this discussion "there is always a bottleneck", we found that we could get more work done per unit of time on raw partitions than on filesystem files. This was after having turned down the kernel filesytem buffer cache since Oracle does its own caching and doing all the normal tuning one would expect on a fresh database.

Just another $0.02 on the raw/filesystem debate... I'll admit that filesystem files are easier for the fresh DBA, but once you've taken the plunge and discovered their quirks, raw devices are no harder to manage than filesystem files, save for the fact that you have a finite number of disk slices without an LVM, but most Unices come with those anyhow.

Re:Absolutely Raid 5 for Data Warehousing systems by Rebar · 2000-07-26 20:19 · Score: 1

Nope, I'm talking about Oracle hash joins.
Try joining two tables of over a million rows, with a large answer set, like this:

create table baz as select foo.*, bar.* from
foo, bar where foo.key = bar.key;

In Oracle, which is NOT the topic I realize, if you use an index with a query like this you are screwing yourself.

If you use a hash join (Oracle 8 && up), you will be blown away. of course you have to use the cost based optimizer and analyze your tables first so it has some data to work with, and drop your indexes . I don't mind if you don't believe me; I had been a DBA for years before I believed indexes could be anything but great, but I am a convert now!

There are other things indexes kill performance on, namely inserts and deletes to a table with indexes on columns other than the ones in your where clause. It's often better to drop your indexes, do the update, and recreate the index instead of waiting on Oracle and wondering why it is taking so darned long.

TIA

Absolutely Raid 5 for Data Warehousing systems by Rebar · 2000-07-25 21:42 · Score: 5

Yes Raid 5, in hardware thankyouverymuch.

Like most everyone else, you are assuming all database are OLTP systems. Data warehousing or data analysis on the other hand requires MASSIVE data transfer rates (mostly read activity), and Raid 5 with large stripe sizes and multiple arrays works really well for this type database. Most queries against the roughly 3TB database I currently work on run in several minutes passing somewhere under 100GB of data each, and if we had used OLTP tactics (indexes to join everything, small block size for low latency reads, etc) to tune the database, they would run in days or hours instead of minutes. Aggregate I/O rates on this monster can exceed 500MBytes/second.

As to the original question, can Linux handle a 30 GB database, my answer would be "Yes, but it will hurt". Ever try staging more than 2GB of data on ext2? Ever try moving more than 1GB of data on ext2 with less than a 4KB block size? It hurts!

Someone please tell me that I will be able to use large files painlessly on Linux sometime. Until then, run large databases on name brand UNIX servers with name brand UNIX. Linux on x86 is good at a lot of things, but a large database isn't one of them YET.

SQL> select sum(bytes) from dba_data_files;

SUM(BYTES)
----------
2.9003E+12

And every byte is on RAID 5.

Re:No.. Three words by -brazil- · 2000-07-26 15:45 · Score: 1

The question was formulated quite broadly: "Is Unix capable of handling a database of this size".

--

The illegal we do immediately. The unconstitutional takes a little longer.
--Henry Kissinger

Three words: by -brazil- · 2000-07-25 18:52 · Score: 2

Oracle for Linux.

Should easily be able to handle this. And considering the size, anything less would probably not be a good idea.

--

The illegal we do immediately. The unconstitutional takes a little longer.
--Henry Kissinger

Re:Three words: by professionalGeek · 2000-07-25 19:03 · Score: 1

The only problem I could forsee with a Oracle 8i on Linux solution is that x86 boxes don't scale very well. And the best way to scale your database is on one big box (or a pair of redundant boxes). As such, a Sun box like a 4500 or 6500 will grow (14+ cpus, 4Gb memory =) much more easily if the database is used intensively (sucking more CPU than the machine can handle). With Intel, isn't the limit still 4CPU for the latest generation? Scaling Oracle across multiple machines is usually *very* painful. What size box is the database running on now?

--
Stripes:Making Java web development easy like it should be.
Re:Three words: by RedFang · 2000-07-26 04:34 · Score: 1

Somebody please tell me that this is complete bullshit. Firstly, I can't even fathom a company being this arrogant about its own product. I thought that Microsoft's "per seat or per head, whichever is greater" Client Access Licensing was absurd. But more importantly, I can't believe that people would actually buy into a license like that.
Actually this is quite sane pricing from an enterprise model. Charging based on machine size has been the norm on mainframes for decades. It's actually a lot more flexible then per seat or per user licensing since you can have an infinite amount of users per installation. Well, as many users as your hardware will support. ;) Oddly enough, this is what industry wanted CPU serial numbers for in the first place. They could check the number and verify the CPU speed and number to catch licensing violations. The only problem being that PC hardware is changed almost continuously when compared to mainframes.
Re:Three words: by bebopkim · 2000-07-26 07:55 · Score: 1

Compaq Proliant 8000 can equip with max 8 Intel Xeons. Unisys ES7000 can equip with max 32 Intel Xeons. It's cheaper than Sun things and HP things. And Linux based on x86 can be installed on them. There may be more machines like them.

raw partitions by emir · 2000-07-25 19:41 · Score: 1

he is probably not going to use ext2 anyway. oracle uses its own raw partitions/filesystem to store its data. this speeds up oracle and avoids 2gb ext2 filesize limit.

so gnu/linux + x86 can be good choice :)

--
-- http://electronicintifada.net --

Re:Clarity of Expression by emir · 2000-07-25 23:46 · Score: 1

original BSD was UNIX, however nbsd/fbsd doesnt contain any code from last BSD release (4.4 ???? ) thus its not UNIX.

--
-- http://electronicintifada.net --

Re:Clarity of Expression by emir · 2000-07-25 20:20 · Score: 2

not any of the free unicses (gnu/linux , nbsd, fbsd, gnu/hurd, obsd) is "real" unix because none of them is certified as UNIX(r). general missunderstanding among *bsd ppl is that *bsd is unix while linux is not.

btw i think (not sure) that there is some group working on certifing gnu/linux as UNIX(r). i believe it costs at least 10 000$ to get this certification....

--
-- http://electronicintifada.net --

not totally irrelevant by GodOfHellfire · 2000-07-25 21:33 · Score: 1

having just come back from sybase dba class i can say that you cannot load a sybase *nix database on nt. you could bcp everything out to flat-file and load that, but that's your only unix -> nt option.

Not a problem... by digitalhermit · 2000-07-25 21:53 · Score: 1

It certainly is within the capabilities of any modern SQL engine. The reason for the high cost is that a database of 30 gigs probably contains lots of critical data. Likely, your enterprise may need guaranteed uptime which includes good hardware, stable software, constant power, and the ability to upgrade or fix without downtime. Here are some of the advantages and disadvantages of Linux when it comes to databases of this size:
Current 2.2 kernel does not support raw disks. This is the ability of the database engine to manage the disks, rather than adding an OS filesystem layer. This gives added speed and reliability. I believe that the newer kernel will support raw disks but it may take short while for the major vendors to support it.

Hot swap ability/redundancy - Lots of good stuff, some bad. Various clustering solutions are being developed that can work with large databases. Linux may be a little weak when it comes to support for hot swap drives (don't know the current state).

In any case, 30 gigs doesn't really say a whole lot about what sort of data you're storing. To be really optimal, you'll need to know how you will be accessing it, estimated number of hits, etc..

30GB Possible? Damn right it's possible! by tjwhaynes · 2000-07-25 19:32 · Score: 3

Look - 30GB database? Lets just look at the necessities first and then we'll get down to a choice of vendor (because you are going to want a reasonably heavy weight database server for this).

30GB of data. Okay - so you aren't mission critical. Even so, with that amount of data, you probably want a hot-swappable redundant system such as RAID if availability means anything to you. But these days you have lots of choices for RAID, including software RAID under Linux. I'd probably still go for a hardware solution for RAID, but that is because I'm not clued up on how robust and failure-proof the Linux RAID is when one of the disks dies. If you don't care about redundancy, 40GB drives are easily found. For performance reasons you might want to find four drives of say 15GB each so that random access to the drives can be done in near parallel, especially if you stripe the drives, but that is yet another option.

Accessing 30GB of RAM is going to require some reasonable memory space - think 512MB minimum and work up from there. Of course, you could run it on far far less (say 80MB) but you will pay a performance penalty - the database products I know about have plenty of tricks up their sleeves if they have spare memory to play with, and resort to paging out to disk when things get tight.

The choice of software is important too. I'll declare my biases up front and say go for DB2 Universal Database, partly because I work on it and I like it. Your other choices are Oracle, obviously, and there are a host of other database vendors out there for Unix systems across the board. DB2 UDB is easier to administrate and looks to be faster than Oracle, as well as generally being cheaper to deploy. As far as functionality goes, everybody nowadays assures SQL92 conformance. SQL99 core conformance isn't too much to hoot about, as it's basically SQL92. The SQL99 spec is far more modular than the SQL92 spec, so it's easier to match the base functionality and then add on SQL99 conformance for, say, the multimedia extentions, later.

So the answer to your question is yes - it is possible to deploy a 30GB on Unix. And it is definitely possible to deploy the same database on Linux - both IBM and Oracle have versions of their databases on Linux.

Cheers,

Toby Haynes

--
Anything I post is strictly my own thoughts and doesn't necessarily have anything to do with the opinions of IBM.

"Not mission critical?" by rob_from_ca · 2000-07-26 00:08 · Score: 1

That's kind of a loaded statement. Sure, people probably don't die if this particular database goes down, but I'm assuming there are going to be aspects of the business depending on it (otherwise what's the point). The customer may say it doesn't have to be mission critical, but rebuilding a 30GB database is not a trivial or quick task. How mad will the customer be when faced with 24-48 hours of downtime? This means you have to have reliable hardware and good system administration practices anyway. Basically, you have to treat it as pretty mission critical.

So of course, the answer (I think is), if you haven't done it before, and no one in your group/business has, and no one's sure if you can or not...you probably shouldn't. Or more accurately hire a consultant to do it; although there's a good chance that when you tell him/her that you want a 30GB, reliable database with good performance that they're going to tell you to go buy an E4500 with 4-8 CPU's and Oracle.

Credibility dropping fast by gammatron · 2000-07-25 19:42 · Score: 1

Is Unix capable of handling a database of this size?

I don't know which is more pathetic...

1) that the dofus asking the question actually typed that, or

2) that the "editor" didn't actually "read" what he posted.

either way, these ask slashdot questions are getting really lame. Come on, guys, all it takes to raise your "standards" is to hit the "delete" button when you get these brain-dead questions.

(-1 Redundant)
--

--
http://gammatron.weblogger.com

Re:Credibility dropping fast by gammatron · 2000-07-26 00:15 · Score: 1

The "fact checking" part is important... most of the questions posted lately have been of the strictly factual type, and could easily be answered by consulting a search engine. Its the subjective questions that are interesting and get the good discussions started.
--

--
http://gammatron.weblogger.com

30Gb databases by LinuxGrrl · 2000-07-25 18:56 · Score: 1

Linux/IA32 probably not, at least under e2fs as you'll likely hit the 2Gb filesize limit, depending on how the database engine involved implements storage (Oracle using its own data partition in "raw iron" style?). Linux on other architectures, specifically the 64bit ones (Alpha, Sparc, Sledgehammer and IA64 before long) would probably be fine.

Re:30Gb databases by Moderation+abuser · 2000-07-25 18:59 · Score: 2

So you add multiple data files. It's really no big deal.

It's handy to use smaller data files anyway. It can be useful for load balancing.

--
Government of the people, by corporate executives, for corporate profits.

Re:No, only Microsoft SQL Server can do it. Period by SuiteSisterMary · 2000-07-25 20:08 · Score: 1

Also, you get nice GUI tools to help you if you don't know a heck about databases.

If you don't know 'a heck about databases', which I assume means 'a heck of a lot about databases', you shouldn't be implementing databases. That's like saying 'This medical textbook has lots of pictures, so it's great if you need to operate, but don't know a heck of a lot about surgery.' And as far as a corporation is concerned, a database is far more important than you are. ;-)

--
Vintage computer games and RPG books available. Email me if you're interested.

Re:Excuse me! by Denix · 2000-07-26 18:05 · Score: 1

SQL Server and NT is stable enough to do TB sized databases. I think that SQL Server is actually MS's best product and I can't think of one that is more stable.

The poorest thing about MS SQL Server 7 is the fact that its admin console uses IE (the most unstable creature in the Universe.)

--
"Simple words such as 'better' or 'faster' are best used by simpletons. Life [...] is more complicated." - TMC

30G no problem. Heres some advice by grantsucceeded · 2000-07-26 11:34 · Score: 1

I have been a Oracle dba for many systems that were larger than this, bot OLTP and DSS/Warehouse. but most of this should apply to sybase too, which would presumably be a gentler migration, if you want to go to linux.

- avoid raid 5, go with plain old stripe/mirror. raid 5 is horribly slow for writes, and in DSS, you do a lot of disk writes as part of queries, because the queries build temp tables/segments transparently to do the large sorts/merges involved.

- get more than 1g ram if you can. Oracle will make good use of this, by increasing memory sort area sizes, and caching database blocks more intelligently than the filesystem (gives preference to index blocks basically) hopefully sybase too.

- the mirror system at a remote site can be accomplished by using redo logs/transaction logs. Restore the database from backups to a remote location. rdist, rcp or scp the transaction logs from the primary database to the mirror, and roll the database forward with each successve log. This is called a "standby database" in oracle parlance.

Re:No remote NT management? wtf? by evilgrin · 2000-07-25 20:29 · Score: 1

So you don't consider field names to be critical data?

I'll leave that one alone, as I'm not your manager...

'~ms tools translates into time and money'
Actually, all the tools I need and use are provided free with my operating system, gratis from MS. If you ever feel the next to double-check this astounding revelation, go to some productive person you know who runs NT workstation and ask them to display the Administrative Tools folder in the start menu.
Then, if you're feeling really punchy, download the free resource kit which is just chock full of goodies to manage an NT enterprise.
Watch out though, you may be required to fire up notepad and read a few text files to figure out what they do.

'~no local bulk load in Oracle'

RTFM. Need I really say more?

'~different hardware'

Well, I just reloaded to my laptop here at just over 110MB per minute. MS SQL 7.0 on Win2k, 600 mhz p3, 128 mb ram Dell inspiron 3800.
Your desktop is probably about as powerful, yes?
btw, I was running Outlook 2k, Work2k, 8 internet explorer windows, my firewall monitoring/logging utility, 2 command prompts, an ssh shell, dns administrator, ws-ftp client, norton antivirus autoprotect agent, and SQL server desktop edition when I just ran my restore or a 1.2 gb database off my network server to my laptop...not even a local copy of the data.

EvilGrin

have a nice day

Re:No remote NT management? wtf? by evilgrin · 2000-07-26 03:09 · Score: 1

rm -f -r /* oops, did my typo just affect the stability of my server? EvilGrin

No remote NT management? wtf? by evilgrin · 2000-07-25 19:59 · Score: 2

I have a few beefs with your post... 1. Oracle on NT a. '~crash by mistyping...' The answer to that of course, is to not mistype mission critical data, you should be using scripts for bulk trtansfer anyway. b. '~no remote management on NT' Where the hell does this come from? Oh, I know. You have no clue how to manage an NT enterprise, you're just taling out your a$$. As an NT engineer, I can do ANYTHING from my laptop, from ANYWHERE in the world. Using only MS tools and a few scripts I wrote in vbscript. I concede that sometimes it would be nice to have a 'true' terminal connection to the server, but you don't 'need' it. 2. '~Bulk import of Oracle for Linux.' Only 20mb a minute? bwahahahaha I can reload data into my NT, MS SQL server at over 150MB PER MINUTE. So, after a 3.5 hour restore, I can go to the bar and get a few brews while you sweat away for almost ANOTHER ENTIRE DAY. Oh, and get this. I can buy RAID tape devices that are supported under NT, and restore at up to 600MB per minute. Anybody want to restore a 30GB db in less than hour? I do. 3. '~use restores on backup db' Why not use any one of many excellent database mirroring/synchronization products on the market? Setup both servers, specify replication partners, and don't worry about it anymore. You may want to learn how to manage enterprise applications before spouting off on them. And before anyone flames me as a M$ booster or something, let me say that I do actually have and use *nix systems in my work. However, they do not run my enterprise messaging applications, databases, etc; they are for development processes because my company's clients require us to test on compliant systems. (I also use one of them for security / penetration testing. The tools developed for the *nix platform are better than on NT. However, I believe this is due to a loyal fanbase of long-time *nix users, and not because of any perceived flaws or inequalities of other os's. EvilGrin Fighting misinformation wherever it can be found.

Kernel enhancements in 2.4 by mauryisland · 2000-07-25 21:00 · Score: 1

It seems that anyone terribly interested in running a large database on a Linux platform may wish to wait for the 2.4 kernel to arrive, as it adds support for raw devices, file sizes over 2G, tons of additional ram... It generally scales better for this type of work. Check out this link for a listing of the new stuff.

Are you kidding? by dilyard · 2000-07-25 20:42 · Score: 1

What else are you thinking? NT? I suppose you could put a small database like this on NT, but if you have the option to use Linux instead, do it. Any flavor of Unix would be preferred, but Linux x86 will give you the most bang for the buck. As for Sybase... If you can afford Sybase, then you can certainly afford Oracle. Put Oracle 8i on Linux and enjoy. Why settle for the beetle when you can have the Cadillac?

Re:Are you kidding? by mheaney · 2000-07-25 21:53 · Score: 1

Sure, I'll "settle" for the small, nimble and fun-to-use Sybase against the large and ponderous Oracle. Sybase's flagship 11.9.2 enterprise server is available for Linux and all major flavors of Unix. I'm running a 26GB Sybase server on a dual-CPU Dell box with four SCSI disks and 512MB of RAM. Performance is outstanding, and it requires practically zero maintenance. Stick with Sybase.

hardware raid5 by ArchieBunker · 2000-07-25 21:50 · Score: 1

Unfortunately the industry standard adaptec cards are not supported by linux. You are stuck with 3 year old UW scsi hardware. Check what raid cards linux can use, theres only a handful.

--
Only the State obtains its revenue by coercion. - Murray Rothbard

IRS by Dungeon+Dweller · 2000-07-25 22:45 · Score: 2

Yes, yes it is. I believe that UNIX is the OS that the IRS uses for their database, which is many many many terrabytes in size. My friend's father works for them, and we discussed this in my file and data structures class.

--
Eh...

Sybase/Solaris still affordable... by ironduke-particle · 2000-07-26 03:00 · Score: 1

My employer, whom I shall not identify, is a reseller. We were once asked to quote on a set of big Sun systems for hosting big Sybase databases, as part of a bidding competition run by a vendor on behalf of the customer. My MD, being the sort of guy he is, read the spec, reverse-engineered the *actual* customer requirement, wrote a new spec, and submitted it with a quote. Significant features of the new spec: 60% less cost to the customer; substantially more functionality; higher margins for us.

Presently, the bidding competition was restarted by the vendor, with a spec remarkably like the one my MD had written, but priced with substantially lower margins.

Conclude: [1] Sun can be even worse than Apple at shafting their strategic partners. [2] Take your spec and the quote you got given to someone who actually knows what they're doing.

Re:30G??? Try 10T...or 130TB by gowdy · 2000-07-26 09:36 · Score: 1

Things change with scale. My experiment, BaBar, has about 130TB in our Objectivity object databases at the moment. It grows at about 10MB/s.

Most of our servers are on Solaris, although we also support Compaq TruUNIX64 and Linux. There is a HPSS backend as we only have a few TB of disk.

We've had some problems bring up sites which use Linux servers, but I don't think any of these are different than the problems we had to solve for Solaris (we gave up on HP a long time ago).

Re:Uuups, a few clarifications by Tassach · 2000-07-28 04:28 · Score: 2

You seem like a pretty clueful DBA so I'll won't reiterate anything you can easily pick up by reading the documentation.

I'm in the middle of doing a feasibility study of migrating our flagship database (~30GB ASE 11.5) from big-iron AIX boxen to commodity x86 boxen running Linux / ASE 11.9.2

I have not found the dump/load incompatibility to be a major hassle. If you tune your Linux box for fast BCP the load shouldn't be too painful. As an alternative, you might try using DBArtisan from Embarcadero Technologies. It has a migration feature that makes moving data and schemas between servers very painless. It is well worth the price ($5000, IIRC) - it will pay for itself quickly in time savings alone

In my test setup, I was able to move our 30GB database from the AIX box to the Linux box in about 10 hours, which fits within our normal scheduled maintenance window. The AIX box is a 4-way RS/6000 box w/ 1 GB and all the storage allocated as virtual partitions on a RAID-5 array (I didn't set this up). The Linux box is a quad Xeon w/ 1 GB of RAM and 8 drives; I'm using raw partitions and doing my mirroring manually from within Sybase. DBArtisan runs on an Athalon 550 w/ 128MB under NT Workstation.

The AIX box is a little simpler to manage, because the old DBA had all the tables on the default segment. Even though it's more work, I prefer to hand-tune the database and place the big and/or active tables on their own segments & devices. Needless to say, you need to be comfortable using sp_placeobject & sp_partition to take this approach. I find that the extra effort setting up the server pays off in the long term in performance and reliability. Barring the difference in the physical storage strategy, I don't see any factor that makes ASE on Linux more difficult to administer than ASE on any other flavor of Unix. Actually, the OS-level administration is simpler in Linux than in AIX, IMHO.

Since you say this is going to be a data warehouse system, you REALLY want to use partitioning so you can take advantage of parallelism. Re-read chapters 13, 14, 15, and 17 of the Performance & Tuning Guide before you start, you'll be glad you did.

I don't know what your uptime requirements are, so I can't say if Linux is robust enough for you. If you need rock-solid 24x7 availibility, I'd say stick with big iron and commercial Unix. If you don't need to be bulletproof Linux should be fine. For us, the cost savings are worth the slightly higher risk. As I write this, our Linux test server has 63 days uptime and has survived several stress-tests with no problems, so reliability hasn't been an issue so far. Linux performance seems to be on par with the AIX box so far -- but the database is not the bottleneck in our system.

"The axiom 'An honest man has nothing to fear from the police'

--
Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?

Interoperability and limits by Gruturo · 2000-07-26 03:35 · Score: 1

Well well..... ive been using Sybase on pretty hefty (100+GB)databases, and see no problem in this.

I'd rather be worried about interoperability and limits: first of all, sybase will NOT load database dumps made on different platforms (Sybase on linux wont accept a dump from Sybase on Aix. Actually, it won't even work between Winnt/Intel and Winnt/Alpha, AFAIK).

This is pretty *bad*. If you only have to get a few tables mirrored, you could use BCP (its Bulk Copy Utility) to periodically dump those to plain text files.

Otherwise, you could also try Replication Server, but that's an unknown animal to me.

The other limit is size: if you want reliability, you HAVE to use raw devices, otherwise you risk corruption in case of server crash (believe me, it DOES happen).
Under Linux you are currently limited to 2GB per raw device, so, with sybase's limit of max. 256 devices (with 6 already used), you have up to 500GB for an (unmirrored) database. Seems a lot, but it's not enough for todays biz needs... I keep seeing more and more multi-terabyte DBS (ahem.... most of them Oracle on Sparc Solaris.)

--

Vacuum cleaners suck. Kings rule.

Re:Interoperability and limits by Gruturo · 2000-07-26 22:24 · Score: 1

Thats not a Sybase limit (although I remember a nasty bug in 11.5/AIX which prevented using devices over 2GB). It's a current Linux limit, I hope it will go away real soon.

--

Vacuum cleaners suck. Kings rule.

30Gb+ by drfrog · 2000-07-26 01:32 · Score: 1

this was discussed on the postgres sql list about a year ago synopsis postgres is scalable best os for this would be probably freebsd ive run a db over the 1Gb range and it was smooth as long as your indexes and sql statemnets are well made

--
back in the day we didnt have no old school

Database Mirror Utility by Sherman+Peabody · 2000-07-25 20:42 · Score: 1

I agree with the posters above that say

Yes, Unix can handle it, and Linux, too.
Don't skimp on the hardware
Your main costs long term will be admin related
Changing databases is a pain

I don't know Sybase at all. However, I know Oracle. Oracle has a utility that will automatically mirror a database on another machine placed anywhere you like. As the master database changes, the mirror database takes the archive logs (logs of every change to the db) and automatically applies it to the remote database. The remote DB constantly acts as if it were recovering from a crash and applies the archive logs. This way the remote database is an exact copy of the master, with a slight time lag depending on how often you create an archive log file.

I don't know if Sybase has anything like this, but I bet they do. Ask your Sybase rep, you'll make her day.

Clarity of Expression by Gothmolly · 2000-07-25 18:59 · Score: 2

Linux is not all Unix. Just like all Linux is not RedHat.

--
I want to delete my account but Slashdot doesn't allow it.

Second on the Filer by The+Big+Bopper · 2000-07-25 20:15 · Score: 1

I can seriously recommend the Network Appliance Filer for back-end storage.

I can second that enthusiastically. The Filers have NFS performance that can often exceed local disk performance (on gigabit ethernet). The NFS performance of a Sun E450 with 280GB RAID 5 array pales next to a similarly configured Netapp Filer.

--
Screw Micro$oft.

Re:Three words:with three words by Ian-K · 2000-07-25 21:40 · Score: 1

I'll agree on that.

Yet, as Alpha hardware aren't that cheap if you want to build a decent server, I could also suggest a parallel system of cheaper IA32/64 boxes.

A friend of mine was building something similar and since these people aren't going to be modelling fluid flows (which would more or less require a cluster of Alphas), a beowulf cluster of some Athlons should do the trick.(I don't know how much FPU-intensive are database applications. I guess they shouldn't be that much, so even Cyrix's should also work well).

The node machines don't have to contain much: 1-2 processors and 1/2 to 1GB of RAM (depending on number of machines) and some (preferrably) fast network card (Ethernet for cheap, Myrinet or similar if you're serious about it).

I have no experience with such database deployments, but a cluster after all might not be as bad an idea as some here have suggested.

Trian

--
I'm no longer fed up with MS Windows: I go rid of them :)

Re:Of course it can by john_many_jars · 2000-07-25 21:34 · Score: 2

Sybase takes control of the HDC so it is not a filesystem file. Further, only a dunderhead would want 1 sement that is 30G in size. Sybase is smart and I have had no problems developing on Sybase for Linux and have been using it to play with for several years--never put too much strain on it.. most I have put in at one time was 6 G. Seem to handle it, though.

2Gb filesize == old information by cthulhubob · 2000-07-25 21:14 · Score: 1

The limit is several Terabytes now. I know the large-file patch is in the 2.3 series development kernels, and I think it was back-ported into 2.2.14 and up.

--

In post-9/11 America, the CIA interrogates YOU!

what about other databases? by Thu+Anon+Coward · 2000-07-26 01:59 · Score: 1

Scanning thru the posts, I noticed that no one even mentioned Pervasive (aka Btrieve). It is a solid database system that cannot be beat for performance. File sizes scale to 64gb for now and will scale even higher in the future.
-It ships with RedHat and is a DAMN sight cheaper than Oracle.
-The engine is built into Netware 5.1 and runs NDS, client access licenses, etc.
-Eleven of the top 10 accounting packages use it (Peachtree, ACCPAC, Macola, Platinum,Sage,DAC Easy,etc). ARC Serve for Netware uses it.
-A 10-user license runs less than $1000!

Before porting to all those other expensive packages, look at Pervasive.SQL first and then make a judgement.

--

I'm good with numbers - .45, 7.62, 9.....

Re:what about other databases? by Thu+Anon+Coward · 2000-07-27 03:20 · Score: 1

Why would I use the PCC at all? Tango has nothing to do with this, and most packages as mentioned in my previous post are NOT moving from the PSQL engine.
As for being an embedded vs. non-embedded engine, so? if it does the job, does it matter?

--

I'm good with numbers - .45, 7.62, 9.....

30GB on the x86 platform is doable. by iamabot · 2000-07-25 23:03 · Score: 1

As many of the other posts reflect, you get what you pay for. For mission critical apps (read -> your database) you want redundancy throughout the system architecture. Disk drives are not the only thing to worry about, the architecture for x86 may not be able to support the data transfer rates you are looking for, additionally there may very well be abmismal support from the vendor for an x86 implementation. My advice is spend on the hardware, a poor db implementation can not only cripple your operations team but also make your career shorter than you may have otherwise planned. /bot

Size doesn't matter - IO does by johnlcallaway · 2000-07-25 19:59 · Score: 4

I hear this question a lot, and I am really tired of it. It doesn't matter how big the database is, but how much it is going to be used. If I created a multi-terabyte database (can you say p0rn?) that only had one user, sure Linux/Intel could handle it. But throw it up onto a network with millions of requests per hour, and the equation shifts. Could you build an Lintel box to support it?? Lets see....

Here are the priority items for any database box --

Memory. Databases love memory for cache, logs, etc. If you can keep your entire database in memory, disk speed becomes irrelevant after the first data access and for writes. If your box only supports a couple of gigs of memory, move on. We have boxes with 4GB of memory, and our DBA wants more.
Disk Bandwidth. The more disk bandwidth the better. Several little disks scattered about multiple SCSI controllers will usually perform better than comperable aggregated large disks. Don't even think about using IDE/EIDE
IO Speed. The faster your disks, the better (Duh...) Again, disk size can play second fiddle to disk access times. I would rather have many small, fast disk drives than one large, slow one.
CPU speed. Did you notice this was last??? Face it, if you can't keep it in memory and your disks aren't fast enough for your processor(s), then the CPU speed isn't as relevent
Network bandwidth. Most computers do not have issues here. However, there is overhead pushing data over a network, and the more data you push, the need for network bandwidth increases to respond to requests.

It is also a good idea to seperate application/web servers from database servers. All modern databases support the ability to service database requests over a network. Providing a unique network solely for database activity that is seperate from the user network is common in most shops now to support the data movement from database servers to the app servers.

The game all sys admins and DBAs perform is finding the current bottleneck. There is always a limiting factor for performance, and it can usually be tied to one of the above items

Determining a configuration to support a database is not easy. You need to gather usage predictions, such as number of concurrent users, read rates, update rates, log projections, and make a guess. You also need to know your target audience and how they access it. A million requests spread over 24 hours is not the same as a million requests in a short period.

This is only a sig, this is only a sig.....

--
I rarely read replies, it's my opinion and if you thought about your opinion a little more, I'm OK with that.

BSD by Britz · 2000-07-25 19:54 · Score: 1

I don't want to argue if Linux or BSD is better, I just heard, that Linux has its strength in supporting a larger variety of hardware and there are more applications out there written for Linux, but BSD still has an edge when it comes to bigger loads. So when you just want to run a database and still didn't purchase any hardware you might want to look at the possibility of using BSD. http://www.openbsd.org/ focussing on security http://www.freebsd.org/ fast and reliable http://www.netbsd.org/ most inter platform Of course, the GPL rules and BSD is not GPL ;-)

Storage Considerations by ccGecko · 2000-07-26 05:49 · Score: 1

Before I get to the storage, yes Sybase works on Linux, and yes, cross-OS data migration is possible (and actually not that hard) with Sybase. Where I work we replicate a production Sybase database from AIX to a reporting server running HP-UX. Multi-hosted, network-connected databases is one of Sybase's strengths.

Anyway, on to the storage. Sybase works best when you give it raw devices, which if I remember correctly Linux doesn't support (yet). So, your stuck with a filesystem. I'll let other, more competent linux fs folks advise you there. Databases stress two things hardest: memory bandwidth and disk I/O. Memory bandwidth can be best dealt with on x86 boxes by getting Xeon-bases systems with as much L2 cache as you can afford in addition to as much main memory as you can afford. As for disk, forget IDE. Go SCSI or Fibre Channel all the way. Definitely use RAID, but before you choose which RAID level, consider your usage of the database. If 80% or more of your transactions are read-only, then RAID 5 is okay. If more than 20% are write, DO NOT USE RAID 5. You will regret it. Every write on a RAID 5 volume requires 2 reads and 2 writes to the physical disks. You will notice this big time once the write mix passes 20%. In this case use RAID 1+0 aka RAID 10. This is different from (and significantly better than) RAID 0+1 for reasons I won't go into. Use hardware RAID. Without a ballpark on your budget, I have no idea what is realistic, but get a hardware RAID system with as much cache as possible. Spread the RAID volume across as many physical drives as possible. One last thing: spend some time developing a solid backup strategy. This step is so often overlooked because it doesn't affect you until you have a problem. Don't make that mistake, and most other problems can be recoverd from. Good luck.

Re:two more words: by stephenbooth · 2000-07-25 19:17 · Score: 1

If you want to know more about Oracle on Linux then check out Oracle Technet. You will need to set upo a login but then can view documentation and download development versions to try out.

It does have a fairly hefty disk foot print (about 600Mb IIRC).

Oracle should be able to handle, in terms of size, whatever the hardware can handle. It also supports raw volumes.

Stephen

--
"Don't write down to your readers, the only people less intelligent than you can't read" - Sign on Newspaper Office Wall

Not a hard questions to answer by xtheunknown · 2000-07-25 23:33 · Score: 1

The quick answer is yes. 30GB is not an overly large db. In fact it is puny compared to some I have seen, mainly an IBM customer test database that spanned 6 large AS/400's and totalled 6TB, or 200 times the size in question. So, a reasonably hefty x86 system, say 4xPIII 850MHz with 2GB memory and enough disk should do, and not be that expensive.

The db software is your problem and I'm not sure mySQL on Linux could handle it, but Oracle could, or you could use Solaris on x86 where db products are much farther along.

--

They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.

Re:Interbase? by j-pimp · 2000-07-26 13:11 · Score: 1

I know of companies who rewrite the NT kernel so it won't crash, so I guess anything's possible.
I'm perfectly happy with linux and FreeBSD but where can I get one of these NT kernels? Its good to know because sometimes outside forces tainted by evil force you to use microsoft products.

--
--- Justin Dearing http://www.justaprogrammer.net/ We're just programmers.

Informix for Sinix by romanm · 2000-07-25 20:14 · Score: 1

First you have to forget about Linux and Intel. You may like your PC, but I don't think availability will be good.

That said ... We've been using Informix Dynamic Server with Sinix (UNIX for 16-processor R10000) with little more than 30 GB database on Emc disk fields and a *lot* of traffic. It works for us (knock knock).

Uuups, a few clarifications by CaptainZapp · 2000-07-25 22:47 · Score: 1

First for the sanity check: Of course the original question was if it's capabale of running on Linux/Intel. I've managed telco billing systems which are rather active and rather large by defintion on HP/UX 10.20 that worked just fine. So, yeah: The question is Linux of course.

Getting in another RDBMS is certainly not an option. It's a full, worldwide distributed Sybase shop with a lot of Replication going between the sites (this database would be not within the scope of a distributed system). Even the dump / load incomptibility between HP/UX and Linux (tested) might be the show stopper.

The 2 Gb file limit is not an issue with Sybase. The storage architecture is so, that you create devices (which reside on files or raw partitions). A database can theoretically span 253 such devices (2 are reseved for system databases). So file size is not an issue.

Lastly, there's so much exteremely interesting stuff to draw on and I want to thank everybody who contributed. The gist for me is not to advise to do it, since the headaches just don't seem to be worth the hassle (especially from a manageability POV).

I still think that Sybase on Linux is very viably (we implemented a few test DBs) for smaller scale databases and for non-mission critical data. Initial tests have not identified significant glitches and the whole HP/UX support environment can be used almost 1:1 (provided ksh is used)

--
ich bin der musikant

mit taschenrechner in der hand

kraftwerk

Re:two more words: by oliverthered · 2000-07-25 19:03 · Score: 2

Redhat Oracle distribution.

Though i havn't tried it personaly Redhat do a very good Oracle tailored distribution,it gives Oracle it own partiton and is setup for performance &co. The support is ment to be quite good as well.

I have worked with quite a few DB systems (M$ sqeeel, Sybase, intrabase, as well as the less server based db's postgres, paradox, access &co , and have an Oracle training course comming up soon, it has lots of info on Oracle for linux, but as i havn't been on it yet I can't go into any details.(but this is another story?)

I believe Oracle will also run on other unix platforms, and may have support from other linux distribs other than redhat.

--
thank God the internet isn't a human right.

Journaling File System by FlyingElvis · 2000-07-25 19:31 · Score: 1

There really shouldn't be a need for a journaling file system since the sql server basically does this already through the transaction log.

You said it... by mirko · 2000-07-25 19:12 · Score: 1

> Further, and even more important, this is a major chance to
> convince a global player of the capabilities of Linux.
Show them something big. I believe that they won't really suffer if they have to pay for a machine that would *only* be 5 times more expensive as a supermarket.
They won't suffer as if you take a 1k$ box on the first hand and the 5k$ box that your U/X reseller advises you to take, you are still far from the 10k$ Sun stations.
You are also far from a consultat's weekly bill.
Also, you won't impress "global players" of Linux capabilities by showing them something cheap (even if it is reliable, sufficient and competitive).
If they see that they can manage really big boxes using Linux, then they will have more chances to be convinced of this opportunity.
So, accept the 5k$ proposition (sounds like multix86 processors along with a RAID-5 array ?) and show your boss that Linux is not a toy.
(Linux or BSD, of course...)
--

--
Trolling using another account since 2005.

Interbase? by Cliffton+Watermore · 2000-07-25 20:38 · Score: 1

Interbase is now OSS/Free Software, too. You left it out of the equation.

As for your claim about Microsoft SQL Server running terabytes of data, I'm very skeptical. I'm not an avid Microsoft user, but I've heard several stories about Microsoft crashing when serving more than 30 clients on a Microsoft Access 97 database. So, how NT will handle terabyte sized SQL databases is beyond me, really.

Open|Free|NetBSD/Linux and Interbase will probably be the OSS database combination for the next few months, until Oracle opens the source due to OSS software eating up their market share.

--
"A few atoms won't even light a match" - Dr Jones, 1933

Re:Interbase? by ScuzzMonkey · 2000-07-25 21:49 · Score: 1

Access and SQL Server are two very different animals, the MS connection notwithstanding. There's no real relation between the two, and frankly, anyone trying to run 30+ simultaneous users off Access was asking for trouble in the first place.

--
No relation to Happy Monkey

Excuse me! by Cliffton+Watermore · 2000-07-26 00:44 · Score: 1

Excuse me, lad!!

I am highly annoyed at being adressed in this manner. You have no idea of my qualifications, obviously. Please read my bio.

I've worked at Dow Chemical and contracted at various academic and biotechnology corporations in Europe and the U.S. I have been exposed to large-scale computing equipment, including high-end hardware running DB2.

Although I haven't used Microsoft products avidly, as I've said, I have had some experiences, and they weren't favourable. Although this hasn't included Microsoft SQL Server, my personal opinion - yes, my professional opinion based on my dealings with NT Server - is that NT Server itself would not be stable enough to cope with the loads mentioned.

If someone had to try and do half the stuff that I've had to do at various institutions on NT - I actually shiver when I think about the consequences, considering the platorm wasn't even stable enough to compile some mid-intensity FORTRAN code my team and I were writing for chemical analysis at one firm I was working with. This was a while back, late 98 or so - The machines were Dual PII Xeon 400s if I remember correctly.

Before you question my "Experience", lad, please read my Bio. I'm not putting down your experience - we all have different experiences, and we should share them with eachother in order to build on eachother's knowledge. But please, before you comment, do read my Bio.

--
"A few atoms won't even light a match" - Dr Jones, 1933

Re:Excuse me! by Cliffton+Watermore · 2000-07-26 14:47 · Score: 1

Ummm, can you please point out where I said that MS Access == SQL Server? What I was saying was that NT Server is NOT stable enough to handle too many users. Please produce an affidavit stating that I said that MS Access == SQL. Otherwise, shut up until you have something useful to say.

--
"A few atoms won't even light a match" - Dr Jones, 1933

No.. Three words by twisteddk · 2000-07-25 19:59 · Score: 1

My sentiments Exactly.

I've just finished posting the exact same argument... Oh well, At last I know SOMEONE is on my side (or am I on Yours ?)

--
--- To err is human... Am I more human than most ?

Works by rxmd · 2000-07-25 20:08 · Score: 1

We tested Sybase Adaptive Server Anywhere and Enterprise as well as Oracle (the Linux versions) on SuSE Linux and FreeBSD 4.0, and it worked fine. FreeBSD was slightly superior in terms of performance.

File size limitations depend on your file systems. For both OS's, filesystems are available that handle 30G files easily.

Hardware was a HP NetServer 4 (LX Pro), dual Xeon 400, 512 MB RAM, 5x16GB RAID disk array on an Adaptec RAID controller.

--
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)

More details: Size, Users, Purpose by rxmd · 2000-07-25 20:15 · Score: 1

Forgot some details. Database size was 24.7 GB, but that shouldn't matter. We tested it with about thirty to fifty users in an ASP environment (requests were done from a Citrix server). The general setting was mostly data warehousing. This probably explains why FreeBSD performed better due to its comparatively good responsiveness under higher loads.

--
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)

MySQL and data warehousing don't go together by rxmd · 2000-07-25 20:18 · Score: 1

I wouldn't use MySQL in a data warehousing environment because its features are too limited (no stored procedures or triggers, no subqueries). If you want to do data warehousing, open source DBs are not an option (sad but true).

--
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)

HOWTO: Oracle on FreeBSD by rxmd · 2000-07-25 21:21 · Score: 1

Well, we mainly used the one in the FreeBSD handbook. It comes with FreeBSD, but it's also available on the web sites, for example here in the online handbook or on one of the mirrors. It works fairly well.

--
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)

Re:HOWTO: Oracle on FreeBSD by rxmd · 2000-07-26 05:10 · Score: 1

No. We upgraded it later on, more or less manually. I wouldn't write a HOWTO on that, though, because it was rather informal and not overly systematical. It was only for evaluation, after all. :-)

--
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)

Re:No, only Microsoft SQL Server can do it. Period by dodo-lodo · 2000-07-25 20:06 · Score: 1

so if you don't know how to work with a database you should keep your hands, off. Even if their is anice GUI. Period.

DATAFLEX by freediver211 · 2000-07-25 19:42 · Score: 1

A Dataflex DB running on a UNIX OS would have no problem handling 30GB. See www.dataaccess.com ... This really is not a big DB compared to some of the other dataflex / UNIX sites out there...

Unix or Linux? by photon317 · 2000-07-25 23:59 · Score: 1

You ask if "Unix" can handle it, then later if Linux can. Of course, commercial unices can easily handle things as much as 1000x (Yes, I've got one running that big about 30 feet from me)that size, given enough money.

30GB is really a trivial size for a database in the modern age. I think that Linux/Oracle can _do_ it today, but probably not as well as a commercial solution (this is not flamebait: I love linux and can't wait for linux to be the top DB platform, but I don't think we're there yet).

Of course, if you're looking down a long road of development, testing, deploymeny, and then maintenance... by the time you get any significant distance down that road, there should be packaged-up ready-to-go Linux-2.4/Oracle machines that could really blow a competitor away. . . In wihch I definitely start working with what you can in the Linux/Oracle field today.

--
11*43+456^2

Re: 30 gig no problem in HP/UX at least by HP-UX'er · 2000-07-25 19:56 · Score: 1

w3rd!

Re:50GB on Linux? by TwoFlower69 · 2000-07-25 18:52 · Score: 1

But I am not sure if it will work, long enough. Anybody expirienced fast oracle under linux? what kind of hw did you use? Greetz Two

~30Gb Sybase Database by .foreward · 2000-07-25 20:20 · Score: 2

I would stick with Sybase. There are no major advantages to going over to Oracle, and 30Gb is not that big an issue. My advice would be to base-line this as a dual CPU box with at least 1Gb of memory. ASE 12 does nice things like "companion" mode to provide you with the other copy. You could also use Rep-Server to provide replicatition as the means to have a second copy. That is the nice thing about Sybase, there are many flexible options to solve a given problem. Like many of the other comments in this thread, I would stick with Sun on the box side as you pay for the reliability. I would also suggest that you get professional help on the design and architecture side.

Quick Other-side... by Pyre · 2000-07-25 22:30 · Score: 2

Just a thought - you could deploy something that is, in-fact, mission critical on Linux/ix86 - and it wouldn't even cost an arm and a leg. (Just a leg, perhaps.)

See:
Mission Critical Linux
Oracle

Re:No, only Microsoft SQL Server can do it. Period by j_skillz · 2000-07-25 19:31 · Score: 1

MS SQL server is not the only system that can accomplish this task. MS SQL server is one of the weaker db products. Oracle is 20 times more powerful and can run windows, unix, or linux.

PostgreSQL certainly will by Karora · 2000-07-26 04:56 · Score: 1

PostgreSQL will certainly handle databases this big. On Linux there is a 2GB file size limitation (being removed in 2.4, I believe) but PostgreSQL will split it's files at around 1G anyway to get around this.

There are other filesystem limitations that may have to be worked around with various Unixes, but managing to get a partition of 100GB or greater should be achievable. In Linux you would probably use logical volumes, but you could simply do it with links if you wanted.

--

...heellpppp! I've been captured by little green penguins!

sybase ase on linux, considerations. by scroe · 2000-07-25 23:01 · Score: 2

there are currently two versions of sybase for linux to concern yourself with. 11.0.3.3 and 11.9.2. one of the biggest differences is that 11.0.3.3 does not support raw partitions while 11.9.2 does. you will get much better performance from 11.9.2 using raw devices, also it lends towards better data integrity.
as for your system, you'll be amazed at how much you can accomplish with linux/intel. there are only two components that you really need to worry about, CPU busy and IO busy. if the system that currently houses your database is running sql, then you can run this to get an idea of how to set up a like system:

1> declare @loop_var int
2> select @loop_var = 0
3> while @loop_var begin
5> exec sp_monitor
6> select @loop_var = @loop_var + 1
7> waitfor delay 'yy:yy:yy'
8> end
where x = iterations and y = the delay in hr:min:sec.

run this during a "peak usage time", have the results dump to a file using the -o param and then
take a look at the CPU and IO. you'll get something like this:

cpu_busy io_busy idle
---- -------------------------
3(0)-0% 0(0)-0% 13863(5)-100%"

this is a sybase ase running on red hat at idle. during production you will want cpu_busy to be in the range of 60-70% as this allows for some growth, if you hit 80% or more start planning for more cpu power. conversely, if your io_busy is getting hit hard it may indicate problems with your network configuration, or that your device configuration needs tweaking. poor performance from a sybase server is not always cpu related.
i run a 10GB DB on an intel pIII 600(ish) with a 1/4 GB RAM and my cpu_busy sits around 65% most of the time. except when users try to dump the contents of their windows "c" drive into the database...grr.
hope that helps, ymmv of course.

-scroe

Nobody heard of Progress RBDMS? by LinuxBuddha · 2000-07-25 23:41 · Score: 1

Progress runs on many platforms, including SCO Openserver and Uniserver, DG-UX, RedHat, and HP-UX. We are talking MULTI-VOLUME databases. Can be spilt amongst many machines and hard drives. Up to 30 TB! Clustering anyone?

Re:Three words:with three words by jaraco · 2000-07-26 06:18 · Score: 1

Are you suggesting they mirror their database with mySQL, which doesn't even support transactions?

249 comments