Slashdot Mirror


Linux Databases with Huge Tables?

Eugene writes "I am working on a project to create an expansive network monitoring and all encompassing database solution. With archival and ageing of all data we are looking at a database that grows by 40Gb a year, with some tables being well over the 2Gb file limit in Linux/x86. We have narrowed ourselves down to Oracle 8i (for it's stalwart referencial integrity checking, and PL/SQL for tracing across aged route information) and PostgreSQL (for it's object polymorphism and CIDR data types for IP addresses). We are concerned with the robustness of Postgres compared to "grown up" Oracle, as well as file size limitations. Can anyone give us any advice on these issues? "

220 comments

  1. Oracle 8i on FreeBSD 3.3? by Anonymous Coward · · Score: 0

    Has anyone done this yet?

    1. Re:Oracle 8i on FreeBSD 3.3? by Anonymous Coward · · Score: 0

      FreeBSD for their network appliance. Last I heard, they were using NetBSD as the reference OS for their appliance.

    2. Re:Oracle 8i on FreeBSD 3.3? by Anonymous Coward · · Score: 0

      Your information is incorrect. NCI's OS is based on NetBSD. NetBSD already had the port to the SA110. Just think about it, which OS would be more easily ported to a new arch?

    3. Re:Oracle 8i on FreeBSD 3.3? by bifrost · · Score: 1

      They are using FreeBSD, I have it from a reliable source inside of NCI. We've talked about this at the BAFUG (Bay Area FreeBSD Users Group) Meetings quite a bit.

    4. Re:Oracle 8i on FreeBSD 3.3? by bifrost · · Score: 1

      Chances are they have had more than one NC at NCI.
      I wouldn't be suprised if they had StrongARM based units, but I believe the inital run was done with FreeBSD. When I was first in the know about this, it was late 1997/1998, so things may have changed quite a bit. Its quite likely that NCI switched to NetBSD to switch architechtures easily.

    5. Re:Oracle 8i on FreeBSD 3.3? by dennisp · · Score: 1

      this may help. I haven't tried it myself yet though, because I run Oracle on Solaris. I heard that they haven't released it as a native binary yet because they were having performance problems that were only recently fixed in the latest release and -CURRENT. I have no idea how one would get ahold of them if one wanted a native FreeBSD version.

      It's also rumoured that Oracle's Network Computer division is using FreeBSD for their Oracle 8i appliance.
      ----------

  2. IBM DB2 for Linux? by Anonymous Coward · · Score: 0
    Have you considered DB2 for Linux?

    The current version is 6.1 and I beleive you can download a version for personal use from www.software.ibm.com, that you can use for comparison.

    1. Re:IBM DB2 for Linux? by Anonymous Coward · · Score: 1

      I would agree. Of course, I like DB2, but I can point to one major advantage that DB2 has over Oracle -- when something goes wrong, DB2 just gets slower and slower. Oracle will commit hari-kari and cheezle your data but good. And yes, I have a few years with both, and Informix as well (which isn't bad, better than Oracle, but without IBM's wonderful documentation). I have been very pleased with the DB2 betas, although I have been playing with 60GB of data and working mostly with huge numbers of small (1.5MB) files. I have had no issues at all. I will have four large FC-AL disk stacks to play with next week (8x36.4) and plan to grab stuff off of the EMCs feeding the S70As and see how far I can push it with decent performance off of a small Netfinity with 1GB RAM.

      I completely understand the position of this guy. We have several TB of old data, some of which we are having to rent MVS time to process, a lot of which comes from companies that we have taken over, and all of which really needs to be sorted and folded into our setup (AIX, DB2, SAP) within the next few years while people still remember what they were doing and what the *gotchas* were with their accounting and all. I think that Linux/DB2 will do the job as long as people are really using it lightly (relatively) and Linux DB2 is "real" DB2 -- from Linux to AIX without a hiccup.

      Of course, with a less insane load (or a nice four way box), I am sure that things would run better, but they aren't doing to badly now for me.

      If I can get another 1GB RAM, I will be able to add more users, but for the time being Linux DB2 is running like a typical IBM product -- slowly, but with the reliability of a Cummins deisel.

    2. Re:IBM DB2 for Linux? by the+eric+conspiracy · · Score: 2

      DB2 has a big advantage over Oracle - the documentation is human readable. It is also a mature product, which is very important ofr a large scale mission critical application. IMHO, I would choose between DB2 and Oracle if I really cared about my data.

  3. 4D by Anonymous Coward · · Score: 0

    4D is very nice, though I don't know what platform you are planning to use. I'd have to say that Oracle 8 on a Sun Enterprise running Solaris is the best you can get, but it will cost you an arm and a leg.

    1. Re:4D by HiThere · · Score: 1

      They may have fixed this by now, but a couple of years ago there was no way to delete a variable from a table. You could add them, you could rename them, but you could not delete them. I found this incomprehensible.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    2. Re:4D by hey! · · Score: 3

      Except that from what I can see, four out of five 4D projects end in a bloody train wreck. I know, I've been involved in a number of rescue missions for 4D users.

      4D has a number of major problems.

      First, it is not relational in a technical sense; that is it does not allow for non-procedural programming. It uses a kind of foreign key structure to "relate" (ugh) one table to another, but the programming style is based on traversing links, and it implements that badly (imagine programming in a system which only allowed you one cursor). The programming style this enforces is much more like a hierarchical or network database. The lack of cursors means that 4D programs show poor modularity; whether any piece of code works depends on the global state of the system. This results in code that behaves inconsistently. Furthermore the weak object model tends to encourage use of global variables to control system behavior.

      4D was a database designed by people with only a vague understanding of database technology, and this shows by their confusing tedency to use technical terms like "project" (not as in project management but as in "project an image") but in ways completely unrelated to their accepted meaning.

      The 4D file format is at once highly fragile and stunningly slow. Past work I've done with recovering crashed 4D database indicate that it depends critically on internal file pointers, which tend to get corrupted. The internals are not documented, and the tools ACI gives you are utterly inadequate. You absolutely _must_ buy third party tools from people who have reverse engineered the structure, and you absolutely must run these any time the database has terminated abnormally. These tools are very slow. So after any abnormal termination, you can count on much down time. Whats worse is that the combination of slowness, and the lack of anything like SQL, means that designs almost always have to carry acucmulator fields in tables that must be updated (to keep monthly totals etc). So in addition to running the third party tools, you have to gin up your own tools to fix these fields.

      The 4D developer environment is one of the most poorly thought out I've ever seen. You MUST lay out your database in a graphical layout window, and "relate" your tables using a connection tool in this window. This works fine for three or four tables, but in a system with twenty, thirty or more tables it often takes minutes of painstaking mousing to figure out the "relationship" between two or three tables.

      This points out 4D's real weakness. Things like the "current record" model of programming and the graphical table layout make it fairly easy to get started with a small system, but unless that system stays tiny (in complexity AND number of records) you are bound for trouble.

      I could go on and on about the shortcomings of 4D; but I know some developers like it. Different strokes, I guess, but one thing is absolutely certain: 4D is a very BAD choice for large, mission critical applications.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  4. Re:SQL 7 by Anonymous Coward · · Score: 0

    Even Sybase Anywhere 5 is faster than MSSQL7. My company is switching over some clients and some of the queries that produce very large results sets take 30 seconds (two on Sybase 5) using the same computer and ANSI Sql.

  5. Useless by Anonymous Coward · · Score: 0

    Perhaps the fact that mysql has no transaction support to speak of is the limiting factor. C'mon, they obviously need that sort of thing, to suggest a database without it is ludicrous.

  6. The topic is Linux databases... by Anonymous Coward · · Score: 0

    And M$ released SQL Server 7 for Linux when?

  7. One word: Transactions... by Anonymous Coward · · Score: 0

    And MySql doesn't have them. PostgreSQL
    does.

    -- cary

  8. Two Comments... by Anonymous Coward · · Score: 0

    1) The place to go for the scoop on how PostgreSQL
    works is the pgsql-hackers mailing list at
    http://www.postgresql.org.

    2) Have you *priced* your possible Oracle
    solution? Remember, that nice free Oracle8i
    CD they mailed to us all earlier this year
    was for evaluation only. I'd be interested
    to know how much it would cost.

    -- cary

    1. Re:Two Comments... by Speed+Racer · · Score: 2

      We just finished talking to Oracle and a VAR for Sun. A simple setup to support ~30 users was going to be anywhere from $500,000 to $2,000,000 depending on the hardware configuration. The actual hardware price was well under $100,000 so you do the math.

      --
      Free Mac Mini. Yes, I'm
    2. Re:Two Comments... by Loof · · Score: 2

      Oracle 8i Enterprise is $200 per Mhz. Dual P2-450 = $180,000 Support is extra, so is a bunch of other useful to necessary options. You need to get at least it, support and 'Programmer' to do anything useful. Fun fun fun. Other databases are also stratospherically priced. Informix is around $50K per CPU (for the new 2K version) And Sybase is pretty close to that, maybe a little more expensive. DB2 is $12K per CPU last time I looked. None of them are by any measure close to what would be considered cheap. Or, in most books, affordable.

  9. I hear it's faster. by Anonymous Coward · · Score: 0

    Raw disk I/O is faster, isn't it? That ought to make it worth the price of addmission.

  10. LANdb by Anonymous Coward · · Score: 0

    (Ack, forgot my /. acct info...) I'm currently working on a similar project, which is primarily based on MySQL, but easily portable to other RDBMS's. Totally web-based, all Perl, etc. Coincidentally, I released v0.70 this morning. Perhaps some of what we've done can help... Here's the site, if you're interested.

  11. Digital UNIX does so rock. by Anonymous Coward · · Score: 0

    The big boys _HAVE_ released databases on Alpha, just not Linux on Alpha. If you like Linux, you will love Digital UNIX, er, Compaq Tru64 UNIX that is. I would LOVE to see AdvFS on Linux, but until there's a solid journalling online-resizable filesystem, Tru64 is the server OS for me. With AdvFS, I can rearrange the disks under filesystems with the filesystems online and Oracle running.

    I run Oracle on Alpha every day, and I am so very rarely CPU bound. If you can afford Oracle and the box to put it on, you can spend the extra $$ and get Tru64 UNIX to go along with it. You will thank yourself when your Linux box goes into the fsck-boot-fsck-boot death spiral. Your favorite CD ripper might not work on Tru64, but that's not what a production database server is for anyway.

    OBTW: If you don't need a feature unique to Oracle 8i, you would do better to install 8.0.5.1; it's much less buggy right now.

    On a side note, the 2G filesize limit on 32-bit Linux, while annoying for dealing with large media files, isn't a limitation for Oracle on Linux.

    And before the flames heat up, know that I'm writing this on a laptop running Linux, with Oracle 8i installed. Linux is great for many things including web servers and development database servers, but putting 40+ gigs of Oracle data on ext2 sounds risky to me.

  12. Book(s) about MySQL??? by Anonymous Coward · · Score: 0

    I like what I've seen of MySQL and have started developing an application using it that is expected to generate similar volumes of data.

    However, I have not been pleased with the only book I've been able to find about cgi - MySQL programming ("mSQL and MySQL", reviewed positively on Slashdot recently).

    The only things that have made it possible to proceed are: considerable experience with SQL, considerable experience designing / developing database applications and the MySQL reference manual. Morris

    1. Re:Book(s) about MySQL??? by 2sheds · · Score: 1

      I have to agree with this completely; the O'Reilly book was very disappointing. Not a patch on the on-line docs.

      james

      --

      Absit Invidia
  13. Re:Let the database do it! by Anonymous Coward · · Score: 0
    Last I heard, maybe it's changed, Linux didn't support raw partitions.

    Technically, it doesn't support "uncached" partitions. I don't know if the buffer cache gets in the way of DB, performance-wise.

  14. Solaris is the answer. kernel 2.2 will have ...... by Anonymous Coward · · Score: 0

    I am not a linux basher but after reading the docs about kernel 2.4 I realised that 2.2 still has major scalability problems. linux can't even handle more then 1,000 processes if I remember correctly. Even if its not 1,000 the limit makes anything liek a huge database totally inapropriate for linux.

    I also heard that microsoft next fud attack will involve sql server on NT vs oracle on linux. Sql server could easily beat the pants off of oracle/linux. Oracle/unix is a totally different subject matter though.

    Wait until kernel 2.4 or use solaris x86. It can run linux binaries unmoddified. If your bussiness has the dough buy a sun box with solaris because the solaris native version of Oracle will probably run faster.


    Again I stress that I use linux at home and its a great OS but the limits on processes and the current resource manager in the kernel is quite bad for huge machines with many processors. This wont effect workstations and small servers like snedmail so linux is great for those tasks. I want to wait it out when 2.4 hits the streets.

  15. Flat files and hash tables! by Anonymous Coward · · Score: 0

    Wouldn't the true hacker just slap down the data sequentially in a file or disk and come up with his own access schemes?

    1. Re:Flat files and hash tables! by Anonymous Coward · · Score: 0

      Yes, if he's being paid by the hour. If on the other hand (1) he is selling a solution instead of a long drawn-out maintenance nightmare, or (2) there is beer to be consumed after successful implementation, then a prefab RDBMS wins every time.

      Not to say hash tables aren't cool.

    2. Re:Flat files and hash tables! by Anonymous Coward · · Score: 0

      Only if s/he was a piss-poor hacker and didn't know any better. Nobody else would be stupid enough to reinvent the wheel.

    3. Re:Flat files and hash tables! by Anonymous Coward · · Score: 0

      And you are a visual basic monkey from the local high school, right? The last time you had to manage a b-tree or build a hash table you quit your job and bitched about your manager for six months afterwards, right? You are the greatest programmer on earth as long as there is a VBX control for what you want to do, right? When people reinvent the wheel you get nervous because it might be different then the wheel you know how to use, right? Even the best SQL engine doesn't have the understanding of the database content that the programmer does. Your approach is the bloated Microsoft coder approach. Real programmers, the old school types that have needed to count how many cycles are in the path of a critical piece of code at least once in their life, can make the best judgement wheather flat files/hash tables are sufficient or if something more extream is needed.

    4. Re:Flat files and hash tables! by Anonymous Coward · · Score: 0

      SQL databases provide complete functionality for everything, whereas someone who reinvints the wheel can provide limited functionalty. For example, if your keeping transactions, and your only gonna look them up by date, you can shuffle your data in a hash table that will be faster than any SQL database.

  16. Re:Be aware of the complexity of Oracle by Anonymous Coward · · Score: 0
    'How to stop defragmenting and start living'

    Is a URL for this document available? Nothing on http://technet.oracle.com/ or http://www.google.org.

    Regards, Dirk

  17. ITASCA by Anonymous Coward · · Score: 0

    www.ibex.ch

    Available now for Linux. The DB of choice for `impossible' requirements [not talking about size here].


    17.5 Size Limits in release 2.3.5
    17.5.1 Classes
    The maximum number of classes in a single distributed database is 4.3 billion.
    17.5.2 Attributes per Class
    The maximum number of attributes per class is 16.7 million.
    17.5.3 Instances Per Class
    The maximum number of instances per class is 4.3 billion multiplied by the number of
    private databases in the system.
    17.5.4 Sites
    The maximum number of sites in a single distributed database is 32,767.
    17.5.5 Private Databases
    The maximum number of private databases on all sites in a single distributed database is
    32,767.
    17.5.6 Long-Data Object
    The maximum size of a long-data object (audio, image, text and others) is 8 megabytes.

  18. Check out interbase by Anonymous Coward · · Score: 0

    I've tried a lot of databases over time, and interbase stood out as excelent. Also it does a lot of the work for you, where in many other systems you would normally need an oracle, ingress or whatever DBA. I haven't tried it under the circumstances you mention, but I would recommend looking at it. And remember, the platform the oracle run's best on is the 35mm slide projector. Cheers

  19. better get a mainframe by Anonymous Coward · · Score: 0

    How many users are you going to support? If you want good response time, you'd better consider getting at a minimum AS400 or a OS/390 mainframe running DB2. I'm all for unix, but a 2 Gig file limit is going to be quite a restraint for a 40gig table. About $50million should get you the hardware thats needed.

    1. Re:better get a mainframe by Anonymous Coward · · Score: 0

      Actually, I should have said:

      $1,200,000 (max) initial cost (depreciating)
      $300,000 recurring

      Of course, you could actually get a used one for less and you don't really need all of that disk at once, so you could look at the $1.2kk as $600,000 right away, with the rest over three or more years. Depends on your rate of growth, cost of capital, and so on.

      Do you have a data center? Do you need one?

      There is always the option of a lease, as well, even of the disk, from IBM. That could cut your costs by a lot, but these days there isn't much of a way to structure and operating lease as a capital lease, so you would lose on the tax end there.

      Out of curiousity, what do your load projections look like? What do your usage projections look like? How fast can people need the data (i.e., will you ever have a "CEO needs this in 20 minutes" situation)? i86 is porky and slow, but it is cheap, and you have a hard time beating the cost if no one minds it taking a while.

      Been doing this a while, can you tell? ;)

    2. Re:better get a mainframe by Anonymous Coward · · Score: 1

      Well, I don't think so. A basic Multiprise with RAID would run you about $400,000, a few decent disk stacks would be $800,000, and that would be enough for 1.1TB of data on SSA and/or FICON, with the power of a low end G5 box (one MCM). Figure in licencing and upkeep at $300,000, it is air cooled, and you are only looking at $1,500,000. And that would be enough power to run a small Fortune 500 company.

      If you needed to access all of that data at once, day in and day out, while running multiple jobs sorting that same data with maximum uptime, that would be ideal. But if you needed the uptime, UNIX would be cheaper (HACMP with two S80s at $350,000 each with disk and 8CPUs and 4+GB RAM)(and a RAMAC or a 7133 stack on two SSA loops) for that application.

      Both UNIX and mainframes have their place (even AS400s have their place in large environments)(not to say that they don't do fine in small ones either), but I would say that this is a toss-up between a commercial or a free UNIX and a commercial or a free database. I would vote for Linux and DB2, probably, as long as the load wasn't too heavy.

      And don't assume that you still need to spend $50,000,000 on a mainframe. You can get really nice mainframe clusters (!) for less that $15,000,000 these days!

    3. Re:better get a mainframe by HiThere · · Score: 2

      Mainframes have their points, but only a few applications require them. Without knowing the environment proposed for the database it's hard to be specific about the hardware. Still, the earlier recommendation of an Alpha-based processor might be a good one. Or a mainframe might be the best. Or...

      For that size of file growing that fast, I doubt that an I*86 would be the right answer. But it depends... given low use and tolerance for waits, it could be.

      Backups could be a real drag though!

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
  20. Re:Some Oracle features to consider by Anonymous Coward · · Score: 0

    Last I heard though Oracle themselves recommend a cold backup once a week or so.... I suspect there's a fairly good reason. -- Frode

  21. DB2 is the only choice for large tables by Anonymous Coward · · Score: 0

    DB2 is a good choice if you're looking for large tables because they just got the "clustered edition" (a.k.a. DB2 EEE) prototyped on Linux. That means you can spread your database across several machines, up to 999. So whatever limit you were worried about: data size, number of rows, etc. DB2 can handle it easily. I've actually run a database on DB2 where I used 10 Terabytes of disk (1 Terabyte of raw data, 3 Terabytes of indexes, and 1 Terabyte of temporary space). The rest was used for backups, input file, etc.

    1. Re:DB2 is the only choice for large tables by Anonymous Coward · · Score: 0

      Was this on Linux or AIX? If it was, that would make me feel a lot better, although I only have about 2.5TB right now.

      You are probably going to say "MVS" though...

  22. Re:Solaris is the answer. kernel 2.2 will have ... by Anonymous Coward · · Score: 0

    So increase the numer of Processes allowed by the #define in the Linux kernel source and recompile. The default is probably 1024 like any other Unix OS. Of course the limit will be based on the data type size used in the kernel of int, long ....

  23. Re:Choose Life, Choose Oracle? by Anonymous Coward · · Score: 0
    Sorry, I didn't mean to say the PL/SQL has NO place in a database environment.

    What I meant was that java inside Oracle 8i has more power than PL/SQL:

    • OO code inside the DB
    • OO data-types inside the DB can be accomplished only with Java in Oracle 8i
    • Networking (TCP/IP), CORBA from inside the DB (have you ever tried that with PL/SQL?, it's possible, but try to imagine the problems...)
    Cheers
  24. You get what you pay for by Anonymous Coward · · Score: 0
    nuf said

  25. Re:Database choices... by Anonymous Coward · · Score: 0

    Around the mid to late 70's RDBMS' where coming out as commericial products. RDBMS' were invented by IBM. Postgres was started at Univ of Berkely as you said under this Professor as you mentioned using a DARPA contract. Thus it always had to be free since public funds were used. The good professor then took the buggy code and fixed it started a commericial company and product call Ingres. He then sold it to Computer Associates. The good professor [knowing money talks] then took the postgres DBMS and started another DBMS company with the goal of being an OODBMS so that pictures and videos can be queried. This DBMS company was known Montage(Meaning a collection of pictures) around 1990. For some unknown reason I don't know about the company was renamed from Montage to Illustra which he then sold to Informix. I forget what Informix calls it now. Around 1994 when the internet took off and a DBMS was need to interface to by the free world as a free DBMS, he was asked to change the Berkeley rules so that the old code from the '70s could be made open-source which then was worked on by the masses and called Postgres. During this time, Postgres has been improved now to what it is today and I believe is OORDBMS (this is different from OODBMS). I don't know if it has oql (object query language) yet. All we need now, it all the tools and capabilities that all the commericial DBMS' have. Moreover, the capability to have a frontend that makes it look like Oracle, or Informix, or Sybase etc..... I'm looking for the Oracle Designer/Developer tool that's free and can do what EasyCase or ErWin can do.

  26. Informix is your answer. by Anonymous Coward · · Score: 0

    I'd advise using Informix Online/DSA/IDS/whatever_their_name-of-the-day-is, (not SE) here. Informix for SCO runs perfectly on Linux with iBCS module loaded in kernel. Use raw disk partitions as your dbspace "chunks" and you need not worry about filesize limit for "cooked files", You are still bound to a 2GB max chunk size, but you can have boatloads of them, and the engine can spread a table out across many chunks within a dbspace. Performance is also about 40% faster when using raw partitions over cooked files, and there is also a built-in mirroring support within the database engine itself for raw partition chunks!!! so there's your fault-tolerance handled within the engine!!! I can't believe nobody hasn't mentioned Informix here yet, check out http://www.iiug.org/techinfo/linux/linux_top.html for more info.

  27. Re: theory vs practice by Anonymous Coward · · Score: 0

    > In theory, theory and practice are the same,
    >but in practice they're different

    for those who don't understand what theory means.

    (btw (a shameless plug), my company is hiring. See my home page for more info.)

    Xah
    xah@best.com
    http://www.best.com/~xah/PageTwo_dir/more.html

  28. Re:Oracle 8.0.5 not 8i! by Anonymous Coward · · Score: 0

    YES! 8i is somewhat buggy!
    Use 8.0.5.1.x instead, unless you really need some feature of 8i. For most cases, I'd bet you don't.

    Try this in sqlplus under 8i for a quick example:

    set linesize 172 (or anything much larger than 132)
    select * from user_tables;

    BOOM! This blows chow on both the Digital UNIX and the Linux version.

    This isn't the only one; I've heard from good sources that there are various problems with create table as select in parallel as well.

    As a general rule, Oracle numbers that end in .0.0 shouldn't be put in production!
    (8i is currently 8.1.5.0.0)

  29. Re:mysql? by Anonymous Coward · · Score: 0

    AFAIK, mysql gets it's speed by allowing clients to lock entire tables at a time, one client at a time. Last time I checked, mysql did not support row level locking like postgres and oracle both do. This may be different now, however. I normally consider db's such as mysql good for applications where query speed, not volume is the primary design goal. Please correct me if I'm wrong. -Brett

  30. Re:mysql? by Anonymous Coward · · Score: 0

    AFAIK, mysql gets it's speed by allowing clients to lock entire tables at a time, one client at a time. Last time I checked, mysql did not support row level locking like postgres and oracle both do. This may be different now, however. I normally consider db's such as mysql good for applications where query speed, not volume is the primary design goal.

    Please correct me if I'm wrong.

    -Brett

  31. Not all features are created equal by Anonymous Coward · · Score: 0

    MySQL has a long list of features. But note that the MySQL features that others are missing are almost all in the "nice to have but not essential" or "we can work around that missing feature" category. The featuers that MySQL is missing are in the "showstopper" category. Subqueries, views, transactions. Even if you can come up with a workaround, it is bound to be pretty darn inefficient which negates the whole purpose of using MySQL: speed.

  32. Re:Why not Informix? by Anonymous Coward · · Score: 0

    I have contracted at a place that used Informix. The best thing I can say about it is that administration isn't too hard, and dbaccess (the query tool, like sqlplus in Oracle) is easily scriptable. As far as scalability goes, it's not up to the task mentioned here. We had much less data (20GB), and it performed poorly when the user level rose. We did a lot of optimization, query plan analysis, and got a lot of hints from the Informix folks. Still slow even on their expensive Solaris boxes. We also had some problem with bugs in 7.30, where stored procedures ignored query plan advice, and would pick a stupid join path despite our best efforts to the contrary. There were a couple more bugs also, though I don't recall them off the top of my head. They weren't show stoppers, but they were really annoying to work around. Does anyone know if those $99 Informix for Linux licenses from Intraware are for deployment or just development? They were running a special earlier this year at http://www.intraware.com/.

  33. Re:Choose Life, Choose Oracle? by Anonymous Coward · · Score: 0

    Oh, and also, speaking to PL/SQL, it's pretty lame, and very annoying to work in. If you can, USE JAVA. And also question very seriously why youre using it in the database at all!!! Don't underestimate the complexity (layers-wise, tools-wise, people communication accross dev teams-wise) you have when you have programming at the app level and the database level. PL/SQL is harder to manage (source code control is a bitch) and the compiler isn't half as good as JavaC, which enforces good programming practices (especially important in large groups.) My experience with PL_SQL is that even with packages, you end up with the code being very 'tightly coupled'.. that is, one big, freaking dependency. Oh, and avoid triggers if you can, as they add yet ANOTHER layer of complexity to your code, and frequently becode 'invalid', creating strange and annoying errors. As for the argument that SQL executed in the database gives huge performance benefits, well... if your app resides on the same box or over a fast network (such as an app server), the performance gains, I believe, arent usually worth the trouble. Just go with JDBC or DBI. An story: Some of you may remember when Oracle first was trying to get on the web.. and some of you may remember that their site was so bad performance wise, you couldn't even get into it. I spoke with support staff over there and mentioned this. This one guy I got on the phone felt like being a good guy and actually told me what their problem was: the entire site was: Hosted on ONE box, (a edict issued to them by LArry Ellison, no doubt coming off a bad cocaine binge, to have the entire site on one box to prove that big UNIX and Oracle hardware could do the job, AND, that all the HTML was parsed and delivered from the database using PL/SQL. He also mentioned that under heavy loads, PL/SQL had become __THE__ performance bottleneck in their system. So there. Don't know what architecture changes they've made since then, but their site is now running respectably.... Ok, so, in the final analysis, go with Oracle (which is a killer DBMS, nothing more), but use PL/SQL and in-server Java sparingly, if at all. Anyone differ with this? Why? Oh, and let's have more hardcore questions like this one re: Databases. Everyone learns.

  34. And how much did you pay for your OS.. by Anonymous Coward · · Score: 0

    Or your editor, or your windowing system,
    or your compiler or your debugger, or your
    web server.

    Me? Nothing. 0. Zip. Nada.

    "You get what you pay for" is disproven by
    counter-example.

    -- cary

  35. Re:Oracle, MySQL and PostgreSQL? by Anonymous Coward · · Score: 0

    I can get behind all of these assertions. You are a AC with real world experience. I was also wondering why use > 1 flavor of DB? IMO, OODB does not really scale. Give it up. I don't wanna dump on PostgreSQL or any of the other freebies. But scaling, backup and recover, etc, are much more trouble than you might appreciate. I'm also curious what project this is for. I've already written two SNMP managers w/database support. I'm always surprised this subject keeps coming up. I would think it's done to death.

  36. Re:Stonebraker by Anonymous Coward · · Score: 0

    Cohera? That's interesting..... I as far as I know, Stonebraker is the Chief Technology Officer at Informix. I just received an email a few days ago from him, stating as such. This is public information since he is an officer and can be found in filings with the SEC. Also, I believe Informix has now released their new Object-Relational engine Internet Foundation 2000 for Linux. Check out their web site www.informix.com.

  37. Re:Do it in ODBC/JDBC (with DB/2 or MySQL) by Anonymous Coward · · Score: 0

    You are a little off. 1. Java will not give you effective persistant transactions on top of MySQL 2. MySQL running on ext2fs will wait on the fsck during recovery (longer than you will want) 3. MySQL has poor features for handling backups 4. MySQL has no concept of true transactions or fast recovery features 5. MySQL has extremely poor mixed read/write performance 6. MySQL has no features for replication or scalability Dru Nelson

  38. Linux FS Limit != FreeBSD 3.x FS Limit by Anonymous Coward · · Score: 0

    I can't quote numbers reliably, but I remember that FreeBSD 3.x filesize limit was truly massive (way above the petabyte range). From my experience with it, FreeBSD *seems* faster at doing most everything.

    My System is possesed by Deamon!! And I like it! :)

  39. Linux FS Limit != FreeBSD 3.x FS Limit by Anonymous Coward · · Score: 0

    I can't quote numbers reliably, but I remember that FreeBSD 3.x filesize limit was truly massive (way above the petabyte range). From my experience with it, FreeBSD *seems* faster at doing most everything.

    My System is possesed by Daemon!! And I like it! :)

  40. Re:Why not Informix? by Anonymous Coward · · Score: 0

    Informix is up to the task. Get Online 7.31.UC3. There was a bug with Index pages being held in the buffer cache too long but this is fixed now. Bloomberg use it for 70,000 users on 300Gb of data across 24 CPUS and it scales fine. It is truely multithreaded unlikely Oracle.

  41. Yes, PostgreSQL will split the table... by Anonymous Coward · · Score: 0

    Over several files. I've never set up a database
    that big, so I'm not sure how easy it would be
    to split the files across several filesystems.

    Oracle has a more sophisticated system with
    tablespaces (is that right? I forget, I haven't
    used oracle for a couple of years) that you
    can put wherever you want.

    -- cary

    1. Re:Yes, PostgreSQL will split the table... by timcuth · · Score: 1

      With Oracle 8.0 and higher, plus the partitioning option, a single table can be stored in multiple tablespaces and/or datafiles. I am sure that this feature can be used to exceed the 2GB filesize limit.

      However, it introduces other management problems. Once you have created a table so large, you would no longer be able to create a single-file export of the table. You can, however, export each partition, separately, so I guess everything would be okay.

  42. Re:Do it in ODBC/JDBC (with DB/2 or MySQL) by Anonymous Coward · · Score: 0

    MySQL will not do sub-selects. This seems minor until you try to do DELETE() or UPDATE() based on joined tables. Think twice about MySQL. (Unless you enjoy writing hacks to get around these limitations)

  43. Use a Union buddy by Anonymous Coward · · Score: 0

    Use a Union, Makes a more complicated query but lets you partition tables as much as you want.

  44. Oracle Costs are not set in stone! by Anonymous Coward · · Score: 0

    One thing to keep in mind is that Oracle costs are not set in stone. If the price is too high then say that to your Oracle sales rep. Oracle will always bring down their price rather then loose a sale. Getting Oracle in the door is more important then making big bucks off the initial sale -- at least in their eyes. They'll make thier money many times over from the yearly support contracts, subsequent upgrades, etc.

  45. Try MS SQL Server 7.0 by Anonymous Coward · · Score: 0

    I highly recommend MS SQL Server 7.0 RDBMS. Yes, it's running only on NT, but scalability is improved over previous version 6.5, which does not scale past four processors-scalability limits in NT caused by excessive thread context switching. SQL Server 7.0 is replacing native schedule service with the SQL Server's own thread scheduler. Several tests have confirmed scalability up to eight processors. Terabyte size databases are not problem, although DB/2, Oracle or Informix are more proven. I will not definitely recommend RDBMS without transactional support. Transaction was part even of the first System R database. Speed? Who need the speed when you louse the data? Programming is great with Transact-SQL, superset of SQL-89/92 language, similar to Sybase. Included also are great OLAP engine and data loading/unloading tool (DTS). You can order very easy on MS Web site trial version (120 days) for the cost of shipping and handling. Database itself is starting around $1400 for the server software with 5 client licenses (up to four processors). For more processors you need to purchase enterprise version, in the order of several magnitudes cheaper then Oracle, terrible expensive and terrible complex database software. If you ever reach SQL Server 7 limit, convert to Teradata for NT or UNIX. I have Teradata for NT on my laptop, running only with 64 MB of RAM (heavy swapping). Sinisa Catic

  46. Re:You're out of date, charlie. by Anonymous Coward · · Score: 0

    "PostGreSQL's only problem nowdays is performance on huge databases." Hmm, PostgreSQL 6.5.2 still doesn't support foreign keys and referential integrity. Now I'm a novice at database stuff, but isn't referential integrity kind of important?

  47. Re:Why 'troll'? by Anonymous Coward · · Score: 0

    Totally agree. The problem with many linux users I have spoken to is that they are about as narrow-sighted as microsoft. Linux is *not* the best solution for everything. Neither is Windows, nor FreeBSD. There is no one perfect OS. For a database like this, I'd be leaning towards a Solaris box. For a "free" alternative, why not FreeBSD?

  48. Re:Oracle, MySQL and PostgreSQL? by Anonymous Coward · · Score: 0

    Why doesn't an OODBMS scale? Is there an upper limit to DBMS size? A limit as to distributing it across platforms?

  49. Re:Interystems' Cach� by Anonymous Coward · · Score: 0

    Beware. Most folks I know using Caché are doing everything in their power to get away from it. Intersystems has gobbled up the competition, become a monopoly, and hiked the prices accordingly.

  50. Large Linux DBs by Anonymous Coward · · Score: 0

    I'm still in the process of chosing a RDBMS for a small scale web app I'm putting together, but will most likely go with Postgresql called from Perl DBI called from Apache on Linux.

    However, in this day of persisting connections for large scale WWW DB apps (a la Apache::DBI), isn't the 32 simultaneous connection limit for Postgresql a very severe issue, especially when compared with the other transaction supporting RDBMS that can handle hundreds?

  51. Re:Linux Disk Management/Journalling by Anonymous Coward · · Score: 0

    While attending the Atlanta Linux Showcase, I spoke with SGI guys from Minn. and they clearly stated the the XFS system is very far along. A point in case is their stated claim that XFS can restore a worst case 9GB single scsi drive in 0.1 seconds. Thats acceptable in my book. We had to quit using linux in servers because efs2/fsck are just soooo pathetic. (ie. 6hrs+ when a 200GB scsi RAID box crashed) That happened twice, and until further notice linux was BANNED!!!

  52. Yes, Sybase 11.0.3.3 IS FREE by Anonymous Coward · · Score: 0

    As someone from Sybase pointed out on sybase.public.sqlserver.linux, the link to that license from the 11.0.3.3 download page was a mistake (and should eventually be corrected). You can use 11.0.3.3 for free for development or production.

  53. Re:DBD::Oracle Linux by Anonymous Coward · · Score: 0

    Why not try add the library to the environment instead ?? setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:${ORACLE_HOME}/lib and then run your program...

  54. Re:What about sybase RAW IO ? by Anonymous Coward · · Score: 1


    You don't need the patch for "raw devices" to use
    Sybase on Linux. You just use partitions as block devices. The mailing list archives have more on the issues involved in choosing whether to use a file system or "raw device" ( archive ), i.e. recoverability.



    We are using Sybase on Linux for several web based services. We have chosen it for speed, it's backup and recovery features, it's support for referential integrity, stored procedures, and all the other things one would expect from a SQL-89/92 compliant database engine. MySQL just doesn't offer those features, although there are ways to imitate them.



    At a fraction of the cost of Oracle ( and DB2 ), it's hard not to take a good look at Sybase. At least if money is an object. Oh and by the way....



    ASE 11.0.3 is FREE! I don't think there is another database engine with as many features, robustness, and speed that make that claim.



  55. Fire your architect by Anonymous Coward · · Score: 1

    Get your requirements together before you start looking at implementation issues. Then find somebody who can match your requirements to a reasonable design. I don't think an RDBMS will be part of your solution (it would not be part of my solution).

  56. Re:What about sybase RAW IO ? by Anonymous Coward · · Score: 1

    Um, while you can use partitions w/o filesystems, it's still not raw I/O.

    The benefit is really that, since you're not running a filesystem, you won't sit through endless fsck activity after a crash. Your database server will need to perform its own recovery, though.

    And the 2GB limit to a "database device" applies. Sybase's "database device" seems similar to DB2's "tablespace". Probably goes for Oracle, as well.

    Since a database (and even a table) may span multiple "database devices", this doesn't present a problem.

    I've run multiple 2.5 GB databases with heavy loading using Sybase ASE 11.0.3 on RH 6.0 with excellent performance and reliability.

    IMHO, Sybase delivers the power and reliability, without the extra complexity of Oracle and DB2.

    Just my $0.02, spend it wisely... :-)

    Gordon.

  57. Re:Choose Life, Choose Oracle? by Anonymous Coward · · Score: 1

    I hate to say this too, but in me and my friends' experience, you might just want to go with Solaris and Oracle. We've tried running some experimental software RAID packages on Linux and have burned Oracle databases, unable to recover (ahem, we had no backups, but still...). I guess if you went with a really coservative loadset on the Linux box, that would be ok too, and you'd save money on hardware. The sun hardware is the real expensive commitment, not the OS.. Also, I'd recommend Oracle, not just because of it's track record, but because of paid support options and backup and sysadmin tools. These features make running a HUGE database possbile, where my __guess__ is MySQL and Postgres are very weak in these areas. Oracle also lets you create very fault tolerant, redundant DB systems that are fast and pretty much bomb-proof. If the system MUST stay up, then Oracle DB's with failover stuff configured is great. Don't think you get this from the smaller free packages, god bless their hearts. Thanks and good luck

  58. Oracle, MySQL and PostgreSQL? by Anonymous Coward · · Score: 1
    Your idea of mixing Oracle and PostgreSQL seems counterintuitive at first - why use 2 DBs in one app?

    Due to Oracle's architecture, hardware resources had better be substantial in terms of CPU, RAM etc. However, it is hard to go wrong with either Oracle or DB2 or Informix or Sybase ASE. My suggestion is to either use Oracle 8i with data cartridges Or Informix Universal server or DB2 with Data Extenders to give you 1 DB with the features that you are looking for in PostgreSQL. Simplifies things and makes it more reliable.

    On the other hand, if politically things are viable, Versant and Objectivity (at a minimum) have Object DBMS on Linux already. Much better OO paradigm and faster and less resource intensive. This is especially suitable if you use C++ or Java. if you're a SQLhead, forget I mentioned ODBMS.

    MySQL does not have transactions so unless you can gurantee that your app doesn't want rollback, you shouldn't consider it for heavyweight stuff.

    Make no mistake, PostgreSQL is good but it obviously does not have the huge testing base that Linux has. The question is, does it have the testing base that Oracle, DB2, Informix or Sybase have? You decide.

  59. size > 2GB NOT important by Anonymous Coward · · Score: 1

    With Oracle, you can spread tables accross more than 1 file. The 2G limit does not really cause a problem. Don't use postgresql, it's klunky. MySQL lacks important data integrity features (rollbacks and foreign keys).

  60. Re:Database choices... by Anonymous Coward · · Score: 1

    You have some of your facts wrong. It was Ingres that was the research database project that was sold to CA, not Postgres. Montage was renamed because the name was already being used by another company. Stonebraker was never asked to "change the Berkeley rules". All the code produced by Stonebraker's research groups has always been available under the standard Berkeley software agreement. Anybody could have taken the code and used it for whatever they wanted. Andrew Yu was the chief programmer of the Postgres project for a while and Jolly Chen was a grad student of Stonebraker's. What's now PostgreSQL was an independent project that they worked on for fun. I actually ported an early version of this to Windows NT as a proof of concept. It had problems but it was able to run the Wisconsin Benchmark. Jon Forrest (a former minor member of the Postgres research group)

  61. Re: Linux Databases with Huge Tables? by Anonymous Coward · · Score: 2

    You have to ask yourself one question: How important is this service (to me; to my company;etc). If the answer is important you need to look at a database which is exhibits appropriate characteristics (robust, scalable, supported, flexible, ...ible).

    I have used Oracle 7 (not under Linux) in a banking environment for mission critial apps, with large amounts of data (100s Gb's). Performance is excellent (with a good DBA), scalability is excellent, robustness is excellent, architecture is showing its age.....

    I'm sure MySQL is excellent for what it was designed: free, well understood, reference SQL
    platform. However it's my contention that for the sorts of data volumes you are looking at Oracle/Informix/Sybase/DB-2 are most likely to be the way to go.

  62. Choose Life, Choose Oracle? by Anonymous Coward · · Score: 2
    As much as I hate to say this, MySQL & PGSQL are still far, far away from becoming as stable & feature-filled as Oracle, esp. Oracle 8i.

    This has nothing to do with Linux, as all of (fine) products are fully supported on Linux & other fine Unices , Although, personally, I'd choose Linux as my platform for the future, which ever Database product you'll eventually use...

    As for the 2GB file limit thingish, worry not, because Solaris had the same problem until a patch was released for 2.5.1, and I still recall running a 70GB Oracle 7.3 (!) DB on that configuration, you just use as many 2GB files as you wish.

    Oracle 8i has all the features you should require for what you seem to need:
    On the database side

    • data-structures inside columns
    • sets inside columns
    • Good (REAL) integrity of references
    • Java support inside the DB (JVM running inside The DB backend), Forget about PL/SQL, a real programming language in the Database!
    On the system-ish side:
    • data partitioning into files
    • parallel servers
    • HSM (Hiererchal Storage)
    • Point in time recovery
    • Smarter Indices
    • Archive-logs
    • Online (Hot) backup
    • SQL tracing
    • And the list goes on...

    Apart from features & performance there's one thing you want to keep in mind:
    When the Sh*t hits the fan, you'll wan't Oracle's date-rcovering features.

    Sorry if this looks like an Oracle Ad, Oracle does have it's ugly sides, I'll be the first to admit that, but there is no real grounds for comparing Oracle 8i with MySQL & PGSQL, or any of the other currenty GPL'ed or free DB's I've seen.

    1. Re:Choose Life, Choose Oracle? by hey! · · Score: 2

      What's wrong with PL/SQL? It has a simple, pascal-ish flavor, with a few niceties borrowed from ada. It is going to be instantly understandable by any programmer. It seems a very good choice for most kinds of scripting that needs to be done on the database server side.

      Not that Java doesn't have a place in the database engine, but I'd be uncomfortable using it for everything. Isn't the VM is going to be a problem for performance critical uses like triggers?

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    2. Re:Choose Life, Choose Oracle? by hey! · · Score: 2

      Well, I'd say what's right about PL/SQL is that it is simple, is readable by almost any competent programmer, and executes reasonably quickly in things like triggers and simple kinds of stored procedures. I don't find its syntax particularly cumbersome compared to most database scripting languages, although Python elegant it emphatically is not.

      What's wrong about PL/SQL is that it doesn't provide anything in the way of object abstraction or a whole buch of other things you want in a full blown application development context. But of course, that's a can-o-worms.

      It sounds like you're using PL/SQL outside its reasonable problem domain. Web integration? If God wanted us to use PL/SQL for this, he wouldn't have invented Larry Wall.

      I like Java too, but I don't want to have to use it everywhere. Do you really want to have every statement in all of your triggers run through the Java VM? What happens when you update a million rows in your database, and get several million trigger executions? Think of all the memory allocation and garbage collections.

      That said, having a language like Java when you do need it is great. Choice is good.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    3. Re:Choose Life, Choose Oracle? by Eponymous,+Showered · · Score: 1

      I think the question should be "What's right with PL/SQL?"

      I do PL/SQL every day and am limited by lack of datatypes, crappy web integration, limited (32k) variable sizes, no OO, cumbersome syntax (the only thing that makes it "Pascal like" is the := operator, AFAICT) and lack of decent array handling facilities outside of tables themselves.

      Bring on the Java! I like Java. It's elegant and powerful. I can't wait until my employer finally moves forward to Oracle 8 (probably another year).

    4. Re:Choose Life, Choose Oracle? by aUser · · Score: 1

      Not to mention the fact that pl/sql is a proprietary hook tying you to the exorbitant Oracle per-concurrent-user licenses. No way you'll manage to get away without rewriting your entire environment. Never write in pl/sql or in transact-sql. Even totally overweight clients are better than that! Keep strictly to odbc, or jdbc, or dbi, and don't bother writing one single line in ADO, as long as ole db is not widely supported. I hope no one besides Micros~1 will support it.

    5. Re:Choose Life, Choose Oracle? by Recall · · Score: 1
      The sun hardware is the real expensive commitment, not the OS.

      Solaris on Intel isn't nearly as bad as it used to be. I've got a couple Intel Solaris boxes that beat the pants of sun hardware that costs twice as much.

      On the other hand, if you stay away from the Ultra5 and 10, you can be pretty certain that Sun hardware will work well. When buying Intel hardware, all standard disclaimers apply, but you can build a solid high performance system if you are careful. Anybody using Linux in a mission critical setting should already be familiar with all of this.

      Oh, and Solaris 7 on Intel supports that address extension thing so you can have more than 4GB of ram now.

  63. Noooo Oracle!!! by istokj · · Score: 1

    I have been a dba for 5 years and I would hate to exaggerate, but I'm not sure you could pay me to use Oracle(ok - I have been paid before but it was kicking and screaming :) For ease of administration, flexibility, raw power, functionality, (this list really goes on), I wouldn't use anything but Informix. I haven't played with Informix on Linux to that scale - but I can tell you that you will regret going on Oracle for any large scale stuff especially if you're not too familiar with it. I've used Informix on 3 tera on solaris and I don't think you can beat it anywhere. Plus 9.2 will be released on Linux and you'll have datablade technology too.

  64. Re:Some good information... by Kirth · · Score: 2

    Mysql handles some database with files near 2GB
    at my site. 10 million records. No Problem, speed
    is adequate (some 0.01 seconds for _complicated_
    queries. Also, since it's running on Alpha it
    won't have Problems with the 2Gig Limit.
    - Speed isn't a problem.
    - Large Files aren't a problem.

    But:
    - MySQL on Alpha isn't very stable. I've had some
    months problems until a version came out which
    would even compile properly. It compiles now out
    of the box, but still, isn't stable
    - LOCKING. Locking a table with 10M records sucks,
    because nobody will even be able to read it while
    it's locked.

    In the end: I wouldn't do it again this way.

    --
    "The more prohibitions there are, The poorer the people will be" -- Lao Tse
  65. Re: Linux Databases with Huge Tables? by Eric+Green · · Score: 2
    First: MySQL isn't even in the running here. It lacks transaction support, triggers, etc., by design. MySQL was designed for speed, not features.

    PostGreSQL has the features he needs, and the latest version is very stable (as stable as Oracle). Its big problem: PERFORMANCE. While its performance is acceptable on a 2gb database, you can forget it on a 40gb database.

    I would say that if he can afford Oracle, to go Oracle. Oracle is proven in large-database environments. Informix, SyBase, and Adabas-D would also be good choices. If he's interested in more obscure databases, Empress has a nice 4GL. SolidTech's stuff looks pretty good too.

    Doing a search for "linux database" is instructive....

    -E

    --
    Send mail here if you want to reach me.
  66. Why 'troll'? by Eric+Green · · Score: 2
    Hmm, why was the message marked a 'troll'?

    SGI Iris and Solaris are excellent choices for jobs that have outgrown Linux. That's the beauty of Linux -- once you have outgrown Linux, you can move up to bigger computers. And with 40gb per year of data, we're talking 400gb within a few years.

    I have personally fsck'ed a 40gb ext2 partition. It is *NOT* fun, it takes over 20 minutes even on the latest/fastest RAID subsystems! So it is indeed appropriate to question whether Linux is the proper solution for this problem. After all, this isn't Windows NT -- we don't have to say "Linux everywhere" to win (just "Windows NT nowhere", grin).

    -E

    --
    Send mail here if you want to reach me.
  67. You're out of date, charlie. by Eric+Green · · Score: 2
    The latest version of PostGreSQL has almost all of the stuff that you mention, except for online hot backups and HSM. It even has several languages other than Java that run inside the DB backend, such as TCL, Python, and Perl.

    PostGreSQL's only problem nowdays is performance on huge databases. Well, replication is still a problem too, as is hot backups, but performance is the killer here. Oracle is proven at running huge databases, PostGreSQL is not. If I were thinking 40gb of data, I'd look harder at Oracle than at PostGreSQL.

    -E

    --
    Send mail here if you want to reach me.
  68. Re:Oracle 8i on Linux by gavinhall · · Score: 3

    Posted by patg:

    OCI programming? Uhhhhgghhh. It's cool, but very different/difficult to the uninitiated. The Oracle distribution comes with some good OCI sample programs, but I wish the interface were like mysql's C interface.

    With OCI, you have to bind every one of your columns before doing an insert,update, or query.

    I suppose that one could write a C wrapper library around OCI which would do that binding for you.

    Essentially, DBD::Oracle is such a beast, except it being for the use of perl.

    I was at one company where the Oracle consultants suggested using a perl cartridge with Oracle's web server because DBD::Oracle wasn't "supported". I went on the explain that DBD::Oracle is really just an OCI application and that it indeed is something Oracle should look at as being "supported".

    By the way, DBD::Oracle (along with DBI) is an excellent piece of code that powers many a site.

  69. Re:SQL 7 by Naikrovek · · Score: 1

    That's exactly right. Lets not forget about the baseline quality and stability of other MS products. MSSQL7 does /not/ stand out as a superior product, in any way. Even Oracle is faster than MSSQL. And MySQL blows it away. Don't bother with the reverse psychology on SlashDot, AC, it doesn't work.

  70. Linux is not a primary dev platform for Oracle by jabbo · · Score: 2

    whereas it IS for Postgres. Postgres on Linux also can run on Alpha machines, blowing away many of the painful 32-bit architectural restrictions. 6.5.2 is much, much better than older versions, but I'm not 100% sure that I'd trust it for a mission-critical system yet.

    Oracle, on the other hand, is developed on Solaris and then ported to other platforms. So if you're going to run a huge Oracle installation, do it on Sun hardware. Use raw I/O (can't do this yet under Linux) and stick the (sievelike) Solaris boxes behind a firewall, for god's sake. But please don't be a martyr and run a big Oracle installation on Linux... yet.

    Don't forget that Oracle is and will remain a complex beast. My personal viewpoint is that Oracle is the C++ of databases -- highly flexible, can be outrageously fast, but you MUST know what you are doing. Hire a professional DBA if you don't... a big installation like you describe will rapidly spin out of control unless someone skilled is around to manage it.

    --
    Remember that what's inside of you doesn't matter because nobody can see it.
  71. Re:What about sybase RAW IO ? by cwinters · · Score: 1

    Actually, ASE 11.0.3 is (in the beer sense) entirely free -- no support from Sybase, but still free.

    ASE 11.9.2 is a fully supported product from Sybase and is priced along the same lines as ASE on NT.

    Just clearing the air...

    --

    Chris
    M-x auto-bs-mode

  72. Another option is Solid by rlk · · Score: 1

    I've used Solid, and it works pretty well. It's a lot simpler to administer than Oracle, and it fully supports transactions, with various concurrency options. Check out www.solidtech.com. I've used it personally, although it was really as an object repository rather than a large data store. It has a good ODBC driver, and it exists for many systems.

    1. Re:Another option is Solid by AMK · · Score: 2

      Solid's a fairly nice database, though it lacks advanced features such as triggers. (It does have transactions, though, which is all that I really need.) However, Solidtech reworked their business plan about 6-9 months ago, and they're now mostly interested in customers who want to embed their database in an application and redistribute it 10,000 times; they're not very interested in supporting individual users who just want a single installation for running a Web site. This change caused a lot of bitterness on the solid-list mailing list, and many users, who'd gone out on a limb by recommending Solid instead of Oracle, were very angry. Most of them seem to be drifting away to other databases now.

  73. Alas, large files don't work on 32-bit Linux by rlk · · Score: 2

    The problem is in the VM system, not the filesystem. The 64-bit read/write/open system calls are in glibc presumably because other operating systems support them; I don't know what happens with them on Linux (they probably degrade to the normal 32-bit ones, and the library simply translates the parameters).

    In any event, it doesn't matter with Oracle, or Solid, or any other halfway rationally designed database. They all allow multiple tablespaces, or datafiles, or whatever you want to call it. It isn't simply a matter of 32-bit compatibility either; there's the little matter of disk/filesystem limits and load sharing.

  74. Postgresql is good by Grim · · Score: 4
    We have been using postgresql for all our database needs for the last 18 months where I work. It has gone through a number of changes and is now a LOT more stable than it was only a year ago.

    If your data is critical and 100% availability is essential, I would probably still go for oracle, as it has a far more mature background and, whilst closed source, is very stable. Otherwise, postgresql is a very good choice. It handles tables of effectively infinite size (it just breaks them into 1GB chunks to avoid the filesystem limit). And it has the bonus that is is open source. If you need a feature, you can add it. I am sure that if you are part of an organisatin generating 40GB of data a year, you probably have a few programmers floating around to add enhancements.

    As an addition, I have found the postgres development team to be VERY responsive and helpful. Personally I would say they are probably the most active and helpful open source dev team I have seen.

    If you want something out of the box, with 100% availability, go oracle. If you want something that you can control, that will grow with you, and will eventually get 100% reliability (it is very close right now), go postgres.

  75. Fast disk access, other database engines by jd · · Score: 2
    I'd say look at IBM's DB/2, which seems to be more solid than Oracle 8i. Informix' Dynamic Server might be worth a closer look, too, but Informix tends to be less robust. You can get free demos of both these products.

    Disk journaling and a fast filing system are also very important. It looks like Reiserfs is going to beat SGI to having a journal, and it's already faster than e2fs. I'd say give that a look, too.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    1. Re:Fast disk access, other database engines by [Zappo] · · Score: 1

      I am not extremely knowledgeable about these things. FWIW, it doesn't seem to me that Oracle and DB/2 are significantly different in terms of 'solidness'. Both are major players at the high end of database use (and IBM seems to claim that DB2 houses the world's largest data collections, at the moment).

      One thing I do recall reading (I think here on /.) is that some people have had very nice things to say about DB/2's documentation and usability. Many people talk about how Oracle is arcane and requires lots of specialized knowledge, and this is why you see Oracle admins getting paid the big bucks.

      If you don't already have an Oracle Database Administrator on staff, you have a lot more freedom than many places that are shopping for a database.

      In terms of filesystems, note that journaling is a reliability feature, not a performance feature (but no less important for being so).

  76. Re:Be aware of the complexity of Oracle by stephend · · Score: 1

    If you're putting 40Gb of data on a machine, you're going to have to do some tuning, or at least vaguely know what you're doing, no matter which database you use.

    I've played about with Postgres and, while it's more 'professional' than MySQL, I still wouldn't like to bet a large amount of data and my company on it. I'd be quite happy putting that volume of data on Oracle, though.

    Suspect an Intel-based PC, even a big server, is going to have a bit of difficulty with that much data. Have you thought of getting a Sun or an Alpha?

  77. Re:What about sybase RAW IO ? by stephend · · Score: 1

    ... unfortunately Linux doesn't support raw devices without a patch. Since it's not a standard part of Linux, Oracle -- which also supports raw partitions -- won't support it.

    It does work, though, although the performance gain isn't huge.

  78. Why not Informix? by tjansen · · Score: 1

    I wonder why Informix is not on the list. I dont have any experience with Informix, but had problems with older Oracles (version 7.x, not on linux). Informix looks interesiting and scalability seems to be more than sufficient for this job.

    1. Re:Why not Informix? by phred · · Score: 1

      Well, MySQL never said they would "never" do transaction support. In section F3 of the current documentation, "Some things we don't have any plans to do":

      Transactions with rollback (we mainly do SELECTs, and because we don't do transactions, we can be much quicker on everything else). We will support some kind of atomic operations on multiple tables, though. Currently atomic operations can be done with LOCK TABLES/UNLOCK TABLES but we will make this more automatic in the future.

      On the other hand, at the O'Reilly open source conference David Axmark told me they eventually plan to support the complete ANSI92 SQL spec. So in this case 'not now' may not be 'never.'

      Beating on a drum I have worn out doing so, not everyone understands what database people mean when we say transactions. PHBs and product specifiers, particularly. It really just means support of the SQL standard COMMIT/ROLLBACK feature. Operationally, this means a lot of heavy hauling for the database; just ask an Oracle engineer about how they do it (as I did once).

      If you don't really really need "transactions," it's worth thinking carefully about how to turn off the feature in the DBMS you're using (some do allow that), or use one like MySQL that doesn't.

      MySQL's developers made a deliberate choice here; it optimizes the program for some applications and rules it out for others.

      --------

      --
      Bill Gates Is My Evil Twin.
    2. Re:Why not Informix? by klas · · Score: 1

      Informix very well should be on the list of RDBMS' to look at. Informix SE will not solve your problem, as it uses flat files in the file system to hold table data. However, Informix Open Dynamic server uses Raw Disk Space, therefore getting rid of the whole 2GB file size limit.

      Informix treats BLOBs rather oddly and will take some getting used to. Servers in replication (HDR) also tend to exhibit problems that Informix has no clue how to correct (most client libraries don't operate properly, ALTER TABLE will cause assert failures I would certainly not consider it one of the FASTEST RDBMS'. Altho I have never crammed a PostgreSQL DB full of data, my inital impression would be that PostgreSQL is as fast if not faster than Informix.

      MySQL is probably not a good choice. Lack of transaction support (and MySQL's statement that transaction support will NEVER exist in MySQL) make it a poor choice for anything but quick and dirty (and small) web databases (at which it excells beyond comparison).

      Take a look at Informix. Especially if having corporate support options is something that is important to you.

  79. Re:DBD::Oracle Linux by Harik · · Score: 1
    The make test failed loading a shared library, so I wrote a simple c prog just to load all the .so libs in a dir and it would load every .so file in the disk except any one in the oracle tree (Forget the error now, sorry). The kluging that followed was horrific.

    PLEASE tell me you added the oracle library dir to /etc/ld.so.conf?? That's generally the problem. (Oracle installer dosn't do that for you)

    --Dan

  80. Re:Perl and Monitoring Oracle by Harik · · Score: 1
    Is there anything similar to DBI for C? Obviously you can't do it exactly the way DBI does, but you can present a uniform interface to all databases.

    With that in mind, all the C database code I write is through wrappers designed to work the same way as DBI, but I only write it for the databases I actually work with. (Postgresql and Oracle)

    Am I re-inventing the wheel or is this something that's only been done for perl?

    --Dan

  81. Route tracing? by K-Man · · Score: 1

    If you're trying to do graph traversals (tracing routes, or walking a tree) in SQL, get something that understands SQL3. Recursion was added specifically for problems like this. PL/SQL is an antiquated solution for this problem. Perhaps if you give more information about the queries you intend to run, we could come up with more useful suggestions.

    --
    ---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
  82. Re:mysql? by mhoskins · · Score: 1

    A better solution to simply piping to gz is to also pipe through "split" and split it up into chunks, then you can use "cat" and pipe it back to a psql monitor to reload the db... (Also works great for splitting up a dump to fit onto cdr's)

    ie: pg_dump |gz |split -b 670m


    ----------------------------------------------
    bash# lynx http://www.slashdot.org >>/dev/geek
    Matt on IRC, Nick: Tuttle

    --
    ----------------------------------------------
    bash# lynx http://www.slashdot.org >>/dev/geek
    Matt on IRC
  83. PostgreSQL and Large Tables on Linux by EricTheRed · · Score: 5

    PostgreSQL on Linux handles large tables pretty well. It breaks each table into 1Gb segments to get round the 2Gb file size limit. We chose 1Gb as it was a nice round figure, which makes it easy to organise the tables when they get really big.

    When they do become huge, you can then move the segments onto different filesystems to ease the load (although currently this is not a simple task). This also applies to indices.

    I know of a few large databases out there using PostgreSQL, but one I have access to is the TASS project (http://www.tass-survey.org), which currently has about 28Gb of astronomical data.

    The largest database I have is currently 5Gb, but that's primarily limited by disk space and not by postgresql.

    --
    Java gaming nut - http://www.retep.org/ or for the rail http://uktra.in/
    1. Re:PostgreSQL and Large Tables on Linux by Cramer · · Score: 1

      I'd have to ask about the query times for such large data sets. PostgreSQL is not known for blinding speed. (altho' it's gotten better over the years -- removing time travel was a huge improvement.)

      None of my databases have ever gotten that huge, but in the fullness of time, they I'm sure they will.

    2. Re:PostgreSQL and Large Tables on Linux by StupendousMan · · Score: 1

      > I know of a few large databases out there using
      > PostgreSQL, but one I have access to is
      > the TASS project (http://www.tass-survey.org),
      > which currently has about 28Gb of
      > astronomical data.

      Actually, it's only about 5 GB on our main site. But the next generation of our cameras will easily generate many GB per year, so we'll reach that 28 GB limit pretty soon ...

      --
      Michael Richmond "This is the heart that broke my finger."
      mwrsps@rit.edu http://stupendous.rit.edu
  84. Re:What about sybase period? by Siva · · Score: 1

    i was just noticing there has been little mention of Sybase (so far anyway). Are there compelling reasons why a Sybase product would NOT be a good solution for this problem?

    For that matter, are there compelling reasons not to use Sybase for anything? just curious...

    --Siva

    Keyboard not found.

    --

    Keyboard not found.
    Press F1 to continue.
  85. sybase free? by Siva · · Score: 1

    not so sure about that...when you go to download the linux version of either 11.0.3 or 11.9.2, you are directed to this licence which says you can only have a maximum of 3 users and cant redistribute it among other thigns.

    particularly relavant is this line:
    5. SUPPORT. These download Programs are provided on an "as is" basis and are unsupported. This Agreement does not entitle you to any maintenance or other services or any updates or new versions of the Programs.


    --Siva

    Keyboard not found.

    --

    Keyboard not found.
    Press F1 to continue.
  86. The specification was Linux by tilly · · Score: 1

    I guarantee that you are not running that system on a Linux box.. :-)

    But if you drop that specification, then I agree. Besides which I don't know if you would *want* to run a big database on Linux today.

    Cheers,
    Ben

    --
    My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
  87. Are you considering an Alpha? by tilly · · Score: 2

    On 64-bit CPUs Linux loses the 2 GB limit on file-sizes. Plus 64-bit architectures are significantly better for dealing with large amounts of data. Not to mention the nice speed of an Alpha.

    Unfortunately none of the "big boys" have released for the Alpha so you would need to use something like PostGres.

    Another tip. If you use ext2, increase the block-size. By default it is 1k, if you bump that to 4k you may see a performance increase and you will see a big improvement in fsck. Of course long-term the right solution to fsck problems is to use a journalled file-system. And so at the moment you may not want to use Linux for this...

    (Give the penguin time. It is still growing up.)

    Cheers,
    Ben

    --
    My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
    1. Re:Are you considering an Alpha? by njd · · Score: 1

      Big boys not on Alpha ... bah !

      We manage shitloads of data running on Oracle 7.3.3, 7.3.4 and 8i all on DEC/Compaq machines.

      Runs fine. I second the call to bring AdvFS and LSM to Linux. Vinum (for FreeBSD) is good, but AdvFS is the real deal.

  88. user defined functions by Signal+11 · · Score: 2

    Just wanted to give you the heads up on something - pgsql lets you pop in your own custom functions, so you can do things that are simply impossible to do client-side due to the limitations of SQL. The other thing is that you can define your own user data types. The last feature that I've been wondering about is the ability to retrieve data in n dimensional - I have no idea what they're talking about, but it sounds alittle like using vectors in C++ to return objects. Might be useful, might not be - I don't know.

    --

  89. two other alternatives by cabbey · · Score: 1

    that I don't see mentioned much yet... DB2 and Sybase. For a functional example of Sybase in serious action check out Distributed.net's Statistics.
    DISCLAIMER: I work for IBM, but not on Linux or DB2.

  90. PostgreSQL and Oracle -- the only two choices. by lowen · · Score: 1

    If you've already narrowed the field down to the two, nothing I say or do is going to turn your mind from that.

    So, let's look at the pros and cons:

    Oracle:
    Pro: This is THE RDBMS. Really. There is no RDBMS better than Oracle. Period.

    Con: Expensive, like any other commercial RDBMS.

    PostgreSQL:
    Pro: Open Source.
    Pro: Widely respected -- only Oracle has a more experienced codebase.
    Pro: Actively supported.
    Pro: Free.
    Pro: In the words of Philip Greenspun, who is an unabashed Oracle fan:
    "> The open source purist's only realistic choice for an RDBMS
    > is PostgreSQL (see Resources for the URL). In some ways,
    > PostgreSQL has more advanced features than any commercial
    > RDBMS. Most important, the loosely organized, unpaid
    > developers of PostgreSQL were able to convert to an
    > Oracle-style multiversion concurrency system (see below),
    > leaving all the rest of the commercial competition
    > deadlocked in the dust. If you've decided to accept John
    > Ousterhout as your personal savior, you'll be delighted
    > to learn that you can run Tcl procedures inside the
    > PostgreSQL database. And if your business can't live without
    > commercial support for your RDBMS, you can buy it
    > (see Resources for a link). (From a LinuxWorld article on AOLserver)

    Cons: not as SQL92 or SQL3 compliant as Oracle
    Con: no referential integrity (yet)
    Con: absence of outer joins (right now)
    Con: recovery is not quite there (look for the next version...)

    PostgreSQL whips up on MySQL once the word 'transaction' enters the picture.

    PostgreSQL's multiversion concurrency control (MVCC) insures that even in the presence of multiple concurrent transactions there can be no deadlock. With large databases, processing large inserts, the deadlock issue is a very real one.

    And, most importantly, unlike many other so-called RDBMS's, PostgreSQL passes the RDBMS ACID test.

    Lamar Owen


    1. Re:PostgreSQL and Oracle -- the only two choices. by fart_face · · Score: 1
      not as SQL92 or SQL3 compliant as Oracle

      Last time I worked with Oracle, which was about 2 days ago, it wasn't even close to being SQL92 compliant. Among the major commercial RDBMSs, I have always found Sybase Transact-SQL and even MSSQL's flavor of Transact-SQL to have implemented much more of SQL92 than PL/SQL. I hate PL/SQL, mostly because it's so damn weird. Is Oracle 8 really that much more advanced?

  91. MVCC by KyleCordes · · Score: 1

    >> its not something easy to sum up in a few words

    Sure it is :-)

    MVCC is a Very Good Thing, and is really the right way to implement multi-user concurrency; remarkably the high-dollar databases don't do it, probably because doing so would be a major change from their locking-based legacy designs. (Interbase does it, BTW)

    1. Re:MVCC by KyleCordes · · Score: 1

      How is it implemented in Oracle? Is it the default, always-on mode of operation?

      I read somewhere that Oracle can do a form of MVCC, but in that mode of operation it is drastically slower due to implementing MVCC in an inefficient mannet, something having to do with transaction logs.

      (Of course, I don't recall the source, and it could be completely inaccurate.)

      Can an Oracle guru comment on this?

    2. Re:MVCC by mayoff · · Score: 1

      Both Oracle and Solid have MVCC support (though they don't call it that). Do they count as high-dollar databases?

    3. Re:MVCC by Cramer · · Score: 1

      Can I get an "Amen, brother!"?

  92. Re:Let the database do it! by KyleCordes · · Score: 1

    What about if the RAID is something simple like RAID-1 (mirroring), with essentially no performance penalty on writing, and a speedup on reading?

    It seems to me that doing a big Oracle system (with the dozen or more recommended seperate drives) without RAID would be a bad idea from an uptime point of view. Even if you can recover to the last millisecond, the odds of one of those drives dying (and requiring that the system be down to rebuild) is just too high.

  93. Use Oracle. by krafter · · Score: 1

    I would not hesitate to use Oracle. We tried postgres before Oracle was available on Linux and after about 3 months the database just tanked, corrupting itself, for no aparent reason.

    When 8.05 came out for Linux I installed it on my Linux server and I am very pleased with the product. Addmiditly I am bias because I use Oracle on HP/UX and NT on a daily basis.

    I have used Oracle for about 6 years now and I am more and more impressed by this product every day.
    It is a complex system, but you can get classes from Oracle, and from many other companies. If you don't like the cost/time involved in classes then just go to any book store and you will find tons of books on all the various aspects of using and administrating Oracle.

    Oracle has many thing's going for it. Point in time recovery, massive scalability and it supports most hardware/software platforms.

    It has years of real life usage behind it. I think the product has been around for 10-20 years now.

    Another option is DB2. I don't have any personal experience with this product but I have heard nothing but good things about it.

    Chris Kraft (krafter@zilla.net)

    1. Re:Use Oracle. by scrappy · · Score: 1

      Just curious, but what version of PostgreSQL are you basing this on? When I st arted using PostgreSQL 4 years ago (approx), I wouldn't have used it for a missi on critical apps myself, but now...database corruption bug reports aren't someth ing I see...

  94. Oracle 8i on Linux by IainBowen · · Score: 3

    I've been using Oracle a long time now and Linux for a few less years and I've so far found Oracle 8i on Linux a stable and enjoyable experience - however, I have yet to base a real system on it. I've also toyed with Postgres under Linux, but after using Oracle for so long, it just doesn't compare.

    I would say that tablesize going over 2Gb is not a problem. the tables go in tablespaces which can be made up of many datafiles.

    PL/SQL is useful for fast and dirty code, but lacks a little to C for performance. If you can master it, Oracle Call Interface with 'C' will produce very fast code.

    I'd also recommend reading a good book both on Oracle Design and on Tuning. There's a couple by O'Reilly, oddly enough.

  95. Re:Some good information... by Jeff+Mahoney · · Score: 2

    MySQL may be great for larger databases - but what happens if they're getting written to contstantly?

    As far as I know, MySQL only supports table locking, rather than the row locking of Oracle (and perhaps Postgres). We started using database logging for our Apache server, so that we don't eat so much CPU running analog - Statistics would just be SQL calls to the database.

    We started with MySQL and ended up postponing the project - MySQL just couldn't keep up. The hardware isn't the problem, it was a DS/20 w/ 1 GB of RAM, running Digital UNIX 4.0F.

    Since we have other projects, this one has been on hold for some time, waiting for one of us to port our logger to Oracle.

    Just thought I'd mention my experiences with it. I'm not knocking MySQL - We still use it for a decent number of mostly reader databases, but for intensive writers - no way.

    Disclaimer: I'm not a DBA, nor do I pretend to be. If anyone has information that contradicts this, it's entirely possible that I'm wrong.

  96. Solid by flanker · · Score: 2
    Solid has a nice "black box" database that is pretty easy to run right out of the box. Their web-site is solidtech.com. They have had Linux support for a long time but have recently gotten a bit big for their britches and declared that they aren't really interested in selling small licensing packages any more. No one has really been able to figure out exactly what they are up to, and there have been some very heated threads on the Solid support list lately.

    My experience is that Solid is a very robust, easy to set up, low maintenance RDBMS (with referential integretity checks, transactions, etc.) that isn't open source. They seem to now be fancying themselves a new Oracle so I'd just stick with Oracle if you need something out of this class. Kinda sad because Solid has been around with Linux since the early days as a low cost solution, but new management has come in determined to tear the company down, it would appear...

    --
    Left shift 1 for e-mail...
  97. DB2 Pricing by matt[0] · · Score: 1

    DB2 Enterprise for Linux goes for ~15K/processor with unlimited connections. You can get your first version of DB2 for something like 100-300 bucks if you sign up for the ISV program. IBM is really looking to put the hurt to Oracle. We priced out Oracle for 279K....ouch.

    --
    --------- Matt
  98. Stonebraker by muchandr · · Score: 1

    is now with Cohera (www.cohera.com) They are doing "federated" databases

  99. Re:Linux Disk Management/Journalling by jetson123 · · Score: 2
    Journalling file systems usually don't do journalling for data, only for file system structure. Furthermore, the databases themselves do the journalling and recovery for the data.

    Using a journalling file system for the partition holding database tables will accomplish pretty much nothing (but it won't do much harm either).

    Journalling file system in general don't help much with data integrity either; the kinds of guarantees they usually make are minimal, they don't protect you against many sources of data loss, and they don't even guarantee fast system startups.

  100. Re:Some good information... by Dowser · · Score: 1

    Hmm...
    No, the page is still there (Just reloaded...)
    And No, it isn't the same page...
    With the http://www.mysql.com/crash-me-choose.htmy you can choose wich databases to compare, thus reducing the size of the result-page and enhancing the readibility drasticly.

  101. Some good information... by Dowser · · Score: 2

    is found here...
    http://www.mysql.com/crash-me-choose.htmy
    Take a close look at the MySQL solution's, they really rock at large databases!

    1. Re:Some good information... by scrappy · · Score: 1
      Actually, we don't do 'row level locks'...we went one further and went straight to using MVCC. See the following URL for an explanation, since its not something easy to sum up in a few words:

      http://www.postgresql.org/docs/postgres/mvcc.htm

    2. Re:Some good information... by PigleT · · Score: 2

      Yeah. The other thing that I remember from the page is thinking, the ODBC 3 results are somewhat screwy - it depends strongly on what ODBC driver you go through, as the rule with ODBC is not 'if the database supports it, we do' but rather 'if it's in the spec and we choose to implement that feature, you can then look up its availability and use it if you choose'.
      Rather long-winded but it sums up feature support in ODBC fairly well, I think.
      What I'm saying is, there are one or two bits & pieces in the results that I believe are artifacts of the chosen ODBC driver, not of postgesql (which is the one I was looking at most).

      And yes, I'd expect some confusion over ODBC's DATETIME type being merely 'DATE' to Oracle, especially if the script was written to look for only one-to-many relationships the wrong way round, as it were.

      --
      ~Tim
      --
      .|` Clouds cross the black moonlight,
      Rushing on down to the circle of the turn
    3. Re:Some good information... by Saraphale · · Score: 1

      Oops, my bad. It seemed unavailable when I first tried to visit there, so I searched for a similar page on their site.

      S.
    4. Re:Some good information... by Saraphale · · Score: 2

      is found here... http://www.mysql.com/crash-me-choose.htmy

      That page doesn't seem to be available. The closest match is:
      http://www.mysql.com/crash-me.html
      Which seems to contain the information.

      S.
    5. Re:Some good information... by TommyW · · Score: 2

      It works for me.
      But I find the results suspect (NB, I work with Oracle databases, so my opinions are biased too.)

      Consider the date & time datatypes:
      date, time, timestamp, timestamp with timezone, datetime, abstime, datetime (other version), abstime, reltime, smalldatetime, timespan, year.

      Alledgedly, MySQL supports 6 of these, Postgresql 8, and Oracle only 1.
      But that one, despite being called date, also includes the time down to the second. If you don't want the time ignore it.

      Similarly, the function list seems to have been needlessly extended to demonstrate the extra functions in some databases (not necessarilly MySQL, but that's where this example comes from):
      Oracle doesn't support FROM_DAYS or TO_DAYS (correct, it uses TO_CHAR or TO_DATE respectively, but they're not listed.) It doesn't include WEEKDAY either, but uses TO_CHAR again.

      There are more mistakes like this which are not in Oracle's favour. But I don't care enough to rant. :-)



      --
      Too stupid to live.

      --
      Too stupid to live.
      Too stubborn to die.
    6. Re:Some good information... by frossi · · Score: 3

      MySQL does not support transactions. The good point is that there is NO way to make a transactionnal database as efficient as a non transactionnal one. So MySQL is faster than PostgreSQL for many benches.

      The bad point is that when you need transaction, you have to perform table level locks which are far LESS efficient than row level locks of PostgreSQL.

  102. Let the database do it! by bunyip · · Score: 2

    Most commercial databases prefer raw partitions and you don't run a file system at all. The database manager has log files for commit/rollback processing. They run faster on raw partitions, since the OS is not "in the way". The DBMS actually doesn't need much of what a general purpose OS offers, that's why Oracle was promoting the "raw iron" concept (BTW what ever happened to that?).

    Last I heard, maybe it's changed, Linux didn't support raw partitions.

    1. Re:Let the database do it! by tomierna · · Score: 1

      What platform/filesystem are you using Oracle on?

      Oracle does not recommend cooked file db's AFAIK, unless used in conjunction with Veritas vxfs.

    2. Re:Let the database do it! by tomierna · · Score: 2

      Speed isn't the only reason to want to let the RDBMS use raw partitions (which essentially means it is managing its own filesystem...)

      Reliability in database consistency is crucial. Put a RDBMS on a cached filesystem and you are asking for trouble. When writing to a raw partition, a write is actually committed to disk. When writing to a cached partition that guarantee cannot be made. While there are some filesystems that support specific database engines (vxfs from Veritas comes to mind - it supports Oracle and Sybase), putting any RDBMS on any OS's native filesystem is a Bad Thing(TM).

    3. Re:Let the database do it! by scrappy · · Score: 1

      Actually, I must disagree here...at work, we use Oracle and *everything* is on the OSs file system, based on recommendations from Oracle themselves...this place doesn't do anything without checking wtih the vendor half a dozen times first, and even then they have to ask once more :(

    4. Re:Let the database do it! by mitchy · · Score: 1

      At Oracle's Performance Tuning classes you will be told that RAID is bad, raw is the only way to go. One of the main reasons for this is that when Oracle "writes" to disk, it really needs to write at that instant, not when the OS has the time.

      Power failure? With OS-based filesystems, you lose (literally). When going raw, you still have logfiles that are current and you can recover to the last millisecond - without it, you can kinda recover, that is, until the power failure, and then you get to guess which transactions were or were not committed to disk.

      Frightening, if your database means anything to you.

      - mitchy

      --
      "The mind is a terrible thing to, um, uh, oh bollocks." -- Me
  103. Re:Perhaps Terradata by Tadghe · · Score: 1

    Teradata's great (I work for the company with the world's
    largest Teradata database, and write code against it so I'm a little
    biased here), but, unless you are hitting 60 TB's or so, I'd look
    at a bit more userfriendly package (Oracle 8i, or DB/2 or
    Informix (Informix's drivers byte,however) I would reccomend taking
    a serious look at Oracle 8i (forget the stuff about it being
    so new that it's buggy, take a look at the highend customers
    using it, and the comments they have about it) Quite a bit of the
    Online stores and datawarehouses I've seen have used 8i
    with super success. Be aware however you may have to
    balance cost-vs-performance unless you happen to have an
    extra couple (or more) grand to toss around. I'd try a demo of all the products,
    and pick which one seems right to you. I would however stay away
    from MYSQL (you *MUST* have good transaction behavior and
    good cursor support for multi-gigabyte databases, unless you happen
    to enjoy downtime.....) I've never toyed with any of the other
    freebies, so I can't comment on them.....

    Tadghe

    --
    Bugs Bunny was right.
  104. Re:What about sybase RAW IO ? by ebradway · · Score: 1
    ASE 11.0.3 is FREE?



    Not quite. You can download RPMs of ASE 11.0.3 for development and evaluation purposes, but when you actually implement an application with it the license fees are the same as the NT version - about $300/user or a flat $50K if the app touches the web.


    Oracle's fees are similar but they are pushing use of "application service providers". Basically web hosts that have licensed Oracle instances that you can use for a premium.




    BTW, if you want a free set of Oracle Development CDs, check out:



    http://www.oracleanswers.com



    Register for the Atlanta or Birmingham seminars. They are fully booked - you won't get to go. But Oracle will send you a complimentary set of CDs for free



  105. Re:Oracle 8.0.5 not 8i! by ebradway · · Score: 1

    Gee, I wish you hadn't posted that as an AC...

    And I think I should try to find another forum for this discussion. This is a VERY important topic for many Oracle folks...

    My big question about this post is:

    When you say it blows chunks, are you saying SQL*Plus blows chunks or does the server instance actually give up the ghost? I'll have to try it here at work (I think we've got an 8i instance around here someplace...)

  106. Re:What about sybase RAW IO ? by ebradway · · Score: 2

    No RAW IO on Linux. I won't go into detail here, but Linux doesn't support Raw IO right now. Linus himself has stated that he has no interest in adding raw IO support because the performance increase from using buffered IO far exceeds the disaster recovery benefits of raw IO. But I hadn't though about FS limits. That's not so much a problem with Linux or Intel, but rather a problem with the POSIX file manipulation functions in GLIBC or LIBC. What does ftell return? A signed 32-bit long, which has a range of -2GB to +2GB. I think there are some kludges that use an unsigned long, but until 64 bit POSIX functions, we're stuck.

  107. Oracle 8.0.5 vs. 8i? by ebradway · · Score: 2

    I've used PostgreSQL and Sybase for a few Linux based projects. I use Oracle 7.3 on HPUX at work. I'm helping a friend spec out a new project. He's ultimately going to use Oracle on Solaris (its government work - so the DB & OS choice is decided through politics). But development will likely occur on Oracle on Linux.

    We have been going around in circles trying to decide whether to use Oracle 8.0.5 or 8i. As far as I can tell, the only significant advantage to 8i is the JVM. But 8i requires a minimum of 128MB RAM and 256MB swap without the JVM and 256MB RAM and 512MB with the JVM. Note: those are the minimums. And Oracle's not kidding. Don't even try firing up an 8i instance with less than 128MB RAM. Ultimately, the memory requirements will be satisfied. I suspect we'll spec out a Sun Ultra20 with about 1GB of RAM. But We don't want to invest a fortune in the development hardware.

    Oracle 8.0.5, on the other hand, runs nicely with at little as 32MB of RAM. Plus, the install is CUI - whereas 8i requires X (I usually don't install X on my servers).

    Besides a few cool new features in PL/SQL, does anyone else have any experience with 8.0.5 vs. 8i that would help sway me one way or the other? Right now we are pretty much set on using 8.0.5...

    1. Re:Oracle 8.0.5 vs. 8i? by mitchy · · Score: 1

      I've been using all of the Oracle stuff for linux (actually have been using Oracle since before linux) and have to say that 8i is the way to go. Why?

      1) Two words: Log Miner.
      2) PL/SQL runs faster, without modification
      3) Index-organized tables
      4) More database attributes are dynamic

      Basically, it is faster, can be modified on the fly, lets you cluster tables, and it's even MORE difficult to install. [slow grin]

      - mitchy

      --
      "The mind is a terrible thing to, um, uh, oh bollocks." -- Me
    2. Re:Oracle 8.0.5 vs. 8i? by peterarm · · Score: 1

      Unfortunately I don't have any advice, but I might soon be in a very similar position and I'd be very interested in hearing what you decided on.

  108. Large Databases on Linux! by Gogo · · Score: 1

    Hi,
    When it comes to very large databases, Oracle is a good choice. I have run databases of about 5GB on
    NT without much problem! And still larger ones on
    Solaris. It is very rugged. Only thing is that as
    the database grows you will have to do lot of tuning to keep the performance to acceptable limits.
    Postgresql should be a good choice for a secondary database. ie, if u can use it for day
    to day data collection and put the collected data
    periodically to Oracle. Postgres also supports integrity constraints and quick scripting now.

  109. Oracle v. PostgreSQL? No contest. by JBettis · · Score: 1

    Use Oracle. PostgreSQL is just not ready for prime time.

    I have used both, and I have lots of problems with PostgreSQL. Large Object handling in Psql is bad bad bad. I get errors from the database all the time. On the other hand I have dozens of Oracle 8 databases running night and day with no problem.

    Oracle has a huge learning curve though. Consider Interbase, it runs on Linux and is very good. It is so simple to admin, no real maintence required.

  110. Database choices... by tomierna · · Score: 3
    There are so many RDBMSs out there that it is very difficult to make a good choice quickly.

    Even if you limit your choices to one OS, you still have at least five or six great DB platforms.

    The choices you've given are by no means the only ones you have, and the reasons you give for narrowing down your choice of engines seem to be pretty sparse.

    For instance, Sybase ASE and ASA both support referential integrity constraints. ASE's are more limited than ASA's, but ASA has RI checks that are comparable to what I've seen of Oracle 8i.

    All of Sybase's products have T-SQL, which are their programmatic extensions to SQL92. Sybase has a robust C API for writing server extensions.

    Sybase ASE is fully capable of handling the data load you've specified. So is Oracle's enterprise-scale product. So are some of Informix's products.

    Interesting tidbit about Postgres - parts of it found its way directly into a product called Illustra. Illustra was bought lock, stock and barrel by Informix. Informix has in one of their products the object-polymorphism you crave.

    From the PostgreSQL Guide:

    The authors of PostgreSQL 1.01 were Andrew Yu and Jolly Chen. Many others have contributed to the porting, testing, debugging and enhancement of the code. The original Postgres code, from which PostgreSQL is derived, was the effort of many graduate students, undergraduate students, and staff programmers working under the direction of Professor Michael Stonebraker at the University of California, Berkeley.

    Michael Stonebraker spun Postgres into Illustra. For a while after the Informix buyout he was grafted into their executive management. I dunno if he's still there, since Informix doesn't seem to list that information on their site.

    Informix still has grants at UCB as well as many, many other colleges. (source) Some of those grants sound interesting...

    Now, I realize that the subject was "Linux Databases", but there are *solid* RDBMSs that run on the various BSD's and commercial *nix, too. I mention this because of the large scale data needs. I'm not sure that Linux running on any PC hardware is capable of supporting the amount of data you are looking at. (Notice I said PC hardware...)

    Because of that, I'd suggest a commercial *nix, running on a server-class machine and an engine from one of the "big three" database vendors: Sybase, Oracle or Informix.

    All of these vendors offer their enterprise-level engines at very reasonable prices, since you are buying at the per-seat level in most cases.

    Of course, I'll likely get flamed to hell and back for the above statements because I'm suggesting commercial, non-GPL engines running on commercial, non-GPL OS's on top of hardware that wasn't built by stuffing armfuls of parts from Fry's into a whitebox ATX case.

    If I were stuck cobbling things together because of budgetary constraints, I'd go with Linux running on a whitebox and use the free Sybase ASE installation that bundled with RedHat.

  111. Re:Linux Disk Management/Journalling by bos · · Score: 1
    Cormac's post is a little misleading. Oracle writes straight to disk, and doesn't go through the filesystem at all, so fsck never gets involved. However, fsck is necessary for PostgreSQL.

    Unfortunately, there are not any journalling options available for Linux yet. Stephen Tweedie's ext3fs stuff is only barely in alpha, and SGI hasn't gotten very far through porting XFS. For now, if you're using a database that goes through the filesystem, you're stuck with fsck.

  112. Re:mysql? - have had problems... by mkleehammer · · Score: 1

    I really like MySQL, but just yesterday I switched to DB/2. I have had corruption in 11M+ record tables three times now, each time completely unrecoverable. Reimporting that many records once was acceptable. Twice scared me. The third time, I switched.

    Temper this with the fact that we are using a JDBC driver to access it that might be doing "bad things", and the fact that it seemed to occur during development. I'm guessing it was related to killing processes (during debuggin) and it not cleaning up dropped connections properly. (Lack of transactions, maybe?)

    --
    - mkleehammer
  113. DB2 all the way by Shin+Dig · · Score: 1
    I am a big fan of DB2, especially its speed with really large tables. Version 6.1 for linux installs really slickly, and has finally gotten rid of the need for ksh, which the 5.2 version had.

    If you want to do big database stuff, you should look at who does it well. The US Patent Database runs on DB2, which is the largest public database in the world. (I think the numbers were 15 Terrabytes Compressed Data last time I read up on it)

    Free 60 trial downloads can be snatched up at this fun address.

    My main feeling that this is a better option over something like MySQL is that if you are really shuffling arround that much data, you need a really robust transaction control engine. Last time I worked with MySQL, it was fast, but lacked real transaction control. This may have been added by now, as I haven't checked of late.

    --
    There is no silver bullet. Plus, werewolves make better neighbors than zombies or vampires anyway.
  114. Re:Um, is Linux really appropriate for such a by PigleT · · Score: 1

    Perhaps we ought to bear in mind that it's only expected to be amassing 40Gb a year - it's hardly as though you're going to want to do a select * across one table that size and get all the events for a year.
    So yes, journalling would be a good idea, but it would also be very sensible to invest in a regular (weekly? daily?) archival solution.
    He could always use postgresql, pg_dump | gzip, and then drop the tables, every night :)

    --
    ~Tim
    --
    .|` Clouds cross the black moonlight,
    Rushing on down to the circle of the turn
  115. Re:mysql? by PigleT · · Score: 2

    That is an alternative, although if there's one thing you might want in a high-load network-management station, it's transaction support.

    From the PostgreSQL Admins' guide (http://www.postgresql.org/docs/admin/manage-ag175 6.htm#AEN1774), I note there's a section on large tables, including the line:
    "Since Postgres allows tables larger than the maximum file size on your system, it can be problematic to dump the table to
    a file, since the resulting file will likely be larger than the maximum size allowed by your system.
    As pg_dump writes to stdout, you can just use standard *nix tools to work around this possible problem:
    Use compressed dumps:
    % pg_dump dbname | gzip > filename.dump.gz"

    etc.

    As regards postgresql being "as grown up as Oracle", erm, it's certainly a viable solution, and doesn't suffer from the bloatware problems of Oracle.
    However, you might well find that it's not as fast as Oracle or MySQL for adding lots of events as rows in the table quickly - it all depends.
    Try it out and see :)

    --
    ~Tim
    --
    .|` Clouds cross the black moonlight,
    Rushing on down to the circle of the turn
  116. Re:Be aware of the complexity of Oracle by ZamZ · · Score: 1
    Sorry about this but I could only find it on Metalink which needs an account that is tied to your oracle license.


    If you have access the direct link is http://support.oracle.com/ml/plsql/knowledgebase.g etcr?textkey=239049


    If you can't get access to it I'll be summarising something similar on a new web page about Oracle that I'll be creating in the next couple of weeks. Send me your email and I'll drop you a note when its ready.


    Basically it discusses keeping objects grouped into tablespaces by size and enforcing keeping the same initial and next extent sizes for all objects in a tablespace. It also goes some way into pointing out why the 'my tables in 100 extents' panic is a myth.



    ZamZ@mzm1.demon.nl is my email BTW

  117. Be aware of the complexity of Oracle by ZamZ · · Score: 5
    If you are looking for a 'black box' to put your data in be aware that Oracle, even at only 40G of data, requires some heavy administration at times. You may want to consider MySql for an easier ride on admin.



    As well as knowledge on tuning the database you'll have very specific Oracle SQL tuning to do. I've been doing this type of thing for a number of years now and I still don't know 100% of what is needed - there are always knew features to get aquainted with.


    In saying that there are some good resources to get you started. Have a look at http://www.orapub.com for white papers and before you even start to lay the database down see the Oracle white paper (from their website) 'How to stop defragmenting and start living'.


    You should also take into account that 8i is a new-ish release of Oracle. Keep an eye on the bug lists for it and make sure you won't be impacted by anything.


    Apart from the warnings, I've run around 60G Oracle db's on Linux and as long as your server is beefy enough you should have no problem.


    One big thing, if you do go down the Oracle route have t-shirts made up with 'I've looked at v$system_event today, have you?' written on them and wear them. If there is one way of finding what your performance problems are/will be its this table. The only real exception is bad SQL which you need to trap in other ways

    1. Re:Be aware of the complexity of Oracle by hey! · · Score: 3

      I agree with this wholeheartedly, especially about the documentation. You need to be a good skimmer. If you can't sit down and skim through a couple thousand pages of documentation in a couple of days, then you could be in trouble. In the past, I've also got bitten by Oracle documentation that was just plain wrong.

      But... if you aren't constantly under the gun from cranky PHBs, tuning an Oracle database can be _fun_. If you've never worked on a system this configurable before, it's like you were a model rocketer and suddenly were given a chance to fire off a Saturn V. Chances are it will blow up on the launchpad, but if you get everythign right the results are spectacular.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    2. Re:Be aware of the complexity of Oracle by Get+Behind+the+Mule · · Score: 3

      If you are looking for a 'black box' to put your data in be aware that Oracle, even at only 40G of data, requires some heavy administration at times. You may want to consider MySql for an easier ride on admin.

      That's quite true (but see my other post about shortcomings of MySQL). I can't help the original poster much because I have no experience with PostGres, but I can tell you that while Oracle is very powerful, the learning curve is very steep, and you'll need a skilled, experienced DBA to really get all of its benefits to work for you.

      I have found few things in the world more frustrating than Oracle documentation, not because there's not enough of it, but because you can't see the forest for the trees. The first time I had to administer an Oracle DB I was overwhelmed, and there was almost nothing there to help you learn how to just get started. By now, I feel relatively comfortable with Oracle, but it's taken months of practice.

  118. Some thoughts on databases > 20Gb by GraemeL · · Score: 1

    Some thoughts and tips.

    I've used Oracle for about 10 years. 8 of them as a DBA. Databases have ranged from a few hundred meg on a users desktop, to 150Gb warehouses to 70Gb OLTP supporting 300+ users on High end Unix.

    With anything bigger than about 20Gb think about the following :

    Design for performance. Think carefully about indexing and disk layout. Don't be afraid to denormalise to cut down on complex joins.

    If you are not going to have RAID 1+0, try to keep tables and indexes that will be used at the same time on different arrays.

    In the case of Oracle, if you have a lot of writes, keep your Online Redo Logs off of RAID 5. Put them on a disk by themselves and use Oracles built in multiplexing to mirror them.

    If you are going to have a batch run, it's much more efficient to code jobs to run in parallel rather than relying on the inbuilt parallel functions in the database. Remember that a few microseconds saved on processing each row makes a huge difference when running through a 200 million row table.

    If you are rolling data out of the database after a certain time, use table partitioning by date. This allows you to drop the oldest partition instead of having to delete 20 million rows from a table. Makes the DBA's job a lot easier and mininises downtime for reorganisation.

    Finally, vet all of the code designs and implementations. Most programmers are used to working on databases that are a lot smaller. I've seen a lot of systems struggling because of coders inexperience with large systems. It is much cheaper to design it right at first than to have to rearchitect later.

    Good luck.

  119. How about the complexity of Linux? by FutileRedemption · · Score: 1

    I had the same experience, but with Linux.

    But not months. To get comfortable with Linux took me four years.

    But thats true for any Unix, I suppose.
    And four years ago it was a little harder to install and use linux, anyway...

    Ok, somewhat offtopic, but not completely.

  120. True hacker by FutileRedemption · · Score: 1

    Yeah, and you will hack in ten years still.

    I suggest that a True Hacker is someone that looks at the tools available to him and gets something up and running as fast as possible.

  121. Interystems' Cach� by dublin · · Score: 4

    I'm not a database expert, but I've done a fair amount of healthcare IT consulting over the years, and run across Intersystems' Caché database quite often.

    Most people aren't familiar with Caché from Intersystems, but if you're building a very large, sparsely populated transactional database (like an electronic medical record, for instance), it's at least an order of magnitude faster than Oracle, even after doing unnatural things with Oracle. (This info came directly from a major EMR vendor that benchmarked Cache and all the usual suspects and found nothing in the same league for this application. According to his tests, Caché was 30x faster than Oracle out of the box - Intersystems' website claims Caché is 20x faster than RDBMS competitors.)

    The technology is quite interesting in some regards: It is very mature, having its roots in the old MUMPS or "M" system (about as old as Unix), but has been updated quite nicely as time has passed to provide quite capable SQL and object-oriented interfaces, even though it is neither an RDBMS or an OODB internally.

    Another nice feature is its distributed caching protocol, which allows you to build a logically huge database server piecewise from a number of distributed servers. A number of large installations, including Boston Children's Hospital (IIRC), use this for their EMR - last I heard, BCH's database was spread over 100-150 servers. This would seem to fit nicely with the Linux way of doing things.

    It runs on Linux and a free download is even available on thier website, or they'll send you a CD for the asking if you're one of the unfortunates stuck on the end of a phone line.

    You can find Intersystems at http://www.intersys.com, not exactly the world's most intuitive URL, so I thought I should include it here...

    --
    "The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post
  122. files longer than 2GB by spiffyboy · · Score: 1

    The linux kernel should be capable of handling files longer than 2GB, if you meet the following requirements:


    - your software (oracle, in this case) uses the 64bit versions of read, write, stat etc.

    - your filesystem is capable of handling large files.


    Now if you are going to use oracle, you should use
    partitions directly. It just bypasses the filesystem and uses the disk in raw mode.

    1. Re:files longer than 2GB by spiffyboy · · Score: 1

      this link is an interview with one of Oracle's developers and what they think Linux is lacking.
      They still say that raw devices have advantages...

    2. Re:files longer than 2GB by theonetruekeebler · · Score: 1
      As has been pointed out elsewhere, file size doesn't limit Oracle table sizes. I have a 103GB Oracle database on HP-UX, which has the same 2GB limit as ext2fs; my largest single table is 12GB.

      Oracle organizes its data in tablespaces, which can contain multiple tables, indexes, clusters, rollback segments, and such. A single tablespace can be composed of a large number of data files. To create a ten GB tablespace, you can create a tablespace with 5 2GB files, which will give you 10GB storage (less about 8kb (or whatever your blocksize is) overhead per file). So long as you remember to allocate in multiple 2gb-8kb chunks instead of single >2gb chunks, you'll do fine.

      Perversely enough, I find the 2gb limitation to be something of an advantage, as it encourages me to spread my data across multiple spindles. I've never used raw partitions; in fact, I've used spindle-distribution and backup/recovery concerns as an excuse not to.

      --

      --
      This is not my sandwich.
    3. Re:files longer than 2GB by Get+Behind+the+Mule · · Score: 1

      Now if you are going to use oracle, you should use partitions directly. It just bypasses the filesystem and uses the disk in raw mode.

      I have my doubts whether there is any application that really needs to go to that much trouble. You can indeed speed up your I/O that way (and Oracle is I/O-bound), but you make administration significantly more difficult -- so much that it's hardly worth it. There are many ways to squeeze more performance out of your Oracle application that don't sacrifice the ease of administration over the filesystem. IIRC even Oracle has advised against using raw devices, for this very reason.

  123. Re:SQL 7 by haizi_23 · · Score: 1

    hmmm. perhaps that would be an option if microsoft were to provide a sqlserver 7 port to linux.

  124. Re:Linux Disk Management/Journalling by hey! · · Score: 2

    If you're using Oracle, better to let Oracle take care of these things.

    Oracle tablespaces are, in effect, a lightweight, special purpose file system. By keeping the transaction logs and data tables on different tablespaces on different devices, Oracle can gracefully handle the failure of any one hardware device, allowing you to move the affected data to a different device, with NO DOWN TIME. Oracle does a better job of handling this than any operating system filesystem, so running the tablespaces on top of a filesystem just adds bloat.

    Speaking of bloat, people who call Oracle "Bloated" need to examine their definition of "bloat". Let's say you have requirements x and y, and piece of software has features X and Y that satisfy them. It also has feature Z that satisfies some other requirement z that you don't have. That doesn't make it bloated, it only means that it might be overkill for your application.

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  125. Re:SQL 7 by hey! · · Score: 3

    Yeah, but the problem with SQLAnywhere is that Sybase doesn't want it to get TOO good. One of the things they've done is provide a wacky implementation of SQL, and taken out things like statistical aggregate functions like standard deviation. It's small, fast and useful, but in terms of compatibility its the pits.

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  126. Re:SQL 7 by ColtCougar · · Score: 1

    Try http://www.pick.com -dts

    --
    -There are only soldiers, and men who wish they were soldiers.
  127. mysql? by jeep · · Score: 0

    If you are looking for mission critical data, why not give a chance to MySQL?

    As I've seen it was designed to handle lots of data (see http://www.mysql.org/what-is-mysql.html)

    As my experience is concerned MySQL is very robust and fully threaded too...

    1. Re:mysql? by neepneep · · Score: 1

      I believe MySQL doesn't do any referential integrity and that is just what Eugene was looking for, I believe.

    2. Re:mysql? by Get+Behind+the+Mule · · Score: 2

      Uhm, we use MySQL at my company for some projects, but not for projects where we need a DB that is "mission critical" and "very robust". For those needs we go to Oracle.

      The lack of referential integrity has already been mentioned. Far more critical is the lack of transactions! The ability to roll back is indispensible in a sensitive application. Also, Oracle has vastly more sophisticated tools for backup than MySQL has -- another essential element of mission-critical DB administration.

      Oracle is bloated and expensive, and I certainly don't recommend it unless you really do need it. But sometimes you do.

  128. OODBMS? by laetus · · Score: 1

    Quick question, are there any OODBSs out for Linux? Also, anyone have any knowledge of how an OODBMS runs (speedwise) against all these RDMSs mentioned (Oracle, DB2, etc.).

    ------------------

    --

    "We're sorry, but the website you're trying to reach has been disconnected."
  129. X86 Database by bifrost · · Score: 1

    Choosing Oracle is probably the way to go, mostly because if its what you're already using, you should stick with it. However one problem you might run into if you run it with Linux is the memory data structure size, and the file size. FreeBSD 3.X allows you to have larger than 2GB files around, so does Solaris, and Net/OpenBSD.

    There has been some interesting success in putting Oracle 8i on FreeBSD, it seems to run faster and a bit smoother under FreeBSD, but I can't confirm that personally because I haven't put it into production. I am going to be installing it today on my workstation to give it a shot.

    To be honest though, you should really be rolling your databases over before they hit filesize limits if you're only growing by 40GB per year. I would invest well in FCAL raid and see what gives you the best performance. Linux's FCAL support sucks so far, we had to discontinue using it and switched for FreeBSD on one installation I was working on. Just a couple of bumps

  130. Re:DBD::Oracle Linux by endquote · · Score: 1

    Has anyone had any luck getting DBD::Oracle to make with an 8i Oracle client install?
    I was looking through the README.clients and very little had much relation to the Oracle8 install or was just plain wrong even for Oracle 7?

  131. Re: Linux Databases with Huge Tables? by msobkow · · Score: 2

    If this is a mission critical system, you should seriously consider putting together an RFQ that characterizes the application access, external interfaces, data volumes, and sketches out the major data domains. Choose the technology based on the application, rather than choosing the technology and trying to make the application fit.

    No matter what you choose you will be spending a good chunk of money on hardware and maintenance staff. An good DBA with experience tuning the database you choose will be critical to success, as will an architect that can address the design issues *before* you invest in the development.

    Most of my large RDBMS experience has been with Oracle 7.x and DB/2 on commercial SMP Unix systems. Both have performed quite well when properly tuned and administered. I've also used Sybase for mission-critical apps, but the ones I've worked on weren't that big (1-2GB total vs. a few hundred GB.) I do know of companies in the banking industry who use Sybase for large databases as, so I expect it could handle the load as well.

    As others have pointed out, your key issues are likely to be backup and restore capabilities rather than a simple "which is better" issue.

    I guess it's obvious that I wouldn't recommend an open source solution (Postgres, MySQL, et. al.) to a client in this situation. There are a lot of situations where I'd recommend open source, but large mission critical data repositories isn't one of them.

    To me it's not an issue of whether an open source solution *could* handle the job -- it's that I don't have experience with deployments that *prove* it will work. I wouldn't consider products like MS SQL Server for the same reason.

    --
    I do not fail; I succeed at finding out what does not work.
  132. If what they say is true... by fart_face · · Score: 1
    If it's true that Postgres and MySQL don't have fully cooked referential integrity, then you really don't have much of a choice other than going for a commercial DBMS. Referential integrity is key, as is solid design. If your wanting a lower-priced alternative to Oracle, which is a bit pricey, then possibly you may want to consider Informix. It's got all the stuff you need, but not as expensive. Since you're not (I'm guessing doesn't seem so from your question ) running in an OLTP-type environment, Oracle may be too much. Informix might be just right. I would avoid MySQL or Postgres simply because I would never want to work with a database that didn't have a totally grown-up referential integrity scheme. It is your friend. Without it, you will never know if what you're seeing is right, especially when it comes to making views or doing joins.

    Seems like what we need is a good OS RDBMS, then I'd go for that. From what I've seen so far, there's a long way to go.

  133. Re:Use a real operating system, use a real databas by theonetruekeebler · · Score: 1
    If this system should be up all the time, be sure that whatever you use can (like Oracle) take online backups, but get a fast taperobot. This could easily cost as much as the system.
    Or you could do what we did: buy a couple of extra drives and do hot backups to compressed files on said drives, and keep compressed log archives there too. Backups to disk go very very fast, and you can tar them off to tape at your leisure without significant disruption to the production DB. Plus, if a real data drive fails, recovering its datafiles is a lot faster: drive failure is by far the most common event requiring restore-from-backup; if you're doing it from another disk, your downtime is dramatically reduced. Tapes will help you recover from more extreme disasters like fires or total system failure.

    --
    --
    This is not my sandwich.
  134. Interbase by AppyPappy · · Score: 1

    Interbase runs on Linux as well. It's limited to 32 TB though.

    --

    If you aren't part of the solution, there is good money to be made prolonging the problem

  135. Linux Disk Management/Journalling by CormacJ · · Score: 2

    You need to look into the journalling and diskmanagement side of linux.

    On databases this large it might still take an age for linux to start up again if it has to fsck the disks.

    There are a few journalling options out there, and this is something that is *very* important to look at if you want to ensure data integrity.

    Currently other Unix systems do have support for very large disk arrays.

    The use of databases on Linux is pretty much a cert, but your big problem will be in disk handling.

  136. What about sybase RAW IO ? by mbyte · · Score: 1

    The Sybase ASE can use RAW devices as database storage, so I think the 2 GB limit does not apply here. The ASE has some drawbacks when it comes to large tables (esp. forward only cursor), but if you compare it to MySQL, which has no cursors at all ....
    (do a select * form bigtable limit 1000000,10 or so in mysql .. it takes AGES, a DB with good cursor support helps ALOT in this situations ...)

    I am just trying the ASA (adaptive Server Anywhere ) from sybase ...

  137. Re:SQL 7 by sarahw · · Score: 1

    I am not pretending in any way to be an expert here, but...

    the company I work for uses sql server (on an NT platform) as the back end of a time-based billing software for lawfirms. SQL Server seems to work o-k for the day to day transactions, but we have no end of problems in the conversion phase before a firm goes live. Raising indexes, simple queries, dbccs and the like - really simple stuff - can take hours (or days) on a 20-30 gig DB.

    Now, I don't know if the problems here lie in the way our install group sets up the software, or if this is just to be expected - but my experience is that sql server just doesn't handle medium to large datasets very well at all.

  138. Perl and Monitoring Oracle by moscow · · Score: 3
    Are you aware of the perl DBI? It works for both Oracle and PostgreSQL (and mysql and ...). This can be used to provide access to databases regardless of type in a uniform manner, and are truly useful for remote access and regular running jobs, particularly those which use standard SQL. In Oracle terms, you can run Perl DBI anywhere that you can run sqlplus.

    There are a number of GPL products which use Perl + DBI. One of the best, IMHO, is Orac which also uses Perl/Tk and so provides GUI access from multiple platforms (Solaris, Linux, NT). Orac offers loads of SQL scripts to help with tuning, or just seeing the layout, of databases. It also provides realtime database monitoring, which is the current thrust for improvement of the tool. You can find it on CPAN, e.g. here .

    Another monitoring tool, which is capable of emailing you when it's unhappy as well as putting up current status on a web page is Karma. This is still developing rapidly and is intended for Oracle on Linux. This can be found here on freshmeat.

    Hope this is of some use.

    --
    Who would believe in penguins,unless he had seen them? Conor O Brien - Across Three Oceans
  139. Faster than hot backups by rjstanford · · Score: 1

    You need a three-way mirror for this to work safely, but what we've done in some sites is set up just that (through RAID controllers). When you want a backup, just bring one of the mirrors offline, freeze the data, and archive it off using OS tools. Reconnect it and it syncs back up.

    --
    You're special forces then? That's great! I just love your olympics!
  140. A few words in favor or ORACLE by billr · · Score: 1

    We (the company) use Oracle. Our largest database is 300 gigs, and growing. It supports in excess of 4500 simultaneous connections during the day. Our system (we're a cog in the wheel mind you), executes in excess of 10,000 (moderately complicated) queries/hour. SQL/Net stinks, but its better than ODBC.

    Recently, we dumped Alphas 'cause they couldn't keep up. Currently it runs on (2?) HP V-class machines with EMC storage.

    Optimization is important. Go get the O'rielly books. Set yourself up for Cost based optimizer. I like Oracle, because you can really get into the guts of a query if you need to, and if you're adding as much data as you say you will be then you'll need to.

    Oracle also supports partitioned tables. So you can take your 2GB table and make it into a series of smaller tables (although it appears as one). If you can come up with a good scheme to partition on, then you could get great benefits from this.

    All of this is said w/o any real knowlege of postgres or mysql. So if they do those things to then kudos for them.

    Good luck.

    --
    I've finally found the off by one erro
  141. Do it in ODBC/JDBC (with DB/2 or MySQL) by RickyRay · · Score: 1

    Oracle is all PR and no extra substance. They are the Microsoft of enterprise databases (from what I can tell, their goal in life is to destroy M$ and replace them as the worthless evil empire). DB/2 is just as powerful and stable, easier to program for (e.g. tools that automatically generate code for it, good documentation, etc.), though still a bit pricy. If you write code handling transactions, MySQL is a huge cost saver, and very fast/stable. On a project we just put a proposal in for, they had suggested Oracle. Pretty quickly we discovered that the pure Oracle licenses were going to cost more than they had planned to spend on the entire project. BAD!!! The costs of Oracle are absolutely pathetic. If we get the project, we found it will be more practical to use something more affordable and spend a little bit more time on the transactions (writing a tad of Java code will give me my transactions on MySQL, and will probably get as fast or faster performance; I've done transactions in Java before, and it worked just fine).

    The best way is to write everything as ODBC/JDBC, not vendor-specific code, allowing portability. Working with legacy databases is a nightmare, especially when you have any reason to port them to another DB. Yes, specialized commands can buy you a bit, but in the long run are rarely worth it.

  142. Re:mysql? - have had problems... by 2sheds · · Score: 1

    Lack of transactions can be worked around, in many cases. As for the table corruption - which version of MySQL were you using? The latest branch with MyISAM, although offically 'alpha' is very cool.

    j.

    --

    Absit Invidia
  143. 2 GB File limit=2 GB Table size limit? by aUser · · Score: 1

    Do I understand it right that any single table has to fit into one, single file, or will PostgreSql transparantly split the table over several OS files?

  144. Re:DBD::Oracle Linux by otherwhere · · Score: 1

    All the king's horses and all the king's men (1 experienced Oracle DBA and 1 good Linux sysadmin) couldn't get DBD::Oracle working with 8i on Intel redHat 6.0 Linux at my shop.
    The make test failed loading a shared library, so I wrote a simple c prog just to load all the .so libs in a dir and it would load every .so file in the disk except any one in the oracle tree (Forget the error now, sorry). The kluging that followed was horrific.

  145. Oracle... by gdyoung · · Score: 1

    #ifdef SHAMELESSPLUG
    At the risk of sounding biased in favor of Oracle, I'd say it's an excellent database to run on Linux. If you want to run a large, professional database, Oracle has a lot of features to offer (and likely more in the works). The reason I say this is more of the line that if you're interested in spending money on an important database you get a lot out of Oracle. Though, Linux isn't a primary build for 8i, it does get a lot of eyeshare with the higher-ups. Our CEO's announcement of a Linux box distributor business should give you some proof of that statement.

    One person's comment that Linux 8i is akin to C++, highly mutable, & blisteringly fast, yet a bitch to use, might not be too far off, though, but you're already getting yourself into that boat if you're intending to run your database on Linux, eh? If you could do a Slackware installation 4-5 years ago, you demonstrate the intelligence needed to figure out Oracle, but it'll just take a bit of time. Grabbing a nice O'Reilly book might help, too. :) Some other databases (like Microsloth's) might give you more usability, but always at the cost of some of the impressive features of a more complex system. I suppose the situation is just a classic trade off. Research the prices and -features-, and look for some independent benchmarks, and you're set.
    #endif // SHAMELESSPLUG

    The words of an engineer, it would appear

    --
    #include"disclaimer.h"
  146. Perhaps Terradata by JinxMaster · · Score: 1

    What you're looking for is NCR's Terradata product. It's UNIX-based. I don't know if there is a Linux port.

    This dbase was designed from day one for very large datasets. It's been around long enough to trust it with your data. I would definitly check it out for a dbase with a growth rate that large.

    http://www3.ncr.com/product/teradata/

    --
    ****** WE hold these Truths to be self-evident, that all Men are created equal, that they are endowed by their Cre
  147. I agree. by Egorn · · Score: 1

    I run many high traffic sites including the one listed below and I have never had any problem with MySQL. I did start using PostgreSQL but I found MySQL to be more robust and just in general and much more plesant experience. I would have to recommend MySQL. Of course with very much larger projects I would recommend Oracle!.
    ---------------------------------------- ---

    --

    Movie News - "Entertainment news, bitch!"
  148. Some Oracle features to consider by cbradley · · Score: 1

    While you should have no problem with a large database on Linux w/Oracle, some of Oracle's features might give it the upper hand in your decision. Considering your database is mission critical and presumably 24/7, your backup strategy is crucial. Oracle will allow you to do online hot-backups while the database is running. A definite advantage over the cold-backups. Also if using large tables, consider Oracle's table partitioning option, which allows a table to be spread across several files (tablespaces) and be queried either individually (by partition) or as a whole. On very large databases this can help reduce query time significantly.

    C Bradley
    NCycles, LLC

    1. Re:Some Oracle features to consider by cbradley · · Score: 1

      Yes, cold backups are the best, however, in a true 24/7 database that would be impossible

  149. Um, is Linux really appropriate for such a by TummyX · · Score: 1

    huge database? I mean, perhaps OSs with proven larger disk support, memory, and jfs would end up being a better investment?

    I'm sure linux is perfectly fine on smaller databases, but if you're talking about tens of GB, umm....?

  150. There might be more .... by xinos · · Score: 1

    check out the enterprise solution at
    http://www.sleepycat.com

  151. History (was:Database choices...) by GoBears · · Score: 2
    • mike stonebraker is CTO of informix (see http://www.info rmix.com/informix/corporate/overview/execs/execs.h tm).
    • mike stonebraker is also CTO of cohera (see http://www.cohera.com/stone.html).
    • Relational Technology (RTI, subsequently Ingres Corp., then part of ASK, now part of CAI) was the commercialization of the University INGRES codebase. INGRES was a Berkeley project from the early 70s to the mid 80s.
    • Miro Systems (subsequently Montage Software, then Illustra Information Technologies, now part of Informix) was the commercialization of the University POSTGRES codebase. POSTGRES was a Berkeley project from the mid 80s to the mid 90s.
    • stonebraker has always released his research software in source form, even in the 70s and early 80s when this wasn't particularly fashionable (these days, the funding agencies make a much bigger deal of it). external contributors were always able to donate changes which were folded into the postgres source tree at berkeley. (most ports happened this way.) the only real change that happened with regards to licensing is that there used to be a UC-imposed license fee for commercial use. a couple of grad students talked stonebraker into talking UC into dropping this fee.
    • systems like illustra have come to be known as ORDBMS (object-relational), which is very different from OODBMS. don't expect OQL - expect SQL3/SQL-1999.
  152. Use a real operating system, use a real database by Laglorden · · Score: 0
    Although I probably will be flamed from some of the crowd here I can't resist saying this;

    If this is this data is in any way missioncritical or important for you, then you should use a Unix with proven track-record for this kind of stuff. AIX, Solaris, Tru64, HP-UX whatever. Likewise I would choose a "mature" database. Like Oracle, Informix, Ingres, Sybase, Mimer (here in Sweden).

    I think (this is an opinion) that a Linux/Opensource system still is a bit immature (like NT) for this kind of work. It's also a question about hardware reliabiblity on PCs. It takes a while to fsck a non-journaled filesystem.

    If your data is worth 100 times the combined hardware + software costs it doesn't feel too bad coughing up some money to get a "real" system.

    Remember backups If this system should be up all the time, be sure that whatever you use can (like Oracle) take online backups, but get a fast taperobot. This could easily cost as much as the system.

    Of course if you are trying to cut costs, don't use Oracle.

  153. get a real file system by bitjunkie · · Score: 1

    it blows my mind that as many people advocate linux as they do considering that the file system can't even handle files larger than 2GB! i mean, even NT can handle that. get a real OS if you're going to be doing something on the scale you are talking about. Linux is definitely not it.

  154. Re:Oracle 8.0.5 by horsie · · Score: 1

    that's http://jordan.fortwayne.com/oracle/

  155. Oracle 8.0.5 by horsie · · Score: 2
    We are currently running Oracle 8.0.5 Enterprise Edition on a RedHat 6.0 Box... and so far its been VERY stable...

    Good install procedure for 8.0.5 on RH6 is on http://joradn.fortwayne.com/oracle/

    We currently have around 3 tables with over 2 million rows on each table, and Oracle has so far performed to expectations...

  156. errata by mudnux · · Score: 1
    the first paragraph should read a decrease in data access time not an increase in time.

    my bad.

    --
    NT is based on the premise that anyone who can manipulate a mouse can administer a system. Huh?!?
  157. Raw Disks are for speed not size by mudnux · · Score: 2
    The raw disk access used by Oracle databases is for an increase in data access time not for increasing the database size.

    Cliff, one thing you did not mention was in which dimension is the table "big". If you are talking about many rows, Oracle has table partioning that allows for very fast indexed searches on "billions and billions" of rows. If you have rows that exceed 2GB Oracle can handle that quite nicely using standard file systems on any platform. If you have data points that exceed 2GB you may have a problem that only a change in platform can address. This would require a blob data type and has limitations (2 GB I believe) when stored in the database. There is a bfile subtype of the blob data type that allows for any size data point but is then limited to the file system's max file size because Oracle stores the data as an actual discrete file. Yes, Oracle is not the easiest product to administer. But, it is intended for use with difficult to manage databases so has all the bells and whistles necessary to manage large databases and unruley or unwieldy data. The first time you restore the database to 16:42:13 June 12, 1999 (or any other point in time) you will realize the power of the restore features too.

    PostgreSQL is a fine product but I do not believe that it is ready for this size of database in a mission critical application (I am assuming that you would not be considering collecting this amount of data for a non-mission critical app)

    I am really surprised we didn't get the "my database is bigger that yours" statements that is typical of the oracle news groups. But then again we at /. are a finer breed (cough).

    --
    NT is based on the premise that anyone who can manipulate a mouse can administer a system. Huh?!?
  158. Informix by Trousersnake · · Score: 1

    Don't over look Informix. I was with a company that had over a terra-byte of data, mostly large BLOB files. When we tried to convert, Oracle simply couldn't handle it.

    --
    Hello! I am Inigo Montoya, you killed my father, prepare to die
  159. Sounds kinda interesting... by /Aquila · · Score: 1

    Kinda funny.. I am working on a similar project. It's a distributed snmp-based networkmonitoring program, with the capability to log data to database. I've not yet considered how large the amount of data to be logged will be, but I've chosen MySql as my database, and up until now it have handled things pretty good. Do you have a website with info on your project? I am really curious to see exactly what it is. Maybe get some ideas for my own :-) btw. If you're interested to see what i've got going, pleasse mail at >>dm97a12@skivehs.dk Sincerely, Christian Theil Have

  160. 40 GB is a small-to-medium-sized db for Oracle by Ora*DBA · · Score: 1

    I haven't used Postgres for the last eighteen months, so I am not up on its latest bells and whistles (though I liked it fine then). I can say that Oracle, properly tuned, scales up pretty big and can stay fast; I have run half-terabyte data warehouses with happy users and hear tell (though they may be fish tales) of terabyte-sized db's on large SMP machines. What I am really waiting for is Oracle Parallel Server on Linux. Unfortunately Oracle wasn't comfortable with Beowulf's DLM, so it will probably be a year or so. However, given, say, a multi-processor Netfinity and a later kernel (for better SMP support), Linux should handle a 40GB Oracle database easily.

    Amounts of data aren't so much an issue these days as concurrent connections. The nature of your app does not suggest thousands of users needing lightning response; I suspect either product will be adequate. The issue is wich support model you are more comfortable with.

    hth -

  161. Whoops, sorry about 2x post. by woobie · · Score: 1

    Sorry about the double post there, I'm in Amsterdam right now and the connections pretty flaky. Didn't think the first one went through.

    -Brett

  162. Progress(tm) by Da+Wizz · · Score: 1

    One option you could try is the Progress(tm), which is an enterprise server, primarily for the big boys (HP/UX, AIX, etc). I have used the SCO version on Linux using iBCS, and it ran a 20Gig database with tables of bigger than 2Gig. Progress have, earlier this year, now released a native Linux port of their enterprise server and 4GL development (which is very powerful and quite easy to pick up) See www.progress.com.

    --
    -= Da Wizz -= mark@froop.net