Are There Large RDBMS Using Linux?
Jason Perlow of Linux Magazine writes:"
With all of the recent computer press coverage of Amazon and Intel converting their
web servers and other front end application servers to Linux, many of these stories
neglect to mention that the back end systems these companies use still rely on
commercial Unixes like Solaris, AIX and HPUX to host their RDBMSes (Oracle, DB2,
Sybase, Informix) for their mission critical transactional applications and data
mining.
Are there any companies out there actively using Linux to host a mission-critical
RDBMS ? or looking to replace UNIX with Linux for this purpose?"
I dont think that any large companies can use them. The use of free (as in beer) appz looks bad on sharehodlers.
Plus, senior IT execs need reliable support and assurance that they got the best software in the market for the job, just in case things go wrong. Its a liabilities thing
I administrate 'Theoldnewsstand' (dot com), an archive of newspaper articles some hundreds of years old for genealogists to search for their family in these time periods. The system relies on MySQL and Linux, and we have some > 10,000 entries for articles now. I've found myself actually requiring to use this operating system to keep the great performance, and boy, does it work well.
Ok, maybe they are not huge, but Prada (Italian fashion designer and sponsor of "Luna Rossa" at the last America's Cup), uses Oracle running on RedHat stored on a pair of EMC Clariions for their datawarehouse.
I don't know what size the database is, but the Clariions had 400GB each worth of disks.
--
The world is divided in two categories:
those with a loaded gun and those who dig. You dig.
As distributions like SuSE continue pushing ahead with high-end features (like logical volume managers, which SuSE already has), usage of these products on Linux will undoubtedly increase. Part of the situation here is cost. When Oracle Enterprise Edition costs $40,000 per CPU, plus another $8,000 or so per year for support, who cares about spending a little more for high end Sun or IBM systems?
Also, Oracle 8i, while supported on Linux, did not offer a couple of features found in Oracle 8i for other systems. In particular, full interMedia support for full-text searches of all sorts of documents (especially from software made in Redmond) was not available in the 8i Linux version. The new 9i does support this feature under Linux.
Are there people stabbing themselves in their ears?
I like Linux, but on the scalabilty front, it's still got a ways to go. Moreover, since most Linux used by corps (at least here) is Intel based, you've got to deal with less mature hardware (backplanes, reduncancy, etc.). Plus the enterprise management tools required are only starting to appear for Linux.
*climbs into his asbestos underwear to wait for the inevitable jihad*
I like lots of people. That doesn't mean I go carting them around the galaxy with me. --Dr. Who
Some people seem to run Linux on S/390's. There's a bunch of case studies here on IBM's website.
This is not a sig
With the advent of NDAs being signed many times over by any professional at pretty much every company on the face of the planet... this may be a more difficult answer to find than you think. I can tell you that my company uses linux for our servers, but we only have around 75 or 100gb of financial data in our databases
testing -- ignore this. thank you
Three Step Plan:
1. Take over the world.
2. Get a lot of cookies.
3. Eat the cookies.
On a related note, what are the largest installations of free software databases... especially the most popular, PostgreSQL and MySQL?
Any war stories?
How about building Redundant Arrays of Inexpensive Database Hosts?
I think we're going to see things change gradually as acceptance grows. Don't rush things. People will move when they're ready, and trust is there. Redhat's worth watching. And it doesn't have to be big vendors, as so much less functionality is needed in the DBMS in these days of N-tier & appservers based infrastructures
And how about designing FOR failure and using commodity boxes (running a free OS?) at the same time? Check out Clustra for a RDBMS that runs on Linux & Solaris, runs over LOTS of small, cheap commodity boxes, and is as a result, very reliable (yes, I do use it). Ok, so it's not free in any sense, but it's good and solid, and used by some big players in the telecoms industry.
ooooooh! What does this button do? - DeeDee, Dexters Lab.
VA Linux provided a PDF last year while we were buying hardware showing the largest Oracle implementation in the world was on Linux.
I believe it was netapp
If I find it I shall post it.
Slashdot Beta should die a painful death.
DB2, Oracle and Sybase are all available on Linux. Sure, people run them, even in big companies but with how big databases, it's a different story.
If you are running a very large Unix box, such as an E10000, the operating system is optimized for the hardware, and the release of Oracle you're running is optimized for the OS. Even so, they still don't work that well--there are many unexplained bugs and glitches, even with the latest stable releases of Solaris and Oracle. No one would want to introduce further instability with a new OS.
Furthermore, there are no potential cost savings. Solaris essentially 'comes with' an E1000, and all your administrators are trained in Solaris.
When Oracle first started producing their appliance products, they were based on Sun's microkernel.
That has since changed. They are now using Suse Linux for all of their appliances. They work fairly well for what they are designed to do, which is to provide an administratively simple appliance... you don't deal with the OS, ony the Oracle admin interfaces.
Looking at my client list, 4 out of 12 of them are running various Oracle instances in Production on Linux, both Suse (the only officially Oracle supported Linux distro, if I'm not mistaken) and Red Hat. 9 of those 12 run Linux in development environments.
While the Linux deployment has usually been in a development environment, I've seen the trend start to move into Production environments. I think this can be attributed to a number of factors; the maturity/stability of Linux, the cost (hardware and software), the feature set (journalling file systems without having to pay through the nose for Veritas), and the hardware availability.
That and the fact that Oracle offers support for Suse. That is HUGE.
While the bigger companies are still using Solaris and HP-UX for their Oracle needs due to the hardware involved (I have yet to see an E10K run Linux, never mind in production), most of the smaller companies I deal with are running Oracle on Linux in some part of their company.
Also, a number of Oracle's newer integrated development tools (JDeveloper, Enterprise Manager, etc.) are being ported to be 100% Java so that they will (and do) run on Linux.
$0.02 (CDN)
Aside from this, much of the main databases (including almost all the mission critical stuff) here are on HP systems. Despite HP's uncertain future (having ditched PA-RISC), I doubt they'll move from HP in the near future.
Now take this reluctance to move between mainstream Unix vendors and apply this to linux, the upstart on the block. Quite aside from the "free" nature of linux and perceived lack of accountability, there's a further issue. Even when sticking with mainstream ventors, there's a reluctance to mix vendors; i.e. there's a desire to use IBM software on an AIX box, simply to avoid the finger pointing that can ensue. IBM have even had ad campaigns based on this. There's a certain comfort factor in knowing that you can go to one vendor and say "fix this" which you don't get with linux on Intel. IBM, HP and Sun all make the hardware and OS; you don't get that with linux (with the potential exception of some IBM kit like the S/390).
To get over this, there need to be vendors willing to support the software and hardware side of a linux solution. Hopefully IBM will pave the way with things like S/390 and the zSeries server.
We hosted roughly 2tera of mission critical db on 2 Quad processor Linux servers. They were running Oracle as their db. It worked great, and we had little problems.
We were also an AIX shop, but decided to go with Linux for this application because of the overall price of hardware and supporting applications.
IBM has starting doing research to get DB2 running *equally* as well on Linux as it does on AIX. Of course, DB2 runs fine for most installs on Linux but for a large RDBMS (of data warehouse size) the perception is that DB2 must run on AIX to perform well. IBM has been trying (and I'm sure it will take some time) to level the playing field. Perhaps, IBM wants to phase out their own AIX and replace it with Linux?
I remember, I installed and configured Oracle 8i 10 GB ( Not large wrt US standards.!! ) database on Redhat 6.2, for a Health Center - At a time when M$ was actively chasing license violators in India. It made perfect sense to go in for Redhat and the mgmnt was very pleased and were infact very happy to know that won't be paying anything for the OS. My last talk with them indicated they are still using it.!!
Our company, a custom e-solutions provider, uses Oracle 9i on Linux almost exclusively because of Oracle's reliability and the fact that we have the resources in-house to support it. There is a caveat to this, though.
At $5,250 for just a 2-year. single processor standard edition license, 9i is not cheap and
most companies who already have an infrastructure built on it will not always realize a signifigant cost savings by moving to a Linux platform. 9i
Enterprise Edition is a cool $45K per processor so it is easy to see how the difference between $20K and $100K for an 8-way Intel versus an 8-way Sun
machine may not always be the determining factor in a platform decision for a system with a 5+ year time horizon.
A while back i remember reading about one of the German banks Kleinwort-Benson I think that had swapped its entire server infrastructure over to Linux. Other financial institutions are looking at it as an option
The City of Bloomington, IN will be doing this. All of our servers are Linux, with the exception of one NT machine for a small Progress database, and several HP-UX machines for Oracle. We'll be migrating them to Linux in 2002.
irb(main):001:0>
i myself am in the datawarehouse of a large international company, our DWH is run off IBM as400's with DB2 + essbase/hyperion.
there are several factors why there will be no change in this.
IBM offers complete intergrated solutions (HW+SW) that you dont get with opensource solutions.
the opensource rdbms cant compete with the likes of DB2 and Oracle in terms of scalability and features.
3rd party integration. (Esssbase/Hyperion) database cube solutions dont exist for linux/freebsd. (man 3d cube db's are funky)
stable cross platform ODBC drivers, (winnt drivers for ASP, JODBC java+websphere, AS400 + RS6000 drivers)
support. (who gives 24/7 support on postgress, and send out tech support guys giving consultations, will come on site on a sunday at 4am?)
what OpenSource rdbms provide true mutli language support (we have records in cryllic, japanese, american, german, etc)?
high availablity (i dont know the current state of HA functionality in the linux kernel)
Linux on the AS400 is not seen as providing anywhere need the requirements at present, and its opensource database solutions are same.
(and i dont even think there is any cube database products in the opensource area... ???)
no sig for you
In addition to the links above, most of the big database systems have active Linux ports. Any Oracle, Sybase, Informix or DB2, InterSystems, Poet, or Versant customer is a potential Linux customer.
It looks like the USGS has some use for linux as a backend - the NWISWeb is using RedHat and MySQL, according to their "About" page.
As to size of the database, the realtime sites are collecting measurements every 15-60 minutes, one or more parameters, 24/7. It all adds up after a while.
If you are running a VLDB on Oracle, you want a 64-bit system; otherwise the SGA is limited to 2GIG.
Oracle only supports Linux x86, with all of its 32-bit memory constraints. Does Linux implement memory windows like 32-bit HP-UX?
Also, at linux.sybase.com, you can download for free the Alpha-axp version of Sybase ASE 11.0.3.3 - this is probably the most available commercial 64-bit database for Linux.
Really, the Linux and WinNT versions of Oracle are at the low end of the food chain.
We have four linux machines using Oracle 9i RAC for our database. The boxes are penguin computing 200x Relions each with qlogic 2200 fibre channel cards and an Intel 10/100 dual nic card, which ties into our SAN'd up Clariion 4500 disk processor/array. The three nics (including the onboard) gives us a frontend/app network, backup network, and an oracle IPC interface.
We have had success using Redhat 7.1 (upgraded kernel to use LVM) and Suse 7.2 (comes w/LVM) for the linux distribution. Do not attempt RAC or OPS without an LVM of some sort. It can be done, but it shouldn't.
The biggest expense you will have is the disk array, and you should not skimp on this. Buy fast reliable maintained disk.
The Linux solution beats out Sun solutions in price hands down. You are talking $30,000 per box for the minimal Sun allowed hardware requirement for the Sun Cluster software with the Oracle Parallel DB runtime licenses (this has changed with v3 and so have the hw requirements). The Sun Cluster software requires an extensive review process by Sun which basically insures your company has two extra of everything and can be onsite to help Sun with their software and hardware in 4 hours. If your company doesn't have it's shit together, Sun and the few vendors that even know what Sun Cluster is aren't even going to bother talking to you about it.
This Linux solutions beats out a Windows NT solution in reliablity over the simple fact that the disk and volume management is clumsy. There is no easy way to create labeled raw devices on a Windows machine. The process as I remembered it was creating unlabeled logic partitions for each disk space and then maintaining a file pointing to the value of the related registry key to map out the tablespaces. As soon as you added a partition, modified a partition, or even used another node to look at the partition table, you and the database were screwed (i.e. restore). This problem with managing shared disk may have been fixed in 2000.
The weakest point in the entire Oracle 9i RAC is the cluster software layer. Whether you are using Sun's Cluster Software, the Oracle supplied cluster manager for Linux, or the hardware vendor supplied OSD layer for Windows. Be prepared to spend serious time in monitoring and getting it under control with appropriate patches.
Once you have fought your way through all of this you can reap in the rewards that multiple nodes with shared data gives you. The greatest benefit is the ability to partition your data and your application which allows you more opportunities to scale. If your data does not partition by some logical means (date, timezone, city, planet, etc) forget about it. Just get a big honking database machine (especially you SAP/Peoplesoft poor SOBs).
No.
Energis Squared runs the technical side of Freeserve and other ISPs. Most of their core systems are Linux based, with some Solaris and *BSD boxes in there too.
I know for sure that a large telco in germany (namely mobilcom) used have about 4+ Mio Useraccounts stored on a Linuxbox running MySQL.
Last i heard (about 2 years ago) they dropped the MySQL in favor of Oracle but still had some Linux boxes running it...
Hmmm. True - but /. is hardly a Critical Mission.
My Girlfriend works for Oracle. I have discussed Oracle and Linux, yes it does support HUGE RDBMS implimentations a couple of which she works with, in her support role.
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
If there is a candidate for Linux in mission critical roles, it would be SAP's linux version of R/3.
I remember Siemens being the initial rollout for R/3 on Linux.
I don't recall of the databases were on Linux.
Anybody out there familiar with this?
I'm sure there are plenty large databases running on Linux and even MySQL. Solving the problem of large databases is relatively easy.
The much more difficult problems are availability (i.e. 7x24, runs for years with no interruption) and throughput.
When you combine these constraints to specify the problem of a large, highly available and highly active database that meets ACID test criteria, you have an enormously difficult problem. Until recently with the advent of Linux on mainframes Linux couldn't even dream of playing in this space simply because of the hardware it ran on. Sure, lots of people have Linux boxes that have uptimes for years, but some people have had to reboot because of a bad hard disk or other component. It doesn't happen very often, but it does happen. And the I/O bandwidth hasn't been there to support the kind of throughput needed at the high end.
Linux on mainframes doesn't really change this at all in the short term, even if you have a proven DBMS like Oracle (forget MySQL or Postgres), because the system as a whole hasn't proven itself. Question: How much money does an airline lose if it's reservation system is down for a few hours, even if it happens once every several years? How much money does a financial institution lose by being unable to execute transactions for even an hour? Answer: enough to buy plenty of proprietary software. People who run these kinds of applications are willing to pay the price for systems with a track record of success in this demanding area. They are often willing to sacrifice certain kinds of sophistication to ensure the safety of their company's critical operations.
I think that once Linux is established on the kind of iron that is needed for these applications, it will take as much as a decade before people will trust it for these kinds of missions. Phrases like "mission critical" are bandied about so they have little meaning; Linux is ready to support many applications that are important to businesses today, but can't be entrusted with other ones yet.
Nobody with a working application of the type I describe here is going to migrate to Linux. Nobody starting such an application from scratch will give more than a moment's consideration to Linux. The most likely entree into this space will be evolution of an application from something that is reasonable to host on Linux on small to midrange computers. If the company doesn't have the resources or the time to migrate to something more reasonable, the Linux will begin to get its shot at proving itself.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
I keep a journal at www.livejournal.com They distribute copies of their clients and their server under the terms of the GPL. They use mySQL in what I consider to be a very large environment. I don't have exact numbers, but it is a large (very large) site and keeping track of all of those journal entries is obviously very trying. I guess I should also share that they are having their fair share of problems keeping their hardware up to date to handle their load. Check them out!
replace something which obiously works well and is designed to do just this task with something that might work?
The operating system is free (beer) as it comes with the machine. The machine is designed for running large databases.
Of course there is lot of overlap. Not quite-so-large or not-quite-as-mission-critical systems can (and do) run Linux.
Why try to use the same tool for everything?
If you have to use Access, you can connect to it via PHP or Perl from Linux using ODBC Socket Server, located at http://odbc.sourceforge.net
ODBC Socket Server is an open source database access toolkit that exposes Windows ODBC data sources with an XML-based TCP/IP interface.
It has clients for PHP, Perl, C (in Windows, Mac, and Linux), Java
.
I don't know if it is large enough for you, but the WheresGeorge.com web site runs off of MySQL on Linux. It has a huge database of dollar bills.
:-)
If you don't think its large enough, then open your wallet and add to it.
At http://www.wohl.com/middleware5-01.htm they mention a couple of real world examples (where the Wimbledon example might be considered as a high capacity showcase for IBM technology)
"At the Wimbledon Tennis Championships, Linux, dB2, and Netfinity servers make it possible to offer real-time information on scores to fans around the globe. Last year, over 914 million web hits occurred during the games, requesting scores and statistics."
"ERP Central is a portal for ERP consultants. They offer ERP news, job postings, and other information, but their big 'traffic builder' is a free time and expense tracking program which users can access to maintain their schedule information and submit it back to their offices from the site. Linux hosted and built on top of Websphere and dB2, the application can scale to handle over 100,000 users and organizations whose consultants use the software estimate that it saves them 75% in time savings, an average value of $500,000 per organization per year."
JK
It's only a matter of time before linux becomes the preferred server for RDBMS. After all, both of them are obsolete.
If you want to see the future of databasing, look at Cache on Win2k server, OpenVMS, or AIX. All but the most delusional kooks realize that hierarchial databases are superior to relational ones in speed, reliability, maintainability, scalability, etc.
We have a need for a new DB system
What systems are available?
Schedule meetings with the sales people from the various vendors, so that we can compare what's out there.
Boink! That's where Linux bounces up against the wall of established companies... except for a smattering of VARs, nobody is there to "attend the meeting" to tout Linux's praises to the big boss... except for the internal sysadmin and/or program managers, who then have to plug the stuff as a better alternative to the established vendors. So, IMHO, for corporate usage, it's not about what the OS can do, it's all in the selling of it.
Now if you'll pardon me, I have to go to a meeting where a big storage vendor will be showing us their wares. Really. ;)
I'd have a personalized plate on my car, but "toxic bachelor" won't fit into 7 letters.
One might consider a "large" database in terms of total disk used for the tables, indexes, and logs. Or it could be total concurrent users logged in to the database. Or it could be total simultaneous users - different than concurrent users since simultaneous users are those actually issuing a SQL statement.
A high number of simultaneous users will require more processor/CPU capacity. A high number of concurrent users (with a low number of simultaneous users) might not require much processor capacity but will likely require more memory capacity due to the number of concurrent connections (and each connection having some amount of it's own memory).
>Hmmm. True - but /. is hardly a Critical Mission.
./ 'ed all of the time :).
But it is continually
-asb
you forgot your sarcasm tags
At the company I work for (which will remain unnamed because I am not in a position to speak on its behalf - but it is an old and large american company with a single character stock symbol) we use Oracle 8i on Compaq Proliants running Red Hat Linux - not only that but it's RH6.2 with all of the limitations of that line of kernels.
None of the databases are gigantic - 80Gb is the largest, but we haven't had any problems at all. If anything, most of these databases used to be on True64 (Digital Unix before that) and we had a lot of problems (although they were probably hardware related). Also - users have reported that performance is better (not that it was a real issue before) but we've never bothered/attempted to document that.
I can't say that the main factor for the move was money (although it was a factor) - after all, if you can afford the Oracle licenses you probably should not be cheap with the hardware/OS but we've had a whole lot of RH Linux for other applications and it just made sense to consolidate.
People tend to make the distinction Unix/Linux.
You never see Solaris/Linux or AIX/Linux etc
even thought the lines of original SysV or
BSD code inside them can probably be counted
on the fingers of one hand. So isn't it about
time that Linux was just accepted as another version
of Unix? Or is there some criteria it doesn't
yet meet?
Why are they using Sun/HP/IBM Own Unix other then Linux for their mission Critical Apps its really simple The Software is Designed around the Hardware and the Hardware is designed around the Software. I do a lot of work with Sun Sparc Systems with Solaris. And I find that Solaris Works Really good with the Sparc Arcecture and Vice Versa. Linux on the other hand was designed for Hardware that the hardware was designed to run on DOS, Windows systems. And linux had did a good job making their OS run this platform and do it better then windows. But still I find that using Solaris on Sparc/Ultra Sparc systems runs very smoothly and I have little to no trouble adding Hardware. Or upgrading. And I find that Solaris is far more stable then Linux is in special cases. Such as their X Server that runs a lot smoother then XFree86 (I know XFree is not Linux) but Every once in a while XFree86 Will completly crash on me with no way of accessing Linux (Including telnet). In a sence it locked up. Wile I never had that problem with Solaris.The main reason is that their are thousands of different vidio driver to use. But still Solaris and other UNIX on their own platform seem to take the Brunt of the work very well. (Plus it helps that these system generally use higher quality parts)
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
I am currently building a molecular biology database for sequence analysis. By biology standards it is quite small (dealing with about 100K entries for sequences, a handful of tables and it isn't terribly complex.
However, I am adding results from similarity searches (it takes time to do these) and piling them into the database. I estimate approximately 3 million records within the next month.
So far it is up to about 400K records and seems to be behaving very nicely.
This is postgres 7.0.3 (stock version with SuSE 7.1).
I intend to migrate to 7.1.3 or later in due course because of the record size limitation in pre 7.1 versions. It remains to be seen how well it will cope with text searches of 3 million records (about 1.5secs with 250K entries in the table) or whether I have to do some keyword/prefix tree munging to speed things up.
If the whole thing falls over then I'll have to look at why and whether I need bigger hardware (currently on 2xPIII/833 1Gb RAM) or a bigger database management program but that would mean money...
At present I get my colleagues to prototype their databases with something like filemaker and then reimplement in postgresql.
..d
--- Four bases should be enough for any genetic code
Slashdot probably has millions of entries by now, if you think about it. Ten thousand entries is hardly a pittance.
That isn't to say your db isn't 'mission critical' (since obviously you wouldn't be able to do much without it). But it certainly isn't big.
autopr0n is like, down and stuff.
Linux is being used solely for the weather.com site, it was down for a couple of days, but i dont know if its bacause of my companys proxy. (they use the other guys software). Its running on a G5 S/390 i think. Here is the link
I will bend your mind with my spoon
AT&T perhaps?
"Are there any companies out there actively using Linux to host a mission-critical RDBMS? or looking to replace UNIX with Linux for this purpose?"
Where I work, we are running Linux as a firewall, and on a separate box, as our donino mail/app server. But after that, we move to Unix for RDBMS which we currently run as test platforms for our Documentum and iMan installs (also on Unux boxes). Now we do have a business partnership with sun, so we do get Unix licenses for cheep (especially since we are developers) but besides that, we can get support from sun on their Unix platforms. There is support for Linux platforms, but generally for more then we pay sun.
Now if only we could get our managers to run Linux as our workstations instead of win2000, or at least dual boot >:)=
My 2 cents plus 2 more
A little while ago I heard that NCR had Teradata running on a 128 node test system running under Linux. They said that porting from MP-RAS to Linux was much easier than porting to NT (TNT) and it is much more realiable than NT. With NT they say that the systems have to be rebooted every month and they have had fundamental problems with the regestry that MS won't allow them to release workarounds for. (They can't fix the registry because it is MS proprietary and they can't release work arounds because MS threatend to void support contracts if they did.) Plus with NT MS doesn't allow them to use hooks like they do under MP-RAS and can do with Linux (not sure if they do or not right now) so performance will altimately suffer more under NT if not already.
At my company we are in the process of switching our progress database currently running on a sco unix box to a new compaq server that will be running Red Hat. This database is extremely mission critical to our company (ie pays the bills. Progress is one of the best platforms I have ever used and is extremely stable on linux.
I'm *certain* there are companies out there using Linux to host "mission-critical" (whatever that means) RDMSes. But this by itself would tell us nothing of Linux's suitability for this purpose. I happen to know lots of companies that use Linux for this purpose, but they also are companies that would not be able to afford the Sun boxes and Oracle licenses that they wish they could run. I also know several places running Linux for - what they would consider to be - "mission-critical" RDMS, but what they consider to be mission-critical is FAR different than what a big investment bank or hospital would consider to be mission critical.
Instead of just asking a question that is almost guaranteed to pat ourselves on the back, we need to also ask for descriptions of the conditions that people are using Linux for RDMSes. That is, before the answer "are you using Linux" can be properly interpreted, we also need to know answers to questions like: How many connections ? How many users ? What size of a database ? What kind of availability do you demand ? What kind of information is being stored ? How big is your staff ? How big is your budget ?
After all, knowing that a company uses Linux to host Postgres/MySql tells us nothing if the company can't afford to buy a Sun box/Oracle license in the first place.
Why? Surely one of the point-and-click interfaces to PostgreSQL would make more sense?
Got time? Spend some of it coding or testing
We run Linux at one of our fabs here in Taiwan running a mission critical DB system called C-Tree. This is 24/7 stuff for those of you who don't know how Fabs work.
Objectstore. An object oriented database (see www.objectdesign.com ) thats known for its speed.
Who knows why we didn't say that.
We are a small money manager, but we are redeveloping our research system from a proprietary language on Solaris to Java and/or Python on Linux. We are using Informix 9.21 for our RDBMS (our other database stuff - client info, quarterly reports, etc - has already been ported). We run Red Hat 7.1 on IBM e-servers (Intel-based). Basically we are moving away from Sun's too expensive hardware and NT/Windows instability.
I think IBM is going to help make Linux a very viable platform.
Note that the UCITA and DCMA make it even more difficult - actually almost impossible - to sue your software vendor.
So WHY does everyone keep repeating this mantra that you can "at least sue your vendor" with proprietary software? YOU CAN'T. And how is a contract with a closed source vendor any more legitimate than a contract with an open source one?
The PICK database (aka D3) is a little known database that's been around for about 20 years, and was ported to Linux about 4 or 5 years ago. This DB is fully implemented on Linux, and I've talked w/ people that have 1000+ users running. The DB itself has several million user-licenses in the field, and a lot of them are running Linux. The Linux implementation supports multi gigabyte DBs and the user count is limited mostly by the power of the machine. I think this counts.
On their website... absolutly huge RDBMS
half my job is supporting our linux network (about 20 nodes, soon to be 30) running a Postgres server with "mission critical" data on it. Our Engineers and Ops guys use it to keep track of telemetry from our satellite. It's all part of an ancillary network that works better than the real support network of SGI boxes.
DB2 on linux has great performance.
/390 (zSeries) is awesome.
...
Linux on rs/6000, as/400 (iSeries), and system
Now, buy that nice hardware (better than plain ole Intel boxes) and either run Linux on em with DB2, or AIX, os/400, or z/OS
A medium-size company (200-300 employee) in Italy (where I live) is migrating their data on a SuSE linux + DB2 solution.
They will use a couple of IBM server (dual or quad x86) in a failover configuration.
You have also to consider that in Italy we have an high number of medium-sized company instead of a (relatively) low number of big companies;
so the size of the company in question is quite big for our parameters.
In TUX we trust
Our Oracle 8.1.7.2 instances running on Solaris 7 to Oracle 9i running on Linux. Our biggest problem so far is vendor related, as our ERP (Peoplesoft) climbed into bed with Microsoft some years ago and basically has just ignored the Linux market for an apps port :(
Anyway, we're shopping replacements for our 3500's and we've found that bang for the buck, Linux for Databases is the way to go. Most of these servers are one-task anyway, and Oracle runs like a champ so far. There are some issues with Glibc that require some manipulation of libraries to get around if you want to use any other dist. than SUSE tho, which sux. That said, we're testing with mandrake 8.1 and it runs fine (post patch).
Imagination is the silver lining of Intelligence.
I work as a sys admin and programmer for the IT department of a university. Basically we have everything running on linux here with very few problems. In fact we are bringing our once useless mac hardware back to life with "yellowdog linux". One linux box runs the main website. The other runs our database systems. Some of these systems are pretty large, although not HUGE scale. Our log files certainly get immense by the end of the month:)
A quick search on Monster.com for +mysql +linux in Info Tech reveals several companies. Whether or not any of these are exclusively linux shops is highly debatable. Browsing through the descriptions some seemed to be using a mixture of different things.
Silicomm Corporation Alabanza Corporation VIP e-Commerce.com, LLC IU Bitnet Bulkregister.com Bizland.com Homestore.com ING Bank, fsb Blackboard, Inc. Express Logistics bluebox communications Pyxis Corporation CareScience, Inc. Ticketmaster Express Technologies Inc.
I run avidgamers.com, a community hosting service currently hosting around 7000 communities. We have 1.2 million records getting an average of 20 queries per second, ranging from sigle-record results to large summarizing queries. (With a fairly large part leaning towards the latter, tallying the number of replies to each thread in message boards.)
Running MySQL 3.23.40 on a 1.4GHz Athlon with 1GB of RAM and an 18GB 15krpm SCSI drive, the system is doing ok, but it's starting to feel the load peaks. I'll be upgrading to RAID fairly soon, which should help things.
All in all, I'm very happy with MySQL, but I'm strongly considering a move to Postgres, because the lack of row-level locking is starting to become a problem. Stability has been no problem... no crashes, no data corruption, nothing.
I'm sure this is in no way one of the largest installations of free software databases, but I thought I'd post my experiences anyway.
-- If no truths are spoken then no lies can hide --
Would we have had this if the software package was from Sun? Well, Sun might have blamed IBM, IBM might have blamed Sun and we'd be left with something which doesn't work. We've been lucky in that IBM want this to work to secure future business, and that is the carrot you can use to 'bribe' vendors to fix bugs.
While open source allows you to track down the bugs and fix them yourself, it relies on you hiring programmers and/or smart admins. Many companies don't want to do that, particularly when you can get the people who wrote the code to fix it (whether you can get them to fix it or not is a different matter; managers' perception is that you can and that's what affects buying decisions).
As for suing, it depends on the terms of the contract. A large enough business should be able to negotiate special terms with vendors to secure business (don't play ball with us, you don't get our money). If a company wants to be bullish enough, it can negotiate terms that do allow it to sue the company, even with UCITA and DCMA. Unless I'm mistaken, those acts mean that vendors are allowed to put horrible restrictions on sale of software etc. It doesn't say that individual purchasers can't negotiate a better deal.
One final point. I'm not saying this to say "linux is doomed, it's never going to make it". I have great hopes for linux (in my last job, I made a lot of use of open source software to good effect), but there are still a few things to be ironed out before big companies are going to adopt it in a large scale. Half of what I'm doing here is playing devil's advocate because I like a good argument (NB: argument != flame-fest!).
Linux can run RDBMS just fine, it's all the other stuff that is lagging. Manufacturers of fiber storage and other high end products tend to focus on solaris more than linux. Large RDBMS includes a lot of other important details that need constant management and attention. Building a PC box with redudant powersource, fans, backup CPU's and motherboards gets you close to solaris prices, so enterprise projects tend to choose solaris or mainframes.
I work for a large manufacturing company. We run Oracle on OpenVMS. With all the nonsense going on with the Alpha since Compaq took over, we've started looking at Linux as an alternative. Hopefully we don't go with XP.
I am a large RDBMS and I use uxLni and DBS*. I do not run sniWdwo because it is too ievxensep.
about.com uses MySQL and postgres under Linux and FreeBSD for much of their mission-critical data. They also run Oracle on Sun, but may move that to OSS.
-Turkey
For what it's worth, I've done this sort of thing a few times, and I currently have two sort of cookie-cutter solutions in mind:
1. A good quality server-class PC with some hardware RAID storage running FreeBSD and PostgreSQL, usually along with Apache, mod_ssl and mod_php4 to build web-based applications. Often this will go along with a copy of Cyrus IMAPd and SquirrelMail for handling e-mail (one of these days I am going to write an IMAP<->SQL bridge in Perl as a replacement for Cyrus).
2. I have yet to personally work on something that exceeds the capabilities of this configuration, but if it happens, I would probably switch to using Sun server class machines running Oracle. Presumably at that time the application would have a big enough budget to spend that sort of money.
MySQL does not pass the ACID test. Lots of people don't think that's important, but if you have a 10 step transaction and step 9 fails, it's a lot easier to simply say "ROLLBACK WORK;" than it is to undo steps 1 through 8. Never mind the fact that without transactional atomicity you have a potential race condition.
We are using a Oracle on Linux for a critical
2Tb database. IT works really fine. Much less
expensive at same performance than solaris
I want to give an anecdote of client leverage that sort of relates. This is a third hand story, but knowing the person who told it to me I suspect it's true.
A friend of mine was consulting many years ago with a large financial firm helping them to maintain their Netware 3.x servers.(as you can see it was several years ago) They had a tape backup system in house from one of the really large vendors that was not working.
They went for like a month where they could not get good reliable tape backups on the servers, and playing phone tag with the vendor trying to figure out the problem. Just wasn't working.
Anyway towards the end of the month, my friend griped to the CIO about the problems they were having and his frustration with dealing with the vendor. The CIO brought up the issue at the board meeting and how it was a risk to the company.
At this point the VP of trading piped up... "You know, we own several million shares of that company in our portfolio... let me see what I can do"
VP of trading calls up the President of the vendor company, tells him that if they don't fix the problem with the tape backup software he's going to issue a warning about the companies product quality and dump every single share of their stock on the market.
The next morning a team of developers were flown in and working on the problems. They had to recompile several modules, but they had the issues resolved within two days.
I guess the point is, there are many ways you can leverage a vendor. It doesn't have to be a lawsuit.
As larien said, usually you just threaten to not pay the contract, or not renew. Or add stipulations as part of the negotiation. I've been involved in many an instance where that has played a huge part in getting better support.
Once I had some issues with a GIS package we had purchased. I tried to work with support, and they ignored me. So when the $5k yearly maintenance agreement came up, I told my boss not to pay it because it didn't gain us anything. I also posted a note to a usenet group explaining my problem.
Next day I got a phone call from the development manager.
Financial incentives are the strongest leverage you can have with a software vendor. Like it's been pointed out... that doesn't work with Open Source in quite the same way.
http://www.mysql.com/articles/us/avacom.html
A large database (in this context) is an enterprise-sized system: multiple platforms serving many millions of records in short periods of time.
I have customers fielding databases on multiple Enterprise 10000 servers...single tables of more than 35 million rows. This is actually a "medium" system in my mind.
I love Linux, I hacked around the pre 1.0 kernels many years ago. BUT, it does not scale up too well. Even the little things in Linux make it hard to do a good (maintainable) job: shifting device names (pull one of your HDs and see what happens), inability to modify hardware subsystems (storage in particular) while running live, etc. Even EMC, NetApp and XIOtech hardware can't fix these issues.
If the Linux crowd wants to be accepted by Big Business, they must learn the needs of Big Business.
Running a few 4-proc Intel servers with Oracle or Sybase does not put you in the same league. Nor does storing 10,000 articles in MySQL.
If you can imagine doing it yourself, if you can even imagine the amount of data to store, then you are almost surely below the threshholds I need to work in every day.
You might wanna look at www.livejournal.com They use mysql and linux as a backend and support a massive amount of clients (200k+ I think).
Of course there is also, source forge, fresh meat, etc. Lots of places use linux as the backend.
Would we have had this if the software package was from Sun? Well, Sun might have blamed IBM, IBM might have blamed Sun and we'd be left with something which doesn't work. We've been lucky in that IBM want this to work to secure future business, and that is the carrot you can use to 'bribe' vendors to fix bugs.
Yeah, this is basically why Sun and Oracle have a
special support thing - basically Sun will support
both it's stuff and Oracle's (and Veritas too if
you're using that) with just one number to call
for all of it. "One throat to choke" as Scott
McNealy calls it.
However, I guess DB2 on Sun hardware is too small
to do the same thing... (they'll push you to
migrate to Oracle instead I guess)
The not so recently OpenSourced SAPDB seems to
be an RDBMS of the same level that Oracle and
DB2. It is the engine of the SAP ERP solutions
those systems tend to be quite large and really
complex. So I would presume the RDBMS that runs
them is really robust.
Is there anybody who is building large solutions
based on SAPDB? (I've heard SAPDB is already
included in the SuSe distribution)
Actually, there's only 24 single letter stock tickers. I and M are both open.
There are many a political game played when new systems are attemted to be brought in. Many times when trying to convince an account to move to a different hardware platform I've heard 'if this fails, I'll loose face' or similiar comment. Many are scared to do something different because no one has gotten fired for running Windows.
As a rock-in-roll Physicist once said, No matter where you go, there you are.
Informix was, I think, the first major DBMS vendor to offer a Linux port, circa 1998 (with the possible exception of SolidTech). Some time later, however, they ceased to support Linux, in which time all the other major players got in the Linux market, then Informix decided to try to recover its market share, but by that point it was too late...
Bush Lies Watch
Linux 2.2 cannot drive the hardware that a critical DBMS would have. Linux 2.4 has not been stable. My company has many Oracle databases, only one of which runs on Linux. This is the smallest of the databases - only a couple hundred queries a day.
As 2.4 gets the bugs out, the possibility of change grows.
Why, oh why doesn't the moderator god give me moderator points when a really good post like this comes up?
Too bad the guy posted as an AC...
All about me
We ran extensive comparisons for a Data Warehousing project using Sun HW/Solaris/Oracle versus Penguin Computing/RedHat/Oracle and while the Sun solution was slightly faster in our tests, it was only marginally faster, yet cost significantly more. No way could we justify the additional expense based on our results. And we haven't looked back. Our Oracle servers haven't failed us in nearly two years, and they just keep getting better. and today's options for Linux hardware are much batter than 2 years ago. We even discovered a problem with a particular Sun server during our testing that Sun asked us to keep quiet about. We took that to mean they'd sue us if we discussed it. Didn't take long to realize that this was not a company we wanted to do any business with. Sun sucks.
We are a retail software company. All are clients, their stores and offices use RedHat with Informix Online as their databases. We could not have asked for a better solution for stable and sturdy performance. And now with IBM buying Informix, there is nothing to worry about the future of Informix the product either.
Linux is great on 32 bit hardware with 2 or less processors. Larger than that and the advantage goes to the commercial guys like AIX, Solaris, and HP-UX. These Operating systems were built to scale well, and have been around for a long long long time. Have you tried to run Linux on a 32-way with 256GB of ram? How about a 72 way with 500GB? Many people running a 64 bit Linux kernel?
Linux actually has NEGATIVE scaling on all benchmarks above 2-8 processors depending on the test. It actually runs faster on an 2 way than a 16 way. Linux has kernel locks the size of the Pacific Ocean. It is not preemptable. The scheduler has 1 run queue, that is a linked list. It is just not built to run on the high end hardware that you need for a large RDMS.
Now ask this again in 2 years and you may get a different answer. I mean Linux does now have a good VMM (finally) and many good filesystems. It is slowly moving up into the enterprise market. Two years ago nobody would have run Oracle on it, now they will run Oracle on it on the low end. That's nothing to laugh at. I have seen development efforts addressing ALL of the issues that are keeping Linux out of the high end enterprise market. If half of these make it into the mainstream kernel Linux will move up to the mid-range.
Anyone who cannot cope with mathematics is not fully human.
Weather.com is using Linux quite successfully to host its Oracle backend. They have replaced 250K Sun machines with 50K Intel based systems doing the same work.
There are many companies out there that use Linux for large data warehousing. The company I currently work for has one of the largest multi-value ERP/MRP databases in the world that is ran on AIX but I have dealt with many other companies that use the same multi-value system on a linux platform and it is much more cost efficient. I think some of Linux/UNIX users would find this interesting as not much has been done in the multi-value database world and the concept behind multi-value databases, performance and total cost of ownership is amazing. check out http://www.rainingdata.com for more info.
Ummm... Perhaps not. $40000 per cpu is not a lot compared to the cost of each CPU. A high end Solaris box will start at around $450,000 for a couple of CPU's and run up into the millions. For that kind of money (plus the cost of Oracle), you can buy a cluster of VERY powerful x86 servers and run IBM's DB2 EEE for Linux (which has no such limitations as the Oracle/Linux port) in a clustered configuration and blow the Sun box out of the water in price/performance.
What legitimate member of the /. community uses a phrase like "mission critical transactional applications and data mining?" This posting reads like a really, really bad press release.
I work on Unix machines running Oracle and on OS/390 machines running DB2. Based on my work in that and all my tinkering on linux over the years I think Linux is now able to handle mission critical on the right hardware. All the tools for big bizz mission critical stuff became available in Linux recently.
But, and this is a big but, it has to be setup by the right person. I have seen Unix and MVS systems setup and hose up for mission critical situations. We lost a lot of money while the systems were down. The higher ups would blame the people (as they should have) because the systems work in other situations just fine so it must be the people.
Based on perceptions, if it were Linux setup by the wrong guy and things went belly up they would blame the Linux because it's untested. It would end up the scape goat instead of the lazy implementation group. That's what Linux has to overcome.
I remember a quote I think was from the Red Baron, "It's not the crate, but the man in it that counts".
"And now with IBM buying Informix, there is nothing to worry about the future of Informix the product either. "
Unless IBM shoves DB2 down your throat next time you talk about licencing with them.
Well, the place i work, (technical college), we use linux exclusivly for our back bone, we actually replaced our SCO UNIX with Red Hat Dist.
Since the school only has a 8Mbit + 2Mbit connection to the outer world, we are heavily relying on servering pages from the internet faster than our real line can hold, and that we do with 2 transarent proxy servers, also using linux. it works great and our students dont feel the strain of our small real connection.
We run a large auditing system (OLAP-oriented rather than OLTP-oriented) on PostgreSQL (v7.1.3) on Linux (RH 7.1), using Tomcat (v4.0.1) as the front-end. We're running it on a Dell PowerEdge 2400 (2x PIII-866) with their Perc RAID controller with a Raid 1 and a raid 0+1 volume.
Our database is currently a bit over 8 GB, with many of the tables exceeding one million records. Queries typically join > 5 tables.
We moved from an MS Access/SQL Server environment and are much happier with the functionality , performance, and stability we now have.
Not to slam DB2, as I think it's a great product and have successfully used it for some really big projects, but for this application I found the PostgreSQL delivered ~4x the performance on many of our key queries. The lower cost and lower administrative overhead sealed the deal in favor of PostgreSQL.
As always, though, your mileage may vary.
Gordon.
He that breaks a thing to find out what it is has left the path of wisdom.
-- J.R.R. Tolkien
I work for a company where we tried running a large RDBMS (DB2) on linux. It failed HORRIBLY.
We're back to AIX now, and everything runs smoothly, and we get decent support from IBM.
"Nothing runs on Linux like DB2". Hah, so true...
Sure, there are some companies that run some databases in the category you describe.
But there are more companies running more databases that aren't.
Those are the prime candidates for Linux RDBMSs. They're often still large databases, and usually important. But Linux handles them just fine in my experience.
Are there things that Solaris or AIX does that Linux doesn't? Of course. Are some databases better implemented on AIX/DB2? Of course. But that doesn't mean that Linux shouldn't be used for hosting RDBMSs (even large ones), or that everything requires AIX/DB2.
A quad-processor Intel box is nothing to sneeze at anymore. Neither is Linux.
It is my experience that, while Linux is not ready for the very largest most mission-critical databases, it is ready for large, important databases. We use it here for 8+GB databases (PostgreSQL v7.1.3), and are very happy with the performance, reliability, and functionality we have.
The bottom line is that Linux and Linux-based RDBMSs are constantly improving, raising the top-end ever higher. Use your judgement (and test-test-test!) when making your decisions. But don't brush of Linux because we can't (yet) run the 100-TB super-database. Most people won't be doing that anyway.
Gordon
He that breaks a thing to find out what it is has left the path of wisdom.
-- J.R.R. Tolkien
I know we're all 'rah, rah Linux!' around here, but the question being asked is pretty unbalanced. I don't know firsthand of any large RDBMS Linux implementations, but that's not saying much.
I do know there are a *lot* of large-scale BSD RDBMS systems out there.
It seems a little skewed to put Linux against 'commercial OSes' when BSD isn't a commercial OS, and is arguably better suited to the tasks at hand than Linux.
Use a hammer for a nail, and a screwdriver for screws.
Kevin Fox
On the contrary, I think the post should have been marked +1, Funny. The idea of someone using access for a large, mission critical rdbms is hilariously absurd. :)
Libertarianism is rich wolves and poor sheep playing gambler's ruin for dinner.
How the heck is this "Offtopic"?
He that breaks a thing to find out what it is has left the path of wisdom.
-- J.R.R. Tolkien
In fact it is strange to see this question accepted. Many months ago IBM's DB2 advertisement
/. ... and it
:(
ran on many and many IT magazines, and sometimes
even windows zines talk about Oracle and Linux...
I think that pointless questions and topic are
posted every day MORE here on
is quite sad
Did everyone forget that one? The NASA Acquisition Internet Service (NAIS) switched from Oracle to Mysql about 1 year ago. The MySQL announcement can be read here. MySQL's news page also has highlights of many companies who have made the switch.
"BSD is about people pissing each other.." (Moid Vallat)
Yahoo!'s databases are all Oracle running on FreeBSD. If that isn't a testament to which OS is superior, I don't know what is.
My company is a midsized (1000 employee) manufacturing facility and we are currently implementing Oracle 11i on Linux. We are currently live on financials only, but MRP is under development, with CRM added later.
So far Linux has proven to be quite adept at handling the load. We have scalability concerns as we approach the later phases (when we approach 250+ users), but we're hoping Linux will continue to grow as we continue to roll out the new features.
Admittedly using Linux has caused us to use a different approach from the traditional "one big box" unix approach. We purchase multiple servers for different tiers and so far we think we can scale easily to support our load. Of course only time will tell.
One thing is for certain, stability has not been an issue. We brought all seven machines online in early March to begin production, and all seven provided 24x7 operation without one minute of downtime during the entire development cycle through go live. We recently rebooted each system to apply security updates to the kernel, that's it.
My management has been very impressed, our consultants say they've neven seen a system respond so quickly, and we saved a bundle over a Solaris solution (which was our next choice).
Later,
Tom
We're currently running Oracle 8i under Windows NT on a couple of DEC Alphaservers (4100's with quad processors).
With MS's abysmal support for NT on the Alpha these days, we've considered moving the Oracle database to another OS. I don't think we want to trash the DEC Alphaservers yet though - since they're still respectable machines. Linux for Alpha is definitely an interesting option for us - but I'm wondering if anyone has had experiences with Oracle for Linux on the DEC Alpha? How does it compare, performance-wise, to running Oracle on the Alpha version of NT?
Last time I checked, Oracle wasn't really giving a high level of support to Oracle for Linux unless you used it on Intel hardware?
What you say is mostly true, but mostly beside the point. First, leave Microsoft out of it. The most respected platform for midrange RDBMS's is Oracle on Sun. The question is, could Linux replace Sun in that setting? Sun does provide meaningful tech support, at least to large sites. Comparing consumer-level tech support ("reboot the comptuer") to enterprise level support is pointless. Yes, the web beats consumer level support.
The web will answer any common question about common software. But if you're doing something even a little off the beaten path, you can run into problems that haven't been solved yet. This can be very dismaying when you're under time pressure. Commercial Unix vendors will actually investigate the problem and issue a patch if necessary.
Basically it repeats the assertion that large corporations are not looking at Linux as an alternative in the enterprise.
I'd have a personalized plate on my car, but "toxic bachelor" won't fit into 7 letters.
Any war stories?
I have 5 million rows in my headline table at www.syndic8.com . Retrievals on the table
are very fast. This is on a 1.2 Ghz Athlon running MySQL.
The biggest issue is that I am perilously close to ext2's 2 GB limit on file sizes. I will fix this with some reorganization and some data compression.
although they aren't a big company by World standards, they are one of the bigger IT Consultancy firms in New Zealand.
Theta specialise in Oracle Systems development and integration and use Oracle 9i RDBMS with Oracle Portal running on SUSE Linux for their extranet. They have some clients running Oracle on Linux as well.
They also do cool stuff with Oracle 9i Lite running applications on iPaq's.
although they aren't a big company by World standards, they are one of the bigger IT Consultancy firms in New Zealand.
Theta specialise in Oracle Systems development and integration and use Oracle 9i RDBMS with Oracle Portal running on SUSE Linux for their extranet. They have some clients running Oracle on Linux as well.
They also do cool stuff with Oracle 9i Lite running applications on iPaq's.
While VA worked with Oracle to get everything going and supported under RedHat, AND they migrated completely to Oracle on Linux, they are not a good example as they are currently migrating their entire operation to become a Windows only operation. While the remaining Linux folks are fighting hard, they are losing the battle. I would expect to see VA drop all references to Linux in its name in the near future as well. So sad...
I'm working with Penguin Computing on a quote to get our cellular data moved off of an IBM AS400 over to a Linux solution in first quarter of next year. I know that we have the budget money for it and my boss (The guy who actually approves such projects) is behind it 100%. Who says that Linux can't be useful to large corporations?
DB2 on Sun hardware
I think IBM supports that.
Multiple vendors is feasible if management is technologically cognizant. With a PHB or two easily intimidated by buzz-words, better to stick with one vendor.
We've just installed an IBM server and there's a problem with a couple of bits in it (I won't give details as I don't know if I'll get into trouble for it...).
I don't have security clearance, but I'm guessing these secret bits are 0 and 1.
While not specifically databases on Linux, another of his companies is "ThinkNic", manufacturer of $200 Linux-based internet terminal for home users.
We use SuSE to host our Oracle 8i databases. It works great. Much better than Solaris, in my opinion. We are also looking into running 9i application server. Not many people know this, but oracle 9iAS is actually just Orion that has been repackaged by oracle.
Have you looked at also running a MySQL slave and running the large summarizing queries on that slave?
A heavy combination of fast writers and slow readers is not nice on MySQL.
One of the PostgreSQL developers was telling me about a database he once designed. The details are a bit hazy and second-hand, but I believe it was originally using Ingres, which was what piqued his interest in PgSQL.
Anyways, the system basically handled a few gigs a day or so of data from GPS satellites and such. It basically crunched numbers and stored results in an effort to figure out how much the earth's tectonic plates were moving from day to day. I would imagine that this system handled many, many rows and transactions daily. I'm pretty sure they moved away from Ingres to PgSQL, which they're probably still using now.
It's not exactly a commercial application, but it is an RDBMS that handles a lot of data, and apparently worked quite well.
J
There are a couple of ways of extending the 2GB limit in MySQL.
RAID_TYPE=STRIPED where the data is stored in several files.
UNION of several identically structured tables. Good for reading from long monster but only writing to one of them. UNION also works if some of the tables are compressed.
From the MySQL documentation:
The RAID_TYPE option will help you to break the 2G/4G limit for the MyISAM data file (not the index file) on operating systems that don't support big files. You can get also more speed from the I/O bottleneck by putting RAID directories on different physical disks. RAID_TYPE will work on any OS, as long as you have configured MySQL with --with-raid. For now the only allowed RAID_TYPE is STRIPED (1 and RAID0 are aliases for this).
If you specify RAID_TYPE=STRIPED for a MyISAM table, MyISAM will create RAID_CHUNKS subdirectories named 00, 01, 02 in the database directory. In each of these directories MyISAM will create a table_name.MYD. When writing data to the data file, the RAID handler will map the first RAID_CHUNKSIZE *1024 bytes to the first file, the next RAID_CHUNKSIZE *1024 bytes to the next file and so on.
UNION is used when you want to use a collection of identical tables as one. This only works with MERGE tables. See section 7.2 MERGE Tables.
When I was at school (British Columbia Institute of Technology), one of the profs set up a box for SAP/R3. He was using a unix of some sort... could have been linux... can't remember.
Quad PIII-Xeon, 1GB RAM, 100GB HDD space on SCSI.
That was to run the server side of it. For the classes we used the windows client.
-----
Side note: The city of Vancouver went to SAP/R3...not sure about their hardware though.
They are huge and use Linux. Google must have some TB? Slashdot is at least huge in number of hits/users
So what makes a Linux backend system any better than commercial Unixes like Solaris, AIX and HPUX to host their RDBMSes? What make a Linux backend system any better than a BSD system? (after all it is free too)
Not a dam thing.... other than the company can say they are fully buzzword compliant. Linux users are usually confused, they adopt Linux because it's popular "and cool" and hide their ignorance in completely subjective reasons like the license or some technical merit that they heard about but they don't really understand. Companies tend to careless about what popular and cool and tend to stay with what works from them. That's all that maters.
I want to say this once more a company running Linux isn't any better than a company running a commercial Unixes. (I don't care if the guy delivering my pizza drives a BMW or a Pinto I just want my pizza) I know the common response of how linux saves money... how linux is this and that... btw last time I checked most free Unixes are a hobbies passion of love and not really meant for mission-critical applications (god just think what NASA is running on)
I know I will get moded down for this post cause I don't hail the flag of tuxs... that's fine but for those who do actually read this (and use linux) think about why your a linux user... first it might have been cause you heard about this free UNIX like os, then it might have been its not microsoft, then it might have been to one up your friends.
(end or my off the wall rant)
Yes... and they have a Co-operative Technical Support Agreement too (Sun VIP). As they do with many hardware and software vendors.
Wanted: One witty yet thought provoking
I work for a largish hospital system. All of our payroll data, most of our patient census data is stored on Oracle DBs that are hosted from Linux boxen. It saves us tons of money not using commercial unixes which in turn allows us to provide better patient care.
I am the lead dba for a company that processes 15-20 million us dollars worth of transactions per day. My backend database is solaris/oracle, it does 3000-4000 sql statements per second, and my company would loose maybe $1000 in revenue for each minute it is down. The larger
two tables in this databasehave in excess of 300 million rows, and are acessed by 100k customers per day. We have over 11 million customers.
It's running on a E4500, which is saving us a lot of money *not* buying E10000s. I like to think it's tuned well, but a big part of the reason it works (fast) is also that it is on an EMC with over 90 disk drives in it. I'ts all about IO bandwidth and servicability in my world, and on those points you are correct in saying sun is a handsdown winner over linux.
.
Now, I work with a sysadmin who is a whiz at making lots of linux boxes work reliably as a web frontend, and is also good at keeping our backend solaris based database up 24/7. neither of us is anxious to put the backend on linux, but we did put up a significantly large, high performance, but *relatively * low availability database up on linux.
It's a 6x800mhz intel box with 4g ram and 16 disks on mylex caching raid 5 controlers. Raid-5 sucks in general, but the point of this system was to get a lot of bang for the buck, so as a big league dba, I took the challenge of making data loads fast in spite of raid-5, in order to get a crack at de-installing windows from this box. If I spent some bucks on more disks, we could get a much faster system, but then that was never the point of this system.
The system is about 200G worth of partition tables (copies of the same 300M row tables mentioned above) with partitioned rollup tables off the sides, for business analysis. The real trick is the partitioning. because of the partitioning, this system is able to do many types of analyses that cant be done on our other analysis system which happens to be solaris with 60 disk drives.
the linux box was a leftover from a failed windows project, so in some sense it was free, but I belive it woulda cost about $80k new. gig ethernet and controler was about 10 or 15k of it.
It's working well for DSS, since the 2 times it's crashed in the last few months didn't really hurt anything.
I'm rambling on now, but I'll talk to the DBAs out there, who speak my language.
If you're gonna do Linux oracle:
- reiserfs sucked performance wise on top of raid 5. Don't know if I did something wrong, but I abandoned it in favor of ext2. I don't care if fsck takes a long time on this system, and ext2 creamed it for database io perf on raid5. I also couldn't get perf out of reiser on simple stripes without the added hurt of raid5, so go figure. fsck times are irrelevant if you use raw partitions, so this is the way to go in most cases.
- Max out the memory (of course) on an intel box. I think the most you can do is 4G on intel platforms. this is sufficient for me, but I kept the SGA down to about 500m, so I could have 10 way parallel processes with 200-200M of sort area size.
- Watch out for linux caching. I've turned it off for my filesystems. It's easy to get into "writeback debt" by pushing a lot of dirty blocks out of oracle cache into ext2fs cache. Add raid5 suckiness at random writeback, and you've got serious constipation problems on your hands.
- I've used some raw partitions, for this system , they seem to be worth it to avoid ext2fs caching hassles, but I haven't migrated completely yet. The "raw" command must be used to "bind" a name to a disk partition before it can be used by oracle as a raw partition, so it makes for a few extra hassles, but no big deal.
- I got a mylex caching controler, which aparently has hot swapping capability in the hardware, mitigating the absence of veritas volume manager and hot plug capabilities at the linux level. It also makes raid5 tolerable. Haven't proven hot swapping by testing yet tho.
- Ext2 fs has some raid5 aware stuff, this helped on the raid5 mylex vols I have, based on cursory thruput tests, but I'm not sure I'm getting the block alignment proper at the oracle level. (don't know after all the oracle/ext2/controler layers, if oracles 16k blocks are aligned with the stripes on the mylex. sigh.
FWIW, back in the dot-com heyday, I also had clients doing modest high availability (to them) databases on oracle/linux. Even then, on relatively small (in gigabytes) database the biggest tunining hassle was writeback caching of linux getting in the way of oracle, and the biggest hassle of scalability was managing many many disks. Raw partitions can get around the former, intelligent controlers (mylex etc) or intelligen disk arrays (clariion, sun t3 etc.)
get around the latter
We considered moving to Postgres for FastMail.FM as well because of the row locking issue. But instead we moved to MySQL with the InnoDB backend (which also drives Slashdot). We've found it works extremely well, and actually doing the upgrade was just a case of running 'ALTER TABLE TableName TYPE=InnoDB' for each table. InnoDB comes with the standard 4.0 binary now too, so you don't have to separately get the -max binary or compile it in yourself. And InnoDB supports multiple files over separate disks (including putting the log on a separate disk of course) so you don't have to worry about converting to RAID.
I can confirm that SAP DB is included in SuSE 7.3...
You can get more info at http://www.sapdb.org
Weather.com is almost entirely Linux based, and what little isn't either belongs to The Weather Channel (and is Solaris) or is going away....
If you're not living on the edge, you're just taking up space!
Also the world's largest web hosting company does not use linux for the bulk of their hosting servers (yet). But they do run linux on mission critical back office servers running Oracle 24x7x365.
They use Oracle for the performance and safety.
They use Linux because it can host Oracle.
They opt for Linux over other OSes that can host Oracle because of performance, cost, and manageability.
They migrated off Solaris because Linux was better in all three of those categories. That isn't to say that Linux is better in all situations, but in their situation, the choice was painfully obvious. Linux ran away with top honors in all three categories.
We use PostgreSQL on Linux here at TrustCommerce. "Mission critical" might be an overstatement (it's credit card processing, which is important but not exactly life-or-death).
Cyclopatra
"We can't all, and some of us don't." -- Eeyore
I had looked into InnoDB earlier, but the row size restrictions made it problematic. Your comment prompted me to check the documentation again, and what do you know: They fixed that limitation starting with 3.23.41. Thanks for the suggestion :)
I don't want to upgrade to 4.0 (which is still in alpha) just yet, but I believe I'll compile 3.23.44 with InnoDB support and give it a shot. Hopefully, the upgrade is as easy as you say. Any hints/tips/caveats or possible problems you've run into would be helpful.
-- If no truths are spoken then no lies can hide --
We run our Oracle databases on Sun hardware from small E420's though 6800's with gigabytes of RAM for performance. There's nothing in the Intel world on that scale, not to mention fault tolerance, scalability, support and so on. No one is going to run Linux on one of these Sun boxes. You need Solaris in order to support the fault tolerance and scalability features of the hardware. Linux will have trouble in the big leagues because on that level the hardware and OS are made by the same company for the reasons above. Even NT DataCenter is hardware vendor specific in order to support the hardwares fault tolerance feature.
is deploying a MySQL/Linux-x86 back-end package for its outlets to sell ~80 millions tickets/yr. We are also working on switching from IIS/MSSQL to Apache/MySQL.
The alternative is to find an OSS developer willing to fix your system for $5K...
(Of course, if you let them play with your hardware/develop something else at the same time, they'll probably do it for cheaper.)
ShunScene
This is from kx.com using kdb, what I and others consider to be the most advanced database and development environment in existence today. It has amazing performance and unmatched efficiency. It absolutely crushes Oracle, SQL Server, Informix, and MySQL (shown by their tpc benchmarks). Here is a cluster they put together.
.1 second.
on thursday jan 4, 2001 steve miano, ed bierly, keith mason and i
loaded 2.5 billion trades and quotes on a 50cpu linux cluster.
simple table scans on one billion trades, e.g.
select distinct sym from trade
select max price from trade
take 1 second
multi-dimensional aggregations, e.g.
/ 100 top traded stocks
100 first desc select sum size*price by sym from trade
/ daily high and close
select high:max price, close:last price by sym, date from trade
take 10 to 20 seconds
translating the data from TAQ to kdb took about 5 hours.
(steve had loaded the 200 TAQ cd's onto several disk drives.)
distributing the 100gigabytes over the 100Mbit ethernet took 3 hours.
(this cluster should probably have Gbit ethernet)
loading the database (k db taq.m -P 2080), starting 50 slaves,
connecting, mapping shared indicative tables over nfs, building
parallel partitions, etc. took
You are just using the wrong database, then. There is a 50cpu linux cluster (not Beowolf, but the native clustering to the database) that was loaded with 2.5 billion stock transactions. It performed very well using KDB (taken from kx.com):
.1 second.
on thursday jan 4, 2001 steve miano, ed bierly, keith mason and i
loaded 2.5 billion trades and quotes on a 50cpu linux cluster.
simple table scans on one billion trades, e.g.
select distinct sym from trade
select max price from trade
take 1 second
multi-dimensional aggregations, e.g.
/ 100 top traded stocks
100 first desc select sum size*price by sym from trade
/ daily high and close
select high:max price, close:last price by sym, date from trade
take 10 to 20 seconds
translating the data from TAQ to kdb took about 5 hours.
(steve had loaded the 200 TAQ cd's onto several disk drives.)
distributing the 100gigabytes over the 100Mbit ethernet took 3 hours.
(this cluster should probably have Gbit ethernet)
loading the database (k db taq.m -P 2080), starting 50 slaves,
connecting, mapping shared indicative tables over nfs, building
parallel partitions, etc. took
this data involved millions of rows and i had it all on a Pentium III 500 with 128mb RAM using Sybase 11.9.2 for Linux. Sybase on Linux is rock solid and i would have no hesitation in recommending it.
The performance was good because the database was designed well and the indexes were optimised for the most common queries. The only issue was/is disk space which was/is easily remedied.
I also did a proof of concept using PostgreSQL 7.1.3 on Linux and also on BeOS BONE by importing one of the bigger tables (3.2 million rows) into PostgreSQL with and without indexes set and querying with and without indexes. I'm pleased to say that both platforms performed well i used the same datafiles used to build the Sybase tables.
You might be interested to know that is very easy to have PostgreSQL databases on different devices if performance (ie device contention) is an issue.
ArsDigita uses Redhat+Oracle+AOLserver in its Arsdigita Community System (ACS).
Phillup Greenspun (MIT) Started the company. Now he sits on the board as a major shareholder, he mentioned in an email "I didn't get along with all the business men and venture capitolist I highered."
As far as i know this is the biggest company that sets up Big Business RDBMS on linux. So far Siemens and The World Bank are thier biggest coustomers.
I'm just learning ACS now and it's quite interesting, however the fact that AOLserver uses TCL scares some away.
Oh, it works by the way! AOL serves over ten thousand hits per second with this architecture.
I tend to post in point form, perhaps i'm just lazy or busy or somthing, hope this info is useful to someone, it's my 1st /. post...
My Karma ran over your Dogma....
What about Google?
Google has huge databases (caching the web). It is run on tons of linux boxes. Their entire business depends on speed and accurate information.
an article about Google
I think one reason that companies may be reluctant to switch critical systems to open source is the possiblity of a group of crackers giving themselves a backdoor into their computers. Although I'm not familiar with the organization process of an open source project I would think that an unscrupulous person may involve themselves with the construction of the OS and may be able to create backdoors for them to use at a later timem, leaving the company with no one to blame.
I stole this Sig
I was surprised to see the sizes noted by people as large. It seems that the Linux (or at least the /.) community are not really big on RDBMS systems!
...
If they were they would know about the TPC benchmarks which frequently refer to terabyte sized databases.
Generally speaking Linux systems work well, usually better than comercial systems, if the databases are very small. This again generally due to the fact that they were optimized to work on small configurations, which is common but not the norm in enterprize.
We use Sybase on a RS6000 (IBM S85) with a EMC^2 storage array delivering aroung 80GB. Our system is small compared to others I have been exposed to. As for 10 000 articles, we have over 2000 tables and many of them have more than 20Million records.
We had a project to test feasibility of Sybase on Linux for our system (although it was 3 years ago), which failed miserably and today we still use AIX, IBM RS6000's and Sybase
I was surprised to see the sizes noted by people as large. It seems that the Linux (or at least the /.) community are not really big on RDBMS systems!
...
If they were they would know about the TPC benchmarks which frequently refer to terabyte sized databases.
Generally speaking Linux systems work well, usually better than comercial systems, if the databases are very small. This again generally due to the fact that they were optimized to work on small configurations, which is common but not the norm in enterprize.
We use Sybase on a RS6000 (IBM S85) with a EMC^2 storage array delivering aroung 80GB. Our system is small compared to others I have been exposed to. As for 10 000 articles, we have over 2000 tables and many of them have more than 20Million records.
We had a project to test feasibility of Sybase on Linux for our system (although it was 3 years ago), which failed miserably and today we still use AIX, IBM RS6000's and Sybase
I think Google is probably the only enterprise size RDBMS on Linux in existence and it is a success because it consists of 8000 or so small systems in the Gigabyte range clevely made to work together. It is also all proprietary, as well as a sponsored research project.
The system started off using HP and HP-UX for the servers, but after rising maintenance costs, they moved from HP-UX to Linux (Mandrake distribution, I believe) and switched to using standard PCs, thus saving a fortune in maintainance costs.
The client software uses Win95 (unfortunately) but these are only used to access the server. The apps there ere written in Delphi. Use of Linux though gave the project considerable savings over other solutions, unfortunately Informix was not cheap.
The work of switching the server from HP-UX to Linux was trival (it was also Informix there). I would have liked to get away from Informix too because of the license costs, but regrettably that would have been too much work at the time.
This may not sound like a killer app, but this is where the share ownership is recorded for a county of around 24 million people, so it is definitely an important system.
The application was developed in St. Petersburg, Russia by the St. Petersburg Currency Exchange and funded by a loan from the World Bank. My own company managed the project.
See my journal, I write things there
Mike bernstein? only guy I know gay enough to still like guns and roses.
Yes there are. Next question.
Would we have had this if the software package was from Sun? Well, Sun might have blamed IBM, IBM might have blamed Sun and we'd be left with something which doesn't work. We've been lucky in that IBM want this to work to secure future business, and that is the carrot you can use to 'bribe' vendors to fix bugs.
That's a nice little theory you got there, but in the case of IBM and other large corporations you face in-fighting between all the little business units. I work for a Fortune 100 company and like most we have an ELA (Enterprise Licensing Agreement) with IBM where for a set price, we get product licensing, product "discounts", enhanced support and whatnot.
My clients decided to take advantage of this "deal" and went with an all IBM solution, (hardware/system software). They got an RS/6000 H70 running AIX (obviously) with a 3995 C64 Optical Library running TSM/HSM. As it turns out, we had quite a bit of problems with the performance of the optical library with TSM/HSM. Following your logic, one call to IBM should have taken care of it. What ensued was finger pointing between RS6k Hardware Support, AIX Software Support, TSM/HSM Support, and the Optical Library Hardware Support. Each claimed it was the other group's problem, and that they didn't have experience with the other pieces. We even had our account rep sit down with us to sort this all out, but even he was unable to get anything done. To this day, I'm stuck with an Optical Library that is unreliable with substandard performance. IBM as a whole was not able to get its act together, and as a result my clients got the short end of the stick.
Never ever had any downtime, I don't believe.
If by "never" mean "not in the last week", you're still wrong.
Yes, please do a write-up of this.... I'm the network admin for a city government in Texas (city's population ~100K, near the D/FW area, I can't say the city name here on /.) and we've got to divorce ourselves from MS, at least on the server side as soon as possible due to MS software not being feasible to maintain any longer because of unrealistic cost escalation and forced premature obsolescence as well as the usual well-known security and reliability problems.
We have a InnoDB table with 12 million rows. While its somewhat slow at times, it kicks Orables @$$. As far as a tip, you better listen when they say don't make the memory pool larger than %80 of RAM. It will swap hard and kill performance, believe me. :)
Consider keeping everything you've done the same, except change Linux + ext2 to FreeBSD + UFS + softupdates. It will likely make a big improvement. Oracle runs just fine under FreeBSD's Linux emulator (I have heard that you must install Oracle on a Linux machine, then tar up the installation and move it to a FreeBSD machine).
In another year when FreeBSD 5 comes out we hope to give Solaris a run for its money on SMP hardware, believe it or not.
Your point. (-:
Got time? Spend some of it coding or testing