Free Software, Get What You Pay For?
An anonymous reader writes "The Xooglers blog is running an interesting article on how big businesses may start out running free software but there is always the continued question of 'Should we go with something "real"?' at some point in their evolution. How often are technologies like PHP, Perl, and MySQL being pushed out once startups get managers who know nothing about the technology and only worry about name brands?"
I think it's unfortunate but in the IT world it is generally true there is no relationship between quality of software and cost . Some of the best software I've ever used has been free, some of the worst software I've ever used has been expensive.
IT and technology and particularly software can be (is) difficult to understand on many levels: functionality; efficiency; ergonomics; stability; etc. In a book (and God, I wish I could remember the title of this book -- one of my faves) talking about manipulating perceptions one of the discussions centered on the fact that when all other criteria are indeterminate or unavailable, it is human nature to assign credibility and worth based on price or cost. This is rife in the world of software.
Unfortunately, I see this as something taken advantage of rather that properly addressed.... sigh.
I don't know about the rest of the people here but, when my company talks about "going with something real", they're asking whether we should stay with Fedora or go to Red Hat Enterprise Linux.
Sit, Ubuntu, sit. Good dog.
Since data management has a complete, underlying theory, as well as several decades of best-practices, it's pretty easy to assess quality. Just look at the list of what's possible, and see if product X or Y correctly implements more of the items. If both products are free/open and are equally reliable and secure, what's the point in using the product with the smaller number of features?
... I am .. a huge supporter of MySQL although I also have heard of instances where MySQL has had corruption problems. Hopefully the new version 5 has fixed that.
Some of the comments here are pretty scary. I'd expect to hear this from a summer hire working on a content mangement system, not somebody working at a big company on a system that involves MONEY:
If you don't have transactions you just roll your own. It's actually not hard at all.
Whoa.. he must be pretty fucking good. Let's suppose you have a DBMS with, say, two major apps using it. You need to adjust a value in 500,000 rows in a single transaction. If one doesn't get changed, and the others do, you are totally fucked. If one application ever sees a state where all 500,000 haven't been changed, you are totally fucked. Power failures, application death, DBMS death, all should maintain this invariant (of not being totally fucked). Your assignment: do this at the application level with a MyISAM table.
I can think of ways to do it, but they aren't transparent to the applications. They involve basically implementing a DBMS layer in between the app and MySQL and putting implementation-dependent columns in the tables (bad design: only business-related data should be explicitly placed in columns).
We have been using MySQL on our site (www.degreeme.com) since its inception and have never had a single problem.
Yeah, hopefully... WTF?
People who push the "no transactions" FUD also forget that transaction support often reduces the reliability of applications
Unbelievable... maybe he means "instead of silently accepting crap, the application returns an error when an error actually occured".
The truth is that it's not the end of the world if you mess up a row or two in most databases
Whew, now I know what to tell my clients when they cut checks to their suppliers based on bad data printed on invoices.
Well, that depends on what you mean. In some sense we had no need for [transactions] at all, obviously, because we built the system without them.
Translation: we didn't have transactions, so we just left them out and crossed our fingers.
Question: if you aren't using transactions (or constraints, etc), how do you know if you had a problem? Do you hand-check the data? Do you have a script running ever 10 seconds to check?? "Consistency guaranteed by wishful thinking".
Oh well. Glad to see MySQL backpedaled on all the BS they used to spout and actually implemented some basic SQL features in the latest version 5.0.
Time is money.
Free and/or open source software such as Linux, the GNU tools, Mozilla, Open Office, GNOME, KDE, MySQL, Apache, Postgres, and many other wildly successful tools have been worked on for countless hours by skilled programmers and designers. Whether out of the kindness of their own hearts, desire for recognition, or a business investment, people have spent millions of hours designing, developing, testing, and documenting Free Software. Consider for a moment how much it woud have cost to pay each and every one of those people for their time. That's the amount of money that hass been put into Free Software.
If someone gives you a mansion, you don't assume it's worthless because you didn't pay for it. The worth is still there; someone else already paid for it.
The nice example given in the article clearly shows clueless managers and not convincing enough developers.
In small startups you may pick it because it's free. In giants like Google you pretty much disregard costs of software purchase and just compare features. "Does it do all we need, well?" is the first and ultimately relevant question. All the others are secondary once the only competitors in the field have been estabilished. In case of databases there is no competition here, and all discussions should have ended at that first question. Does it do all we need, well? Yes, NOW it does, all we needed was added, it works fine. Does anything else do all we need, well? HELL NO! MySQL is an absolute master in the field of speed, when properly optimized beats everything and everyone (at costs of all the quirks we had to fight in the meantime). Everything else is much slower, and most choices will be simply way too slow for the expected workload.
Free (Gratis) or not, doesn't matter here at all. Open Source matters, if it doesn't do what we need, we can get it to do it, but that's not essential.
Managers who don't get it, won't work long. Simply because they will keep failing delivering working projects on time.
Anagram("United States of America") == "Dine out, taste a Mac, fries"
How the fuck can you all discuss this sort of thing when an INNOCENT Stan "Tookie" Williams has just been executed.
You people make me sick. I am leaving Slashdot forever.
Propz to GNAA
MySQL isn't a real database. Certainly not before version 5.0, at least.
However, upgrading to a real database didn't have to involve an expensive commercial brand. They could have moved to another free database like PostgreSQL.
---------
There is inferior bacteria on the interior of your posterior.
are PHP, Perl and MySQL push out because they lack scalability and are hard to maintain?
I mean they are fine for small buisnesses, but when the buisness grows, so do their IT assests (databases, website, in-house software, etc). Large amounts of PHP & Perl code can be hard to maintain when compared to other languages like say Java. And as for MySQL... I'll let the other comments deal with that as I'm sure they will.
I should also be quick to point out that this doesn't mean OSS is bad, just that some of it's more visible members are overhyped and rely on thier brand-name to 'sell'. A good example of scalable OSS tech would be PostgreSQL as a DB, Plone/python for the web and java for large server/desktop apps.
And before someone mentions that java isn't opensource, I'll point out it is a language with an open(ish) specification and has an open source compiler/runtime which is good enough.
That is when they die. Seriously. Look at these guys, what was their name again, ah, yes, Google -- kinda silly but you know these computer types -- anyway, where would they be if they had dumped their Linux and stuff for Windows XP or even OS X (the Unix of the great GUI but crappy thread performance)? Not trading at about $400, that is for sure.
Don't worry. This is evolution in action: The clever ones, the more efficient ones survive. Those who pay $400 for Microsoft Office instead of using OpenOffice for free are not efficient. If Open Source can keep the legal playing field level, the rest will take care of itself.
perl /is/ a brand name, pretty much.
filter: +3. Hey, look! all the trolls went away!
Sure MySQL has supported transactions for years, but tell me what is the best way to load 20GB of data into innodb tables?
;).
I tried a fair number of recommended ways, but still got stuck with the disk writing at 2MB/sec even at the _start_ (which is not even the actual rate data is flowing into the DB). Not even got to the "slow down due to big index" part. Yes I did set the sync thing to 0 (sync max of once a second, instead of on every commit).
I was seriously considering putting the innodb log files on a ramdisk
I didn't wrap everything in one huge transaction, because according to the docs if "stuff" happens it takes 30 times longer to roll back everything. So if you are 3 hours into inserting something, it'll take 90 hours to rollback... There are workarounds, but I've looked at the workaround, and "Thanks but no thanks".
In the end I loaded the data into MyISAM tables instead (disable keys then enable keys).
With MyISAM if you want to add an index, MySQL makes an entire copy of the table. The manual doesn't say that Innodb is different, but I haven't tested it.
Fortunately it's what I'd call a small site.
Imagine if you had terabytes of data and to break the news to your boss, "yes sir, we need to double our storage capacity", "Yes sir, our data is only growing at 200GB a year, but we still need to add 2 more TBs _now_ in order to add an index".
By the way by default MyISAM tables are limited to 4GB... Talk about thinking ahead.
Whereas with innodb, say you have a huge ibdata1 file, how do you shrink it once you're done with some tables? While you can have tables in ibd files, it's still something new - the previous reported bugs sure don't give me a lot of confidence in it.
I use postgresql at home. One day i should go load up 20GB of data into it and see how long it takes.
In a book (and God, I wish I could remember the title of this book -- one of my faves) talking about manipulating perceptions one of the discussions centered on the fact that when all other criteria are indeterminate or unavailable, it is human nature to assign credibility and worth based on price or cost.
Many California wineries discover that they can't sell their juice at its natural pricepoint of $19.95, but if they add a few choice words to the label, like "Private Reserve", or "Luxury Cuvée", and jack the price [or at least the MSRP] up to $99.95, then the juice flies off the shelves.
[BTW, this strategy doesn't work as well for European wineries, as their clientele is, by and large, a little more cynical when it comes to things vinous.]
that the question of when and why to migrate to and from free software to commercial alternatives is being asked, that it can be asked.
It means there are alternatives, that everyone can make a freer decision with more options, that free software provides a baseline commodity level that benefits everyone, and that commercial providers can compete on providing genuine value added on top of this baseline offering.
"Provided by the management for your protection."
I agree with your fundamental point, that the language chosen should depend upon the objective. However, I strongly disagree with some of your points.
C/C++ for really large programs - this gave us wonderful bugfree things like Windows, and Office. The power of the languages break down after a few hundred thousand lines of code as they turn into maintenance nightmares. Data dictionaries anyone?
Java hard to use? - This is the first time I've ever seen anybody say that. Java has its issues but, if you think Java is hard to use then C or C++ will be way beyond reach.
Lets really break it down. Loosely typed languages tend to be best for small quick and dirty projects. Strongly typed languages tend to be better for large projects. Speed should just be treated as a requirement. Some things are faster in Java, some are faster in C++. The old law works here: make it work, then make it fast.
Ease of code maintenance will always depend upon the dicipline of the development team, regardless of language. However, strongly typed languages will tend to promote practices of more maintainable code. Loosely typed languages give quicker development in the short term.
When a strongly typed language is used the compiler becomes a powerful tool in the "prevention" of bugs. If you define a method in Java to take an integer, the compiler won't let you give it anything else. If you define a method in C to take a pointer, you can give it a pointer to anything. Perl is so loose about its types that its painful. The larger a program the more this becomes an issue.
Your points about million line Java programs being slow, or about java data structures not working in "big" "professional" systems just shows you might have a bit a learning to do.
Some "big" "professional"'s using Java in large applications:
RealNetworks - Tomcat/BEA
Citibank - Java Web Server (very old version)
Bank of America - IBM WebSphere and Sun One
United States Navy - IBM Websphere
United States Airforce - Tomcat/IBM Websphere
Neilson Reasearch - IBM Websphere
Fidelity - IBM Websphere
Blue Cross Blue Sheild - IBM Websphere
Now these may not all be the best solution in each case but, these are some of the ones I have experience with.
----- If communism is a system where the government owns business, what do you call a system where business owns govern
Of course I'm not a SQL purist. I think that DBs exist as part of a larger system and play a specific role. I prefer my app-level logic and data where it belongs - in the application driving the db. Call me old school, but I can live in a world without views and stored procs just fine.
Where to implement your business logic (app vs. stored proc) is a preference question, granted. But having transactions and referential integrity constraints is not.
Transactions and constraints are your last line of defense against data inconsistency caused by application bugs. Think of them as sort of Assert-checking. Your application code has the responsibility to check, that the customer placing this new order actually exists. But if you forgot to SELECT FOR UPDATE in only one place, you created a bug that will probably not be exposed by any software testing. It is waiting for a race condition with the cleanup procedure that removes inactive accounts once a month colliding with the customer returning after 2 years.
If you define the constraints that represent the relationships in your data, your database will ensure that the above bug does not create an inconsistent state. One of the two transactions will fail and it might cause the application to do some ugly, ungracefull stuff. But trust me, you really want it to do that in this case.
Call me old school, but for some data I prefer belt & suspenders.
Jan Wieck
It takes a real man to ride a scooter