Ruby on Rails 2.0 is Done
Jamie noted that ruby on rails 2.0 is done. In addition to upgrade and installation instructions, the article lists a number of the more interesting new features in the release which appears to be quite extensive.
← Back to Stories (view on slashdot.org)
You are, or at least so far as hype == marketing.
"It is a good divine that follows his own instructions" - Portia, The Merchant of Venice
Back when I was a young whippersnapper, we called that thing a relative record number!
I am like the 10th comment and it is already slashdotted!
People have discussed this over and over and over again. I presume you're talking about support for composite primary keys. They aren't necessarily a good thing. Go read http://rapidapplicationdevelopment.blogspot.com/2007/08/in-case-youre-new-to-series-ive.html
I don't even consider normalization taken to the extreme, to be a good thing. It's a trade off, just like everything else - what you gain by normalization, you might lose in the form of added application complexity, or perhaps even something else. Just because normalization is "good" according to ivory-tower database theory, doesn't mean that anything that isn't fully normalized is "bad" or "broken".
"Yes I know that a serial/autoincrementing key makes it easy for the app... it makes it a lot harder for the DBA in a lot of cases."
Can you explain what exactly it is that makes it a lot harder? (And isn't a DBA paid to do his job?)
You could make a web app without ActiveRecord. You could write your own ORM framework or whatever. The whole philosophy is to have easy to use, quick to program, etc frameworks by default. If you want to chart your own course beyond that, there is nothing holding you back. Sorta like any other framework.
That's great for you, but Ruby on Rails is not - and isn't intended to be - a framework for redundant distributed DB applications. Ruby on Rails is not trying to be the thing for everybody. And that's exactly what makes it so powerful and easy. Indeed - turning it into something for everybody actually makes it worse. I'd like to see some proof that composite primary keys are so important in web applications. So far I've seen no convincing evidence. Despite all the complaints about composite primary keys, new Rails websites are written everything, even by high-profile organizations like IBM, NASA, Oracle and Yahoo. And they seem to function just fine.
If you really, really, really, really need composite primary keys, you can still fallback to raw SQL queries in Rails.
While Rails might not be my first framework of choice to implement Digg in, I prefer to build sites which actually, you know, make money by solving problems for paying customers. When you do that, you don't really have to worry about scaling to infinity and beyond, but you do have to worry about expressiveness, maintainability, and time to market. (If you have too many customers relative to servers, heck, easy solution there -- the engineer in me says "just throw up more boxes", but the businessman in me says "pay somebody to worry about it so I can go back to counting my benjamins".)
I have a Rails site, my first (hopefully of many) for my small business, which plugs along at about 20 requests a second in tests. If I could saturate those 20 requests a second, I would quit my day job on the spot. Scaling? Eh, who cares.
(P.S. Day job is writing enterprise level crud apps for Japanese universities on the J2EE stack. They worry a bit about, e.g., getting hit with 8k users signing in simultaneously during class registration. You know what we do? Exactly what I'd do for a Rails app in the same situation ("don't do anything stupid like an n+1 queries loop, cache the important stuff, and buy enough hardware for the job"). Only difference in Rails is I have never wanted to poke my eye out with a spoon while writing it.
Help poke pirates in the eyepatch, arr.
Actually no I am not talking about compound keys (although yes that is very, very important). I was talking about:
CREATE TABLE foo (company_name text PRIMARY KEY)
Which as I understand it, Rails is too stupid to understand. Granted, the only ORM that I know of that isn't is the one being used by Catalyst but still. It is a major design mistake to assume that your data can be correctly represented in a normalized structure using (id serial PRIMARY KEY).
I am also not talking about extreme normalization. I am talking about basic normalization... e.g; up to say the Third Normal Form.
The job of the DBA is to enforce proper database semantics, including design, performance, and maintenance.
Proper design is impossible with rails without reducing performance (the requirement to have a serial primary key and then a natural unique just to satisfy proper data requirements). Rails also increases database maintenance through mention of the above, and increases resource utilization (disk space and IO), reduces transactional velocity (having to update multiple indexes that shouldn't be required) etc...
I can go on and on.
However, I have talked myself to death with Rails and Java programmers before. The majority (not all) are stuck in there own little code generated world and don't want to actually do things correctly.
Not to mention that Rails significantly reduces your ability to do really cool stuff such as stored procedures (yes you can break out and do it manually but then why are you using Rails?), Federated databases and on and on and on.
Get your PostgreSQL here: http://www.commandprompt.com/
Ever notice how those most "concerned" about scalability tend to have never profiled or benchmarked their own code? ... or understand why you want to scale horizontally, rather than vertically? Whenever I build services that can handle 120,000 requests/sec., they usually just end up being 99% underutilized. Everyone likes to think THEY will be the next MySpace, with no server budget apparently. I highly doubt that any who argue Rails can't scale has ever had to deal with real distributed clusters. The database cluster will have many more scalablity issues than the webservers. This is such a non-issue, I cannot believe it. If you can scale JAVA!!!... You know what I mean.
Yeah, blame it on the programmers. Taking elitism (or perhaps "anti-programmer-ism" is a better word) to a whole new height.
Yes, my database theory knowledge is limited. I admit it. I passed the 'database systems' course, but I'm not a professor in database systems, so *of course* my database theory knowledge is limited. That's why DBAs exist and why they are paid to do their job in the first place.
But you know what, the database isn't the only thing that exists. There's also the actual application. If, by normalizing stuff to the extreme, makes the application 6 times harder to write (= missed deadlines, potentially more bugs, harder to maintain, etc) then I don't mind de-normalizing some stuff after carefully weighting the pros and cons.
As for "ivory tower database theory", consider the following scenario (this is based on a question from the "database systems" exam):
We have a 'teachers' table with columns (first_name, last_name, address, city, phone_numbers). Here, phone_numbers is a set. We are asked to normalize the table. One of the obvious things to do here would be to change the 'phone_numbers' column into a table, and add a one-to-many relation from the 'teachers' table to the 'phone_numbers' table.
The professor, however, took normalization to the extreme: he even introduced a 'numbers' table. So a 'phone_number' table has a one-to-many relationship with the 'numbers' table. (Or something like that; I can't remember the question completely.)
Great. Perfect normalization. But I would never write such a database schema. I don't even want to think about the huge, ugly, and hard to maintain SQL queries that result from this. And this is what I mean by ivory tower: great in theory, but makes things a total pain in practice for 99% of the applications.
Please just try Rails for a little while. While Rails has its flaws, overall it's a highly productive framework - and much of the credit for the terrific code clarity goes to Ruby, which is much more powerful and dynamic than almost any other mainstream language(other than maybe Javascript)
:conditions => {:status => 'pending'}) :pending, :conditions => {:status => :pending}, and then be able to change the previous example to:
Some things to read about and try within Rails:
* ActiveRecord's ability to introspect the DB schema at runtime. e.g. autocreating the method to allow: User.find_by_name('Joe')
* ActiveRecord's magic-fields, e.g. created_at/updated_at
* the ActiveRecord associations, and the easy DB queries that come with them, e.g. @user.posts.find(:all,
* the scope_out plugin, which provides some nice additions to 'with_scope'. e.g. in the Post model you could do scope_out
@user.posts.pending
* ActiveRecord callbacks and the controller before/after filters
* the RESTful routing and easy links that come with it, e.g. link_to(@user.name, @user) will create a hyperlink to the correct URL for that user record's 'show' page
* the form/field helpers which also integrate with the routing, so you can now do just form_for(@user) - it will create a proper form tag for hitting either the create or update method for that @user, depending on whether the record has already been saved to the database - the form_for/fields_for block syntax is also very powerful, especially when you add your own form helper methods
* all the convenience methods provided by active_support, like 5.minutes or 1.month.ago
* Ruby itself - Ruby is simply a joy to code in. even if I were going to dump Rails, I would now strongly prefer to find a new Ruby framework(like Merb) than using another language
I'd strongly urge you to pick one or more of the PHP MVC frameworks to look at while you read about Rails. Most of them are copies or at least inspired off Rails to some degree, so they often use similar conventions. You'll see the difference between what's possible in PHP and Ruby - PHP doesn't come out looking too good at the end
Exactly. The fact is, Ruby is slow. Rails is slower. This is a very accurate complaint about Rails as a web framework, although in the real world it's generally not much of a problem. Somehow though, people have confused speed with scalability, and start claiming the Rails isn't scalable. In fact Rails tends to encourage or at least make fairly easy a shared-nothing architecture which allows a trivial "throw more servers at it" solution to scaling
That said, because of Rails speed, you will wind up needing to scale it sooner and larger than you would a site written in Django, say. If people want to complain about the hardware costs of running a real-world Rails site, I encourage them to do so and put up real numbers regarding the money they spend on developer time and other company expenses vs. server costs, and how Rails being so CPU-hungry is killing their bottom line. So far I've seen none of that, just uninformed whining
Or rather, since I haven't been keeping up with the development process, perhaps I should ask, is there a viable apache 2.x module for ruby that allows one to run RoR sites without relying on mongrel/other web servers?
Because, frankly, if it can't be run on apache 2.x, I (and the company I work for) won't touch it. We have already seen the scalability nightmare that RoR was, of course, so obviously we're a bit skeptical about performance optimizations. (:
Note: I have nothing against using new technology, even if it requires learning a new language, but when one has a hundreds of sites that require web server A, and a framework requires the shoehorning of web server B, well, the aforementioned framework loses its appeal.
The wise follow a damned path, for to know is to be forsaken.
That stored procedure is awesome (well, it actually isn't very good sql, but it doesn't matter right now). As the developer, you just need to worry about passing the monster name and the database spits out everything you want.
If you do most of the logic in your stored procedures, it makes it easy to bolt on new features written in various languages. If you decide to have a perl script for a cron job, you just call the same stored procedures your ruby app is calling. If you want a windows front end for your admin staff, the windows app calls the same stored procedure too.
Once you bury the database logic in the application code, you have to rewrite it for every application. It is, in a way, a very evil form of copy & paste programming. Now every change in the database requires you to go into every single application and change something. Kinda like when you get slutty with your code and copy & paste it rather than abstract it out into a library.
And I'm aware stored procedures don't play nice if you are worried about cross-database issues because you sell the software. This only works when you get complete control over the application & database stack.
PS: MySQL stored procedures suck. Use a real database with a better stored procedure language.
Not everybody is working on small shops, on non mission critical apps. When you have big stuff, or big clients, you need to be very flexible, because you can't control everything. You can get the database shoved down your throat, communication protocols, unneeded specs, everything. That is what flexibility is for. A consultant can come and say that you need a high availability distributed architecture, you might have to change stuff for security reasons, for performance. Flexible frameworks let you do that.
Of course, it's nice that there is stuff like ROR for those who don't need to deal with medium to big projects, but I don't think they are 99% of anything.
You're assuming that all composite primary keys use values that do change. That's highly unlikely, given the number of tables in the world filled with historical data. That said, I agree (for other reasons) that surrogate keys are much better.
Seriously.. why does Ruby on Rails get so many people so fired up? If you don't like it, don't use it. If you do like it, feel free. There's no one-size-fits-all solution, but for many people Rails comes pretty close. If you're not one of those people, there are plenty of other frameworks and languages.
Why do people in any kind of IT have such huge egos? It's counterproductive and at the end of the day, if you're making the client happy, and that makes your boss happy, you've done your job.
"Holy. Crap. You're not the guy that writes Active Record, are you?"
If you're referring to DHH, then no, I'm not him. My stance isn't as extreme as his ("database is just a big hash") but I do agree with some of his points. Transactions = good. Foreign key constraints = good. Stored procedures = only use when absolutely necessary. Normalization = weight between pros and cons in application code complexity and data redundancies. Etc.
"The thing a lot of OSS developers seem to forget is that many applications are primarily for data processing with user interfaces thrown on top. I.e. Not every damn "web app" is a blog or wiki, where it's primary purpose is to be a web app."
Not every, but *a lot* of them are. Very often they're systems for displaying, storing and retrieving small to moderate amount of information (unless you're working on a really really big multi-million system).
"Fact is that, if Rails wants greater acceptance (and, yes, I realize it is already widely accepted -- I'm talking about continued growth), then it's going to need to support things like composite keys. Why? Because people use them, and the application may have come years before the web interface."
I don't think so. I'm pretty sure that people complain about composite primary keys because it's so easy to complain about. Most of them probably wouldn't consider using Rails even if it fixes all its "flaws". *
There was a time when I took all Internet complaints very seriously. I worked very, very, very hard to meet peoples' demand, and I did it for free. In the end, it didn't help. Whenever I publish a fix for one complaint, they complain about other things. It's an endless cycle. The complainers can never be satisfied no matter what I do.
For example, people complain about memory usage in Rails. I've developed a way to reduce the memory usage by 30%, and look - very few people are interested! The people interested in my work are extremely disproportional to the number of people complaining.
* But by fixing Rails "flaws", you've just made it worse. The reason why Rails is so great in the first place is because it's a very specialized framework. It's not trying to be the thing for everything, like J2EE. If you make it the thing for everything it'll be a lot more painful to work with. It's like saying that your television can't wash your clothes. While it's possible to make a television that does that, it would be a royal pain to make and to work with.
Just curious: why isn't performance even mentioned in this thread? It should be a tradeoff between (application code complexity, normalization, and performance). Choose the most important one and design accordingly, sacrificing the others only when necessary.
Well.. maybe. Or Maybe not. But Definitely not sort of.
When you as a DBA use anything other than a surrogate primary key, you are making the exceptionally dangerous assumption that the client has the correct understanding of what their model entails, that there will be no exceptions to the rules of that model, and that the model they gave you will never, ever change.
Borrowing from your SSN example, let's say that your client tells you the main way they identify customers is through the SSN, and you go by that, and then there's a case of identity theft and the customer's SSN number has to be changed? Now you've got potentially thousands of records with a bad primary key that you have to change (and mitigate constraint issues as well). What if privacy issues require the company dropping SSN's as an identifier, and now the company will be forbidden from asking customers their SSN's? You'll no longer be able to generate primary keys compatible with the same ones used before.
What truly separates a good DBA from a bad one is the good DBA's ability to anticipate change, design for change, insulate existing stuff from change, and basically save the client from any flaws in their own conceptual model (while making it look like they've followed the client's conceptual model to the letter). A bad DBA simply trusts whatever his client says and believes it to be correct and forever immutable.
Ergonomica Auctorita Illico!
If you do most of the logic in your stored procedures, it makes it easy to bolt on new features written in various languages. If you decide to have a perl script for a cron job, you just call the same stored procedures your ruby app is calling. [...] Once you bury the database logic in the application code, you have to rewrite it for every application. It is, in a way, a very evil form of copy & paste programming.
That's a good point, but I think you're learning the wrong lesson from it.
Yes, duplicating your database schemas across multiple code bases is bad, as it makes it approximately impossible to update things. But pushing all the brains into your database leaves you with four problems. One, you're generally locked into exactly one vendor. Two, the variety and quality of development tools is poorer. Three, if there's some other storage technology you want to use, you're generally screwed. And four, your various code bases are still tied to specific stored-procedure calls, which last I looked didn't have much in the way of protocol versioning.
I think the better option is to follow the OO approach, where you keep behavior and data together. You make yourself a tidy service layer in some convenient protocol like RESTful HTTP with XML or JSON. (Rails has nice support for that, BTW.) Give it a little versioning, so that you can transparently extend your APIs. That gives you the same kind of isolation your stored procedures do, but a much wider range of tools for both client- and server-side development.
In my view, that gives you the same clean boundary and exactly one code base mucking with raw data, but a lot more flexibility over the long haul.
It's great to see this sort of report with a well-known major site.
I've done work converting a fairly high-traffic site to rails, but the rails-ness was never advertised because it was irrelevant to the client, and the work was done under NDA so we couldn't write up an article touting the success. After all, they didn't want to tell their competitors how it was that they cut costs, whether that's a justified fear or not.
There are a lot of things I don't like about Rails, but scalability isn't one of them.
They have no freaking idea what they are talking about. I run sites for a network of radio stations that is all rails, and we have the fastest, most stable sites of any competing media in our markets at least. Looking around, I'm always comparing our sites to the big dogs in New York and LA, and we still destroy them on content and speed.
The thing is, it takes a lot of planning and a ton of benchmarking to make it work. Fragment caching, more efficient queries, and not picking up every associated item will get most people 50% there. Most rails guys seem more interested in getting the flashy crap figured out and not enough time with optimization. They give all Rails devs a bad name because their crap is unstable and slow.
[logansbro@gmail.com]
You are completely incorrect.
If your domain model describes the way an actor finds an entry is by Order# and Line#, that should in no way, shape, or form decree what your technical artifacts look like.
The correct thing to do in that case is to have a unique, opaque, identity key (numeric or guid, just so long as its unrelated to the record data, and has no additional meaning beyond the unique value of that record).
Then you can also add unique constraints or indexes to the composite key, and/or you can enforce that unique constraint in the application. Or both, for the smart ones.
But you need to have a unique way to identify the record THAT IS NOT SUBJECT TO CHANGE. In your example, you could re-order the lines, or one line could have been a mistake and you need to move it to a different order.
If you've used composite keys on order# and line#, then you've got alot of cleanup work to do after your change.
If you've used proper opaque identity keys, then you just change the data, and there are no side effects.
Since in that case your joins are also done on the identity keys, your relatinoships are stable even when you change order# or line#.
The SSN one is even worse. I can guarantee you that if you do that, someone will have the wrong SSN, and it will need to be changed in your data.
If you've used SSN as the primary key, then its a pain in the ass, and you have to do data integrity cleanup.
If you've used a proper opaque identity key, then you just change the SSN, and there are no side effects.
This is stuff you learn the first time you write an app as a junior developer without a mentor, and use SSN as a key. A year or two later you come to regret it, and the lesson is learned for a lifetime.
"If you have a scaling problem, you don't have a problem."
Which basically means if you've got 200 000 hits a day and your existing setup is folding under the load, you've found a glory hole on the webbernet. And you're likely to find the funding required to scale the site - even if that means moving to an entirely new technology alltogether.
We suffer more in our imagination than in reality. - Seneca
With all indexes, sorted results, and the 20k documents *20 revisions each code set, the numbers are
subquery: 34s
materialized: 3.2s
distinct: 6.9s
is_latest: 300 ms
If anyone cares to try this on oracle (I doubt it does distinct on(), that's nonstandard), I'd be interested to know if it can automatically convert the subquery into a table. Generating the dataset was pretty easy (convert pseudocode to language of choice): If it does, the results for the "materialzed" query and the "subquery" query should be nearly identical. Otherwise, the "subquery" query will be a lot longer as the database looks up the most recent revision for 20,000 documents, one document at a time.