Book Review: R Graphs Cookbook
RickJWagner writes "Once upon a time, I thought communication was one of my strong suits. Alas, a few years into my programming career I realized I'm more of the head-down codeslinging type, not one of the schmoozing managerial types. So when I have a point to make, I really like to have my data ready to do the talking for me. In that capacity, this book is a very good weapon to have in my arsenal." Read on for the rest of Rick's review.
R Graphs Cookbook
author
Hrishi Mittal
pages
272
publisher
Packt Publishing
rating
8/10
reviewer
RickJWagner
ISBN
1849513066
summary
An invaluable reference book for expert R users
Right away, you should realize this is not a book that teaches R. R (an excellent open source statistical language) is a great tool for any technician. I've used it to analyze logs, find performance bottlenecks, and make sense of mountains of nearly unrecognizable data. But this book doesn't teach R, it teaches R graphing.
It turns out R has excellent graphing capabilities. You can draw scatter plots, line plots, pie graphs, bar charts, histograms, box and whisker plots, heat maps, contour maps and 'regular' maps. These are all good for demonstrating data in different ways, and the book lightly explains which graph will help you illustrate which point.
If you're getting a little interested, you'll also want to know that all this graphing can be scripted and scheduled. So you can get data-driven reports on a schedule, easily accomplished once you know how to write the graphing scripts (which are then scheduled using cron or a similar facility). One small caveat: To prepare your data for presentation, I think it's usually necessary to partner R with another language that's better for text extracting and manipulation. I prefer Python for this task, you might like another language.
The book is exceptionally easy to read and work with. This doesn't mean it's simplistic, though. Anyone who's tangled with R's graphing without a good example will testify that figuring out the various functions and arguments necessary to wrangle a descriptive graph can be really difficult. This book gives you the kind of graphs you need, with the bells and whistles you're going to want, in a series of snippets you can run immediately.
The book is written in Packt's "Recipe" format. In a nutshell, this means that it's a series of how-to sections worded in a templated form. There are headings for sections that inform you what you're going to accomplish, how it's done, and why it worked. You quickly realize it's a repetitive format, but it serves to make the book an excellent resource for quick reference.
Another really nice feature of the book is the downloadable source code and matching data. Knowing the data is half the battle, really. The specific formulas given are certainly useful, but without knowing how the underlying data is formatted you really wouldn't get nearly the practical value. For that reason, I urge anyone using this book to be sure they examine the underlying data for at least the first few formulas. After that, it'll be automatic, you'll know you want to look at that data when you're trying to master some graph type. Then when you go to make your own data ready for graphing, you reach for that secondary language like Python, extract the fields you want in a way similar to your example data set, and presto-- you've got the graph you want.
The book starts out with a first chapter that introduces the kinds of graphs you'll be able to produce and situations where each type is most useful. The next chapters, up until the final one, are in-depth sections on each of the graph types. Maps are treated to a different chapter than pie graphs, for instance. The final chapter covers putting final touches on your graphs, including saving them in different formats (PDF, PNG, JPEG, etc.) and niceties like adding scientific notations, mathematical symbols, etc.
The book states that the target audience is experienced R programmers. I really don't think that's necessary, though. There is an obligatory R installation section, and I think that a reasonably competent programmer with Google at his disposal could get off the ground (for graphing purposes) with this book and a little bumbling. If you already know R, then you needn't worry at all, there is nothing here that will look foreign to you.
If I could change one thing about the book, I'd want a comprehensive index of all the functions and arguments that augment the basic core functions that produce the example graphs. These functions and arguments tweak the basic function in ways that make them much more appealing than what the basic function alone can provide. But the book isn't able to show each and every combination with each graphing function, so it's up to the reader to figure out how to pick some of the options from one recipe and apply it to another. It's not difficult to do, but having an index to help you find the options you want would make this process easier.
You can purchase R Graphs Cookbook from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
It turns out R has excellent graphing capabilities. You can draw scatter plots, line plots, pie graphs, bar charts, histograms, box and whisker plots, heat maps, contour maps and 'regular' maps. These are all good for demonstrating data in different ways, and the book lightly explains which graph will help you illustrate which point.
If you're getting a little interested, you'll also want to know that all this graphing can be scripted and scheduled. So you can get data-driven reports on a schedule, easily accomplished once you know how to write the graphing scripts (which are then scheduled using cron or a similar facility). One small caveat: To prepare your data for presentation, I think it's usually necessary to partner R with another language that's better for text extracting and manipulation. I prefer Python for this task, you might like another language.
The book is exceptionally easy to read and work with. This doesn't mean it's simplistic, though. Anyone who's tangled with R's graphing without a good example will testify that figuring out the various functions and arguments necessary to wrangle a descriptive graph can be really difficult. This book gives you the kind of graphs you need, with the bells and whistles you're going to want, in a series of snippets you can run immediately.
The book is written in Packt's "Recipe" format. In a nutshell, this means that it's a series of how-to sections worded in a templated form. There are headings for sections that inform you what you're going to accomplish, how it's done, and why it worked. You quickly realize it's a repetitive format, but it serves to make the book an excellent resource for quick reference.
Another really nice feature of the book is the downloadable source code and matching data. Knowing the data is half the battle, really. The specific formulas given are certainly useful, but without knowing how the underlying data is formatted you really wouldn't get nearly the practical value. For that reason, I urge anyone using this book to be sure they examine the underlying data for at least the first few formulas. After that, it'll be automatic, you'll know you want to look at that data when you're trying to master some graph type. Then when you go to make your own data ready for graphing, you reach for that secondary language like Python, extract the fields you want in a way similar to your example data set, and presto-- you've got the graph you want.
The book starts out with a first chapter that introduces the kinds of graphs you'll be able to produce and situations where each type is most useful. The next chapters, up until the final one, are in-depth sections on each of the graph types. Maps are treated to a different chapter than pie graphs, for instance. The final chapter covers putting final touches on your graphs, including saving them in different formats (PDF, PNG, JPEG, etc.) and niceties like adding scientific notations, mathematical symbols, etc.
The book states that the target audience is experienced R programmers. I really don't think that's necessary, though. There is an obligatory R installation section, and I think that a reasonably competent programmer with Google at his disposal could get off the ground (for graphing purposes) with this book and a little bumbling. If you already know R, then you needn't worry at all, there is nothing here that will look foreign to you.
If I could change one thing about the book, I'd want a comprehensive index of all the functions and arguments that augment the basic core functions that produce the example graphs. These functions and arguments tweak the basic function in ways that make them much more appealing than what the basic function alone can provide. But the book isn't able to show each and every combination with each graphing function, so it's up to the reader to figure out how to pick some of the options from one recipe and apply it to another. It's not difficult to do, but having an index to help you find the options you want would make this process easier.
You can purchase R Graphs Cookbook from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
How much are they paying you guys to keep putting these Packt reviews up?
...is brought to you once again by the letter Packt and the number RickJWagner.
Breakfast served all day!
I'm more of the head-down codeslinging type, not one of the schmoozing managerial types
Oh really, I wouldn't have known it from your binary description of the world. Because an IT person being cordial with people is "schmoozing."
Keep fighting those geek stereotypes, guys.
This book is listed as available in epub and pdf format at www.packetpub.com.
Or any other spreadsheet program.
Now of course I admit that Excel is probably not as flexible as R. However, unless your job is to produce stunning, tailor-made graphs, a spreadsheet application will deliver results a lot faster.
Nobox: Only simple products.
Recursive Slashdot spamming... nice one!
...Don't forget the Amazon-affiliated link for the book. So /. gets a cut of the book sales as well.
R makes great graphs functionally speaking, but without mucking about with the options and some post-processing they are not the most attractive. Open up your favorite financial/data intensive news source and look at the visuals and you'll find that generating that style with just code is fairly difficult.
Until about Office 2007, the defaults in Excel charts were also atrocious. Openoffice.org is still pretty bad, and Matlab is not much better than R. The good news is that you can generate PDFs from each of these and easily open them in Inkscape/Illustrator, where making vector-based edits is easy.
Anyone who regularly visualizes data needs to pick up resources on how to clearly organize and display your data, like "The Visual Display of Quantitative Information" by Edward Tufte (though some of his examples are a little dated). Books like that are full of examples that would be very tricky to replicate without any post processing, because it usually involves eliminating excessive lines and cluttering detail.
"The universe seems neither benign nor hostile, merely indifferent." --Carl Sagan
Just from examining the few preview pages on amazon.com, this book appears to be far too basic for anyone who has actually done any serious work with R. I personally would forgo this entire book, and spend the time wandering through the R Graph Gallery which has far more examples with source code and underlying data. It's also rather odd that this book doesn't cover ggplot, grid graphics, lattice, or any of the more commonly used tools in advanced R graphics.
Perhaps this book could be useful as your first foray into graphing with R... but I'm unconvinced it even covers that well.
http://www.donarmstrong.com
Awesome. Thanks.
Quote "Anyone who's tangled with R's graphing without a good example will testify that figuring out the various functions and arguments necessary to wrangle a descriptive graph can be really difficult. "
he really is not management material; the 1st thing a manager would ask is, since I'm paying for your time, is there a software tool that is easy to use....
when you spend a lot of time mastering a difficult language or tool, it doesn't mean you are smart and should impart your knowledge to others: it means you are dumb and should have looked for a simpler tool
"God, you're worst type of smug: Smug while blatantly wrong."
Anonymous Coward a Republican?
nuff said! - try it!
Lets just be careful we are not overly reliant on pure data in the first place. Or you become susceptible to these (http://pastebin.com/p2HfGx1L) techniques. P.S. Sorry for the pastebin link, but it looks like Venkat took down his online email archives...
I've even downloaded it form gigapedia already!
A++ Would pirate again!
Graphs in R are fiddlier and uglier than they need to be. ggplot2 makes it a lot easier (both to create graphs and to manipulate the data behind them). It's based on Tufte's ideas, and lets you put a huge amount of data in plots, cleanly.
Coach Outlet Coach Outlets Store Coach Outlets Online Coach Outlet Store Coach Factory Outlets http://www.coachoutletsstore.org/
Yorick is seriously better than anything else, and no need to "... reach for that secondary language like Python, extract the fields you want in a way similar to your example data set..."
Yorick is a c-syntax interpreted language that can manage huge datasets, parse and convert arbitrary text or binary formats, and produce beautiful graphs. And it's *fast*!!
free and open source.
yorick.sourceforge.net/
Hadley Wickham, author of ggplot2, is a prolific contributor of R modules. His documentation is fairly good, yet of the somewhat harried variety. You can get yourself quite lost by the amount of argument inheritance, which in R is entirely unlike tea. The book needs about 50% more material added, by someone who understands generic programming, stating precisely what operators are required for each argument passed into the ggplot hairball.
Hadley also indulged in some proscriptive urges. One was not to provide any form of pie chart, not even in 3D. This one I heartily endorse. The other was to make it difficult to put distinct vertical scales on both sides of the graph. He believes this can be used to create misleading graphics.
Unfortunately, he hasn't done many engineering graphics where you might want to put mA on one side and mW on the other (assuming a fixed supply voltage). There are many, many cases in engineering where you would like your graph to sport two distinct (yet fundamentally equivalent) sets of vertical scales. The root of all evil is premature proscriptivity. The one true simplicity in information technology is compositionality, and Hadley himself is one of the foremost practitioners in the way he architected the ggplot layering system.
I love ggplot, but I had to scribble madly in the margins of the first half of the book before I leveraged the power and ... uh ... convenience. Convenience paid for in blood, but well worth the price.
I also have the Lattice book, but have done less with it. It's a far more traditional approach. ggplot also does lattice graphics, in its own peculiar way.
With ggplot, when you get into faceting, you'll find yourself reading section 9.2 "Converting data from wide to long". This is nearly as fundamental to the ggplot architecture as the equivalence between pointers and arrays to the C language, yet it's buried in chapter nine a la harried documentation.
If you're not going to learn the equivalence in C between arrays and pointers, why bother? I'd say the same is true with ggplot. You had better get a grip on splorping your data frames with plyr, or why bother.
If you love data, the great thing about R (compared to Excel) is that you get to play in Myhrvold's kitchen without having to invest $10m. Even with the heavy machinery at hand, eventually in R, simple things are simple again, if you stick with it.
I should also add that I sometimes exploit the new-fangled ability to write inline C++ code in R. Where I used to do stand-alone applications in C++, these days I almost always use R as my graphing front end.
Another thing: I found on Hadley's website an Amazon wish list which included "The Flavour Bible". I bought this book after finding it there, and I love it. I then recommended it to some other foodies, and some of them report back that it has become their most used cookbook. Other people complain that it's just a list of lists, but not the kind of people in my circles.
Incidentally, ggplot supports both spellings "color" and "colour" for all colour arguments. I have to give Hadley a pass in the greater scheme of things for thwarting my mA/mW dual axis ambitions. But I sure hope he doesn't do it again.