R 3.0.0 Released
DaBombDotCom writes "R, a popular software environment for statistical computing and graphics, version 3.0.0 codename "Masked Marvel" was released. From the announcement: 'Major R releases have not previously marked great landslides in terms of new features. Rather, they represent that the codebase has developed to a new level of maturity. This is not going to be an exception to the rule. Version 3.0.0, as of this writing, contains only [one] really major new feature: The inclusion of long vectors (containing more than 2^31-1 elements!). More changes are likely to make it into the final release, but the main reason for having it as a new major release is that R over the last 8.5 years has reached a new level: we now have 64 bit support on all platforms, support for parallel processing, the Matrix package, and much more.'"
Someone who can't afford license fee of SAS or Matlab, this is the best alternative out there. And in some cases a better alternative.
Not well known but R's accessibility support is far better. Here is an example from a paper accepted in R Journal
Statistical Software from a Blind Person's Perspective
A. Jonathan R. Godfrey
http://journal.r-project.org/accepted/2012-14/Godfrey.pdf
pie(c(85,15),init.angle=25,col=c("yellow",1),labels=c("pacman","not pacman"))
Used R in my thesis research a few years ago ... it was such a blessing to have statistical package I could have on *my* computer! Thanks fellas!
yes, you need to rebuild all of them: update.packages(checkBuilt=TRUE,ask=FALSE)
64-bit support! Trivial inefficient parallelization! Matrices!
Welcome to 2001.
Buck lease && nun b sun
bæ8Ã0sÃOE?5r©oÂÃ?âz:ÃÃAÃ?ÃOEÂ6fXÃ?]Â
I was recently asked to make an interactive tutorial in R, which sounded like a fun project, except I have no clue of R. Are there any good starting points at learning R?
As in: when they release it, you can trust it to work.
Hence they didn't mess around with major reconstruction of R's guts until they could release something that's finished (and well-tested !) and bumped the version number to 3.0.0 when they did in order to properly differentiate it from previous versions.
This is one of the differences between amateur OSS offerings (like for example KDE with its miriad half-baked Kxxx packages, sundry horrible OSS games, etc.) and genuine production-quality OSS (like R, Lapack, Octave, Libre Office, PostgressQL, MySQL, GRASS GIS, QGIS, Maria DB, GNU CC, the Linux kernel etc.)
This is very gratifying as R happens to see widespread use in academia, government and business when it comes to data analysis and statistics.
If R has a weakness, it is that uses an in-memory approach to data-processing, unlike e.g. SPSS, which keeps almost nothing in memory and simply makes passes through datafiles whenever it needs something. R is also a bit memory-hungry, so the need for genuine 64-bit implementations should be clear.
Apart from sporting about 4000 useful and ready-to-run statistical applications packages, R has convenient and efficient integration with C code and has what's probably a contender for the best support for data-graphics anywhere.
For those who didn't know, even packages like SPSS and SAS have incorporated R interfaces to tap into the wealth of application packages that R offers. Can't think of a more significant compliment right now.
Even i have used R in the past for my thesis. My statistician was using S-plus to do magical things that the hospitals SPSS definitely could not do.
However, S-plus was not available to us non-statisticians.
As a complete non-programmer, mediocre statistician, i was able to reproduce en build upon his examples in R.
But what i truly missed was a usable GUI. there were some, and i tried them all at the time, but none were able to do more than the basics. For someone using R daily, a GUI will be more trouble and limited. But for someone like me, a well developed GUI like S-Plus had at the time would have bee more than welcome.
Seeing the headline R 3.0.0, the first thing i was looking for: did they include a GUI by default???
Why are other peoples sig's always more witty ???
"Covers R version 3.0"
If you just use R to run data through a package (which in my opinion is the quickest way to get a lot of value out of R) then the learning curve is tolerable. Less steep than for SAS (I think), but steeper than for SPSS.
On the other hand: R in and by itself is mostly a tool for statisticians and data analysts (or anyone else who doesn't flinch at having to write scripts, who's acquainted with the phenomenon of 'manual', and who's used to spending a few hours or so reading before they try to do anything). That in itself represents a barrier.
I've found the on-line R documentation mostly unhelpful for beginners (thorough but pedantic, often implicit, and tending to use jargon). The offline 'Introduction to R' is a lot better though, and there are some good user-contributed texts that can be freely downloaded. I agree that it's useless to buy a book on the actual language (be that S or R) because as a beginner you will only use R's ready-made functionality and script that. If you fins yourself delving into the language you're probably doing something wrong (for a beginner). Your best bet is to buy one of the 'cookbooks' for R.
I tried to use it for an undergraduate statistics course in conjunction with Excel using the RExcel package and Rcmder.
The RExcel package establishes a com link between MS Excel and R and comes with an Excel plugin that creates Rcmdr menus in Excel. The net result is that people can load, view, and edit their data in MS Excel, open the menu, send the data to R, do menu-driven analyses in Rcmdr, and bring results back in Excel if required.
It was less than a success. Students stumbled over having to realise you have to send the data to R before the menu options take effect, had difficulty of keeping track of where their 'live' data actually was (Excel or R), and on top of that had difficulty remembering where to look for the menu options.
Yes, I know. Well ... they were business school students but still, eh?
I believe that R commander can work for an introductory course, provided you match the content of the course exactly to the RCommander menu or vice-versa. Your students will be a bit hemmed in aftwerwards: they'll be able to replicate the stuff you prepared for them, but as soom as they try anything else they will have to sit down, think, and spend time figuring out how to use the software.
A couple of years ago I ran into SAS at a trade show. It really surprised me that they were still around; I'd previously seen their products on mainframes back in the late 70s, with punch cards. (I forget by now whether I'd used SAS or SPSS, which were the two competing commercial stats packages in that environment.)
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Once in a while I read comments like
"I recently switched my scientific programming from R to Python with NumPy and Matplotlib, as I couldn't bear programming in such a misdesigned and underdocumented language any more. R is fine as a statistical analysis system"
"Python + Numpy/Scipy is such a better alternative now it's not even funny. It's actually a real language, and has loads of packages."
Interestingly, these comments come from practitioners, not language theorists, who tend to have far more appreciation for language trade-offs (see,e.g., http://r.cs.purdue.edu/pub/ecoop12.pdf). I believe this usually due to poor knowledge of R, and sometimes poor understanding of language design in general. I would like to clarify a few points here:
1. R is not a DSL or a "statistical language". It's a general-purpose, Turing-complete language, with which you can write pretty much anything. R's byte-compiler is written in R. The ability of a language to be written to a large degree in itself is usually a sign that the language is not so flawed.
2. R has deep roots in functional programming and in Scheme. It was not written by statisticians who did not know language design (another widely held misconception). Luke Tierney wrote a entire LISP for Statistics before becoming a core contributor to R. That was 25 years ago.
3. R is not a *perfect* language, but neither are languages like Python or . Python for example has a ton of syntactic sugar (good for me, but bad for Alan Perlis), a bolted-on object model (OOP purists like to diss it), reluctant functional programming (which is really neither meat nor fish) and concurrency is as wonky on it as it is on R. But that is really besides the point. Perfect languages are extremely useful to drive the design of future imperfect ones. To a large degree, language doesn't matter. It's what you can do with it that matters.
4. R is an *amazing* language for data analysis. I have a fairly good knowledge of other substitute languages (including Matlab, Python and Fortran), and nothing comes close to it. If you don't think so, it's fine. Just don't denigrate something you probably don't know enough to appreciate.
5. The statement, popularized by John Cook I believe that "I’d rather do math in a general language than do general programming in a mathematical language" underlies much of the objection to R usage. Let's unpack this statement. It really says that the cost of doing general programming in a technical-oriented language like R is higher than the cost of doing mathematical programming in a language like Python. This is of course an empirical statement. It doesn't make to sense to write a web server in native R, but neither does it make any sense to write an operating system or a DBMS in Python. But, a lot general programming tasks faced by data analysts are relatively simple: read/write files or to a DB; serve web pages on a low-traffic server; be called by a larger program. For all these things R just works (TM). Conversely, there are hundreds of very specialized packages, very well-thought in R written by really, really good people that simply have no counterpart anywhere. The cost for replacing them is very steep: either do some dumbed-down analysis or roll your own. On the other hand, Python or Clojure or Ruby, even Java! have sufficient capabilities to run a linear regression and produce a plot, so no need to complicate things.
My larger point is, again, that languages don't matter as much as people like to believe (the same applies to editors, btw). Principles matter. Availability of what you need matters. *Choose the language of least resistance*: for data analysis of your bank account on a weekend, you can use Excel. To run a highly customized simulation on Blue Gen/L, you *must* use Fortran. In this light, be happy for what R can do for you, and rejoice for this major new release. If you are using moderately-sized data sets (2^31 elements is actually a good reference point) and need quick access to state-of-the-art statistical routines and visualization, maybe R is just all that you need.
-gappy