Statistical Programming With R
An anonymous reader writes "This series introduces you to R, a rich statistical environment, released as free software. It includes a programming language, an interactive shell, and extensive graphing capability. What's more, R comes with a spectacular collection of functions for mathematical and statistical manipulations -- with still more capabilities available in optional packages."
I've heard good things about R, but have never really got to grips with it (although I know it has been around for a while), so any kind of primer is more than welcome as far as I'm concerned.
From the "What is R?" page:
:-)
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language
So, R came from S; that must mean that R++ is coming up next!
We have a few people using R around here, mainly in the backend of cgis to produce graphs of various things. The main problem? If you want to output to a jpg or png (like, to display the result in a webpage), R has to create a window in X, draw onto the window, and then take a snapshot of the window. What this means on a headless sun machine? We get to run a virtual X server soley for our R cgis. Bloody hell, it's a stupid implementation of a crappy language.
</cranky old man>
SAS, Minitab, hell anything is better than SPSS. Now to include R.
It's time I wean myself off of excel particularly since the other day I couldn't even create a histogram since my dataset is more than ~64,000 data points, which is apparently excel's limit. Does anyone in the community know of a good replacement for excel that scales well to many data points but also has some sort of user-interface so that I can do some visual manipulations if I want to. I understand that most of these packages come with their own interactive shell and languages, but I would like to have both the command-line interface and a visual interface (like that of excel), while still being able to scale to many data points. Any suggestions?
There are Python bindings. They are here. Enjoy!
but not at all easy to learn. It's not that the programming is hard (although it is, it is a functional language which takes a while to get your head round) - but the documentation is aimed at fairly high level stats boffins.
But... ANYTHING is better than SPSS.
dave
What's more, R comes with a spectacular collection of functions for mathematical and statistical manipulations...
I can see that this package will be quite popular with political campaign managers.
Don't blame me, I didn't vote for either of them!
You mean I missed out on all of the languages from "D" through "Q"?
Better hit the O'Rielly books. I have a lot of catching up to do!
Does anyone know where R really fits into the grand scheme of things? The only other language mentioned in the R FAQ was S... there were no comparisons with Octave or any commercial products like Matlab or IDL. So, what does R really do for you besides being another analysis and visualization project?
For people who have never taken real stat classes in college (or never learned it on their own) R will seem like a useless language. Most other languages can handle basic statistics computations.
Statistics is a whole lot more than means and averages. When I took my first real stat class, everything I knew about statistics was literally covered on the first half of the first page. I was totally blown away by what you could do with statistics.
R is for hardcore stat folk who know a bit about programming, not programmers who need to do a little basic computation.
It's difficult to evaluate all the various statistical packages. I would love to read some sort of comparison among the various packages, both commercial and free. For instance, which packages have some sort of GUI? What types of programming languages does each one use? How does each one scale? Is there a particular feature that separates a particular package? Anyone?
Try "help('postscript')", "help('png')", and "help('jpeg')".
Output to different graphics devices has been in S, Splus, and R for as long as I can remember (and that's a long time). Maybe you should try having a look at the copious documentation for R; the documentation, like the system itself, is free.
Does anyone have any insight on how this differs from octave?
This is the first I've heard of R, but I've tried using octave a few times. It seems to be a sort of enhanced gnuplot. I was thinking about using it for a project I'm working on, though I may just stick with good 'ol C for performance.
Do any of these projects work well with sparse matricies? I'm interested in using them to run a pagerank-like computation, but not if they use n^2 memory.
-jim
...just a few days after Talk Like a Pirate Day. R!
R is really a beautiful language, for its purpose. It has a very nice correspondence with math and code, and for most parts of "hard" science, that's really important.
Compared to MATLAB, you can easily write R code 5 times as compact as MATLAB code, and still get more understandable code.
Employee of Inrupt, Project Release Manager and Community Manager for Solid
I use JMP-IN its a GUI stats package which does all sorts of stuff. It's made by the folks over at SAS who are well known for their industrial strength stats packages. JMP covers everything from Students T tests, ANOVA, Kaplan-Meier survival curves thru non-parametric analysis and more. Very easy to use, and alot of function, all with a easy to use GUI interface, and nice looking graphical outputs. Accepts Excel spreadsheets and more. I got it with an accademic discount for about $70 US. I use it for medical statistics for publication.
..........FULL STOP.
There used to be a language called Minitab that I used in my second statistics course years yonder. I wonder what happened to it. It sorely needed a GUI for table browsing.
Table-ized A.I.
Other tools which I have come across, but haven't really worked with: Axiom (symbolic computations, CAS); Scigraphica (graphing); opendx (data explorer + visualization).
I've actually never really used R (by the time I came across it, I was done with my physics labs), so I can't really compare any of the others to it. But it definitely looks like one of the tools that I should add to my suite.
I use both Lisp and maxima, and i fail to see the strangeness in maxima's syntax. In any case, the exposed syntax isn't anything Lisp-like.
Try Corewar @ www.koth.org - rec.games.corewar
"2) Data languages (SAS, SPSS): The basic object here is a dataset with variables. Inverting a data matrix here is essentially a meaningless concept, and would be extremely difficult to do"
Not sure if this meets your definition, but I've been using SAS for boocoo years and can tell you that it has a "TRANSPOSE" facility explicitly for making columns into rows & vice versa.