Slashdot Mirror


User: JohnDaschbach

JohnDaschbach's activity in the archive.

Stories
0
Comments
1
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 1

  1. R for current productivity on Ask Slashdot: Switching From SAS To Python Or R For Data Analysis and Modeling? · · Score: 1

    I use R every day but have used Numpy and Scipy and related tools in the past and still on occasion. The package and documentation system in R is excellent. Good packages come with a vignette with examples that lets you quickly get up to speed with a new package and all packages are documented to be accepted at CRAN. A quite impressive variety of statistical and modeling packages are available. There are multiple graphics packages (although I find the standard one the most useful). The R library system I find excellent, especially when incorporating Fortran or C code you have written or obtained. For an old Lisp programmer, R makes perfect sense. The basic most flexible data structure in R is the list with useful feature that you can access elements by index (list) or key (hash) [a common idiom in Lisp]. One the downside standard R is all in memory so big data is not built in. A commercial company, Revolution Analytics I think is the name, has a big data version of R, and there are big data packages for some domains. Python support for big data is more extensive. The database interfaces are workable, and the netcdf4 interface quite usable, but if you need a lot of fast, flexible, external data access I'm certain Phthon is a much better choice. The object model(s) S3 and S4 are different from Python and are more familiar to users of CLOS but if your going to develop a large in-house package Python is a far better language. As always the answer depends on your needs and expertise. Personally I find R and Fortran and C and C++ and Perl make an excellent environment for data analysis and modeling but that is very dependent on my background and needs and lets me explore statistical ideas very efficiently. If I knew that the domain I was working in was reasonably narrow and I wanted to develop larger scale packages instead of varied data analysis I would choose Python. As an example RStan and PyStan seem to stay pretty synchronized but Python has Theano. If your interests are more oriented to statistical learning algorithms Python is the better choice, but if you want to use a wider array of Bayesian statistical analysis easily on data R may be the better choice.