Slashdot Mirror


Ask Slashdot: Statistical Analysis Packages For Libraries?

HolyLime writes "I'm a librarian in a small academic library. Increasingly the administration is asking our department to collect data on various aspects of our activities, class taught, students helped, circulation, collection development, and so on. This is generating a large stream of data that is making it difficult, and time consuming, to qualitatively analyze. For anything complicated, I currently use excel, or an analogous spreadsheet program. I am aware of statistical analysis programs, like SPSS or SAS. Can anyone give me recommendations for statistical analysis programs? I also place emphasis on anything that is open source and easy to implement since it will allow me to bypass the convoluted purchase approval process."

3 of 146 comments (clear)

  1. R or WEKA ... Wait, What Exactly Are You Doing? by eldavojohn · · Score: 5, Informative

    R is my personal favorite but you're going to have to get down and dirty with some high level programming (scripting). Check out the data import package (you would probably export your spreadsheets to flat txt files and import although the functionality is ever increasing). There's no user interface in this suggestion ... what there is, however, is a massive collection of packages for statistical analysis. Very well maintained, constantly updated and ever expanding.

    The other suggestion has a better GUI but is really heavyweight. WEKA has helped me time and time again perform advanced statistical calculations on data sets and it's in Java so runs on just about anything. Their interface occasionally improves too, they now have an explorer that I use to prep data and remove outliers/null data (don't worry, this isn't climate data). It's well documented.

    These (probably) require an intermediate data transformation step but are open source and extensively supported. Any examples of what you wanted to do? Simple stuff like standard deviation or complex stuff like principle component analysis (PCA)? I guess if it was just simple stuff, that'd be built into Excel, right? Maybe your problems are simple enough to just need a good macro writer to tackle? Whatever happens, good luck!

    --
    My work here is dung.
  2. PSPP by Geste · · Score: 5, Informative

    Look at the free SPSS work-alike PSPP. http://www.gnu.org/software/pspp/ Sounds like R might be a bit much for your needs.

  3. Do NOT stick with Excel by Anonymous Coward · · Score: 5, Informative

    Excel and other spreadsheets suck at stats:

    * Burns, P. (2005). Spreadsheet Addiction.
    * Cryer, J. (2001). Problems with using Microsoft Excel for StatisticsPDF.
    * Pottel, H. (n.d.). Statistical flaws in Excel. PDF
    * Practical Stats (n.d.), Is Microsoft Excel an Adequate Statistics Package?
    * Heiser, D. (2008). Errors, faults and fixes for Excel statistical functions and routines

    For a more comprehensive and technical discussion, see the papers by Yu (2008); Yalta (2008); and McCullough & Heiser in Computational Statistics and Data Analysis 52(10).