Slashdot Mirror


PDL 2.4.0: Scientific Computing for the Masses

Dr. Zowie writes "Perl Data Language 2.4.0 was just released; get it here. This release includes even more powerful array slicing, a complete GIS cartography package, API access to the Gnu Scientific Library, and a host of other goodies. Between PDL and its less-mature siblings Numeric Python and Octave, the established commercial languages' days appear numbered."

17 of 40 comments (clear)

  1. Maple? by infernalC · · Score: 3, Informative

    Maple doesn't get an established commercial languages link?

    BTW, Maxima, Macsyma, etc, is free and has been around for years.

    1. Re:Maple? by pr0ntab · · Score: 2, Interesting

      Well, the author of the article is an idiot anyway, as PDL is basically a Perl interface to LAPACK.
      PDL is not a CAL, DE-solver, or anything like that.

      So Maple's days are not numbered. Go figure! :-)

      --
      Fuck Beta. Fuck Dice
  2. Where is the GIS? by smishra · · Score: 2, Interesting

    I looked through the website but could not find and GIS related modules. I also did a Google search (gis site:pdl.perl.org)but that too came up blank.

    1. Re:Where is the GIS? by ChrisDolan · · Score: 4, Informative

      Download it. In the tarball is
      PDL-2.4.0/Demos/Cartography_demo.pm

    2. Re:Where is the GIS? by Dr.+Zowie · · Score: 3, Informative

      The GIS stuff is a collection of vector and pixel coordinate-transform methods that allow you to plot, overlay, and compare map data from all the conventional formats. There are some sample global-scale maps of Earth included in the distribution, but they're not especially high resolution since they're mainly intended to demonstrate the capability.

      We designed the GIS code to be useful for planetary, astronomical, and solar work -- it's "just" a set of tools for dealing with sampled and vector data in a spherical system.

  3. Re:I'm seeing red by kworthington · · Score: 2, Interesting

    This might help the Editors: I've noticed that these glitches happen when Michael is the editor who posts. I have screenshots if needed. (not trolling, it's an observation. I have nothing against Michael)

  4. R: Open-source statistical languate by StatFiend · · Score: 4, Informative

    Another open-source statistical language is R. Its commercial cousin is S-Plus.

  5. Octave v. Matlab by djdead · · Score: 4, Informative

    I use Octave at home to test anything I'm doing for the "Matlab" sections of my homework. And while I think it's a great program and works well, for large computations Matlab is much much faster. There is one routine in particular that takes about 4 hours to run at home and only 15 minutes to run at school. And no, this isn't because my home machine is P-MMX 100 and school has has 3GHz P-4's. The machines are pretty closely matched.

    --
    -1: flamebait should really be -1: inciteful
    1. Re:Octave v. Matlab by t · · Score: 3, Informative
      Ah you did post the code, I thought you were pulling a SCO for a minute there.

      Looks to me like the problem is the line right above the part you posted:
      nextRedds = zeros(1, x+2*runs);
      The reason this is bad is because you reallocate it slightly bigger at the beginning of every run. This wastes a lot of time in malloc. You should allocate it before the loop at the biggest size necessary, and then initialize it as you already are.

      This part
      redds = [redds; nextRedds(1, a:b)];
      is especially bad because you are forcing Octave to grow your matrix on every loop. You should also preallocate this before the loop. In general, allowing Matlab/Octave to automatically enlarge arrays as needed results in disasterous performance. This is in general bad style, especially if one day you decide to code this up in C or something for speed, you would have to recode all your allocation code.

    2. Re:Octave v. Matlab by t · · Score: 2, Interesting
      Well your loop seems to have some flaws. You allocate
      nextRedds = zeros(1, x+2*runs);
      But you only use a range of [1:x+1+runs]. Is the rest meant to be zero?

      Regardless, copyRedds for example will at its largest be 2*numIters + x+2*sum([1:numIters]). redds will add to its initial size x*sum([1:numIters]) elements. And nextRedds will at most be x+2*numIters. You should allocate all of these at the max sizes required, and use the necessary indicies within the loop instead. e.g.,
      redds = [redds; nextRedds(1, a:b)];
      would be
      redds(length(redds)+ [1:x] = nextRedds(1, runs+[1:x]);

      I have no idea where you got your 1.49GB figure from as `x' is unknown to me. sum([1:N]) = (N+1)*(N/2) is about N^2. For N=10k then that's about 3/4 GB of RAM. So I guess your number is plausible. If you don't have that much RAM then now you know why it runs so slow, you can't escape that fact that that is indeed how much RAM will be trying to use at the end of the loop. I'm not sure about Windows, but I think it will continually allocate swap on the local harddrives which would allow you to run this monstrosity without running out of memory. I'd be willing to bet that the computer at the school has several GBs of RAM, that would explain the enourmous speed difference.

  6. Re:I'm seeing red by kworthington · · Score: 2

    Excuse this reply to my own post, but there's now a post that is also red, and not submitted by Michael.

  7. Since all scientists use Perl by Anonymous Coward · · Score: 3, Funny

    Good thing Perl is a required course in most degree programs for science, otherwise it might not have much of an impact.

  8. scipy by d-Orb · · Score: 4, Insightful

    Well, I don't know about how mature/not mature Scientific Python or Octave are with respect to PDL, but I like Python better and I was used to Matlab in the past anyway.

    At present, I am using Scipy, a nice more complete version of Numerical Python. Together with IPython, I get a very nice numerical environment. Unfortunately, while Scipy is very nice, it is still a bit of a bleeding edge product. But it is **very** fast for large array computations. I also like the fact that you can link fortran routines easily (yes, people still use fortran, it's useful and easy).

    I also use Octave because I miss the ease of generating plots in Matlab (yes, I could do this with scipy, but somehow, I resort to using Octave). It is a very complete program, with many toolboxes. Given that some of the Matlab toolboxes can also be incorporated, there is a vast array of functions for you to play around with.

    On the other hand, I think that none of the "established languages" are a good comparison. IDL is extremely powerful for Remote Sensing/Image Processing tasks (my area of research). It is simple to use, and a bit of a standard in the field. From the PDL changelog, the cartographic features in PDL amount to no more than transformations... Mathematica is extremely powerful in symbolic Maths, which as far as I can tell, is not what pdl is about. And Matlab is turning into the VB of scientists (at least, it is multiplatform :D)

    Oh well, I'll have to give it a go :-D

    1. Re:scipy by Dr.+Zowie · · Score: 2, Interesting

      There are several device-independent graphics packages for PDL. The main one is PGPLOT, a venerable but powerful package written in FORTRAN in days of yore. PGPLOT has output modules for everything from a PASCAL turtle to (yes) eps. There are interactive devices (X windows and such) and hardcopy devices (PostScript, EPS, gif, jpeg, png, and such).

      What impressed me most about PGPLOT when I started using it is the strong device-independence. For example, it's difficult to say "Give me a 600x400 pixel X window" since pixels aren't device independent. It's much easier to say "Give me a 6-inch by 4-inch X window". Takes some getting used to, but then when you go to stick your output into a publication you can generate the same plot and send it to (say) the eps device instead of the X-windows device.

      The PDL front-end has a record-and-play feature, too, so you can define a PGPLOT X window, noodle around with it, and then say "Replay all that into this hard-copy device" and get exactly the same plot, rendered on the other device.

  9. Comparisons... by Dr.+Zowie · · Score: 3, Insightful

    Yep, you're right that Mathematica is not a good comparison -- I stuck that in mainly as a reference to the numerical part of Mathematica, but the symbolic stuff is pretty much unmatched (though Maple fans might disagree).

    Much of PDL's development has been motivated by a need for something "like IDL, but more powerful", and I think that's really where PDL shines best: in remote sensing and image processing tasks. It helps a lot that all of CPAN is already present, and that the file I/O and indexing have many fewer "gotchas" than those of IDL. The PGPLOT back-end is great, too, for actual device-independent plotting: how many hours have you spent tweaking your IDL plots to actually print right on the PostScript device?

    It's (IMHO) a Good Thing that we have all three of numpy/scipy, Octave, and PDL: each has a different set of strengths. Ultimately, each group really should use the tool that suits them best (and it shouldn't cost more than the workstation it runs on...). The reason I've more-or-less committed to perl development rather than Python or Octave is that it has a nice "natural language", expressive feel to it: it's easy to build pipeline-style, imperative-style, or evaluated-style constructs, whichever is most convenient for the current application.

    Of course, the open-source languages have the added benefit that results derived using them are actually reproducible, whereas closed-source languages might conceal irreproducible bugs (in the language or the reduction code) that other groups can't identify.

  10. This is quite true. by pr0ntab · · Score: 3, Informative

    More importantly recent versions of MATLAB JIT-compiles all the functions you run into a VM-bytecode-like thing, whereas Octave is a straight interpreter (AFAIK), so if you use a lot of recursive function calls or iterations and stuff.... heheh, you'll notice the difference right there.

    --
    Fuck Beta. Fuck Dice
  11. A lack of Parallelism by ChaoticCoyote · · Score: 4, Informative

    All of these tools address different aspects of numerical computing. A mixture of languages and tools will generally produce the best results.

    I've been experimenting with a number of scientific programming packages, ranging from traditional languages like Fortran 95 to new developments like SciPy. Of the "new" approaches, I like SciPy the best, given its support for MPI and ease of linking to traditional languages.

    Support for NUMA and SMP architectures is severely lacking in most "free" packages. This may, in some respects, be due to the lack of parallel support on gcc (although there is an effort underway (gomp) to add OpenMP support to gcc).

    Parallelism is important to any large-scale numerical application -- and PDL, as yet, does not appear to support SMP, NUMA, or cluster architectures. I know there are attempts at adding parallel support to Perl, but haven't seen much activity with them.

    GSL does not implement any parallel algorithms; according to this post by Brian Gough (), GSL is not designed to support parallelism.