Slashdot Mirror


No, It's Not Always Quicker To Do Things In Memory

itwbennett writes: It's a commonly held belief among software developers that avoiding disk access in favor of doing as much work as possible in-memory will results in shorter runtimes. To test this assumption, researchers from the University of Calgary and the University of British Columbia compared the efficiency of alternative ways to create a 1MB string and write it to disk. The results consistently found that doing most of the work in-memory to minimize disk access was significantly slower than just writing out to disk repeatedly (PDF).

5 of 486 comments (clear)

  1. Re:Check their work or check the summary? by LordLimecat · · Score: 4, Interesting

    Tl; DR:

    They used python and java. Sort of hard to develop a meaningful thesis on general programming when you're that far up the abstraction stack. Who knows, maybe python and Java suck at memory management (GASP).

  2. Re:Check their work or check the summary? by Frnknstn · · Score: 5, Interesting

    It's not even the choice of tools, they seem to willfully misuse the languages to get poor results.

    --
    If it's in you sig, it's in your post.
  3. Re:Check their work or check the summary? by danlip · · Score: 5, Interesting

    The language is not the problem, the code is terrible. They did String concatenation in the most expensive way possible. I'm pretty sure if you used a pre-sized StringBuilder it would be faster in memory.

    They also make some very novice benchmarking mistakes.

    This is actually a pretty good interview problem. Anyone who writes code like that should not be hired, even for a junior position.

  4. Re:Check their work or check the summary? by Anonymous Coward · · Score: 2, Interesting

    Fixed their code by using a StringBuilder and moving the flush call inside the loop, so it actually writes it to disk.
    The result:

    In-memory mean: string time 0.008900000000000002
    In-memory mean: file time 0.0034000000000000002
    Disk-only mean: file time 1.1747

    Yes, it's still quicker to do things in memory, you just have to do it right.

    PS: with just one flush:
    In-memory mean: string time 0.0091
    In-memory mean: file time 0.0038000000000000004
    Disk-only mean: file time 0.026599999999999995

    Still faster in memory.

  5. Re:It depends by Penguinisto · · Score: 4, Interesting

    That's the very first thing I thought of... what if the code were written in a lower-level language (and not in fucking python or Java!), then made do this task on Windows $latest, OSX $latest, Linux $latest, maybe a resurrected DOS $latest for reference, etc... I mean, it can't be that hard to write this thing in C and port it as needed.

    Doesn't seem very scientific at all otherwise. I mean, are they testing memory versus disk, are they testing memory vs. disk performance in a given specific language, or what? Maybe they just needed to flesh out their abstract a bit more to reflect this?

    --
    Quo usque tandem abutere, Nimbus, patientia nostra?