No, It's Not Always Quicker To Do Things In Memory

← Back to Stories (view on slashdot.org)

No, It's Not Always Quicker To Do Things In Memory

Posted by Soulskill on Wednesday March 25, 2015 @03:45AM from the performance-that-fails-to-perform dept.

itwbennett writes: It's a commonly held belief among software developers that avoiding disk access in favor of doing as much work as possible in-memory will results in shorter runtimes. To test this assumption, researchers from the University of Calgary and the University of British Columbia compared the efficiency of alternative ways to create a 1MB string and write it to disk. The results consistently found that doing most of the work in-memory to minimize disk access was significantly slower than just writing out to disk repeatedly (PDF).

5 of 486 comments (clear)

Min score:

Reason:

Sort:

Re:Check their work or check the summary? by LordLimecat · 2015-03-25 03:54 · Score: 4, Interesting

Tl; DR:
They used python and java. Sort of hard to develop a meaningful thesis on general programming when you're that far up the abstraction stack. Who knows, maybe python and Java suck at memory management (GASP).
Re:Check their work or check the summary? by Frnknstn · 2015-03-25 04:06 · Score: 5, Interesting

It's not even the choice of tools, they seem to willfully misuse the languages to get poor results.

--
If it's in you sig, it's in your post.
Re:Check their work or check the summary? by danlip · 2015-03-25 04:30 · Score: 5, Interesting

The language is not the problem, the code is terrible. They did String concatenation in the most expensive way possible. I'm pretty sure if you used a pre-sized StringBuilder it would be faster in memory.
They also make some very novice benchmarking mistakes.
This is actually a pretty good interview problem. Anyone who writes code like that should not be hired, even for a junior position.
Re:Check their work or check the summary? by Anonymous Coward · 2015-03-25 04:48 · Score: 2, Interesting

Fixed their code by using a StringBuilder and moving the flush call inside the loop, so it actually writes it to disk.
The result:
In-memory mean: string time 0.008900000000000002
In-memory mean: file time 0.0034000000000000002
Disk-only mean: file time 1.1747
Yes, it's still quicker to do things in memory, you just have to do it right.
PS: with just one flush:
In-memory mean: string time 0.0091
In-memory mean: file time 0.0038000000000000004
Disk-only mean: file time 0.026599999999999995
Still faster in memory.
Re:It depends by Penguinisto · 2015-03-25 05:38 · Score: 4, Interesting

That's the very first thing I thought of... what if the code were written in a lower-level language (and not in fucking python or Java!), then made do this task on Windows $latest, OSX $latest, Linux $latest, maybe a resurrected DOS $latest for reference, etc... I mean, it can't be that hard to write this thing in C and port it as needed.
Doesn't seem very scientific at all otherwise. I mean, are they testing memory versus disk, are they testing memory vs. disk performance in a given specific language, or what? Maybe they just needed to flesh out their abstract a bit more to reflect this?

--
Quo usque tandem abutere, Nimbus, patientia nostra?