No, It's Not Always Quicker To Do Things In Memory
itwbennett writes: It's a commonly held belief among software developers that avoiding disk access in favor of doing as much work as possible in-memory will results in shorter runtimes. To test this assumption, researchers from the University of Calgary and the University of British Columbia compared the efficiency of alternative ways to create a 1MB string and write it to disk. The results consistently found that doing most of the work in-memory to minimize disk access was significantly slower than just writing out to disk repeatedly (PDF).
on the speed of your memory, and the speed of your disk, SSD's are getting more common.
Sorry but you'll need to do it without using any memory. We need to make it fast.
'll have to dig through their testing and methods, but this seems pretty fishy given the summary.
Seek/Read/Write time of a disk is always slower than memory. No exceptions to the rule exist given current commodity hardware. Bus length to a disk is also much longer than to memory. Again, there are no exceptions given commodity hardware.
Won't be the first time someone reported that the laws of physics don't exist for something, and I'm sure it won't be the last. Maybe someone with free mornings in the US can break it down better than the summary.
-The wise argue that there are few absolutes, the fool argues that there are no probabilities.
This is the dumbest research I've seen in 2015. There was actually no computation involved -- they just wanted to write a long string to disk. They concluded that adding the superfluous step of concatenating strings in memory, then writing to disk, was slower. Well duh! That's not what memory is for!
They have nothing better to do with their time than benchmarking bogus string operations?
Probably because they're using regular Java strings, which are as slow as fsck. Something that's probably described in an intermediate Java course.
I see no mention of C in the article, which has the best means of concatenating 1000000 1-byte strings.
So have they proved that writing to disk is faster than writing to memory and then writing to disk?
I fear, this article will be referred to for years to come as "evidence", that in-memory work is slower, while the truth is, Java and Python programs are slower, than the properly-compiled (to machine code) programs. TFA says so too:
but few people will read that far down...
It is just "too easy" to write code, that will cause the useless object-creation and destruction in these "higher level" languages — and a human mind can not distinguish between a microsecond and a millisecond, so it all seems to work fine — until you need to do it a million times...
In Soviet Washington the swamp drains you.
The price of ECC ram doesn't drop for years and years.
Did they forget to flush it?
It's slower in languages with automatic memory management, or with a VM, which is no surprise.
It would be much faster than disk if you wrote the time critical parts in a language designed for, you know, speed...
I apologize for the lack of a signature.
Not a good paper. Quadratic string appends are a problem, yes, but a solved one. It's why StringBuilder and [].join exist.
The authors wrote two programs that concatenated differing amounts of bytes to strings. There greatest "speed-up" was for a case where they concatenated 1 million times a single byte to a string or wrote a single byte to disk.
As they implemented this in Java & Python, this actually meant that they did one million copy operations for the in-memory version, while the operating system batched the writes to disk together very efficiently. This is not news for anyone that knows their programming language.
Lousy languages produce lousy programs. Get me some C baby!
I haven't tested the correct way myself to compare performance, but the test code used is a perfect example of how not to code Java.
They've used immusable strings for a purpose which they aren't designed. If your entire operation is to concatante values together, then it should be using a StringBuilder which is actually designed for doing this.
This just shows that bad algorithms in memory will be slower than reasonable algorithms on the file system.
In reality, both of those implementations were primarily memory backed. The implementations didn't flush the disk between each write... they only flushed the disk at the end. That means that all of the intermediate writes were actually going to memory.
So maybe the conclusion should have been "memory is faster than memory!"
1. arXiv paper - not peer reviewed ... oh I wonder where their 1MB of data is going to
2. authors never mention caching, buffers, any kind of actual technical details
3. for the Java code they use 'BufferedWriter'
4. plots done in MS Office
=> the paper is complete and utter crap and would not pass muster with any reviewer on any C-rated conference or workshop
Generally if you're looking to speed things up in RAM its not because youre concatenating a group of strings over and over, its because your overall read time improves dramatically as well. The study also doesnt take into account IO controller overhead...for example the overhead to write to RAM is generally mitigated in intel chips as the northbridge is merged into the processor and takes advantage of cool things like predictive instructions by the ALU. PERC raid controllers and HBA's are typically limited by the bandwidth of the bus and the clock speed of the controller on the other hand, as well as any pending rebuilds or cached data theyre committing or storing at any given time. JBOD configurations in some RAID cards also requre you to build an individual RAID for each disk, meaning the controller could have countless configurations it has to track.
an excellent example of where you want RAM to handle reads and writes is in email antispam. amavis queues get expensvie fast, despite optimized perl threading, but cutting this back to spamass-milter and keeping spamassassin in a ramdisk with its compiled ruleset there too means you can handle nearly the entire evaluation of the message without even touching the incoming queue on disk. issuing rejections at the handshake then greatly improves efficiency over having to issue bounces, which can touch up to 4 queues on disk in some cases.
Good people go to bed earlier.
This is a REALLY mind boggling stupid test (or at least headline). Of course it is faster to immediately write stuff to disk as it becomes available, than to build the string in memory and then flush it to disk. Keep the IO bus full while the next write is prepared.
That doesn't change the fact that you should avoid touching the disk as much as possible, it just illustrates that if you must touch the disk, you should try to do it while the processor is busy doing other things (if possible).
I'm guessing that the underlying library caches things and is implemented more efficient than the string concatenation.
What they're saying is if you write bad code, it performs like shit. Did someone get a PhD from this?
I completely fail to see any merit in this paper at all. This whole article should be removed.
This:
for i in range(0, numIter): ...from the in-memory part of the experiment. Any Python programmer knows you don't do it this way. Strings in Python are immutable, so this re-allocates concatString every time round the loop, most likely causing multiple garbage collection cycles. It's obvious that this is not written by a Pythonista, as it's not "Pythonic" code. No wonder it's slow.
concatString = addString + concatString # modified: concatString = concatString + addString
Better (maybe not the best, tho. Call me lame... )
concatList = []
for i in range( numiter ):
concatList.insert( addString )
concatString = "".join( concatList )
News at 11...
one should proof-read enough to avoid grammatical typos in the introduction:
1. Introduction
[...]
Disks, whether mechanical or SSD, have orders ***or*** magnitude higher latency and transfer times
for "idiotic premise"
The paper describes using string concatenation in java to prepare the string in memory. In essence, it's comparing an O(1) operation to an O(n) operation and complaining that the latter is slower for large values of n.
Unless I have misread the paper, it seems that these folks have just found experimental proof that disk writes are buffered.
"In Java and Python strings are immutable, and any assignment will result in the creation of a new object and possibly copy operations, hence the overhead of the string operations. The disk-only code, although apparently writing to the disk excessively, is only triggering an actual write when operating system buffers are full. In other words, the operating system already lessons disk access times.
I'm guessing that this investigation started with someone making a bet while their thought processes were slightly impaired.
But I read it on Slashdot!
so, really, it's not faster to worry about using the motherboard's memory, when you could be ALSO using the disk controller's memory as well.
see, that would make sense. Jumping thru hoops to avoid disk writes, which end up just being memory writes to the controller's memory, might take more time than just sending the data to the controller. THAT is totally plausible.
But, if we assume at some point the disk controller is busy. or its cache is full, because the server is doing USEFUL WORK besides this test. Then, I'm not sure I believe it.
Did the article say what kind of things they tried to avoid writing to disk? compression? I guess I'll have to actually read it, even if that's what they'll be expecting me to do...
Someone looked at the Java code used ? (PDF p8 "Appendix 1. Java code")
for (int i = 0; i numAdd; i++) {
addString += "1";
}
Why not StringBuilder ? No one want to concatenate (lots of) strings with "+=" in Java because it is not efficient.
Maybe someone should run the test with proper Java string concatenation code and see the results, then you could tell.
Glad I wasn't the only one thinking this. I read the article and thought that I had missed something since their "task" and "conclusions" we're so trite.
is an expensive in-memory operation, as it is in many other high-level languages. Unsurprising that writing a 1MB string to disk is faster.
Sounds confused, firstly you're not writing it out to disk, they're writing it out to the disk cache, i.e. a memory copy. They even use BufferedWriter.
Secondly they're using Strings in java, not StringBuilder, Strings do not implement concatenation quickly, StringBuilder is the way to do that.
It's all shows a very poor understanding of code. I really recommend anyone serious in becoming a programmer to LEARN MACHINE CODE or at least C so you have a decent understanding of whats going on at a lower level.
What the authors did was do lots of concatenations in languages that have terrible performance for string concatenation. As it turns out, when you do something inefficient, it's slow.
I learned this after about 30 days of playing with visual basic, back in the day. It's horrible at concatenating strings. Sounds like the Java + Python implementations that they tested, are similarly bad.
So, it does prove that doing things in memory is not always faster than writing to disk. But the lesson really should be - if you find such a case, it proves the functions you are using, are horrible and should not be used when performance matters.
One doesn't simply 'write to hard disk'. You ask the operating system to do that. The OS buffers writes and sends them to the disk in what it thinks is the most effecient blocking. And when you 'write to disk' you really 'ask disk device to write to disk (eventually)' A hard disk any more is a little computer dedicated to storing data on disk and retrieving it as quickly as possible.
So, they took a high level language (java,python, etc), repeatedly concatenated to a string, presumably without allocating memory in advance if thats even an option on these languages, and then wrote the result to disk. VS just appending stuff to the file on disk, and found that it's faster....... oh rly? Where do you get a job to write studies like this?
No, It's Not Always Quicker To Do Things In Memory
The title ("No, It's Not Always Quicker To Do Things In Memory") should be modded Flamebait, Troll or similar. If it'd be possible.
Slashdot, fix the reply notifications... You won't get away with it...
With Java and Python? Yikes - what a poorly done study. Do it in FORTAN or C and get the same results, and then I will believe you.
It's the kind of research I expect from high school kids learning to program for the first time...
Actually, the authors' comparison is unfair. When they do "in memory" (in the slowest setup as they recognize that appending is fastest), they add the new piece of string at the beginning of the string (which requires a copy of the old string). When they write to disk they append the new piece at the end of the file.... BIIIIP, not the same algorithm. If they had mimicked the same process on disk, the on-disk timings would just be awful.
Even with SSD, memory timings are clear, L1 cache is faster than L2, which is faster than memory, which is faster than "disk". A properly written algorithm (with the proper programming language -- indeed java or python are not the best languages for such a comparison --), will always be faster in-memory than on-disk at least as long as the data structure can stay in memory....
I bet this paper will never get published...
They're only examining the performance of concatenating immutable strings, versus the performance of writing to a (buffered) stream.
This is a problem that's been known about for donkey's ages. It's just that computers are so stupidly powerful it's no longer an issue that many programmers ever have to confront.
In VB6 you had to jump through hoops to do it properly, but it's such a common case in Java that the compiler will optimize repeated concatenations in a loop into using a StringBuilder instead. I presume Python has similar optimizations.
I used to just stick all the strings in an array, allocate a new string of the appropriate length, and copy them into it.
News at 11 : many less experienced programmers are ignorant of the internal workings of their chosen frameworks, because they never had to write their own implementation at a lower level.
There are SO MANY layers of caching between the application layer and the physical disk it is not possible for most applications to know that they are actually writing to disk. This is simply one example of that. Additionally, crappy application layer code runs slowly. Yes, it makes a HUGE difference how you write your code even today in how quickly it executes. High level languages simply make it easier to write code which executes slowly for no apparent reason.
Pretty much my thoughts. Writing to disk is slow, but it's also semi-async operation (in that much of the time, the job is offloaded to the I/O subsystem before the write is complete), which generally means the sooner you start writing your results the sooner you'll finish, and if you start early you can do computational work while the I/O is happening rather than spinning wheels while trying to write the whole thing in one go. All they seem to have done is add a pile of latency and may even have introduced other impacts such as garbage collection or VM swap.
Log in or piss off.
Basically the article doesn't give enough detail. It doesn't say whether the strings were created using the base string objects in Java/Python or using the much more efficient stringbuilder objects. The former would be horrendously slow. Also what was the base setup of the machines being tested on how much memory did they have? did their disk controllers have built in cache? What kind of disks were used.
Build a Man a Fire, and He'll Be Warm for a Day. Set a Man on Fire, and He'll Be Warm for the Rest of His Life.
Disk cache management is more efficient than Java and Python string management when you do a dumb thing one million times just to consume CPU cycles.
Add byte, move pointer versus Alocate new, larger string, copy contents of former strings, return old strings to pool, gargabe collect. Repeat it one million times.
This study is stupid.
In either case, they're still doing things in memory. The OS will set up its own buffered write to disk. That's one of many things OSs do which high-level people aren't always aware of.
By using a bigger user buffer and performing one write - memory copies go UP, cache thrashing goes UP and total memory usage goes UP, so OF COURSE IT'S SLOWER.
The disk accesses are optimized and buffered into fewer accesses behind the scenes.
This whole thing could have been skipped simply by doing a bit of research into how I/O systems and file systems work.
"In both cases we flush the disk file before closing it to make sure correct disk access measurements are done"
flush doesn't guarantee the OS write cache is flushed to disk, you need an fsync after that.
The study's test involves concatenating some 1MB string in chunks of size n, for n in [1, 10, 1000, 1000000], then writing the string to disk. Note that for size 1000000, the entire string is made at once, and both the in-memory and to-disk methods are doing the same thing. They found that queuing many disk writes to the OS is faster than reallocating a string the same number of times in memory, then writing once. The results converge towards n=1000000, as expected. Essentially they've compared reallocations to queueing.
"[T]he concatenation code in Appendix 2, concatString = addString + concatString, places the addString contents at the beginning of the target string (vs. the end), so moving concatString is required, and a stack allocation does not help with performance. To verify this statement, we re-ordered the concatenation statement to concatString = concatString + addString, and ran the code on both Windows and Linux again. Windows’ slow concatenation results did not change, while for Linux the modified inmemory version was faster than a disk-only version, hinting at a stack allocation scheme."
So essentially, always writing to disk will be faster if you have no idea what you're doing. Obviously there are situations such as logging where repeated concatenation to disk is sensible, but this really doesn't seem to be breaking any new ground. I expect/hope that we're not about to see a huge revolution in forgoing memory in favour of disk access.
What they actually compared wasn't the speed of the disks, but the speed of the language runtime and OS file IO buffering routines!
It wasn't really that surprising that concatenating java or phyton objects can be slower than letting the low-level runtime do the same task.
If they had wanted to test the disk IO speed then they would have had to add at least some fflush() calls.
It is trivial, in any language, to make your code faster than the actual disk transfer speed, but a lot harder to make it faster than a set of small block moves within (cached) RAM.
Terje
"almost all programming can be viewed as an exercise in caching"
One of the first optimizations you learn when writing Java in a moderate load environment is to use StringBuffer or StringBuilder when concatenating Strings. There is probably a similar construct in Python. The test was not written from a place of experience or was purposely constructed this way to prove their pre assumed point.
The research tells us that repeatedly concatenating strings together is a bad thing... WE ALREADY KNOW THIS!!! good grief, who taught these guys to code? The title of their paper "When In-Memory Computing is Slower than Heavy Disk Usage" implies heavy disk access where none exists. They actually go on to point out that it's the OS doing magic things that helps out. i.e. it's the OS using RAM to buffer the disk that keeps your app speedy. So erm... memory being used instead of disk then... the exact opposite of their claims
Why not have a temp ram disk card that takes old ram say ddr1 / ddr2 / etc (runs at the slowest speed of the stick install does not need to be matched) does not need a battery and has say pci-e or sata link?
I'd argue it's always faster to do things in memory. In the case presented here they were *not*. In both cases being compared they were writing to disk. All they did was determine the better way (for their case) to write to disk.
What about writing it to memory repeatedly?
This just in! Doing different things in different ways on different media will produce different results! (These guys should get a Nobel price)
It's even more simple than that. Their "writes to disk" are just being stored in disk cache hence the "faster" speed. On the other hand, they do basically the most inefficient in-memory operations possible.
It's faster to use the buffering in libc or the OS than do your own shitty implementation of buffering.
I mean, I can see that inexperienced developers might make this mistake. But this isn't surprising.
In other news, Pope found to actually be Catholic. The Pontiff was quoted as saying. "I always knew I was Catholic from when I was a little boy."
This just in. Massive government study shows bears do defecate in the woods. Head of the $65M (£43.6M) government funded study, Dr. Hans Schmidt, describes the study. "Ve always knew ze bears did zeir business somevere but ve were never sure vere zey did it. But now it is confirmed. Zey do zeir business in ze woods."
My mantra of "If it can be done, it can be done badly in any language or any system." is proven once again.
Here's the relevant bit:
long startTime = System.currentTimeMillis();
for (int i=0; i
So if numIter is one million, they're generating and throwing away a million temporary objects, some of them quite large. No competent Java programmer would write a tight loop this way.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
What is more interesting to me is what lead up to publishing of this masterpiece... Is it a cry for help from someone regretting their study choice? Smartasses skilled in the finer arts of trolling? Or just a magical combination of laziness, hubris and ignorance?
.
It's a flawed experiment.
It's almost as if submitter "itwbennett" got this shoved thru the queue to get yet another one of his ITWorld.com posts frontpaged, despite the content itself being worthless.
They have rediscovered the memory allocator copy problem.
Basically in python lets say you do this
x = x + "newstr"
So depending on how big x is will depend on how fast this operation is. You have to alloc a new buffer to hold len(x) + len("newstr"). Then memcopy it over. Then free up the old string.
That operation can be slow. If you are just appending over and over and reallocing you probably are better off with some sort of long term storage intermediate or a 'buffer' added to the end so you do not need to realloc.
So yeah depending on the string length yes a seek to the end of a file could be faster. THAT is however highly dependent on how your filesystem allocates new sectors for a file and if you are on a sector boundary.
A good way to mitigate the issue is to have some extra buffer on the end of your memory so you do not need to do the string copy. Then at that point you are doing mostly seeks with the occasional copy. This only works to a point depending on how big x ends up. So your mileage may vary.
There are some comments regarding Java and Python being bad languages to test this premise. That isn't a problem because Java and Python are both production languages. It doesn't matter if Python and Java are slow for these experiments simply because companies are using Python and Java for production code.
My problem is with their "benchmarks". In both Java and Python, strings are immutable. So, in order to concatenate one string to another you have to create and entirely new string by doing a deep copy. Thus, the cost of a string concatenation is the addition of the sizes of both strings, not just the second. A basic algorithms course would inform us that this is an O(n^2) algorithm. However, writing to disk sequentially does not have this cost. The file can be appended to and so the previous data does not have to be rewritten. Thus, writing directly to disk is an O(n) algorithm. With this knowledge, for any speed memory and disk, I can always find a size file that will show under these conditions that writing to a disk is faster. A better experimental setup would be to use a StringBuilder, which essentially is a mutable string. This would make the memory operation benchmark an O(n) algorithm as well and provide a much closer comparison.
However, I will add the caveat that the research is useful to some programmers. Specifically those that do not program with the underlying OS and architecture in mind. But there are many other problems than memory vs. disk writing that will show themselves if you program ignorantly.
I hope thjis is not just the kernel VFS and cache doing a better job with their memory management than the researcher's code.
Yep, many commenters here got it right: the "study's" authors are doing teh-stupid operations "in memory". This one is so egregious, especially that the ITworld article author fell for it, that I felt it warranted its own dissection post: http://blog.duh.org/2015/03/wh...
How could such naive stuff get slashdotted? I always thought slashdot's supposed to be for nerds.
know your problem...
And realized why they got it wrong.
I always wondered why university papers are held in such high regard. Clueless students(yes even grad students!) writing papers on subject matter that they don't understand is of questionable value to me.
I understand the value with expert supervision, when the professor is running the research, and I understand the value in peer review. But, then this paper gets cited and I start scratching my head; 'why is a clueless student's paper of any meaning to me?'
Their python results are bogus. After I fix the n^2 string aglorithm I get these results:
in-memory: String took 0.0522561073303, file took 0.000552892684937
disk-only: file took 0.184334993362
Before I get these results:
in-memory: String took 93.5676689148, file took 0.000598907470703
disk-only: file took 0.180480003357
Don't do stupid, incompetent, bone-headed things and your program won't be slow.
I changed:
for i in range(0, numIter):
concatString = addString + concatString
To:
concatString = ''.join( addString for i in xrange( 0, numIter ) )
Of course you can do better:
concatString = addString * numIter
Produces:
in-memory: String took 0.000266075134277, file took 0.000687122344971
disk-only: file took 0.194098949432
I just scanned the paper, because their claim seem to be idiotic. It looks like they are appending a single byte on the end of a string in memory and on disk. For the memory operation, this will result in a string copy since strings are immutable, vs. doing a one byte file append onto the disk. The former is increasingly expensive and the latter is a fixed cost, so after infinite operations, the disk cost becomes far less than the memory operation. If this is indeed their claim, and I am not missing something, then they should be collectively slapped for wasting our time by writing this paper. If this is really your use case, write some proper data structures to manage your data in a sane fashion.
So yes, if you do stupid things, you can make bad engineering decisions look like good ones.
HA! I just wasted some of your bandwidth with a frivolous sig!
None of the authors are computing scientists. Look at the paper they published. One is studying biology, one is studying either electrical or computer engineering and the third is studying either chemistry or biology.
In SQL server, large memory table objects can be slower to join with or access than temp tables due to lack of indexing. I am sure you could find other cases where memory might perform worse than file access.
love is just extroverted narcissism
What morons. Sorry, but they are. They are writing to a file through the operating system which means that it is being spooled out to disk asynchronously, so obviously piecemeal writes are going to be faster because they will run concurrently with the string generation algorithm. Plus their 'writes' are probably being buffered in ram anyway.
Writes to files generally do not stall programs. These people are morons.
-Matt
Another shining example, of why anything that has to do with performance, should never, ever, be coded in Java. This would be akin to writing a numerical algorithm using floats instead of doubles or even single precision, or coding an operating system in java, it would take weeks just to boot!
Now if they really wanted speed and performance, they should have run the code on a java interpreter, running on a java interpreter written in java, running on a java interpreter ... ad infinium, because recursion makes things faster, and java is faster than C, everyone knows what!
...you would see that the authors are aware of every criticism that has been stated here thus far.
They tested using strings in python and java, both of whose string libraries are very much overweight. And they tested by concatinating strings in a way that requires constant reallocations and memory copies versus pushing data to fixed size disk buffers in the OS cache.
So... surprise! When writing data sequentially the C implementation of disk buffers is faster than the java and python implementations of strings.
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
Can we reject things like this please?
Or possibly have a "-1 written by an idiot" mod option, and enough of them removes it from the front page?
Now that having a vagina is a legal NECESSITY for any coder receiving funding from a US state facility (or from a major politically correct IT company), you can expect an awful lot more GARBAGE flooding the annals of Computer Science.
I am reminded of the sickening NASA press conference, that announced a totally new form of life on Earth. A BRAIN-DEAD female, employed by NASA purely because she was female, had been given a team of mostly male science slaves, to pursue her NONSENSE theory that because she remembered the stupid simplification of the Periodic Table during her High School years (when teachers over-state the similarity of elements in the same 'group') DNA variants must exist in nature where one element of usual DNA would be replaced with ANOTHER element from the same group. This is a kin to saying that because a water-wheel is technically related to a gear, you might expect to find water-wheels within a mechanical watch. The 'science' she had used to 'prove' her hypothesis was the most humiliatingly awful pile of garbage, but her male colleagues had been forced by the upper echelons of NASA to claim her a 'genius'.
When ability is replaced by political correctness, and an anti-nerd program is implemented in most US schools and universities, standards don't just fall through the floor, but one enters an 'Emperor's New Clothes' phase in the USA, where anyone who wishes for any form of success MUST lavish praise on the ravings of cretins that the politically correct establishment is currently presenting as figure-heads.
Now this isn't an anti-female point. To the contrary, because of the bias towards males in engineering, the VAST majority of incompetents I have encountered have been males- some with significant power and/or influence. However, this is common in any system (scum rises) - and is different from a system FORCED to bias itself to selling one type of Human over another.
Bill Gates and Rupert Murdoch's 'Common Core' (don't forget, these two monsters- supposedly from either end of the political spectrum- are actually CLOSE friends- and work together on the 'new' education projects in the USA, like the now at the NSA 'inBloom' obscenity) initiative is already dumbing-down education in the USA, and placing maximum hurt on the male Beta experience in technical subjects- especially maths. Put simply, Common Core is designed to ELIMINATE the influence of natural Alphas in the classroom, and place 100% of the intellectual authority in the hands of teachers - adults that frequently have such poor skills in their 'subject', they actually fall BELOW the Beta designation. The underlying idea is that non-alpha kids will have too little innate confidence in their own skills at that age to challenge the system, and that the alphas will 'withdraw' and simply succeed with ex-classroom activity.
PS as for the hopeless article promoted by Dice (no accident there), the 1MB data set hits the common L2 cache limit by no co-incidence. The concept of 'memory' on a modern computer is a complex chain of 'L' caches, where the RAM of the system is actually L4 cache, the SSD drive should be used as L5 (but rarely is), which would make the hard-drive L6. Normally, under a properly designed memory-management system, one wouldn't be explicitly writing anything to the hard-drive UNLESS it was to create or alter a permanent HARD-file.
The speed of a modern CPU cluster (4+ cores in a modern work-horse PC) is so much faster than the data rate to a HD physical platter (as opposed to the RAM cache of the HD), it isn't funny. But usually when doing explicit HD writes, on accepts on that thread that the thread is going to be limited to the speed of communication to the HD. That thread, of course, will not even dominate the run-time of its own core since each IO stall will allow another thread to take its place until the HD has finished its current DMA activity.
The computer is NOT a black box, and coders who treat it as such, because their knowledge comes from politically correct teaching, havin
NEW SCIENTIFIC DISCOVERY!
For n equal to one million, an O(n^2) algorithm is slower than an O(n) algorithm. Even when the O(n^2) algorithm is run in RAM, and the O(n) algorithm is disk writes being buffered and optimized by the operating system.
I'll take my Nobel Prize now, thank you.
-
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
So in a nutshell, they compared the cost of the most inefficient method of in-memory string concatenation versus an efficient method of in-memory string concatenation: the memory in the latter case being a buffered file writer. Lololol!!!
And one of the first things you learn if you bother to learn about how your compiler works is that the java "+" operator on strings is replaced with stringbuilder under the hood in any remotely modern version of the language.
It's dumber than that. They didn't even do it right in Java. There is a note near the end of the paper that says "However, using a mutable data type such as StringBuilder or StringBuffer dramatically improved the results". They didn't present the numbers, but what they really meant was "The performance problems we saw were entirely due to our not using StringBuilder or StringBuffer, this paper shows no meaningful difference in performance between memory-then-disk and disk-only access once the algorithm is fixed."
Ok, I read all the other "This is stupid" comments and my jaw kept dropping. I actually felt this was an April fools thing or something similar and that we were all missing something somewhere (and please let me know if I am... I REALLY need to know). I HAD to read the article and underlying paper, cause I just couldn't believe the absolute asinine stupidity of the test, let alone that it was being presented as research, or that the test itself was so flawed! So after all that, had to post. Summary for others, adding my voice to the crowd.
----------------
Assumption: Software Developers avoid disk access cause they believe doing it in memory is faster. This is put in context of BI and bigdata.
Testing: Create a program representing a common task that can be tested where one uses memory and the other uses diskspace.
Memory Test:
1) Create a string in memory.
2) Add it multiple times into another string
3) Write second string onto Disk
4) Flush writes
Disk Test:
1) Create a string in memory
2) Write it multiple times to Disk
3) Flush writes
Create code in Python and Java.
Conclusion: Memory Test is so much slower than Disk Test! Additionally, the languages used have certain quirks to make it worse. Optimization helped a little but only on Linux. Therefore, programmers should reassess and understand their OS and programming languages before assuming this belief which is not true.
---------------
Assumption & Testing idea... very good. I would have loved to know the unknown scenarios where this assumption should be questioned. Especially in the world of click&drag programming for workflows, ETLs, and report writing.
But from there... its all BS and stupidity. Basically the test tests if replicating the hard drive driver in memory and then using the driver to write to disk is faster than just using the driver to write to disk. Are you bloody serious?!?! That's like testing if 2+2 is greater than 2+0. And that is before we start looking at using Java and Python which do a ton of work in terms of memory management and build all types of stuff around data types. Before the fact that they wrote the Python code WRONG (that's the slow way of doing string or listing concat). So they picked languages that write in memory O(n) extra times for the same data.
This test would have come to the same conclusions in C, C++, or Assembly! But the folks wouldn't have been able to write code to see the micro second time differences.
So lets set the record straight. NO developer out there goes out of their way to just write to a memory file if its simply going to flush to disk. Its not worth the extra lines of code, nor the lost CPU cycles in reading them. Especially since most operating systems do this already at multiple points along the data chain at the very low hardware & driver levels! If we have developers like this, we have a ton of bigger problems in software development than this little thing that will be solved by money.
To test this belief properly, give me a scenario where you reuse the written to disk/memory stuff, transform it, and then write to disk. See which one is slower. If its written properly, you will see that the underlying hardware systems will actually store stuff in cache or memory for you to help you speed it up! If you find proper scenarios where the memory part is slower, please let us know cause that is actually adding to the IT body of knowledge.
God, as this was BigData related, I was hoping at least something along the lines of "In DB data processing and extract vs extract and client side processing". Give me the points along a curve where one is better/worse than the other. THAT would have been interesting.
"...alternative ways to create a 1MB string and write it to disk"
This is not surprising if the goal is to write something to the disk.
But what if you were to write something to the screen instead?
Would it be any faster to create a 1MB string on the disk and then display it on the screen? Probably not.
Not to mention that writing to the screen really is writing to memory; as opposed to a disk which is a slower, physical medium.
So really, like anything it depends on what you are trying to do.
It may not always be quicker to do things in memory, but it usually is.
Political correctness is really just herd psychology pushed by insecure people who desperately seek social conformity.
Can't be.
I just ran a dBASE III test with my first harddisk I ever got, a full height 20MB Seagate and memory always won.
>This is the dumbest research I've seen in 2015. There was actually no computation involved -- they just wanted to write a long string to disk. They concluded that adding the superfluous step of concatenating strings in memory, then writing to disk, was slower. Well duh! That's not what memory is for!
Agreed with you on the uselessness of their research, but that is most definitely one important and common use of memory: buffer caches used by the operating system.
Effectively, they unintentionally tested the speed of the OS to concatenate strings vs Java or Python. The researchers are wrong right out of the gate: they say "Heavy Disk Usage" in their research headline, but at no point did they actually test disk performance, everything they did is being handled by the OS buffer cache.
All the researchers have shown is that string concatenation operations in Java and Python are atrociously slow. The java example used the naive form a=a+b; to concatenate strings, which is one of the slowest ways to do it in Java if you are doing repeated concatenations to a string.
If, in their tests, they had also done a string concatenation in C by allocating a buffer and appending to it using a pointer (not strcat) the speed difference doing that vs. 1 million write calls would have been negligible.
Also, if they sync'd after each of a million 1-byte writes to test how slow "Heavy Disk Usage" is compared to a single write of a million bytes, they wouldn't have bothered finishing this paper at all because it's so damn obvious that memory is faster.
Maybe we should store our files in memory and load them into the harddrive to do calculations.
What do you expect... There is only one person that MIGHT have a computer background on the paper... This is pure academic fluff, compare an in-memory database with a spinning rust based database and see how many operations you can get out of each one performing the same operation.
I have mod points and I am not afraid to use them
Unless you really mess up your in-memory variant, there is no way in this universe that disk access can be faster. Here is a simple thought-experiment that shows this: Just use an in-memory array as storage vs. the disk. The array must be faster as RAM has orders of magnitude better bandwidth and latency as even the fastest SSDs, and it has far, far smaller block-sizes in addition.
Of course, if your in-memory data storage is so badly organized that the OS in-memory (!) buffer-cache does a better job, you may think that you observe the disk being faster than your RAM.
Looking at the paper header, the authors are all from a biology-department, so the suspicion that they are clueless of how to write things efficiently is really not far-fetched. I will not read the paper, it is likely a waste of time.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I guess that explains why they're treating computing as observational science which it is not. This still doesn't explain why anyone approved this study to begin with when they could just ask someone who is knowledgeable with computer OS internals... or, god forbid, do some preliminary research on the topic.
Try 1 million writes with 5000 files. Their example doesn't apply when scaling.
Writing lots of small strings to disk one by one, is faster than building an in memory string up from those small strings one by one - reallocating a larger chunk of memory and copying the entire current string on each little step.
It's almost like memory allocation and copying data over and over might slow things down a bit over simply not doing that at all.
Honestly the paper is completely retarded. Only a moron would not expect that result.
The results were poisoned by the presence of various caches affecting disk I/O and for that matter memory I/O. On some modern systems, either the disk lies to the computer or the OS lies to the application and the application thinks the data is actually stored on the bare metal before it is really stored (the data may or may not be stored in a "safe" place like a non-volatile cache - the point is that a small write operation returns "success" very quickly, much faster than if it had to wait for the bits to be written to the platter).
The only thing they can really say is "on this hardware, using this operating system, under this workload, these are the results of our experiments."
I'm not saying their results aren't useful - they are. Instead of presenting this as "memory writes are faster than disk writes" they should say "in some or many modern systems, under some circumstances, it may be more efficient for programs or operating systems to write to external storage devices in small bits rather than going to extra work to minimize the number of writes to such devices. Don't assume that what was true about the performance of an application calling an operating system to perform a disk-write operation or of an operating system asking a hard drive to perform a disk-write operation is the same now as it was a decade or two ago."
Just don't call them "disk writes." Call them what they are - "requests by the application or the OS to the OS or hardware to perform a disk write."
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Here
public static void main(String[] args) {
final long time = new Date().getTime();
StringBuilder sb=new StringBuilder();
for(int i=0;i<10000000;i++)
sb.append(" ");
System.out.println(new Date().getTime()-time);
}
The result is 101ms on my JRebel instrumentated JVM
I hope he fail, he is a disgrace on Canada
It's nice that competent people are shedding light on things like string copying, real disk I/O versus OS-mediated buffered one.
This is great even if the paper turns out to be flawed.
That notwithstanding, let me tell about a personal experience, take it FWIW: I made a course where I opted for results of command output to screen; my less computer-savvy friends left the default of data output to disk.
To my surprise, output to screen never happened (other things were being displayed very well, thank you); just by using the command for disk file output, results were immediate. Being in a course where following the subject was mandatory, I left the question aside for further reflection.
In real-life (and that's what the paper is about), things happen and it's great to better understand how 2015 hardware could be surpassed by a 15-year old one because better software choices were made.
(I'm looking at you, registry.)
I ran Windows in RAM it was much faster.
Non peer-reviewed research should be published in the onion, at least this way we all know it's a joke, and news sites would stop posting this garbage as real research. The summary of this paper is you can write terrible code that makes in-memory look worse than disk because you don't really write to disk, the os does, and you don't know how to actually measure in-memory vs disk access.
Did you actually read the paper? It makes a lot of sense, it just isn't that surprising if one understands the reasons.
The in-memory-operation (string concat) causes a lot of memory-copying. The disk-writing doesn't, because the system isn't really writing out stuff for each such request, but is buffering the data in a way that is more efficient than the copying of immutable, fixed length strings.
Their message isn't "wow, disks are faster than memory after all!". It's "don't teach developers to blindly favor in-memory over disk in all cases, teach them how stuff works on all levels".
sounds like a bit torrent client with a small disk cache. I once ran Utorrent with a 1 MB cache or perhaps no cache at all. The hard drive light blinked continuously and the motor sound didn't stop. After I enabled the cache, the hard drive did not make much noise. Computer did seem to run slower when the cache was small or disabled too. Or maybe I was imagining things.
Correction: Dumb concatenation is for situations where clarity and convenience are more important than performance. That is to say, most of your code. If you're using a buffered string building system to perform a simple concatenation of file name components before opening the file to begin the real work, then you're creating maintenance headaches for an irrelevant performance boost. In an inner loop dedicated to string manipulation though...
One of the most important aspects of programming is knowing how to choose the right tool for the job - typically 80-90% of your codebase is executed so infrequently that even abysmal performance won't be noticeable to the user, and optimizing that code will actually be counter-productive due to the time and maintenance overhead it imposes. Moreover, most programmers (and especially the inexperienced) are quite poor at recognizing which parts of their code are actually performance critical. Recognizing that, it may often make sense to default to doing things the simple, slow, way and only worry about optimization after performance analysis has highlighted the bottlenecks. Personally I'll often write the expected hotspots with some basic "low-cost" optimizations (like StringBuilder) and an eye towards making future optimization simple (expected optimizations should never require an architecture overhaul, and maybe it's worth passing in certain data that will be necessary for future optimization of a heavily used function, just to avoid having to re-write a million function calls if/when it comes time to optimize it), but primarily focus on correctness. Then, after everything is working correctly, and performance analysis has shown me which bits of code are actually causing issues (including the bits I never would have suspected), then I go back and break out the heavy guns. I've gotten burned too many times over-optimizing the wrong things and generating headaches down the road to make a habit out of it anymore. Unless it's something fun to implement of course...
--- Most topics have many sides worth arguing, allow me to take one opposite you.
... Move along.
The author of the paper knows nothing about how strings work in Java or Python. Every time the string is concatenated in a loop, the interpreter creates a brand new string. The "in-memory" code isn't doing the same thing as the "on-disk" code.
Holding the string in memory serves no purpose at all if you are just appending to it. Frankly, this += of strings issue is the most common "Smart but Green Self-Taught" versus "Computer Sci grad" problem you will see with new hires. Appending strings can be O(n^2) when the strings are immutable, and it applies to most high level environments. Even Metasploit had this issue at one point, and it was written by some very smart people. So everybody learns to keep appending to a list, and then flatten it to a string at the end. But this tie in with disk just makes the paper totally dumb. If you won't be reading the queue of string chunks, just flush them to disk immediately so that the code runs in constant space - relieving the memory allocator.
Concatenating strings one character at a time in Java has QUADRATIC performance (i.e. O(n^2)). If they used the StringBuilder class instead I bet most of their bottlenecks would disappeared. With that class it should be amortized O(n).
Using a system with no congestion to write a small amount of data (that fits entirely in said cache) to disk cache before it is flushed in one flush is fast, and constructing code to perform obscene numbers of ops on modern operating systems using highly oo languages perform slower than not doing so. Gun -> Head = blown.
String concatString = "";
for (int i=0; i numIter; i++) {
concatString += addString;
}
That's going to create 1,000,000 StringBuilder objects, use them to append a single String each, and allocate 1,000,000 new String objects as well
StringBuilder builder = new StringBuilder(
for (int i = 0; i numIter; i++) {
builder.append(addString);
}
String concatString = builder.toString();
I bet $1,000,0000 that code is faster.
tl;dr; Researchers who don't know who Java works suck at writing Java benchmarks.
String a = b + c;
gets translated by the compiler to something like:
String a = new StringBuilder(a).append(b).toString();
It's creating a new StringBuilder object, its member variables including a char array, it copies the String passed in to the constructor. Append is probably also expanding the array, which means creating a new array and copying the old one to the new one, then copying the data from b to the end of the new array.
toString then creates a new String object, copying the data again.
If you write shit code, you get shit performance.
Breaking News: Badly written code is worse than well written code!
Of course, there are exceptions. But many PhD's I've known make lousy programmers, in terms of producing good software.
I've come to think that the skills needed to be a good post-graduate student are different from the skills needed to be a good professional developer.
Professional developers know (or should know) how to optimize code, when necessary. All else being equal, optimized code will ALWAYS be faster in memory than on disk. The two examples in this research are NOT equal. A more equal test would be to output to a memory stream, vs. a file stream. I'll bet the results would be quite different.
Drool ops resulting from code written by windowlickers is slower than os/system library for disk access written by developers; especially when the data fits completely in os and disk hardware cache.
If you do it correctly, and use the right tools for the right job, on-board memory will ALWAYS be faster than physical disk (and even SSD's)
How long to create a string of 9's?
Secession is the right of all sentient beings.
In C you always know where the pitfall is. Or at least you have chances to really know.
Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
Could someone please send this paper to Linus Torvalds? I'd like to hear his opinion on this paper. =)
as stupid does
See subject.
According TFA, they actually do an explicit sync to disk at the end of the writes. So it's not purely writing into cache.
Log in or piss off.
I was about to upgrade my hardware, but instead I just pulled all my DIMMs and I'm only using virtual memory now. My computer is like a million times faster, and I think it even got rid of some viruses that were hiding in memory.
Now if I could just figure out why that goddamned System Idle Process is using so much CPU time!!!!!
https://www.eff.org/https-everywhere
is really slow, but if you use a file buffer like so:
it's much faster. Wow. No kidding. Also, note, they don't flush until the end. This is laughable. No wonder CS programs are under attack if this is the kind of thing that people think they can publish.
Many people are suggesting using string builder, as a easy fix...If you think about this problem, that doesn't solve it as you approach infinite operations, it just pushes the cost crossover point way out (possibly beyond the limits of existing hardware, so it might be practically moot). Since they are doing silly comparisons like this, I would suggest just writing a linked list to store each byte as a counter example that will provide more of an apples to apples comparison. Adding an element to an linked list will have a fixed cost, just like appending a byte to disk will, so after infinite operations, you could demonstrate that memory operations are always going to be faster performing similar tasks when the IO time of memory is faster than disk IO.
HA! I just wasted some of your bandwidth with a frivolous sig!
That's what buffered files are supposed to do: make slow disk writes appear as if they are as performant as memory writes.
Many problems here:
FIRST:
This is not even remotely comparing apples to apples. In the first case they continually concatenate a string. i.e.
strCat = ""
for loop for some time:
strCat += "1";
In the second case they continually write "1" out to a file.
One problem is that they are claiming to test whether doing something in memory is faster/slower than _doing the same thing_ while reading/writing to disk. But they haven't accomplished this. In the first case, they are ending up with an object of type string that resides in memory, in the second case they are ending up with a file on the disk, but do not have the string anywhere in memory. So to really compare, they either need to write strCat out to disk, which is _obviously_ slower than the second method (would anyone have claimed otherwise), or they need to read back in the file at the end which may or may not have changed the outcome.
SECOND
Any Java programmer knows that the first option is slow and is doing a lot of unnecessary object creation. When programmers say its better to do stuff in memory than writing it out to disk they don't mean, use whatever algorithm you want and no matter what it will be faster if you do it in memory than any algorithm you might choose for writing to disk. Would have been slightly more impressed had they used a StringBuffer, but I doubt that will get the same results, and that is Java 101, so what the heck do they think they have discovered?
This is the kind of stupidity you can expect from a Java-based education.
At least one basic assembly class should be mandatory to call yourself a computer scientist.
This kind of useless paper is exactly why idiots should not be allowed in computer science. They even give the explanation in the paper and still draw to bad conclusion. To me, it should be renamed "Bad programming habit performs worse than very bad performing habit in the absence of knowledge about the tool used".
Video of some good progressive thrash music
Agreed. This has got to be some sort of April Fools joke. There's no way this is a serious piece of research, much less actually pass a peer review process. Even a junior level programmer could tell you how stupid this paper is.
Some tips for the authors of this travesty:
1. Learn how computers work.
2. Learn how operating systems work.
3. Learn how programming languages work, especially ones that are interpreted or VM/CLR based.
4. Learn2code.
This "research", if it actually is research, should be nominated for an Ignoble Award. This would deserve an F even in an intro to programming course.
I think I'll go write a paper on how having lots of polygons in a 3D model will slow rendering down. I should get two Ph.Ds for that work.
~X~
From the title, I thought they were going to be using CPU registers or something.... then I read the summary and was like WHAT? Got to be a mistake there. Then I read the comments and it all made sense. Not going to bother reading the article.
I like the way they compare making your own "buffer" in the least efficient way, then writing it to disk with the runtime's buffered I/O, vs just using the runtime's buffered I/O. So they didn't even test the disk's performance. With Python it could be a simple newbie mistake, but the Java code has explicitly added buffering.
This article's either an outright lie, or a result of gross incompetence. Probably both.
So they basically selected a bad datatype and wrote a very inefficient program to handle manipulation of data and they use that as the basis to say that memory was the issue. The issue was programming without thought to what the computer was actually doing. Is this what these two Universities are teaching their students? Were they being purposely bad programmers to prove a point?
God help the world if these people ever have to program efficiently....
Slashdot has fallen far in credibility if it promotes sloppy research like the referenced article.
All about me
University of Calgary and the University of British Columbia must not be very good Computer Science schools.
>According TFA, they actually do an explicit sync to disk at the end of the writes. So it's not purely writing into cache.
The code in the paper says they flush before closing the file. This is not the same as a sync. They don't even flush (or sync) after each write.
doesn't the compiler see that the old string memory location is no longer being used and can thus be reused for the next string ? The last decade it seems to me that compilers have gotten worse and worse. worst was android studio that takes minutes to build a simple project and it doesn't even need to do any compiling.
There are huge differences in memory chips. Some brands have enormous error rates, which considerably slow down the system they are installed in. I don't see any reference to what kind of memory they used. I do notice that the words ECC and Registered do not appear anywhere in the reports. Anyone doing serious computing should invest in Registered ECC Memory and a motherboard that properly supports it. The time you save may be your own. Didn't Einstein prove that Money (M) equals Time (T) plus Effort (E) squared? m=te Or was is Pythagoras?
While it is nice to point out that the code sucks, they do already know that. They mention StringBuilder on page 6, for example. No problem there. But of course, in the end, the paper has nothing to do with disk vs. memory. It's about comparing one O(n^2) algorithm to an O(n) algorithm and determine that one of them is quicker. The problem is, that it has nothing to do with where you save your data. The basic point, that bad code makes stuff slow is true, but, if you ask me, told quite confusingly.
Jesus H. CHrist, this is a blatantly sexist and epic troll.
But I can't disagree with the argument. 100% correct on all points.
Go back and read their conclusions and then think about how much of you flaming is actually the point they are making.
I've never bothered posting here before, but this is pathetic.
You're all so busy trying to look smart (about something blatantly obvious) that you clearly haven't taken the time to even read the conclusions.
Make's you look pretty silly to me!
It does NOT do heavy disk access; both version buffer in RAM before writing to disk a single time.
It just shows that the OS buffering is more efficient, for that use case, than the language string manipulation.
yea
on multiple 1000x1000 COMPLEX*16 arrays, and watching the machine page-fault itself nearly into oblivion, I call bollocks.
I sincerely hope this is a prank, even if it's apparently from a biology department.
SLOGEN [ http://ungdomshus.nu : Sebastian cover music]
I opened the link and found that I share my first name with the first author. How am I supposed to live with this now?
Even worse, they were not just concatenating the strings in memory. They were making a new string each time and copying the old one first, then concatenating. Their choices of computer languages and their lack of understanding of those choices makes this a problem.
"Almost every wise saying has an opposite one, no less wise, to balance it." - George Santayana
Wow. Has anyone heard about buffered writes ? And does kernel-level page cache ring a bell ? No fsync was ever used in the benchmarks, therefore, it is never actually hitting the disk. The only good thing about this paper is that the Java and Python listings are available at the end, for everyone to identify the basic flaws in this research.
So yeah, it's faster to write directly to MEMORY than to do a copy before writing to MEMORY.
Well, that's pretty lame then. They did say "sync" in the paper, but I didn't get to the actual code since, quite frankly, I was being blinded by the daylight already coming through the holes in the rest of the paper.
Log in or piss off.
Please, please, for the love of $DEITY, nobody tell them about mmap'ing a file! That will make their brains go supernova!
Use a StringBuilder. Go back to school newbs.
Really? They published this with their names on it?
The only thing they've proven is that a bad algorithm can be slow.
We've known that for quite a while now.
No, java uses StringBuilder in java 1.7 & 1.8
class a {
public static void main(String [] args)
{final String b=args[0],c=args[1];final String a=b+c;}
}
then compile and disassemble bytecode
javac a.java
javap -c -private a
slashdot prevents printing the {} characters
as you see from the bytecode
it it basically
a=new StringBuilder().append(b).append(c).toString()
java -version
java version "1.7.0_75"
OpenJDK Runtime Environment (fedora-2.5.4.2.fc17-x86_64 u75-b13)
OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)
Anyone can upload an article to arXiv.
In the case of this particular piece of "work," it is obvious that no revision took place.
This can be hardly considered research and it should not appear in slashdot or in any other media... unless for entertainment reasons.
Why would you run this test in Java and Python? I love python, but I wouldn't use it for a memory performance test. I would use a language where I know, and control, what is happening with memory management. IMHO, that makes the entire experiment suspect and, most likely, useless.
I only skimmed through it, but I did see things like :
string = string + "1"
The performance of that depends entirely on how the python interpreter is implementing this operation. It could be doing something sub optimal... you don't know. Figuring out what is happening is a research project in and of itself. If the experiment were performed in C, we'd have real data.
As a side note, the article did say something about memory being faster, which it is, but the overall time was slower, from start to finish. I do wonder how the JIT compiler in Java and the python interpreter are effecting this. That issue is not limited to parsing either. Again, another reason this should have not been done in those languages.
As someone who has written code to do performance testing on both RAM and FLASH before, this article flawed. They might have some point. But. the article doesn't make it.
http://pastebin.com/wJuWeAiN
.1 seconds. .4 seconds.
In-Memory takes
Writing to Disk takes
Just look at their chart, they are comparing apples to oranges. Their "memory test" is shown as string concactination time plus write to disk time [they break it into two columns]. Disk test is write to disk time.
This is like saying it takes longer to walk to the store if you make a pair of shoes first than if you just walk to the store.
From the python wiki - https://wiki.python.org/moin/PythonSpeed/PerformanceTips:
Avoid this:
s = ""
for substring in list:
s += substring
Use
s = "".join(list)
instead. The former is a very common and catastrophic mistake when building large strings. Similarly, if you are generating bits of a string sequentially instead of:
s = ""
for x in list:
s += some_function(x)
use
slist = [some_function(elt) for elt in somelist]
s = "".join(slist)
This "researcher" is an idiot. The java code given at the bottom of the "research paper" uses + operator to concatenate strings. This is O(N) in Oracle java. Total algorithm becomes O(N*N) in memory, and O(N) on disk.
Obviously N*N takes longer than N after a certain N even when N*N is running on faster memory.
Bingo Dictionary - Pragmatist, n. A myopic idealist.
They published the code Java code was using a bufferedwriter - which will reduce the amount of disk IO so effectively they are testing it in memory - then testing it in memory?
I'm wondering if we need to find a means of enforcing at least some level of fact checking and commentary from the mods.
Simply re-posting this submission as is has turned into a giant flame fest because the research was crap. (As is a frighteningly large proportion of comp-sci & comp-eng research these days).
If the mods are decent then they should be able to take a moment to look through this, understand how much of a train-wreck it is and provide a bit of commentary to prevent the flame-fest or outright reject it. (I'm already on the site so a crap article isn't going to bring any additional add revenue. It will simply drive traffic elsewhere after the first few people identify it as crap.)
Yes, yes. I know. That isn't what our mods do...
Maybe it's time they started...
for (int i = 0; i numAdd; i++) {
addString += "1";
}
for (int i=0; i numIter; i++) {
concatString += addString;
}
Seriously ?
It should use a StringBuilder buffer, this is highly inefficient in Java.
Year ago in a some of Delphi Object Pascal projects (both personal & professional ones) I changed classic oldschool Turbo Pascal File I/O methods like:
AssignFile
Reset
Append
Rewrite
ReadLn
WriteLn
Flush
CloseFile
For File I/O to their "fully qualified" System Unit API based analogs of:
System.AssignFile
System.Reset
System.Append
System.Rewrite
System.ReadLn
System.WriteLn
System.Flush
System.Close
They're "straight-to-API" abstracted/based methods (vs. std. 'oldschool/classic' Turbo Pascal methods).
I did so, & on the very note you mention (looking for speed & efficiency gains in file routines)!
I didn't see a change in .exe size OR execution speeds (I profile using oldschool methods of registering a hi-res multimedia timer with the OS to test these @ the START & END of each routine involved to test (yea, primitive but it works)).
That essentially told me that Borland/Embarcadero's compiler designers did the RIGHT THING for their compiler/linker really - not doing extra work for no good reason + using the MOST EFFICIENT METHODS abstracting even the oldschool syntax to the most efficient API based types.
* However, I did see a BIT quicker compilation time (fully qualified naming probably helped there), & I ended up sticking by the API based ones from the System Unit since then actually (just to be 'safe') & use the API itself (most of its built-in to Delphi as it is in C++ dialects usually, unlike languages like VB was where you had to use "declare" to use the API).
APK
P.S.=> I didn't like reading here & hearing other languages do NOT abstract out to the API methods (which is all they're going to do from the language methods anyhow in the end really - iirc, folks here "zeroed-in" largely on Python, Java maybe some others mentioned by others here - since I've used them over time academically or professionally myself in development projects on contract I was semi-concerned even though they worked & it was LONG ago) - Again - That's taking extra steps passing to the OS that way afaik & inefficient... apk
Year ago in a some of Delphi Object Pascal projects (both personal & professional ones) I changed classic oldschool Turbo Pascal File I/O methods like:
AssignFile
Reset
Append
Rewrite
ReadLn
WriteLn
Flush
CloseFile
For File I/O to their "fully qualified" System Unit API based analogs of:
System.AssignFile
System.Reset
System.Append
System.Rewrite
System.ReadLn
System.WriteLn
System.Flush
System.Close
They're "straight-to-API" abstracted/based methods (vs. std. 'oldschool/classic' Turbo Pascal methods).
I did so, & on the very note you mention (looking for speed & efficiency gains in file routines)!
I didn't see a change in .exe size OR execution speeds (I profile using oldschool methods of registering a hi-res multimedia timer with the OS to test these @ the START & END of each routine involved to test (yea, primitive but it works)).
That essentially told me that Borland/Embarcadero's compiler designers did the RIGHT THING for their compiler/linker really - not doing extra work for no good reason + using the MOST EFFICIENT METHODS abstracting even the oldschool syntax to the most efficient API based types.
* However, I did see a BIT quicker compilation time (fully qualified naming probably helped there), & I ended up sticking by the API based ones from the System Unit since then actually (just to be 'safe') & use the API itself (most of its built-in to Delphi as it is in C++ dialects usually, unlike languages like VB was where you had to use "declare" to use the API).
APK
P.S.=> I didn't like reading here & hearing other languages do NOT abstract out to the API methods (which is all they're going to do from the language methods anyhow in the end really - iirc, folks here "zeroed-in" largely on Python, Java maybe some others mentioned by others here - since I've used them over time academically or professionally myself in development projects on contract I was semi-concerned even though they worked & it was LONG ago) - Again: That's taking extra steps passing to the OS that way afaik & inefficient... apk
Year ago in a some of Delphi Object Pascal projects (both personal & professional ones) I changed classic oldschool Turbo Pascal File I/O methods like:
AssignFile
Reset
Append
Rewrite
ReadLn
WriteLn
Flush
CloseFile
For File I/O to their "fully qualified" System Unit API based analogs of:
System.AssignFile
System.Reset
System.Append
System.Rewrite
System.ReadLn
System.WriteLn
System.Flush
System.Close
They're "straight-to-API" abstracted/based methods (vs. std. 'oldschool/classic' Turbo Pascal methods).
I did so, & on the very note you mention (looking for speed & efficiency gains in file routines)!
I didn't see a change in .exe size OR execution speeds (I profile using oldschool methods of registering a hi-res multimedia timer with the OS to test these @ the START & END of each routine involved to test (yea, primitive but it works)).
That essentially told me that Borland/Embarcadero's compiler designers did the RIGHT THING for their compiler/linker really - not doing extra work for no good reason + using the MOST EFFICIENT METHODS abstracting even the oldschool syntax to the most efficient API based types.
* However, I did see a BIT quicker compilation time (fully qualified naming probably helped there), & I ended up sticking by the API based ones from the System Unit since then actually (just to be 'safe') & use the API itself (most of its built-in to Delphi as it is in C++ dialects usually, unlike languages like VB was where you had to use "declare" to use the API).
APK
P.S.=> I didn't like reading here & hearing other languages do NOT abstract out to the API methods (which is all they're going to do from the language methods anyhow in the end really - iirc, folks here "zeroed-in" largely on Python, Java maybe some others mentioned by others here - since I've used them over time academically or professionally myself in development projects on contract I was semi-concerned even though they worked & it was LONG ago) - Again - That's taking extra steps (From "form" or classic methods then to API & OS) passing to the OS that way afaik & inefficient... apk
Delphi's Object Pascal's not like that @ least. I found out Years ago in a some of Delphi Object Pascal projects (both personal & professional ones) I changed classic oldschool Turbo Pascal File I/O methods like:
AssignFile
Reset
Append
Rewrite
ReadLn
WriteLn
Flush
CloseFile
For File I/O to their "fully qualified" System Unit API based analogs of:
System.AssignFile
System.Reset
System.Append
System.Rewrite
System.ReadLn
System.WriteLn
System.Flush
System.Close
They're "straight-to-API" abstracted/based methods (vs. std. 'oldschool/classic' Turbo Pascal methods).
I did so, & on the very note you mention (looking for speed & efficiency gains in file routines)!
I didn't see a change in .exe size OR execution speeds (I profile using oldschool methods of registering a hi-res multimedia timer with the OS to test these @ the START & END of each routine involved to test (yea, primitive but it works)).
That essentially told me that Borland/Embarcadero's compiler designers did the RIGHT THING for their compiler/linker really - not doing extra work for no good reason + using the MOST EFFICIENT METHODS abstracting even the oldschool syntax to the most efficient API based types.
* However, I did see a BIT quicker compilation time (fully qualified naming probably helped there), & I ended up sticking by the API based ones from the System Unit since then actually (just to be 'safe') & use the API itself (most of its built-in to Delphi as it is in C++ dialects usually, unlike languages like VB was where you had to use "declare" to use the API).
APK
P.S.=> I didn't like reading here & hearing other languages do NOT abstract out to the API methods (which is all they're going to do from the language methods anyhow in the end really - iirc, folks here "zeroed-in" largely on Python, Java maybe some others mentioned by others here - since I've used them over time academically or professionally myself in development projects on contract I was semi-concerned even though they worked & it was LONG ago) - Again - That's taking extra steps (From "form" or classic methods then to API & OS) passing to the OS that way afaik & inefficient... apk
They should have viewed this presentation about increasing a python data crunching application 114,000 times faster before they set off on their research project.
To summarize - there are a multitude of ways to optimize your application including using the chip's onboard cache to avoid the overhead/delay of accessing memory on the motherboard across the bus
Yes - as we try to eek out more performance from our applications - we'll need to consider the relationship between our applications and the underlying implementation and capabilities of the hardware it lives on. Further - I would say we also should be considering how to make our tools do this sort of thing for us. Given the complexities we are seeing in the development arena today, including virtualization, the need to do more with less both on the back end, as well as on small hand held devices, and the need to build more faster while increasing security of what we build, I consider it imperative.
Lodragan Draoidh
The more you explain it, the more I don't understand it. - Mark Twain
Based on the comments here, I doubt if many people actually read the paper with any amount of attention before joining the mob with pitchforks.
The authors clearly know Java Strings are not the best thing for concatenation. They even mention StringBuilder as a way to fix the performance problem. This is an example of how things can go wrong if not careful. It is great that so many slashdotters already know their Java Strings so well, but does everybody know their Python, Haskell, etc. too.? The paper is making a general point using a specific example.
A lot of posters mention the problem being comparing an O(N^2) algorithm with an O(N) algorithm. If you read the paper you'll notice that the string concatenation loop they use looks like this:
for (int i=0; i numIter; i++) {
concatString += addString;
}
The running times don't change in a quadratic way, because in their code numIter * length(addString) is always 1,000,000. As the outer loop gets bigger, the concatenated string gets smaller.
The difference in running times comes from the number of string concatenations (the outer loop), and it is clear that as the number of concatenations drops, performance gets better. This indicates that the problem is caused by the immutable String objects, which need to be reallocated and re-initialized. The more this needs to be done, the poorer the performance, as pointed out in the paper.
Speaking of complexity, does anyone know what is the complexity of their disk-only code below?
for (int i=0; i numIter; i++) {
writer.write(addString);
}
The for loop goes over N items, while write() must loop over the length of addString == O(N^2)? In any case, here again numIter * length(addString) is always 1,000,000.
I believe the paper is simulating creating a file in memory and then writing it to disk as opposed to writing the strings to disk as soon an the strings are generated. The alternative/clever ways of generating a string of 1,000,000 "1"s like concatString = addString * numIter are useless in this case, because in reality addString may contain unique data bits and pieces.
There are some unexpected results with the paper's Python tests. The considerable performance differences under Windows and Linux with the same code, or when rearranging a concatenation order are interesting. This paper is actually worth a careful read, even if most of us would never write code that heavily uses Java Strings concatenation!
Nice buffer overflow bug you've got there.
People just say things on here and it's taken as fact. How is Java's String implementation overweight?
Java, like C# does use 16 bit char widths, but that actually makes it faster than variable width characters. That's why these languages do so.
So what about Java strings are 'overweight'?
FFS. I knew the researchers coding fuckup before I even read the code
BWAHHHHHHHHH
Expect the disk to be faster. For the privileged few who bothered to RTFA, you'll understand. The high-level languages used tend to degrade on the specific instructions (string concat) as the number of ops increases; I also wager that this problem is more prevalent on Windows where the memory manager is about as good useful as a clown who thinks he has an eye for fashion. So, distilled, this article should read:
"Researchers find an obtuse way to defy a well-established rule-of-thumb".
Bravo. Or not.
the tests they show only serve to demonstrate how slow python string concatenation is.
they then go on to say that disk writes are bufferered more sensibly, and this is why they are faster.
the key thing to notice is that disk buffering happens in **memory**.
memory is still, and will remain, much faster.
you just have to use it properly.
Don't hard drives have more than 1 Mb of DIMM cache? Or has the onboard RAM disabled for the test?
Put this paper in the dictionary under junk science. This is stupid.
Send these researchers to Mars; They're very much needed there;
Casteism
I would realloc the buffer doubling the size each time it overflowed. This allocation strategy is simple, is bounded to 50% worst case overhead, and requires only log N reallocations for a maximum buffer size of N.
It also happens to be the policy used by Java's ArrayList and presumably by its StringBuilder.
Reminds me of one really strange course on college. The lecturer calculated factorial, up to ten - using DOS 6.22 .bat file. Actually, he provided three "solutions".
/b fact.txt" and probably an echo with "look at the size, this is the result".
:)
One solution was to write ten "if" cases, and just echo the corresponding number, hardcoded.
I can't remember the second solution.
The third was the real "beast". It was based on recursion. "factor.bat" called itself. The batch created one byte file in the beginning. And this file was joined n-times within each iteration. All this to facilitate multiplication, which was not directly achievable in a batch file. In the end, there was "dir
I kid you not. It was something like: for $1==1, fact.bat created a file "fact.txt", with one byte (using echo x > fact.txt) then the file was joined n-times - with type fact.txt >> xfact.txt after fact.txt was added n times to xfact, xfact would be renamed to fact.txt
Of course, in this case, disk access was really slower than pascal version, that run in memory...