Sort Linked Lists 10X Faster Than MergeSort

← Back to Stories (view on slashdot.org)

Sort Linked Lists 10X Faster Than MergeSort

Posted by kdawson on Sunday February 25, 2007 @07:43AM from the out-of-sorts dept.

virusfree tells us about a new algorithm that has been developed that the author claims can sort a linked list up to 10 times faster than MergeSort. "BitFast," a member of the Hash algorithms family, is available in C and C++ under the GPL.

7 of 326 comments (clear)

Min score:

Reason:

Sort:

Re:It's radix sort. by abradsn · 2007-02-25 08:00 · Score: 4, Insightful

His algorythm is "in place"

radix is not

I agree though that the concepts for both algorithms are very similar.
Re:wtf? seriously. by slamb · 2007-02-25 08:07 · Score: 4, Insightful

Fifth, most people don't sort link lists.
Sixth, the headline "10X faster" is incorrect, as they differ asymptotically, not by a constant factor. (Run different data sets...vary by size of a single element, and by number of elements in the list. The ratio will change.)
Re:wtf? seriously. by phazer · 2007-02-25 08:10 · Score: 5, Insightful

Fifth, the author tries to "GPL" the algorithm, which is utter nonsense. GPL deals with copyright, so the most he can do is GPL his implementation of his bucket-/radix sort. Anyone is free to re-implement the algorithm, GPL or not.

It would require a software patent to restrict the use of the algorithm to GPL programs.

(And sixth, a quick look in a text book would have clued the author in)
solution in search of a problem by belmolis · 2007-02-25 08:44 · Score: 4, Insightful

This is a variant of RadixSort, which is well known to be faster than any comparison sort such as MergeSort. The problem with non-comparison sorts is that as such they are restricted to sorting items representable a unstructured bit fields, which means, essentially, integers. A large part of the time, the real problem in sorting is (a) extracting the fields that you want to use as keys (since it is not generally the case that you want to sort on the entire record) and (b) arranging for each pair of records to compare as you need them to, which involves both recognizing the internal structure of keys (consider the case of dates) and imposing suitable weights for the individual components. In other words, in many situations the bulk of the code and time are devoted to parsing and transformation of records. So long as you are not using a really bad algorithm, the time devoted to the sort itself is likely to be a small percentage of the total time.

For example, I have written a sort utility that differs from most others in its ability to parse the input into records and records into fields and in the transformations it can apply so as to use unusual sort orders and sort on things like dates, month names, and numbers in non-Western number systems. It was originally written for alphabetizing dictionaries of "exotic" languages. It is frequently the case that the time devoted to the actual sort is less than 1% of the run time.

In sum, non-comparison sorts have a niche but are of limited utility because they get their speed from making use of additional information that is only available for a limited set of datatypes. For the great majority of applications, only comparison sorts are flexible enough to be of use.
Re:FROTHY PISS by VENONA · 2007-02-25 09:27 · Score: 5, Insightful

Explanation: pretty much bogus.

As for the quality of your post: let me guess--you also complain about slow, bloated software, right? The old, "Intel giveth, Microsoft taketh away," adage? Users have several orders of magnitude more compute and storage power than 'back in whatever day' yet personal computers seem little more responsive, etc.

Don't feel lonely. There's a large population of lugnuts like you, who, if they think of CS at all, largely carp about how some CS departments haven't become current technology de jour tradeschools. Some, unfortunately, have, but that's a whole different discussion, which has been seen on Slashdot time and again.

Algorithm research is important, as is having at least something of a grasp of algorithms. In *your* next programming exercise, since you seem to regard sort efficiency as 'esoteric', feel free to reinvent the bubble sort. Also, tout it as the Next Great Thing, and submit patches against all your favorite apps. That will get you your twenty seconds of fame, I assure you.

Sometimes I love Slashdot--but then I read a post from some random AC idiot like you: the proverbial lowest common denominator. Maybe you should restrain your efforts toward what you seem to regard as cool snarky posts, watch a thread (about which you plainly know nothing) develop, and maybe freaking *learn* something.

OTOH, maybe I'm just bitter right now, because I've just been doing a search through Google news on climate change, and I'm pretty much convinced that the last thing the human race needs right now is people like you.

--
What you do with a computer does not constitute the whole of computing.
Re:It's radix sort. by Tim+Browse · 2007-02-25 09:41 · Score: 5, Insightful

Who do you thinks going to get the credit when they finally invent fission? The person who thought it up or those that actually made it?
Well, I'm not sure, but I'm pretty sure it's not going to be the guy who made it again 30 years after everyone was already using it.
Re:radix sort vs. comparison sort by Tim+Browse · 2007-02-25 10:30 · Score: 5, Insightful

I don't know what possessed me to look at his code
Damn you! You made me look at his code! The goggles, they do nothing.
His cleverness gets the better of him when it comes to his modified radix sort. For example, he creates two arrays (on the stack) of 65535 elements; apparently unaware that this creates an array with indexes 0..65534.
I can't help feeling he should have declared Ptrs and Tails as Node* arrays, and bypassed all that random casting to longs. Not sure what's going on there. But then anyone who thinks the roundabout way he used of accessing the top 16 bits of a 32-bit memory value is 'cool' is definitely on my list of people most likely to re-invent the radix sort badly.
At first, I thought it wasn't a stable sort, but looking further into it, that's because he mixed up the 'head' and 'tail' descriptions in the explanation (or possibly in the code).
It is amusing that some /. posters think that doing this in-place is somehow an amazing leap of insight. Sometimes /. is like reading thedailywtf.com - you see something dumb as the main story, and then find half the people commenting on it have even less of a clue.
I must be new here.