Sort Linked Lists 10X Faster Than MergeSort
virusfree tells us about a new algorithm that has been developed that the author claims can sort a linked list up to 10 times faster than MergeSort. "BitFast," a member of the Hash algorithms family, is available in C and C++ under the GPL.
http://en.wikipedia.org/wiki/Radix_sort
It's still in the N log(N) time class. Mergesort has minimal time overhead to prepare and only N memory overhead. Do we have that for this algorithm?
How is this different from Radix sort?
This isn't sorting linked lists, it's sorting linked list of NUMBERS (or, more specifically, N-bit strings.) Doing that fast is hardly rocket science, and this bitfast is nothing new... That's basically radixsort with radix = 2^16.
Nice, but you could just copy the list to array and do counting/radix sort. Does nothing for strings, floating point values, structures...
It's a bit of an apples vs. oranges comparison to put this up against mergesort - mergesort is a comparison-based sort, while Papadopoulos' bitfast is a radix sort and thus O(N*W) where N is the number of elements and W is the width of each element in bits. (hint - try sorting 1000-byte strings with bitfast, and see which is fastest) And no, it doesn't have anything to do with hashing.
However, it's definitely a clever way of implementing radix sort with linked lists, which may make it useful in some cases (e.g. OS internals) where you don't want to allocate space for a big directly-addressable array.
First, it's just a radix sort [well that's my take]. Second there are too many google ads on that page. Third, merge sort is O(n log n) time, hard to beat for random data. Fourth, most people use variations of the QUICK SORT not merge sort.
Tom
Someday, I'll have a real sig.
It's useless to talk about the speed of a sort algorithm without discussing its complexity.
Linear improvements are completely uninteresting. Take algorithm X in Perl, then a version in optimized assembly, and there you go, a 10X improvement without having done anything groundbreaking.
A 10X faster bubble sort is just as useless for large enough values of n, and a 10x slower merge sort would still have a very good performance in comparison.
It was used by 80-col card sorters, since circa 1925.
See Knuth, Volume 3
Since the article has nothing, I wanted to ask: has anyone implemented merge sort where you merge from both the front and back using two simultaneous threads ? I haven't seen this anywhere, but it's an obvious improvement for multicore machines. Does anyone know how to further parallelize merging (or for that matter, heaps) ?
In Soviet America the banks rob you!
Let me know when there's a video on Youtube of the sort doing it's thing with colored bits and animation.
Reviewing just the first hour of video games.
No.
You are an "HTML programmer".
An oxymoron and a moron, simultaneously.
Defenestrate yourself.
"Those who cannot remember the past are doomed to repeat it." - George Santayana
This was written by VirusFree! Who wouldn't want to use something by a guy named VirusFree? It's a million times better than something by FreeVirus.
AFAIK, quicksort (as well as mergesort) is optimal algorithm for comparison based sort, which means you can not find anything asymptotically better than that.
And how about submitting it to academic journals if you come up with something better?
Against my better judgment, I skimmed the article, code.
Two observations:
Why no analysis of impact on memory usage? 10x speed-up, but at what cost? Omitting this detail makes it difficult to weigh the utility of the algorithm.
Not an Apples to Apples comparison. In the source, the "BitFast" sort converts the entire link-list into an array before performing the sort. Without even considering the merits of the actual algorithms, the Mergesort has no chance of being faster given the more complicated data structure used.
As others pointed out, this sounds like the Radix sort, but I was too disgusted to step through the source in detail and even verify this. Maybe someone else will have a stronger constitution.
The poster has just implemented a radix sort. One can do them quite faster, but of course, they are basically restricted to sorting numbers, or at least bit strings with a reasonable diversity in the distribution of prefixes.
/* Copyright (c) 2007 David Kastrup dak@gnu.org */
For a pretty efficient mergesort implementation in C/C++, try the following (has at one time been offered to libg++, but not taken up). It sorts, as an example, stdin line by line to stdout. The whole recursion stuff has been flattened into loops and binary arithmetic.
<blockquote>
#include <cstddef>
#include <iostream>
#include <string>
using namespace std;
template <class T,class V> class listsort : V {
public:
listsort(const V &arg) : V(arg) { }
T * &sort(T* &headin, size_t n) const
{
size_t underpow,n1,n2,bitrev;
int lev, kmax;
T **head = &headin;
T **headstack[CHAR_BIT*sizeof(size_t)];
T **tail;
T **tail1;
T *p1;
T *p2;
int sp = 0;
switch (n)
{
case 0:
return *head;
case 1:
return getlink(*head);
}
for (underpow = 1, kmax=0; underpow <= (n-1)/4;) {
underpow *= 2;
++kmax;
}
for (lev = kmax, bitrev=underpow, n -= 2*underpow;;) {
p1 = *head;
tail = &getlink(p1);
p2 = *tail;
if (listleq(p1,p2)) {
tail = &getlink(p2);
} else {
*tail = getlink(p2);
getlink(p2) = p1;
*head = p1 = p2;
}
switch ((bitrev+n)>>lev) {
case 2:
p2 = *tail;
if (listleq(p1,p2)) {
tail1 = &getlink(p1);
p1 = *tail1;
if (listleq(p1,p2)) {
tail = &getlink(p2);
break;
}
*tail1 = p2;
} else
*head = p2;
*tail = getlink(p2);
getlink(p2) = p1;
break;
case 3:
headstack[sp++] = head;
head = tail;
++lev;
continue;
}
for (;;) {
if (lev == 0)
return *tail;
if (!((bitrev>>--lev)&1))
:-)
hello world
It may disappoint you, but you lost to a post titled "Frothy piss". If getting a "first post" is really that important to you, I suggest you take this as a lesson and re-evaluate your priorities in life. Being nearly as good as "Frothy piss" is a sad, sad life achievement.
You're making the baby Knuth cry. Please read "The Art of Computer Programming" before proceeding any further.
I've done a bit of scientific programming, but I'm not a computer-science-oriented person by any stretch. Thus, I'm a little confused by this. Under what license are things like Quicksort and Stupidsort (for another, very bad example) "released"? Why would a person feel it necessary to release an algorithm under any kind of license at all, in fact? No textbook would want to include GPL-ed code--wouldn't that make the entire book GPL by nature of the license?
Please correct me if I'm mistaken, but this seems like a fairly bad precedent to set.
Ok, I'll start from his site:
- Programming Skills : Visual Basic ***Excellent***
Yes, that certainly is... excellent.
- Message : "Don't ever let school, stop you from learning!"
School could, learn you some grammar.
The algorithm is being released under the GPL ( General Public License ). The algorithm belongs to PhoenixBit and VirusFree but you may use/modify it freely.
*** DO NOT COPY ANYTHING FROM THIS PAGE TO ANY OTHER PAGE. IF YOU WANT SOMEONE TO READ THIS THEN LINK TO THIS PAGE ***
In addition to trying to apply copyright to an algorithm, doesn't a restriction on copying defeat the purpose of releasing something under the GPL? Or does text in all caps trum previous text not in all caps?
Feel free to add to this. If there are some clips of this guy with lightsabers, pleas post them.
The algorithm belongs to PhoenixBit and VirusFree but you may use/modify it freely.
Huh? You can copyright the implementation you have (what allows you to put it under the GPL). That does nothing to control the algorithm... you would have to attempt to get a patent to do that.
This is a variant of RadixSort, which is well known to be faster than any comparison sort such as MergeSort. The problem with non-comparison sorts is that as such they are restricted to sorting items representable a unstructured bit fields, which means, essentially, integers. A large part of the time, the real problem in sorting is (a) extracting the fields that you want to use as keys (since it is not generally the case that you want to sort on the entire record) and (b) arranging for each pair of records to compare as you need them to, which involves both recognizing the internal structure of keys (consider the case of dates) and imposing suitable weights for the individual components. In other words, in many situations the bulk of the code and time are devoted to parsing and transformation of records. So long as you are not using a really bad algorithm, the time devoted to the sort itself is likely to be a small percentage of the total time.
For example, I have written a sort utility that differs from most others in its ability to parse the input into records and records into fields and in the transformations it can apply so as to use unusual sort orders and sort on things like dates, month names, and numbers in non-Western number systems. It was originally written for alphabetizing dictionaries of "exotic" languages. It is frequently the case that the time devoted to the actual sort is less than 1% of the run time.
In sum, non-comparison sorts have a niche but are of limited utility because they get their speed from making use of additional information that is only available for a limited set of datatypes. For the great majority of applications, only comparison sorts are flexible enough to be of use.
Ideally, I'd like a tag for the clump of "Computer science by people who haven't done the background reading" posts we get. I haven't seen "my algorithm compresses everything, even compressed files!" or "I've proved that P = NP!" posts in a while. But we should be prepared.
How about "badcomputerscience"?
I am sure Microsoft has patented it by now.
in the source examples, why is main of type long?
Just killed himself I guess. Not to mention that this news is total trash, sadly. Being faster than O(n log n) with random data, is pretty hard
That's why he put it up. This is just some huckster trying to (and succeeding in) making money off of idiot Slashdot moderators and viewers.
--an unbreakable toy is useful for breaking other toys--
Kindly, polite encouragement of younger computer programmers is preferred to flaming them.
Retard who thinks you can GPL an algorithm manages to reinvent an algorithm first implemented as a computer program in the 1950s, and used way before then outside of digital computers.
Also manages to not know about memset and write C code that assumes sizeof(long)==4 and puts non-inline code in header files. I don't dare look at the C++ code since there's a large probability it will cause permanent brain damage.
a new algorithm 10x faster...from two guys calling themselves VirusFree and PhoenixBit?
sigh
Bit level comparison of floating point numbers won't work in general, and more specifically, it doesn't work in the code you posted.
Also, after translating your time functions for unix, your algorithm does not terminate unless compiler optimizations are turned on (gcc -o moo main.c -O3).
Barring the fact that floating point is broken, comparing your radix sort to another array-based sorting algorithm might be more productive. O(N log N) is the lower bound on non-parallel, and much of the battle is coming up with lower multiplicative constants, as you are trying to do. Again, your comparison
You state that you need a constant amount of memory, in your case this is sizeof(long) * 2 * 2^16 = 2^19 bytes, which is half a meg.
Testing the long version doesn't seem to get equivalently sorted lists either. For one thing you seem to be sorting in the opposite direction as mergesort.
Ignoring correctness, you only have a speedup of ~1.3 over qsort.
Perhaps coming from a math background, my priorities differ from those with a CS background, but...
Where's the proof?
Is it peer reviewed? Has it been published? Does it exist?
Although the lack of a proof does not make your algorithm is incorrect, I'd be suspicious. In fact, I would be unwilling to use any algorithm in my code without being sure that a good proof of its correctness exists somewhere.
Proofs are more than formalities, they're essential: without a good proof, you're just a crank selling snake oil.
Nothing sucks like a Vax, nothing blows like a PowerMac G4
Unless I'm very much mistaken, there's a blatant buffer overflow in this code. In bitfast.h the "Tail" and "Head" arrays are defined as having 65535 (0xffff) entries. That's from 0-65534 inclusive.
The code in "BitFast" then uses the array as if it can access Head[0xffff] and Tail[0xffff] - which is one past the end of the arrays. I'd guess it works by coincidence on Win32.
Additionally, although there might be some reason I can't follow at the moment, in the float version only the top 15 bits of the values are seemingly used. This would seem to ignore the sign bit of the FP representation. Maybe there's another subtlety I'm missing, but I can't see it at the moment.
Given these basic misunderstandings of programming concepts I'm not surprised the authors hadn't known of the radix sort which many people here have linked as being very similar indeed.
GodboltBlog
1. Loop ......................AddHobbie ("Computers"); ................Do Until 1=2
Since he says he's "**Excellent**" in Visual BASIC, one could ponder how he MIXED UP THE Do AND Loop STATEMENTS.
2. Don't ever let school, stop you from learning!
He apparently never had...
3. 10 times faster than MergeSort? That's asymptotically, right? O(10nlogn). Call the press. (Doesn't matter that Radix Sort actually takes O(n), as long as it's 10X!)
4. BitFast start by creating two arrays of size 65535 ( pow(2,16) )
Anyone got a calculator? Somehow his numbers don't add up. Is it actually possible for a positive power of two to be odd?
And...Since when do you use pow to calculate powers of two?
5. They are not the most error free,optimized,commented,beautifull code to have ever been written
Gee...really?
Explanation: pretty much bogus.
As for the quality of your post: let me guess--you also complain about slow, bloated software, right? The old, "Intel giveth, Microsoft taketh away," adage? Users have several orders of magnitude more compute and storage power than 'back in whatever day' yet personal computers seem little more responsive, etc.
Don't feel lonely. There's a large population of lugnuts like you, who, if they think of CS at all, largely carp about how some CS departments haven't become current technology de jour tradeschools. Some, unfortunately, have, but that's a whole different discussion, which has been seen on Slashdot time and again.
Algorithm research is important, as is having at least something of a grasp of algorithms. In *your* next programming exercise, since you seem to regard sort efficiency as 'esoteric', feel free to reinvent the bubble sort. Also, tout it as the Next Great Thing, and submit patches against all your favorite apps. That will get you your twenty seconds of fame, I assure you.
Sometimes I love Slashdot--but then I read a post from some random AC idiot like you: the proverbial lowest common denominator. Maybe you should restrain your efforts toward what you seem to regard as cool snarky posts, watch a thread (about which you plainly know nothing) develop, and maybe freaking *learn* something.
OTOH, maybe I'm just bitter right now, because I've just been doing a search through Google news on climate change, and I'm pretty much convinced that the last thing the human race needs right now is people like you.
What you do with a computer does not constitute the whole of computing.
Ah, the book of gods and the cure for insomnia. I have read that book 3, 4 times. The best I can master is 4 pages before falling asleep! It was a of great help when I was building an index method for Self Balancing Multiway Tree, with key compression and redundant data removal for the IBM S/34 and S/36.
Who knew Webster's dad was such a great hacker?
Modern copyright is theft of culture from everyone and it retards the progress of the useful arts and sciences.
I am not saying he is Ramanujan or anything like that. But still every amateur deserves to be forgiven once.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
Of course one would hope that the patent examiners at least have heared about radix sort, and therefore would reject that patent due to prior art ... but then, maybe he can avoid that by adding "with a computer" :-)
The Tao of math: The numbers you can count are not the real numbers.
Inside bitfast.h:
// with /* */, malloc and new[]
o rt
long *Tail = new long[65535];
long *Head = new long[65535];
for(n = 0; n < 65535; n++){ Head[n] = 0; Tail[n] = 0; }
Memory leak of 128KB for each time the function "Node* BitFast(Node* HeadNode,long run)" is called.
MISSING: delete[] Tail; delete[] Head;
Furthermore the following is faster or use memset:
long Tail[65535] = {0};
long Head[65535] = {0};
Unless you are running this in DOS16, DOS32 with Pharlap or smaller 8051, 68000 processor, you do not need to use new[] or malloc() for this.
In InsertToList, Node *tmpP = (Node*)malloc(sizeof(Node));
free(tmpP) missing inside FreeList()... no such function
In main.c:
Node *L1 = malloc(sizeof(Node));
Node *L2 = malloc(sizeof(Node));
MISSING: free(L1); free(L2);
In main.cpp:
Node *L1 = new Node;
Node *L2 = new Node;
Instead use (no need to use heap, use stack memory):
Node L1 = {0}, L2 = {0};
Mixing the usage of std::cout and printf() is not recommanded.
Use all of the former or the later, else you will have weird display on some run, unless you flush the output buffer. I suggest you use printf() all the way.
cout << "Ok!!!!" << "\n\n" << "Merge Sort : " << tim1 << " ms\n" << "BitFast : " << tim2 << " ms\n\n";
becomes:
printf("Ok!!!!\n\nMerge Sort : %ld ms\nBitFast : %ld ms\n\n", tim1, tim2);
Furthermore, your implementation of link list, and calling Length(Node*) for every equal is highly inefficient, requires O(n). Store the link list length somewhere when inserting. EqualSplit takes O(1.5n) instead of O(0.5n) because of that.
Something like:
typdef struct LinkList {
Node head;
int length;
} LinkList;
As a result, MergeSort recursively call itself, which calls EqualSplit every turn.
You should make your header file C friendly, and avoid mixing
No clue why this header is needed:
#include <iso646.h> instead of not/and use use ! and &&
before: if(not p) return NULL;
becomes: if(!p) return NULL;
Finally, BitFast is not recursive; therefore, you should try to find an implementation of merge sort which is not recursive also, so you save on function calls, if possible.
"However, iterative, non-recursive, implementations of merge sort, avoiding method call overhead, are not difficult to code." -- Wikipedia
http://en.wikipedia.org/wiki/Merge_s
Your function should be in a file.c or be inline.
Merge sort is optimal. A good implementation will be as fast as any other O(n log n) algorithm, maybe
withing a factor of two, as long as general purpose CPUs are used.
It seems this is an O(n) algorithm, of which there are two: Bucket-sort and Radix-sort. No other ones are possible. These are better in theory and practice, but have limited applicatbility. i.e. do not work for all input data. Merge-sort is an universal sorting algorithm.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You can copyright a specific implementation, or your own specific description of the algorithm, but you cannot copyright the algorithm itself.
For example, the specific instructions (the list of ingredients, and the process of mixing them together) of how to bake your dear Aunt's double layer chocolate cake cannot be copyrighted. However if you want you can write a little booklet describing how your Aunt would make this cake for your birthday and how much it meant to you and incorporate the ingredients and the process as part of the story. Or you could collect all of your families recipes together and make a copyrighted cookbook of the collection.
Here is the backing comment from the US Copyright office:
http://www.copyright.gov/circs/circ61.htmlIf you want to protect the idea/algorithm itself, file a patent
Besides .. it's just a radix sort..
It's "I ran his sort on this post", but I like "I ran his sort on shit pots" better.
While the combination of radix sort and linked list is interesting, the implementation is quite poor: to contain all the bit in a 16bit word you need an array of 65536 bits not 65535!!
Minor points:
-It's using an ugly hack to split the 32bit values into the low or high 16b, why not just duplicate the function?
-Ptrs and Tail are initialised manually instead of using memset which can be more efficient.
-It would perhaps be worth using one combined array for Ptrs and Tail to improve cache locality (probably a very minor gain).
The Art of Computer Programming, vol 3, Sorting and Searching, 1973, by Donald Knuth Page 173, Algorithm R - radix list sort. No credit is given. The algorithm has been well known and obvious since the first link list was invented. Hey! A good choice for a patent!
Don't use this code !
:
:-)
First, there are multiple bugs in the code itself :
1) everywhere, it is assumed that 2^16 = 65535 and not 65536. Arrays are allocated for this size too, but are used with 65536 entries. (un)fortunately for him, the space is probably allocated anyway so the program does not crash on this.
2) a similar bug in the random number generator carefully avoids dangerous values, because he generates numbers between min and max by doing a modulo on max-min , which prevents max from being produced. This is not a
big deal anyway, just that it hides the bug.
3) he assumes everywhere that "long" is unsigned and "short int" too. This makes the code SegFault on Linux with negative array indexes. The fix is obvious, though.
And most of all : times are measured in optimal conditions. Indeed, this algorithm is
just a radix sort with a large radix (2^16). A radix algorithm is efficient when the number
of elements is high compared to the radix, because at every level, all radix entries have to
be scanned for existing values. In his example, he sorts 4 million elements. His code is faster
than the merge sort on 4 million entries, just as would be any radix sort. But on smaller sets,
it still takes the same time per level, because it has to scan all 65536 entries even if there
are only 100 numbers. The fact is that for less than 10000 values, the mergesort becomes faster in his own code.
When measuring algorithms complexity, we use the O(x) notation. The time per iteration is never
considered. When the time matters (as in his example), we should consider the time per iteration (T), and the total time (t) will follow
t = T . complexity
In complexity, his algorithm is O(N) on 16 bits, but the time per iteration is very high. Other sorting algorithms are O(N.log(N)). So the total time is :
t1 = T1 . O(N) for his algorithm, and :
t2 = T2 . O(N.log(N)) for the mergesort.
But when T1 > T2.log(N), he looses, which is almost always because log(N) is small for 16 bits, and T2 too because operations are very simple. In his case, T1 is very high because it represents the time required to scan 65536 values.
It's amazing that he fell in such a beginner's trap. I suppose that he's still a student, which would explain the total lack of portability and newbie's bugs in the code.
Well tried anyway
Willy
I nominate this slashdot post to the Worst Slashdot Post for 2007. It's going to be hard to topple.
Tried to post some lovely code here, but was refused by the "Lameness filter", which said I have too many "junk characters". Well, of course I do - it's C++ code! Lots of ampersands and such, which apparently it won't accept even embedded in ecode or blockquote tags.
Anyone have any advice?
Pick the Knuth: both have a round face, thin hair and no teeth!
image 1
image 2
I invented Radix sort once, too. Who hasn't? Let him have his 15 minutes of fame.
if this new sorting algorithm worked with arbitrary data, it could be applied by the patent office to find (10x faster then merge sort) that the algorithm is not new.
Maybe this guy has re-invented radix sort? I can't really tell! I wish somebody would post and tell me!
I think every thing's been said about the radix search on a SORTED linked list. I think his benchmark excludes his sort making it faster. Plus if I wanted a sorted list I'd use an data structure that inserts the value, reducing the sort overhead, I know the trade off and it's debatable.
They should rename it to lamers script kiddie intro.
For his buffer over flow, he uses a classic sled technique without even knowing what it is. He calls it a buffer. You don't even need to do that just get the address, offset and insert.
also fails to mention that if it's on the heap most o/s's won't execute from there.
Sorry if there are any typos it's my Birthday and I'm drunk!
Plus I'm annoyed I was excited to check out a new algorithm and what a disappointment.
And my validation word is warrants that's scary! I'm strictly white hat I swear!!!!
1. Although it's technically O(n) compared to QuickSort's O(n log n), consider the fact that
a. the number of passes required by radix sort is proportional to the key size, and that
b. if the keys are primarily unique, then the key length will be O(log n)
2. One cute feature of radix sort for very large data sets is that you only need linear access to the data, so it can be done with non-random-access external storage. The one practical implementation I've ever heard of used four 9-track tape drives.
However, considering use in main memory, both radix sort and quicksort make efficient use of a D-cache that is 2-way set-associative or better, and both hit the whole data set once per pass -> if it swaps, it'll swap a lot
And here's a sort that can be up to infinity times faster than mergesort:
void (linked_list * list)
{
if (!list->sorted)
{
qsort(...);
list->sorted = TRUE;
}
}
This algorithm has a best case of O(1), which is better than mergesort (best case is log(n)). Somehow the 'up to' part in 'up to 10x faster' doesn't instill much confidence.
I did appreciate the number of google ads (plus the inconspicuous donation button) along with the "License" that said 'don't copy this text, rather link to the page'. The sort isn't earning him the money...showing up on /. is.
First of all, he deserves to be made fun of. If you think you have discovered something new and amazing, how about taking some time out to find out if you actually did or not before making nonsense claims? And nobody is saying he's a bad person for this, just that its stupid and shouldn't have been posted. It reflects more on the /. editor who was moronic enough to front page this than it does on the guy who wrote the (absolutely terrible) code. (kdawson, this is why clueless fucktards hide out in the games section).
And finally, trying to GPL it (as stupid as that is) is very much a bad thing. Its equally bad as trying to cash in on it. Either way you are restricting the use to a select group of people, wether those are people willing to pay, or people who are GNU/communists doesn't make any difference. If you want to make something free, then make it free, for everyone, for anything, period. Otherwise don't try to pretend you are doing something altruistic, when you are really doing something selfish.
RAIDICKS!
For once, goatse is on topic!!!
Mere Aggregation is, in my opinion, the most overlooked aspect of the GPL. It gives developers the freedom to work with GPL code, without requiring that everything they touch then be GPL. Thus allowing GPL code to work its way into a closed system, one component at a time, and may the best code win. In this manner GPL projects can then attract contributers and maintainers from a much larger developer pool, thereby increasing the overall robustness of the community.
This is the "real" viral nature of the GPL, not that FUDish crud about 'if anything is GPL then everything is required to be GPL'.
(Please note, I am not talking source code level components, but rather independent pieces of software working in concert.)
Building a better backup.
Zettabyte Storage
The previous few posts caused me to also open up the code. Wow! What a ride! I decided that I wanted to add a few things I found as well.
I loved that he referred to his use of the short pointer as a "cool pointer hack to avoid shifts and ifs!!!" To begin with, if we have to call that a "cool hack" then the requirements for coolness have definitely dropped a bit. I am also confused as to how this helps avoid shifts and ifs. As far as I can tell, it avoids one shift. I'd love to see the code without the "cool hack" because I'm intrigued as to how he uses an if statement to remove the higher bits.
As stated before, I am also fond of his use of an array of size 65535 rather than 65536. I surprised he didn't run into any segmentation faults. I know that I can give him a test case that will seg fault pretty easily.
I also question whether he understands the purpose of a header file, as the entire source for his BitSort is contained within a header file. I guess it would sort of make sense if the function was declared inline, but it isn't.
Another fun element is how non-modular his code is. Everything is hard-coded, even though it would have been easy to declare a few constants or even make the function a template.
Lastly though, why the crap did this article get through the editors. It is, as has been stated over and over, a radix sort and nothing different. I know that he claims it is different because it is in-place, but that is such an obvious simple change that it does not warrant a whole new algorithm. Regardless of the content, the article is so poorly written with some of the worst grammar I have read in quite some time. Slashdot really needs to improve its article standards.
You can do that with merge sort as well, right? First you write n/2 unsorted records to tape A and the other n/2 unsorted records to tape B. Now you say each of those tapes has n/2 sorted lists of length 1. Then, you read both A and B at once, and you merge the first two sorted lists on A and B and write them to tape C. You take the second two sorted lists on A and B, merge them, and write them to tape D. Continue in this manner, alternating your writes between C and D, until you run out of data on tapes A and B.
Now, tapes C and D each contain a sequence of sorted lists, but each one is twice as long as the lists on tapes A and B. So you repeat the step and read from C and D and write to A and B. Every time you do this, the sorted lists get twice as long and there are half as many of them, until eventually you have one list with all the sorted elements.
For extra added efficiency, take the original input and sort it using whatever sort you choose (including mergesort) in RAM before you do the initial write to tapes A and B. This reduces the number of passes to log2(data size / ram size) instead of log2(data size).
The best best case sort you can get is n. An optimal algorithm would always hit that.
What it has is the optimal worst case, which is not the same thing at all.
When picking a sorting algorithm, you have to consider the best, worse, and average cases, and specifically how it deals with the kind of dataset you have.
Mergesort's best, worst, and average cases are all the same - nLog(n)
It's sort of like a compromise. It's not really fast at any particular case, but, then again, it's never bad.
If your data is perfectly random, or you know nothing about it, then merge sort is your algorithm.
However, on semi-sorted data, quicksort performs better or worse than merge sort. Better if it's almost sorted in the correct order, (O(n) if it is sorted in the right order), or worse if it's sorted mostly in reverse order O(n^2) if entirely in reverse order.
Most data that is created by humans is partially sorted (i.e. in human data entry cases)...so there are many places that quicksort is usually a better choice. However, given any unknown set sample, there is a much higher likelihood that the set will be mostly random - meaning on average when considering all possible sets, mergesort performs better.
These are the two extremes in the sorting world - quicksort has best best case, but the worst worst case, while merge sort has the best worst case, and the worst best case.
All other sorting algorithms that are considered efficient fall between these two extremes and take advantage of some piece of knowledge about the data (keep in mind that sorting algorithms in general *don't* distinguish between in-place and not in-place. This is almost universally a trivial detail that doesn't effect the order of the algorithm) that lets you do better than O(nLog(n)) on that specific kind of data.
It always seems strange to me that people learn what these two algorithms are without learning how they define the boundaries of the solution space. If you don't know what they're for then what's the point of learning more than one at all then?
Mod me down and I will become more powerful than you can possibly imagine!
A$$hol is a beggar. Has removed the link and inserted Make Donations.
cartoon about Knuth
Buffer Overflow in Action Tutorial - Part 1
Ever since I got out of school, I almost never had to worry about sorting algorithms. I let the database do the sorting almost all the time (whether desktop DB's like FoxPro and MS-Access, or Oracle). Perhaps it is important in embedded software or systems software, but for business applications it seems most let the database do it, and we forget all those Comp-Sci sorting algorithms.
Table-ized A.I.
+1 Venomous
Ad b: This is not true. If you assume that the keys grow polynomially with n, let's say n^3, then just let the number of buckets in the radix-sort grow linearly (i.e., use radix n instead of some fixed radix). Then the number of passes will be equal to the degree of the polynomial, so in the cubic case three passes will be sufficient, leading to a linear-time algorithm again.
Radix-sort stops being usable if the integers grow superpolynomially, but then you can use a bunch of other techniques (see papers on sorting integers by Mikkel Thorup), leading to complexities on the order of n*log log n or even better.
That's the kind of thinking which leads to the current kind of bloated and slow programs. Of course a better algorithm is always the first step, but programmers should always try to optimize their bottleneck algorithms. Would you like it if your software was 10 times slower? What about 100 times slower?
In my opinion, it is ridiculous to say that linear improvements are irrelevant. If we believe the current predictions about CPUs (number of cores increasing as opposed to GHz), this is even more true, since the old excuse of "the hardware will get better in 6 months" will stop being true.
The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
Slashdot editors should have rejected this story.
Now, now, VENONA. You're simply experiencing the #1 tradeoff in having above-average intelligence: More of your fellow human beings are simply going appear stupid to you.
And thank you for having expressed my point of view entirely. The first thing the snap-judgment part of my mind presented me when reading the AC post was, "Probably drives a Hummer, proudly."
"Press to test."
(click)
"Release to detonate."
So mergesort's O(n log n) is replaced by O(n/10 log n). That's impressive!
The guy has replaced the page with a link to Radix sort on wikipedia.
Somebody said that this isn't a gym class and this guy was treated a bit harsh. Apparently People replying to this have been seen the dark side of the gym class, and here is their chance of giving their revenge on the newbie :P
Nice move from the poster to put this guy infront of the firing squad. Now he has been called retard, his CV and everything he tried to do was shot down. Welcome, newbie, to the world of mature professionals :)
Critisism is in place, but calling him a retard is a bit too much. Newbie perhaps, but retard? Ignorant or stupid? Well, hopefully this has made the world a better place. Mr. Virusfree will atleast think twice before publishing anything in fear of getting into slashdot, where newbies get what they deserve.
I don't think this guy deserved all this. Is the function of /. to kill newbies? Hows about starting a week of newbie hunting, and searching the web for sites made by kids doing silly things with computers and introducing them to the professional firingsquad and hopefully make these 'retards' to stop touching computers ever, since they apparently are too stupid to be allowed to use them. See how many websites will Slashdot be able to take down by public ridicule campaigns? Revenge the treatment you got in gym classes, theres always somebody weaker than you to smite!
In the winter it is much colder than in the night.
As long as a turing machine uses finite memory of length N, there exists a DFA of N^2 length that accomplishes the same in O(1). Essentially, your stick of memory is a finite number (X). Your algorithm, whatever algorithm it is, deterministically translates X into a new finite number (Y). I can avoid running the algorithm by having a static array that maps X to Y. To turn a 16-bit computer into an O(1) DFA, I need 2^32 bits of memory (a 65536-element array where each value is a 65536-bit string).
Anytime O(...) notation is used, it assumes arbitrary length input (a turing machine is never assumed to work on a finite tape, because it would so obviously reduces to a DFA otherwise).
Furthermore, optimal sorting is rarely expressed as O(N lg N), it is O(N lg k) where k is the # of unique elements. Because most textbooks want to use single-variable Big-O notation, they stick with O(N lg N), but more serious analysis uses multi-variable complexity. k is only pseudo-dependent on N, in that it has to obey the relationship k = N. In the worst case, k = N, and hence sorting is N lg N when expressed in terms of a single-variable. If we're serious about sorting it is O(N lg k). When k is finite, it reduces to merely O(N).
Seems like someone wanted to make some clickthru money from /. ?
-- A change is as good as a reboot.
I've written both an implementation of Radix sort and Merge sort in C using linked lists. With integer keys Radix sort is indeed faster by about a factor of 2 on large lists (10^7 keys). The stack management for the non-recursive Merge sort take more time than the very simple operations in the Radix sort.
The Radix sort is fairly simple:
typedef uint32_t KEY;
enum {
KEYBITS = 32,
STEPBITS = 11,
};
enum {
NUMBUCKETS = 1 << STEPBITS,
};
#define MASK ((KEY)(NUMBUCKETS-1))
typedef struct node NODE;
struct node
{
KEY key;
NODE *next;
};
NODE * sort(NODE *a)
{
NODE *bucket[NUMBUCKETS+1];
NODE **tail[NUMBUCKETS];
unsigned shift, i;
bucket[NUMBUCKETS] = NULL;
for(shift = 0; shift < KEYBITS; shift += STEPBITS){
for(i = 0; i < NUMBUCKETS; i++)
tail[i] = &bucket[i];
while(a != NULL){
unsigned index = (unsigned)((a->key >> shift) & MASK);
*(tail[index]) = a;
tail[index] = &a->next;
a = a->next;
}
for(i = NUMBUCKETS; i-- > 0;)
*(tail[i]) = bucket[i+1];
a = bucket[0];
}
return a;
}
Quicksort has a typical runtime of O(nLog(n)). It has a worst-case time of O(n^2). For this reason I have a preference for heapsort which has a worst case of O(nLog(n)) even though it is typically slower than quicksort by a constant factor depending on the implementation. I consider the constant a minor point, others may consider "typical" versus "worst case" a minor point. When talking O() one should use worst case.
How can he put an algorithm under the GPL? How is that any better than someone patenting an algorithm? Is it OK because it's the GPL and this is Slashdot? I know that a license != a patent, but if one supports the idea that algorithms should not be protected IP then shouldn't we all be crying foul over that?
Not to mention that this doesn't appear to be any kind of new algorithm, just a 'rebranded' one.