Researching Searching Algorithms?

← Back to Stories (view on slashdot.org)

Researching Searching Algorithms?

Posted by Cliff on Tuesday October 22, 2002 @12:46AM from the has-this-been-done-before dept.

NiN3x asks: "I have recently written a sorting algorithm that can be close to four times as fast as quicksort and never much slower. I was wondering if there are lots of sorting algorithms like this out there. I don't think I could be the only one that has thought of something like this. I am only in my second year of computer science, so I don't know a lot about these things. I have tried searching the net and can't find anything. My algorithm is NOT an inplace sorting algorithm. Can anyone point me to some sources for this type of thing?"

5 of 62 comments (clear)

Min score:

Reason:

Sort:

Copy and paste it here by inerte · 2002-10-22 00:56 · Score: 2, Insightful

Or provide a link to a file so we can properly answer your question. How can we know that your algorithm works or that haven't been implemented before with such a vague description?

--
Buy a Nintendo DS Lite
Wrong title? by Anonymous Coward · 2002-10-22 01:06 · Score: 2, Insightful

Shouldn't be: Researching Sorting Algorithms?
Proper Analysis is Key by photon317 · 2002-10-22 01:31 · Score: 5, Insightful

There are lots of algorithms out there that seem to be multiples faster than quicksort in daily use. The problem is when you formally analyze them, you find out their not gauranteed to be faster, and could in fact be horribly slower. If you run it enough times against various wierd data, you'll probably hit one of these slow runs yourself. I can only guess because you haven't mentioned anything about the algorithm or shown any sample code. Read up on how to analyze your code for O() notation and figure out what the real worst case is mathematically.

--
11*43+456^2
Re:Funny that this topic came up... by kscguru · 2002-10-22 16:52 · Score: 2, Insightful

And at this point, I'd even suggest digging up the Standard C library implementation of quicksort - I remember coming across something somewhere that mentions that the standard implementation is a hybrid quicksort that is actually more optimal than the theoretical / canonical quicksort. I'm thinking it was either a hashed-quicksort or a multi-quicksort, but I could just be making those up - and if so I'm sure someone will correct me. As joto mentions, any sorting is still at best O(n log n), just a lower constant term, and a greater likelyhood of better-case over worst-case.
The original poster mentions that his algorithm isn't a simple in-place sort - which, based on my best guess, means that he's probably sorting pointers quickly then re-arranging the elements in a single pass. Maybe, maybe not - but still no faster than O(n log n).
But at this point, I'd start worrying more about scalability, memory usage, etc. - one of the nice things about quicksort is that it is so general, it can quickly be adapted to low-memory or multi-processor implementations. Sure the algorithm is faster - but faster does not necessarily mean better. If my implementation guess above is correct, the algorithm would use significant additional memory, making it unsuitable for sorting simple numbers (where the pointers would dwarf the elements themselves) or for use in low-memory (i.e. embedded) applications.
The true test of whether the algorithm is innovative is: 1) have a professor look over it, and then 2) get a (respectable) journal to publish it. If it passes the peer review required for publication, then it's an innovative algorithm. If not, it's one of thousands of alternative implementations of the same idea. Clever, maybe even useful, but not spectacular.

--
A witty [sig] proves nothing. --Voltaire
Fast vs. Optimal by cmpalmer · 2002-10-23 03:53 · Score: 2, Insightful

Where the art and the science of algorithms diverge is in their implementation. Does a bubble sort in hand optimized assembly language on 3GHz machine beat a quicksort written in GW Basic running on an antique 386? What differences are there between data already stored in memory vs. sorting data that is on a remote server? What if two algorithms are both O(n), but one's "operations" involves multiple calls to slow libraries, the other one involves just two integer compares.

I know the question raised was on the correct analysis of the algorithm in question, but several of these postings have diverged into languages and types of data. This is good and it is the whole point of a well grounded CS education -- knowing all of the strengths and weaknesses of available algorithms so you can make the correct choices based on the size and type of the data and the language (and libraries of choice).

--
-- stream of did I lock the front door consciousness