Deep Algorithms?
Stridar writes "A paper presented in a recent article quotes Donald Knuth as saying the computer science has 500 deep algorithms. He mentions that Euclid's algorithm is one of the most important, and he seems to agree with the idea that CS will be mature when it has 1000 deep algorithms. What I would like to ask Slashdot is the following. What are the most important algorithms in CS? What is your favorite algorithm? And finally, what are the outstanding problems for which algorithms would be immediately placed in the "Top 1000" category." We had an older story where two scientists picked their top ten algorithms.
I dont think I'd call it my favorite algorithm, but the Boyer-Moore string searching algorithm is pretty cool.
COordinate Rotation DIgital Computer
This algorithm is implemented by most FPUs and even some PGAs to calculate trigonometric and hyperbolic functions. It replaces the evaluation of those power series you've already forgotten about from school with a clever combinations of bit-shifts and additions. Back in the days when multiplications were much more expensive than additions, this is how it was done.
Dictionary.com defines an algorithm as:
A step-by-step problem-solving procedure, especially an established, recursive computational procedure for solving a problem in a finite number of steps.
Another way to think about an algorithm is this, you start out with input data in a given format, and then run some set steps on that input data until eventually it gives you output data. The nice thing about algorithms is that when they are correctly formulated, they can work without human intervention or without thinking/reasoning (just following the steps on the data). That is why they are particularly useful for computers. But they don't have to be limited to computers. Most recipes for food could be considered algorithms, that is, a set of procedures that bring you from input to output.
A good example of a computer algorithm is one of the many sorting programs. Quicksort, bubblesort, mergesort, heapsort...these are just different algorithms for taking a list of unorganized integers and by following their steps, you get a list of integers in numerical order.
When it comes to beauty in algorithms, people are generally referring to simplicity and efficiency in algorithms. Doing things in a way that most people wouldn't normally think to do them, yet doing them in terse and efficient ways (elegance).
I'm not exactly sure what is meant by a 'deep' algorithm, but I would think it would reference just how complex the task that the algorithm solves is.
My vote for best algorithms are: Sieve of Eratosthenes (an ancient greek method for finding a list of all prime numbers), and the Fast Fourier Transform, an algorithm that has revolutionized several industries.
I drink to prepare for a fight; tonight I'm very prepared. -Soda Popinksi
An algorithm is simply a series of steps one can take that, once you have finished them, will have solved your problem. A deep algorithm is one that is especially useful, applicable in many circumstances, and has some inate cleverness that makes it non-trivial to come up with.
So, for example, an algorithm for searching could be:
1. For i first item to last item2. If item i is what you are looking for, return it
3. otherwise, go onto the next i.
This isn't a very fancy algorithm, but it works, and it is useful in many many circumstances. Of course, it is also trivial to come up with (look at every item one at a time untill you find your goal), and therefore isn't deep.
It is interesting to compare an algorithm to a heuristic. Heuristics would make great algorithms, if not for the bugs. That is, a heuristic is a set of steps you can follow that are likely to solve your problem, but aren't guarunteed. In that sense, they're just buggy algorithms.
There are also approximation algorithms. Suppose your problem is to find the shortest route that will visit a set of cities, and return you to your starting point. This, btw, is a classic problem called the Traveling Salesman Problem, and is provably rather nasty to solve (it belongs to a class of problems called "NP-Complete"). That is, if you want the SHORTEST route, the best know method is to try all possible routes (and for even a relatively few cities, that's a lot). However, there are algorithms, that if you follow them, are guarunteed to give you an answer no worse than twice as long as the best possible. That is, we can approximate the answer, within some provable bound of optimal, with a set deterministic steps. (For the nitpickers, the approximation only works with Euclidian TSP, not general TSP, and .: doesn't give a solution to the Hamiltonian Cycle problem).
Any other questions?
Gotta agree with you, but not on its own.
I can't narrow it down to about 50, personally. Here're the broad-brush "highlights":
a) All of quicksort, mergesort, heapsort and radixsort.
b) FFT, DFT, their relatives, whilst I'm divide and conquering. Convolutions and shite too.
c) Graph algorithms including Kruskal's, Dijkstras. Coloring algorithms (useful for compilers).
d) Parsing algoriths, while I've got compilers in mind
e) String matching algoritms ditto
f) Compression algorithms - Huffman, Arithmetic, LZ*, BWT.
g) Cryptographic algorithms - Hashes, Private Key Fiestel Networks, Public Key 'bignum' techniques. I'll throw in CRCs here too as they're close to hashes.
h) Bignum algorithms - Karatsuba, Barrett, Montgomery, Oooh, I've had FFTs already, can I have them again?
i) Pure Maths - Euclid, XGCD. Addition Chains (e.g. Pippinger). Eratosthenes, Bernstien-Atkin likewise.
j) Trial division, Fermat's Method, Brent/Pollard Rho, Pollard/Williams P+/-1, Lenstra's ECM, Quadratic Sieve, (S/G)NFS.
k) Applied Maths - Newton-Raphson, Runge-Kutta, Tchebyshev interpolation.
Too many to count...
THL
Keeping
An algorithm is a process for doing something. Perhaps the best example for non-programmers is long division, the way you learned to divide multi-digit numbers with pencil and paper. The basic mathematical definition of division is that it's the inverse of multiplication, which is a fine definition of an operation but doesn't tell you how to divide two numbers. Long division is a procedure which gives you the answer. Other procedures are possible.
This may sound like I'm describing a program, or a piece of code. A piece of code can implement an algorithm, but many different pieces of code can implement the same algorithm. An algorithm has a specific mathematical context in which it works -- e.g. the dividend is not zero. The piece of code has specific to a computational context -- written in C, divisor is in variable x, dividend is in variable y, quotient ends up in z, all of which are single precision floating point.
What's a deep algorithm? That's subjective, but I'd say it has to either be non-obvious or become obvious only after you learn a nontrivial piece of theory. There's probably an aesthetic component as well.
Er... no. You're wrong.
Here I'm citing from my "Introduction to the Theory of Computation" book from Michael Sipser from MIT (ISBN: 0-534-94728-X):
"Informally speaking, an algorithm is a collection of simple instructions for carrying out some task."
There probably is a real definition somewhere but I think it sufficies to say that algorithms are things done by Turing Machines (that's why Turing invented them in the first place), and since the a=a+10 stuff from the parent can be done by a TM, it's an algorithm.
The man of knowledge must be able not only to love his enemies but also to hate his friends.
To be finite, the process must end after a predictable number of steps. Each step must be unambiguous. The process may require input (parameters) to solve the problem but when complete it must return a result. The process must demonstrate effectiveness by solving the problem in a "sufficiently basic" manner.
"Quick Sort is recursive"
Recursion adds _bugger all_ overhead.
What kind of machine are you using - is it based on punched cards? Recursion is no more expensive than a loop overhead. How many loops does bubble-sort have to set up?
"and then do bubble sort on those. "
I hope not. I mean, you've got selection, insertion, in-place merge, and hard coded minimal exchange sorts to chose from; there's no need to bubble. Ever!
"
There are also algorithms that do matrix multiplication in O(n^2.blah) time. The naive method is cubic. How come no one uses these fancy methods? Because they are a pain to implement and require n to be so large before they actually start performing better than the naive implementation.
"
Yeah but the 2.blah you're thinking about is 3*ln(8)/ln(9) = 2.84. Which is damn close to 3 compared with how vastly different O(n^(1+o(1))) is from O(n^2) in the sorting examples.
THL.
Keeping
is at The Computer Science Hall of Fame
To address some of your points:
Hell yeah, recursion adds overhead. If it cannot be readily optimized into iteration (such as tail recursion), you need to maintain stack frames, allocate new local variables, etc. For tree recursions, such as quick sort (pick a pivot and resursively screw with each side), it is very hard to make the algorithm iterative without implicitly mimicing recursion or a stack.
Yes, I was not thinking perfectly when I claimed that a good quick sort implementation would use bubble sort when the chunks got small enough. They may use some other quadratic sort. But I am still willing to bet, that when tuned properly, using the bubble sort then is more efficient than using quicksort all the way down. I don't want to make any claims about performance on various architectures because it all boils down to freakly things such as cache hits. When I try to explain things for the layman, sometimes some details slip by, just waiting for the tight-assed dork to jump on.
By 2.blah, I was actually referring to the better results, rather than the boring divide-and-conquer method you cited. I recall results as good as n^2.23 or so, and I believe it is still unproven if matrix multiplication can be quadratic.
Indeed - the last time this came up I did some benchmarks. The 'old fashioned way' using temp variables when optimised runs about twice as fast as the xor version.
In not-quite-assembly it looks like:
foo(cx,dx)
mov ax,cx
mov bx,dx
mov ax,ax xor bx
mov bx,bx xor ax
mov cx,ax
mov dx,bx
vs.
foo(cx,dx)
mov ax,cx
mov bx,dx
mov cx,bx
mov dx,ax
one problem with skip lists is that each node must be dynamically resizable because the number of forward pointers is changing and not known at compile-time
For a really long skip list you should consider using a skip list to store the forward pointers.
"I have opinions of my own, strong opinions, but I don't always agree with them." -- George H. W. Bush
The only problem with A/B (and I love it and have implemented it a number of times in games) is that it only works for abstract strategy games with a zero-sum score. (meaning, one player is winning by the exact same amount the other player is losing...thus, zero sum) So chess, backgammon, checkers, hex, and any other game with a zero-sum board evalution it'll do fine. Otherwise it won't work very well.
Plus, an A/B search is only as good as the method you are using to evaluate the game. I'll use minmax and a good heuristic rather than A/B with a bad one. Just some food for thought.
sig--we don't need no goddamn sig
The most important algorithm I have used in the last 10 years.
Threaded AVL trees.
It is amazing how much code I have "fixed" or
improved by dumping a:
List with a sort function
binary trees
simple AVL trees
loading data and then sorting it.
This is the most underused yet best bang for the
buck algorithm/data structure out there.
I have used it for directed , undirected graphs,
user interfaces, event handling, cacheing algorithms, data flow algorithms, searchs user information etc. The overhead is minimal and
a known runtime is so important. The automatic
reblancing has known points to make it thread
safe and the ability to linear process through
the tree makes it a replacement for any list were
someone is searching for data.
Leslie Donaldson
Berlinski, author of "A Tour of the Calculus" (a damn good book), later wrote "The Advent of the Algorithm". It's on my to-read stack. I would think many slashdotters would be interested in reading it, and most slashdotters would be interested in reading the last comment on Amazon's review section. :-)
It's a variant on a vorpal fork bunny. Try typing it into a bash shell on a machine you won't mind hard booting
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
- A* algorithm (generic AI problem solving algorithm.)
- Dijstra's minimum path (and other graph algothms.)
- Risch algorithm (Maple uses this to integrate arbitrary expressions.)
- Toom-Cook/FFT, Montgomery's algorithm (big number multiplies.)
- Introspective sort. (Hybrid of Quick sort and Heap sort which gives great average and worst time complexity.)
- Various pseudo-prime/factoring tests. There are a number of "random trial" algorithms that work unusually/unexpectedly well.
- Karmarkar's algorithm (fast average and worst time complexity Linear program solver.)
- Arithmetic encoding, PPMD, BWT (state of the art in compression algorithms which are pretty damn good.)
- The DCT/iDCT algorithm (video/image compression.)
- Peterson's Critical Section algorithm (great for multitasking.)
- "Dynamic Programming".
- Good Hashing algorithms.