"Evolved" Caches Could Speed the Net

Caching by Zorilla · 2004-07-06 02:49 · Score: 5, Funny

According to New Scientist, evolutionary algorithms could make many network caches twice as efficient.

That's easy, just cache 4 boobs at a time instead of two

--

It would be cool if it didn't suck.

Re:Caching by Anonymous Coward · 2004-07-06 02:58 · Score: 2, Funny

Wouldn't it more efficient to cache 1 boob and display it x times? Sort of a boob pooling.

Algorithms by th1ckasabr1ck · 2004-07-06 02:51 · Score: 4, Interesting

Pablo Funes of US company Icosystem and Jürgen Branke and Frederik Theil of the University of Karlsruhe in Germany used "genetic algorithms", which mimic Darwinian evolution, to develop strategies for internet servers to use when caching data.

It would be interesting to see exactly which algorithms they are talking about here. I wouldn't be surprised if they drew some ideas from garbage collection algorithms also.

Re:Algorithms by bunyip · 2004-07-06 02:56 · Score: 5, Funny

I wonder if they evolved logic to counter the slashdot effect. 1. Scan slashdot.org for new stories every five minute. 2. Scan new story for links. 3. Cache those pages.

That would be the "suicide algorithm". As the server goes up in flames from the Slashdot-effect it brought upon itself, it would become the first cyber recipient of a Darwin Award.

Alan.
Re:Algorithms by spektr · 2004-07-06 03:27 · Score: 4, Funny

I wouldn't be surprised if they drew some ideas from garbage collection algorithms also.

I suppose you're the clever garbage man from the Dilbert cartoons. Because as an engineer I don't really get the connection between garbage collection and caching algorithms. Now that we have covered the first two frames of the cartoon - what's in the third frame where the garbage man makes me feel ashamed by explaining some complicated concept of engineering?
Re:Algorithms by ahem · 2004-07-06 03:44 · Score: 4, Insightful

...as an engineer I don't really get the connection between garbage collection and caching algorithms. ...
It seems to make sense to me, if only because the two are complimentary problems. Caching is figuring out what stuff is valuable so you keep it around. GC is figuring out what stuff is valuable so you can throw away the rest. Kind of like in probability where it's easier to figure out the likelihood of something not happening and then subtracting from 1 to figure out how likely it is to happen.

--
Not A Sig
Re:Algorithms by ezzzD55J · 2004-07-06 04:15 · Score: 5, Insightful

Humbug, the big difference is: in GC there's perfect knowledge about what you want to throw away (unreferenced or only circularly referenced objects), and the problem is how to find out efficiently.. In caching the whole problem is to figure out which objects to keep, and beyond that, efficiency is no problem.
Re:Algorithms by Surt · 2004-07-06 04:29 · Score: 2

Sometimes increasing the cache size without adjusting the algorithm reduces performance.

--
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
Re:Algorithms by laudney · 2004-07-06 04:58 · Score: 4, Insightful

Garbage collection problems are far easier than caching problems in networks. There are many reasons. The most important one is that in garbage collection, 'references' are basically 'pointers'. An object that's not pointed to by any other objects is considered garbage. And don't forget, in garbage collection, you can even 'stop the world'!

By contrast, in caching, 'references' are 'client requests', which are unpredictable and highly variable. Furthermore, on busy networks, there is seldom a moment for you to 'stop the world' and 'evolve' for a while!

In the story, people use network simulators and randomly generated requests. Since network simulators are notoriously simplistic and inaccurate, and caching is heavily influenced by real-life workloads, I'm interested to know whether their algorithm is applicable in reality.
Re:Algorithms by aiyo · 2004-07-06 10:05 · Score: 2, Informative

GC and caching are done very differently. In java memory has a counter associated with it. When you have two pointers to that memory then it has the value 2 stored somewhere with the memory. When you change the reference the counter is decremented. Every once in a while the GC will come along and free memory with a counter of 0. If you dereference that memory so that you have absolutely no pointers anywhere using the memory then you will have lost that memory although it is not freed until later.

What makes a good cache? by YankeeInExile · 2004-07-06 02:52 · Score: 5, Insightful

Well, I have found one flaw in the methodology:

The starting population of algorithms was tested on the simulator using randomly generated requests.

I would think this would breed out of the caching system any affinity to locality-of-reference.

One of the things I did each morning when I was running a cybercafé was "prime" the Squid cache by running a little script that did a wget -p on all of the URLs in the portal page, and a few sites that were not. And it definitely did improve performance for most users.

One of my unrealized dreams would be some sort of speculative-fetch algorithm for Squid that would basically do a breadth-first fetch on a page while the user was busy reading it.

Of course in my not-so-humble opinion, the biggest problem with any caching system is the population of websites that, through either malice or incompetence develop content that is cache-hostile, and call it "experience enhanced".

--
How does the Slashdot Effect happen given that no slashdotters ever RTFA?

Re:What makes a good cache? by Short+Circuit · 2004-07-06 02:55 · Score: 4, Insightful

What you get out of the breeder shouldn't be what you put into production.

Assuming you can still read the code, you're still going to want to put in common-sense improvements that the GA's didn't discover.

--
tasks(723) drafts(105) languages(484) examples(29106)
Re:What makes a good cache? by YankeeInExile · 2004-07-06 03:00 · Score: 2, Interesting

So, do you posit that the purpose of the GA is just to give human programmers insight into avenues they might not have otherwise considered?

--
How does the Slashdot Effect happen given that no slashdotters ever RTFA?
Re:What makes a good cache? by AKAImBatman · 2004-07-06 03:16 · Score: 3, Interesting

I'm actually a bit surprised that neural nets haven't yet entered common usage. In theory, such a net would be far more "intelligent" (sort of) in making decisions.

e.g. This page looks a lot like a truly dynamic page, so let's not cache it. Over here, this is dynamic, but only slightly so. Let's cache it and index the most likely path the user will take.

Have CS researchers given up on the neural net approach, or are the nets still far too unstable for real world use?

--
Javascript + Nintendo DSi = DSiCade
Re:What makes a good cache? by kerrbear · 2004-07-06 03:44 · Score: 2, Insightful

What you get out of the breeder shouldn't be what you put into production. Assuming you can still read the code, you're still going to want to put in common-sense improvements that the GA's didn't discover.

Could this be the long awaited answer to harmonizing Creation and Evolution :-)
Re:What makes a good cache? by Short+Circuit · 2004-07-06 03:49 · Score: 3, Interesting

I actually wrote a paper on those in middle school. The problem with them is that the larger ones are very difficult to tune.

It turns out that if you have more than three layers (in other words, if you have any layers between the input and putput), you run into proplems in training when the network doesn't work as well as you want it to perform, but furthur cycles of your training algorithm don't seem to make it any better.

The comp.ai.neural-nets FAQ was my primary source for that paper. Read that if you're interested. (I felt like my brain went numb after a while back when writing the paper.)

--
tasks(723) drafts(105) languages(484) examples(29106)
Re:What makes a good cache? by a_ghostwheel · 2004-07-06 05:06 · Score: 3, Informative

There are other types of neural networks than just standard back-propagating ones. Linear vector quantization is much better at solving classification problems (ART-based algorithm should be good too). Problem is that in many "on-the-surface" problems have better specialized solution than just a generic NN or GA brute-force approach. This situation led to drastic decrease in funding for NN research.
Re:What makes a good cache? by autophile · 2004-07-06 05:20 · Score: 2, Informative

Have CS researchers given up on the neural net approach, or are the nets still far too unstable for real world use?
Well, no, it's just that neural nets have to use a carefully calculated and parameterized error-propagating algorithm, as well as an appropriate architecture and size of network, and if you choose the wrong one, you're basically hosed. Genetic algorithms (and genetic programming) have learning algorithms that are much simpler to understand and implement -- even though there's a lot of argument in the field as to exactly why they work. Even the classic "One-Armed Bandit" explanation is under attack.
There has been some work done where a genetic algorithm/program evolves a neural network, and there has been nice success with that, as long as the problem is conducive to solution by a network. But in general, GA/GP can use any kind of solution architecture, and you choose one that makes the most sense for your problem. Google for +genetic +evolve +"neural network" for more info.
--Rob

--
Towards the Singularity.
Re:What makes a good cache? by Dachannien · 2004-07-06 09:08 · Score: 2, Interesting

Not a flaw. When dealing w/ GAs, one needs to create the initial generation somehow, and using 'randomly generated' GAs provides a good start that is statistically unlikely to be biased or stuck in a local maxima/minima.

Except that wasn't what the poster was getting at - he was noting that the trials provided to the GA were essentially random, and didn't necessarily reflect what the cache is likely to see IRL.

I suspect, however, that this is an error in the NS article, and that the data sets *were* tuned to particular situations.

On a side note, some of the research our group is doing suggests that random initial populations in GAs can hinder the search, and that the first portion of the search is relatively indistinguishable from random search, because the fitness in most problems is flat over large portions of the search space. If you do know something general about the solution to the problem, it might not be a bad idea to seed the initial population with that knowledge.

Algorithms by bchernicoff · 2004-07-06 02:52 · Score: 3, Funny

I wonder if they evolved logic to counter the slashdot effect. 1. Scan slashdot.org for new stories every five minute. 2. Scan new story for links. 3. Cash those pages.

what about worm traffic? by garcia · 2004-07-06 02:53 · Score: 3, Interesting

From what I can tell a good majority of the traffic that my machine receives is worm traffic. Would these genetic routines be setup to disregard those as cache data? If that's the case would they be setup to just block that data?

That alone would save me quite a bit of traffic as people on my subnet hit me constantly with their infected machines.

66.41.161.120 hit my machine 57 different times (that isn't individual requests, that's total times).

LRU Rules by Mirk · 2004-07-06 02:53 · Score: 5, Informative

There's a good reason why LRU caching (least recently used) is so widespread, and that is that it's very very hard to come up with a sophisticated algorithm that outperforms this very naive one.

For the uninitiated, elements are added to an LRU cache until it fills up; thereafter, whenever a new element is added, space is made for it by throwing away the least-recently used one. Note, least recently used, not the least recently added, i.e. the oldest, since an element that was cached long ago may be used all the time, and so be well worth its place in the cache. For example, consider the company-logo image that your browser caches when you visit a new site and that is embedded in every page on that site. However old it gets, it's likely to continue to be used while you're on the site. As soon as you move to another site, it gradually shuffles its way down the deck until it falls off the bottom - which is precisely what you want.

--

--
What short sigs we have -
One hundred and twenty chars!
Too short for haiku.

Re:LRU Rules by rossjudson · 2004-07-06 03:12 · Score: 3, Informative

ARC caches fairly handily outperform LRU. When LRU is appropriate they adapt towards it. When frequency is important they adapt towards that.
Re:LRU Rules by arth1 · 2004-07-06 04:12 · Score: 2, Interesting

There's a good reason why LRU caching (least recently used) is so widespread, and that is that it's very very hard to come up with a sophisticated algorithm that outperforms this very naive one.

For small cache sizes with one or a few fairly similar users, this is true. For larger cache sizes with a large number of users, it's almost always better to expire based on both LRU (Least Recently Used) and LFR (Least Frequently Requested). You may want to cache all those sucky Christmas links until next year even though no-one is going to access them from mid-January until October. The number of requests for those objects will, with a combined LRU/LRF scheme ensure that they have a lesser chance to be harvested than an item that was accessed once, two months ago.
It also means that a site that has been slashdotted once has a less chance of being slashdotted again.

Regards,
--
*Art
Re:LRU Rules by Fzz · 2004-07-06 04:13 · Score: 4, Informative

ARC caches fairly handily outperform LRU.
Thanks for the pointer. Here's a link to some background on ARC, and a paper describing the algorithm. Looks like an interesting algorithm.

Not DNS by artlu · 2004-07-06 02:54 · Score: 3, Interesting

It already takes forever for DNS changes to propagate through every network, which can be extremely frustrating when you have a high bandwidth domain. There definitely needs some optimization on the DNS front.

GroupShares Inc.

--
-------
artlu.net

Re:Not DNS by Short+Circuit · 2004-07-06 03:01 · Score: 3, Informative

If I'm not mistaken, the DNS RFC takes into account the fact that domain records will be cached. That's why the records have expiration and refresh times on them.

--
tasks(723) drafts(105) languages(484) examples(29106)

bittorrent tie in? by Chuck+Bucket · 2004-07-06 02:56 · Score: 4, Interesting

I"ve always wondered if something along the lines of cache complimented with a Bittorrent type of scheme couldn't help speed up the internet. that way bits would be mirrored all over, and a server could pull them in faster since more servers did less work each.

just something I'm thinking about today, well, that and the Kerry/Edwards pairing.

PCBRwer342$#

--
free ipod and free gmail!

Next Gen Networking? by arieswind · 2004-07-06 02:56 · Score: 5, Interesting

From the article:
He[Pablo Funes] suggests networks might in the future be designed to work out who deserves the most help for themselves. "Sophisticated network behaviours might implement rules for reciprocity and trust," he says. "And conversely, for not cooperating with other others who try to abuse our resources."

The future of network security? Imagine the next computer virus outbreak: Every network in the world could recognize the virus type activity and allocate them lesser or zero resources, maybe sending them a "Virus detected, please run antivirus software or contact your IT Department" notice, and detecting outside attacks from viruses and automatically flagging them as unsafe, and not give much(or any) attention to traffic from or to that site

the details of their method by Dezer · 2004-07-06 03:02 · Score: 5, Interesting

Actually, I do genetic algorithm / genetic programming research at the university of michigan. It's unlikely that these guys are using genetic algorithms to develop a new algorithm, but are rather using an existing algorithm and *tuning* the associated parameters using a GA. Given a list of parameters, GA's work by finding the best combination of parameters. As a result, the settings could be constantly tweaked (say on a daily/hourly basis) and different servers could still have different regional settings. My only problem with the concept is that it still depends on the tuning of pre-existing algorithms... but still - the results they share (2x improvement) is encouraging.

Al Gore is a great contributor by hugesmile · 2004-07-06 03:05 · Score: 4, Funny

evolutionary algorithms could make many network caches twice as efficient

Once again, Al Gore has his hand in the shaping of the internet.

I'm sure everyone in the Slashdot community will miss him - even if you didn't enjoy his work, there's no denying his contributions to popular culture. Truly an American icon.

A note on hill-climbing by Animats · 2004-07-06 03:08 · Score: 5, Informative

Genetic algorithms are methods for optimization in bumpy spaces. The basic goal is "find x such that f(x) is maximized". As optimization algorithms, they should always be tested against the two simple optimization algorithms - basic hill climbing, and random search.

If the search space is not dominated by local maxima, basic hill climbing (go for the best neighboring value) will work. And it will be fast. If the function is differentiable, it can be orders of magnitude faster than other methods, because you can use a variant of Newton's Method.

If the search space is small, random search (just guessing) will work by exhaustively searching the space. This is obvious, but tends to be ignored in academic papers all too often.

This discussion also applies to neural nets and simulated annealing.

Now this article at least describes a problem for which a GA might actually be useful. Many such articles don't. But they haven't demonstrated that you need a bumpy hill-climbing algorithm.

This is why, despite all the hype, GAs, neural nets, and such aren't used all that much. The search space has to have the right properties. Not too small, not too big, bumpy, but not too bumpy.

Re:A note on hill-climbing by 55555 · 2004-07-06 04:12 · Score: 2, Informative

Misses the point. GAs do not look at search spaces and ask 'is this function too bumpy'. GAs (as well as many other evolutionary algorithms, see GECCO) are more similar to stochastic greedy strategies. I'd suggest looking at Walsh coefficients instead of derivatives as a rule of thumb.

A large part of the reason GAs aren't catching on more is that there aren't enough really good text books on them. All of the really useful information is still in the form of research papers, these things take time. Also, we're only starting (last 5 years) to know enough to be able to routinely apply them. Without that, they're too hard and too expensive to use.

As for random search, good papers look at the scaling factor for their algorithms. The goal is to build GAs that are quadratic. Any paper which suggests an exponential time GA (like random search) is wasting your time. The idea is to have a range of algorithms, greedy search for logarithmic time (or gradient descent), GAs for quadratic, and enumerative search for NP-Hard problems.

The parent is either ignorant, overly jaded, or being a jerk.

Legos and Tron mentioned in his thesis by the+frizz · 2004-07-06 03:16 · Score: 3, Informative

The article link is light on details of the evolution algorithm, but Pablo Funes's home page has the text of his thesis on Evolution of Complexity in Real-World Domains. It talks about his use of evolving algorithms on topics like designing the strongest lego brick structure and playing Tron. Very cool, but not its application to caching.

The is even a link to an online Tron game where us humans can play versus his evolving algorithms. The win/loss stats for his algorithm is approaching even. Given that humans can also evolve in the Tron game play, I imagine that the algorithm will have a head start over the new influx of slashdot visitors and start to win more often than not over the next week. I never got to play though. The SQL db to mange the stats was already down then I tried.

This can't be legal by TimCrider · 2004-07-06 03:20 · Score: 2, Funny

What happens if someone caches Metallica, or the new super hot J-Lo movie? Then how is the guy who brings the director a warm towel ever going to make his money?

You guys should really think before you go out and start making technology that blatently abuses copyrights

</sarcasm>

Re:*sigh* by NialScorva · 2004-07-06 03:22 · Score: 4, Informative

Evolution is the change in allele frequencies over time. Pure and simple, that's all it is. Check any evolutionary biology book. The notion of giraffes sprouting gills is absurd and not even remotely what evolution is except in the creationist strawmen. What prevents a whole lot of small changes from adding up? No one has discovered a barrier or any reason that this wouldn't occur. It's like saying that water can move grains of sand, but there's no proof that a lot of water will eventually erode a beach.

The evolutionary algorithm will have a range of all possible algorithms that can be developed, so in a sense it is limited to "test various algorithms", though it would be testing all possible algorithms. Similarly biological evolution is limitted to testing various imperfect self-replicators, meaning all possible imperfect self-replicators. It is further constrained by the current state, but then that's the problem with non-biological GAs as well, the King of the Hill effect.

This seems a little...selfish by PhysicsGenius · 2004-07-06 03:22 · Score: 2, Insightful

"Priming the cache" and "doing a breadth-first fetch on a page" are both things that create *more* network traffic on the off-chance that it might save some number of microseconds for the user. It's basically a Tragedy of the Commons situation. Everyone would be better off if no one pre-fetched links, but any given person is better off if they don't cooperate with that global model. So everyone just grabs what they can get and everyone is worse off.

Re:This seems a little...selfish by droleary · 2004-07-06 04:08 · Score: 4, Interesting

"Priming the cache" and "doing a breadth-first fetch on a page" are both things that create *more* network traffic on the off-chance that it might save some number of microseconds for the user.

More traffic isn't necessarily bad. While what you say is true, you fail to note that user-initiated traffic is done in bursts. Just like your CPU is idle >95% of the time, so is your network connection. So all users benefit, both in real and perceived performance, when there is a steady 100% utilization.

So everyone just grabs what they can get and everyone is worse off.

Again true, but naive. What would be better is if there were a mechanism to prioritize the pre-fetch cache. Every page has one link that is pretty much the most likely next page. Then a secondary link, and so on. Ideally, a site owner should be able to put that priority list somewhere in the page such that a user agent can start getting it after the current page has loaded and is being read. Otherwise, maybe the user agent can favor new content (i.e., compare this load of Slashdot with the one done five minutes ago and grab the links in the diff). That's a far cry from a mad grab, and would probably benefit all parties involved.
Re:This seems a little...selfish by droleary · 2004-07-06 08:46 · Score: 2, Interesting

There's the link tag for html headers, which specifies the next page, previous page, and things like that.

Ah, I knew it was too good an idea to be all my own! For those not aware of how to use the link element for that, I found this page as well as this one. I don't know how widely it is used by site designers, and I can't say I know of any web browsers that use it, but I would say it is definitely something a proxy server should be taking advantage of. It's use is even stated directly in the second document:

User agents may choose to preload the "next" document, to reduce the perceived load time.

Re:*sigh* by Conspiracy_Of_Doves · 2004-07-06 03:26 · Score: 3, Informative

ANY environmental adaptation of genetic code over multiple generations is evolution. It does not have to be a major change. BTW, creationists are the only ones that make a distinction between macro-evolution and micro-evolution. If fact, creationists are the only ones that I have ever heard use the terms.

--

Technoli

Re:*sigh* by GeckoUK · 2004-07-06 03:34 · Score: 2, Insightful

Score:3 Insightful? when did creationist trolls become insightful?

Nature makes no distinction between your so called macro and micro-evolution. Rather there are to separate processes at work that combine to produce evolution.

1. Survival of the fittest (what you call environmental adaptation) starts with a population with a mixture of genes and states that those best able to reproduce in the current environment will probably make up a larger percentage of the next generation.

2. Genetic crossover during reproduction and random mutation, happens to keep the gene pool mixed and introduces new adaptations. This provides the "raw material" for survival of the fittest to work on.

Large changes are unlikely in nature because 99.999999% of major random changes to an organism will result in something that is less fit than its siblings (a giraffe with gills for instance will either suffocate in the savannah, or be unable to eat if by some miracle it also finds itself in the ocean), small changes however have better chance of being an incremental improvement that will become dominant in a population through natural selection. These small changes WILL compound over time to produce major changes in physiology (for example the way whales have adapted back to living in the oceans (actually for a really good example go read Richard Dawkins description of how eyes have evolved in many different creatures as it strikes down very effectively the old creationist chestnut of "the eye is far to complex to have ever evolved in stages")).

ARRGGHHHHH why did I bother writing all this, I know you are a troll.

Re:*sigh* by The+Cydonian · 2004-07-06 03:39 · Score: 3, Insightful

I'm presuming you come from a biological background, but allow me to say this:- evolutionary programming is more about using evolution as a metaphor rather than modelling evolution per se. This I gathered after endless conversations with my (gastroenterologist) dad and (plant pathologist) mom on my research work in my final year of college.

Most computer scientists loosely use the term "evolutionary programming" to talk about algos that have inherently unpredictable ("emergent") results rather than modelling actual evolutionary processes observed in nature, although that's also a fair part of it all. (Incidentally, I also believe the "Science" topic assigned to this story is wrong for this reason.)

The meta-algo is evolutionary programming in that the algo fianlly "developed" by the meta-algo is apparently result that isn't immediately apparent, indeed, one that perhaps unpredictable by humans.

--
More than mere navel gazing.

Re:*sigh* by StrawberryFrog · 2004-07-06 03:52 · Score: 2, Insightful

Evolution implies ... e.g. An ameba suddenly mutates into a multi-celled configuration. Or a giraffe suddenly develops gills (Emphasis added).

Rubbish. There's nothing sudden implied.

--

My Karma: ran over your Dogma
StrawberryFrog

Re:*sigh* by Laxitive · 2004-07-06 03:58 · Score: 3, Interesting

If you had studies GAs, you would know that GAs do "create" new information that doesn't exist in the original solution set (i.e. any given generation of potential solutions).

GA systems both propogate "good" information, as well as generate and test new information incrementally across generations. Of course, this is done within whatever limitations you choose to impose on the solution set (i.e. if you're using a fixed N-bit string to represent entities in the solution space, then you'll search across all 2^N possible solutions in the N-bit string space).

Fundamentally, GAs are just a way of searching a solution space for good solutions. They are slow, but can tackle a more diverse set of problems easier than the traditional backpropogation-based neural network. I would consider neural networks to be more in line with your description of an 'adapting' learning algorithm.

The following analogy can only be taken so far, but you can compare backpropogation-based neural networks to be akin to hill-climbing algorithms - you start from one solution, move some limited distance towards what you think is a better solution. For a neural network with N vertices, you can think of neural network learning as hill-climbing in an N-dimensional space over whatever field used to specify edge weights.

GAs, on the other hand, do a much more disparate search over any solution space. And it is surprising how well it finds near-optimal solutions even for problems where the relation between solutions is complex and non-intiuitive (e.g. minimums for seemingly chaotic functions).

-Laxitive

A new sorting algorithm also ... by ThomasCR · 2004-07-06 04:06 · Score: 2, Informative

... has been evolved
with Critticall tool.
Original site.

not even close by Anonymous Coward · 2004-07-06 04:25 · Score: 5, Insightful

Garbage collection is required to be correct, but is allowed to keep extra stuff around as memory permits. Everything that must be kept can be calculated deterministically without future knowledge, and the same holds true for everything that can be discarded. Approximation merely allows the what can be discarded calculation to progress faster. In garbage collection it is not correct to throw out something that will not be used in the future unless it is also unreachable.

For cacheing there is no requirement of correctness. Performance is improved when things that will be used in the future are kept. There is no correctness requirement for keeping things around, and indeed best performance often requires discarding things that will be used in the future. In addition, there is no way of determining what will be used in the future without knowledge of the future. The correctness requirements of a cache are related to security (don't cache sensitive information) and staleness (don't cache stuff that is too old), but even those may be relaxed rather than strict requirements.

Fundamentally the two problems are very different, and algorithms that deal are successful in the two different cases will likely be very different.

50% of all bits by TMacPhail · 2004-07-06 04:28 · Score: 3, Funny

50% of all bits sent over the internet are 0s. Just cache that and we have a 50% cache hit rate. :)

Re:*sigh* by Conspiracy_Of_Doves · 2004-07-06 04:33 · Score: 2, Informative

Natural selection is one of the methods (the other being sexual selection) by which all evolution (macro and micro) proceeds.

--

Technoli

Skips too far forward by Lulu+of+the+Lotus-Ea · 2004-07-06 04:34 · Score: 2, Informative

Evolution is *not* the "change in allele frequencies over time." Or at least that's a secondary issue, historically, conceptually, and causitively. Well, on the last, allele frequency is pretty important (but still not exclusive). But evolution as a concept is firstly a question of PHENOTYPIC change, even before it's about GENOTYPIC change.

Evolution was a well known phenomenon (e.g. from the geological record) long before the work of Mendel was known. Specifically, the rather well known Darwin had no concept of an allele when he came up with the Theory of Evolution by Natural Selection. In fact, Darwin's approach was most certainly not the only explanation of evolution floated in 19th century biology. For example, Lamarckian mechanisms had a plausibility at the time... in fact, Lamarckian mechanisms still represent an important underexamined area of evolutionary mechanisms, for historical reasons (Lysenko etc.)--see, for example, the work of Ruth Hubbard and the role of viral insertions into the genome. So the whole allele thing is only part of one (important) mechanism, only understood relatively late in the scientific study of evolution.

Moreover, even in modern terms, mutation still exists; admittedly, the high-school text book focus on point mutations of genes is vastly overplayed. But just looking at allele frequencies misses also both the role of regulatory genes, and also ignores structural changes in chromosomes. Genes and smaller sequences migrate within and between chromosomes--this has nothing to do with alleles. Stephen Gould's work on phentotypic growth regulation in _Ontogeny and Phylogeny_ is good here, as is the tome _Structure of Evolutionary Theory_.

--
Buy Text Processing in Python

so far so good but.. by brunokummel · 2004-07-06 04:57 · Score: 3, Insightful

Caching information can also be used as a backup of data in case the server crashes or has its data compromised by a virus , hacker, whatever you feel like ..
so what happens if you, for instance, have a security hole in one of your "smart" servers?? or even a breach in the protocol structure (DoS)?

you could get the server to "breed" algorithms that would stop all servers by either corrupting the data in all servers that are providing the service or just DoS-ing them.

I guess i'm not yet convinced that this solution is of any good for real world network which the whole structure is based on insecure protocols.
maybe after IPV6, IPV9, who knows?

--
What is best in life? To crush your enemies, to see them driven before you and to hear the lamentations of their women.

Re:*sigh* by NialScorva · 2004-07-06 06:19 · Score: 2, Informative

No, it's not that simple. Evolution has been defined as changes adding up enough so that a single celled organism becomes a multi-celled organism, which develops RNA and DNA, which then grows in complexity to develop various "macro" level systems such as a cardio-pulminary system, a digestive tract, a nervous system, and a central control "brain" system. Each of these changes requires a macro change of which no predecessor exists in the current chain of organics.

That is one of the facts that it explains, not the definition. You're doing the equivalent of saying that a ball falling is the definition of gravity. Evolution is the collection of mechanisms by which these occur, which under the modern synthesis is by various alleles changing over time in a population.

Actually, my first example was the very real issue of a single celled ameba becoming multi-celled.
And oddly, stages between single cell and multicell still exist. The Volvox are single cell creatures that only function in colonies. A colony is a sphere that can grow a daughter colony inside it, then give birth to a new colony. Yet all of the cells involved are generic, not specialized into tissues. Sponge and jellyfish are on the other end of the gap. They both show tissue level specialization without organs.

A giraffe sprouting gills would be an evolutionary change, but is highly unlikely.

A giraffe sprouting gills would be a challenge to modern evolutionary theory, not an example. Modern evolution relies on the existing material being gradually changed or coopted.

A more realistic example is a fish developing lungs. According to the current theories, most life originated in the ocean. I don't remember if the current thought is that lungs and gills developed simultaneously or if they developed in unison. Either way, the effect is much the same. A low order life form developed features not currently in its genetic makeup.

Such as a mud skippers and lung fish? They both occupy niches and use mechanism similar to the first air-breathers using modified gills and air bladders, respectively. Not only are the transition species possible, but they're descendants still exist.

Which is exactly my point. No new algorithm will be generated through the "genetic breeding". Only existing algorithms will rise to the top.

That's interesting. I can write a program that contains the Axioms and rules from ZF Set theory, then randomly applies them to create new theorems. But that would be pointless since only theorems that existed in the orginal data and algorithms "will rise to the top". The whole of mathematics is pointless since all theorems are tautologies from the axioms and rules of manipulation.

Freenet by zoeblade · 2004-07-06 07:14 · Score: 2, Interesting

Doesn't Freenet already do this within its own network?

Re:Sigh... by NoOneInParticular · 2004-07-06 08:13 · Score: 2, Informative

Actually, I have the real article here (not the New Scientist rubbish), the one they published at a conference, and what they do is evolve a function that takes the input data and computes a priority for that page. Pages with low priority are flushed, pages with high priority stay cached. One of the interesting expressions they found was:

priority = lastTimeAccessed * (distance - accessCount)

\ This was on data where the distance the document had travelled was taken into account. So, given some available input data they evolved a priority function (using GP) that servers as the center for a caching strategy. This with apparent great success.

Re:*sigh* by AKAImBatman · 2004-07-06 17:19 · Score: 2, Interesting

I don't believe you understand the word "evolution". Evolution is not a huge jump that rapidly redefines something, it's an almost unnoticable change over time.

No, the current theories state that it is a unnoticable change over a long period of time. On a higher level, evolution simply states that major changes occurred over a long period of time. It also states that those changes occurred in an "organic vacuum". e.g. Lungs developed where no lungs had ever existed before.

Evolution can be understood as a search through a problem space. The caching algorithm sets up a problem space of "all possible caching algorithms". A P2P server would be completely outside of the problem space and thus irrelavent.

If that was really the case, then higher organisms should never have developed. The problem space for the "simple" organism was to survive in its wet environment. I believe you named colonies of amebas as an example of "vestigal organisms" at one point. Without arguing that point, what forced these successful organisms to then develop different cell types? How did different cell types improve their success rate? From there, how did developing complex DNA improve their success rate? From there, how did specialized macro-organs improve their success rate?

In evolutionary theory, the "simple" life form was striving to become a better adapted life form. It obviously exceeded its problem space by a wide margin, or you and I would not be having this discussion today. Thus if GAs are really indicitive of "real" evolution, then why can't they eventually evolve into "higher algorithms"? A P2P network could very well improve the GA's success rate for its problem domain. It can start by accidentally reusing its network connections, then accidentally making contact with other cache hosts with the same mutation, then it can evolve a protocol by accidentally learning each communication packet through a torrent of random spam packets, etc.

While that sounds somewhat silly, that's not my point. My point is that evolution states exactly that sort of behavior over a long enough period of time. If a GA is a good emulation of micro-evolution, and macro-evolution is based on micro-evolution, then shouldn't we be able to compute a set of mutations that take the program out of its pre-programmed algorithms? Shouldn't it be able to leverage parameters that we didn't program it to leverage?

--
Javascript + Nintendo DSi = DSiCade

Re:*sigh* by AKAImBatman · 2004-07-07 03:07 · Score: 2, Interesting

Will you make up your mind, first you agree with the grandparent poster that one particular example doesn't prove the general case, and then you make the claim that the SCA example does suggest the general case is zero sum.

I have made up my mind. The parent of my post made a statement that was factually correct: SCA does not "prove" that positive mutations can't happen. I then go on to show that everything from SCA to genetic manipulation has so far shown us that genetics is a zero sum game. It's perfectly possible that it's not, but we haven't yet proved it either way. All we have is a mountain of data that suggests genetics is a zero-sum game.

Come on man, make up your mind! which side of the fence are you on?

How is that scientific? If you always approach a problem from the angle that "X must be right", then you'll never make any new discoveries. If something can be observed then proven to be true, then we must adjust our world-view and move on.

It is worth noting that there is a natural bias towards observing negative mutations, not only are they probably far more common but also they stand out against the background, i.e. when people start dieing, or are born with gills, or no legs then people are going to notice pretty damn quick and start saying "ohhhh, mutation==bad!"

This is quite true.

however a beneficial mutation (for example a resistance to the common cold) is much more likely to go unnoticed at first,

The difficulty here is that until we can isolate and identify "positive" mutations, they are nothing but speculation. It's possible that it's just a matter of time before such mutations are discovered, but we can't say they exist until we can demonstrate them. It's a bit like Tachyons in physics. Theory said they should exist, therefore they existed, right? (If it's on Star Trek, it must be true!) Turns out that Tachyons are now considered to have been a "glitch" in string theory that has now been worked out. No time-traveling phantom particle, sorry.

as it is a lot harder to notice or prove that something hasn't happened during a persons lifetime.

It's very difficult to prove a lot of things in science. That's what we have laboratories for. By direct manipulation of creatures with short life spans, scientists can observe how genetics work and test theories on how a positive mutation might be induced. With more and more computing power at their disposal, it should only be a short time before we're able to begin very fast computer modeling of entire genomes.

The first surprise has already come in the form of shorter gene sequences than predicted. It wouldn't surprise me if biologists find themselves completely rewriting the books on genetics in the near future. If there's anything that physics has taught me, it's that the Universe is not one to easily give up its secrets.

I apologise if you took offence at being called a troll earlier, I meant it in the original sense of someone who deliberately posts inflammatory content in the hopes of sparking a debate. (anti-evolution posts certainly qualify on Slashdot)

Apology accepted, but with one caveat. I never claimed that evolution was false. It's not very nice to stomp all over the theories of others just because you don't like it. My specific complaint was the outright confusion caused by the "micro-evolution" renaming of environmental adaption. I don't like it, because no link has been demonstrated. This gives people the mistaken impression that they can write software using genetic adaption type algorithms, and actually produce software that "evolves" in the macro-evolutionary sense. The truth is that the software doesn't go anywhere. It merely "adapts" to the situation at hand until the most powerful solution can be found.

As for calling you a creationist.... well are you are aren't you, you claim to not believe in evolution, so how do you think we got here?

Ac

--
Javascript + Nintendo DSi = DSiCade

Slashdot Mirror

"Evolved" Caches Could Speed the Net

55 of 195 comments (clear)