Collaborative Filtering and the Rise of Ensembles

← Back to Stories (view on slashdot.org)

Collaborative Filtering and the Rise of Ensembles

Posted by timothy on Tuesday September 1, 2009 @06:49AM from the soon-there-will-be-symphonies dept.

igrigorik writes "First the Netflix challenge was won with the help of ensemble techniques, and now the GitHub challenge is over, and more than half of the top entries are also based on ensembles. Good knowledge of statistics, psychology and algorithms is still crucial, but the ensemble technique alone has the potential to make the collaborative filtering space a lot more, well, collaborative! Here's a look at the basic theory behind ensembles, how they shaped the results of the GitHub challenge, and how this pattern can be used in the future."

58 comments

Min score:

Reason:

Sort:

Open source governance by Anonymous Coward · 2009-09-01 06:55 · Score: 2, Interesting

So can ensembles be used to create more sophisticated forms of direct democracy? That is, where everyone has input into decision-making, but where that input is vastly more complex than simple majority rule by the mob?
FWIW, this is called open source governance.
1. Re:Open source governance by Anonymous Coward · 2009-09-01 07:11 · Score: 1, Insightful
  
  Thing about gov is everyone insists on having equal voting power... and the whole reason ensembles work is due to unequal weights---maybe those who vote "better" than others are given more voting weight?
2. Re:Open source governance by Anonymous Coward · 2009-09-01 07:16 · Score: 0
  
  At least one system proposes just that: http://metagovernment.org/wiki/Scoring_system#User_score
3. Re:Open source governance by sakdoctor · 2009-09-01 07:32 · Score: 2, Insightful
  
  "better" would require a fitness function; and everyone thinks that they vote "best".
  If it were possible to define such a perfect function, we wouldn't really need voting anyway. We could just get a computer to crunch all the parameters and spit out a utopia.
4. Re:Open source governance by idontgno · 2009-09-01 08:14 · Score: 1
  
  Donald Fagen, is that you?
  
  A just machine to make big decisions
  Programmed by fellows with compassion and vision
  Well be clean when their work is done
  Well be eternally free yes and eternally young
  -- I.G.Y.
  
  --
  Welcome to the Panopticon. Used to be a prison, now it's your home.
5. Re:Open source governance by kthejoker · 2009-09-01 08:16 · Score: 3, Informative
  
  "Fitness" doesn't always have to be related to the output; it can be related to the quality of a guessed input.
  Consider the corollary of a poll test: a model in which "trusted" voters receive extra votes while everyone else still gets on vote. You can determine "trustworthiness" (or "karma", if you will) the same way Slashdot does - through moderation and meta-moderation, or you can use a more objective "de minimis research" criteria (like a poll test but without the punishment for failure.)
  So someone voting on a school board bond election who can correctly answer questions about the stated usage of that bond, or the school district's financial bond rating, or who attends a school board meeting discussing the bond, could get 2 votes for the price of one.
  This would a) allow "passionate" (albeit informed) voters to have more of a say than someone who is indifferent, and b) encourage people to do research and get involved in politics.
  In a way, it's anti-democratic, but if you are going to insert any sort of elitism into the system, it might as well be a meritocracy.
6. Re:Open source governance by Korin43 · 2009-09-01 08:59 · Score: 2, Interesting
  
  Instead of trying to come up with a way to decide who gets extra votes, why not just go with transferable votes, where you can either vote on issues or tranfer your vote to another person (reversible of course). That way if I know that someone else is more informed than me, I can transfer my vote to them and not bother with with the details of politics. Each representative then has their vote, plus all of their constituents votes.
  
  It deals with the problems of underrepresentation (right now if 51% of people voted for one person, and the other 49% voted for another, the 49 group is completely unrepresented), lack of knowledge among voters (most people know nothing about politics, but at least know someone who knows more than they do), and a lot of government incompetence problems (if someone in "your party" does something you don't like, it's not "the Republican" vs "the Democrat", you can just transfer your vote to another person in your party.. although hopefully this would just destroy political parties instead, since they're not helpful).
7. Re:Open source governance by DragonWriter · 2009-09-01 09:04 · Score: 2, Insightful
  
  So someone voting on a school board bond election who can correctly answer questions about the stated usage of that bond, or the school district's financial bond rating, or who attends a school board meeting discussing the bond, could get 2 votes for the price of one.
  This would a) allow "passionate" (albeit informed) voters to have more of a say than someone who is indifferent, and b) encourage people to do research and get involved in politics.
  Giving everyone one vote already does that, without the ability to game-in discriminatory effects through poll tests (the ability to game-in discriminatory effects comes when you give people the right to write the questions and determine what the "correct" answers are, which is essential to having such quizzes); the more passionate are more likely to participate in politics (whether by voting or otherwise), and the more informed are more likely to realize their goals through whatever action they take, so both passion and information are already rewarded.
  
  In a way, it's anti-democratic, but if you are going to insert any sort of elitism into the system, it might as well be a meritocracy.
  "Merit" is subjective; all elitisms view are justified (esp. by the chosen "elite") as some kind of meritocracy.
8. Re:Open source governance by RiotingPacifist · 2009-09-01 11:41 · Score: 1
  
  Instead of trying to come up with a way to decide who gets extra votes, why not just go with transferable votes, where you can either vote on issues or tranfer your vote to another person (reversible of course).
  because a lot of people are lazy and/or greedy and would simply sell all thier votes to the highest bidder
  
  It deals with the problems of underrepresentation (right now if 51% of people voted for one person, and the other 49% voted for another, the 49 group is completely unrepresented
  Not really, if you consider a president normally has little power, most power is held in houses where everything is voted on, so the 49% get their representation, unfortunately people at some point decided strong governance is worth more than representation of individuals so most countries have retarded voting systems that take away quite a bit of this representation. currently most countries have fairly stable political climates, so strong governments (and as a result strong parties) aren't really required, so most of countries would be better served by some form of proportional representation and as a result a divided government that would compromise (on the few issues they disagree on). in the uk, the lib dems would hold some actual power (they get a substantial vote, but its spread evenly so they get fewer seats), in the us the libertarians would at least get the 1% that vote for them (if not more as people would actually vote for them)
  
  --
  IranAir Flight 655 never forget!
9. Re:Open source governance by Anonymous Coward · 2009-09-01 11:51 · Score: 0
  
  You mean like this?
  http://zelea.com/project/votorola/home.xht
  Found it on the op's link:
  http://metagovernment.org/wiki/Votorola
10. Re:Open source governance by ObsessiveMathsFreak · 2009-09-01 12:29 · Score: 1
  
  Not a chance. "The IQ of a mob is equal to that of its lowest member....divided by the size of the mob."
  
  --
  May the Maths Be with you!
11. Re:Open source governance by Hognoxious · 2009-09-01 19:16 · Score: 1
  
  That way if I know that someone else is more informed than me, I can transfer my vote to them and not bother with with the details of politics. Each representative then has their vote, plus all of their constituents votes.
  It deals with the problems of underrepresentation (right now if 51% of people voted for one person, and the other 49% voted for another, the 49 group is completely unrepresented)
  How does it deal with that? The 49% aren't represented, whether they vote directly for the losing candidate or whether their proxy does the same.
  
  --
  Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Group Labor by r6_jason · 2009-09-01 07:04 · Score: 2, Informative

Of course having a group of people working together is a strength. If you are having a bad day or just feel like slacking off some one else is there to pick up the slack and keep the project moving. See also Division of labour http://en.wikipedia.org/wiki/Division_of_labour
1. Re:Group Labor by Trepidity · 2009-09-01 07:19 · Score: 5, Interesting
  
  Although that's true with humans, it's a bit curious why it'd be true with algorithms. After all, the aggregation of 3 algorithms is still just an algorithm. It's not even totally clear which algorithms are ensembles and which aren't--- some non-ensemble methods could be re-analyzed using ensemble terminology, and some ensemble methods could be rewritten as unified iterative loops that don't look very ensemble-y. The jury's still out on the whole subject, as far as I can tell (I'm not an ML person, but I'm an AI person whose research bleeds into ML).
  An exception is when you're aggregating information from truly different statistical problems, in which case you inherently have an ensemble problem, until someone comes up with the theory (plus tractable implementation) to view the problem as one unified statistical problem. I think collaborative filtering is currently in that stage--- there's no canonical way to pose the problem in the terminology of statistical regression/etc. that captures all aspects of it.
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
2. Re:Group Labor by CorporateSuit · 2009-09-01 07:47 · Score: 1
  
  I think you're looking too deep into the pond on this one. The load bearing can be taken up with a few "if/elseif/else" calculations.
  
  --
  I am the richest astronaut ever to win the superbowl.
3. Re:Group Labor by FlyingBishop · 2009-09-01 08:40 · Score: 1
  
  Well, it's firstly true with humans because you have always been able to do parallel processing with multiple humans.
  But it's perfectly applicable with algorithms, especially now that we have better parallel machines.
  But even so, you take a task where you have several heuristics, something like the ubiquitous quicksort but a little less reliable. If all of them have a 50/50 chance of working, you set all of them at the problem at once, and even on a single core, you end up with a fairly robust and accurate heuristic.
4. Re:Group Labor by amplt1337 · 2009-09-01 08:52 · Score: 1
  
  So in other words, it's Yet Another Buzzword?
  
  --
  Freedom isn't free; its price is the well-being of others.
5. Re:Group Labor by xdotx · 2009-09-01 11:48 · Score: 1
  
  Untrue, a group of people is NOT necessarily a strength. Groups encourage slacking off and discourage taking responsibility. Or in short, http://en.wikipedia.org/wiki/Groupthink
  
  --
  Our wealth breeds emptiness
6. Re:Group Labor by Anonymous Coward · 2009-09-01 17:12 · Score: 1, Insightful
  
  No, I think his pond is just too deep for you.
7. Re:Group Labor by Hognoxious · 2009-09-01 19:28 · Score: 1
  
  Ensemble is the new mashup.
  
  --
  Confucius say, "Find worm in apple - bad. Find half a worm - worse."
8. Re:Group Labor by CorporateSuit · 2009-09-02 13:02 · Score: 1
  
  Your inability to grasp concepts is apparently making mountains out of molehills. It's people like you who write unreadable wikipedia articles, written in pure jargon because they don't understand the topic well enough to form it in their own words -- so they have to borrow the words that were given to them to define it. In this case, here's the definition of what you think is the "deep pond":
  
  "For A's optimal conditions, use A"
  "For B's optimal conditions, use B"
  "For any other conditions, use C"
  
  I'm sorry, how is that NOT an if/elseif/else statement?
  
  --
  I am the richest astronaut ever to win the superbowl.
a few different things going on by Trepidity · 2009-09-01 07:07 · Score: 4, Interesting

There's a lot of argument over why ensemble techniques work well in general, when using them on well-posed statistical problems. But in the collaborative filtering case, they work well at least in part because there's not a canonical way of posing the problem statistically that's also tractable--- there are instead multiple ways to view the problem, which expose different information. Aggregating those views is a pretty straightforward way of getting more information.
For example, you can see the Netflix prize as a few different standard statistical problems. As a per-movie regression, predicting what Person A will rate Movie B, given ratings vector of Person A and the ratings vectors of everyone who's already rated Movie B [the per-person ratings vectors excluding B are the X's, and the ratings on B are the Y's]. Or you slice the movie-ratings matrix the other way, with per-movie ratings vectors as the X's. Add in some other views (those are the two most straightforward), aggregate all the info you get from them, and you do better than any one approach alone.

--
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
1. Re:a few different things going on by mbkennel · 2009-09-01 08:17 · Score: 4, Interesting
  
  "There's a lot of argument over why ensemble techniques work well in general, when using them on well-posed statistical problems."
  My opinion:
  1) It's a form of regularization & noise averaging. Different classes of estimators have different systematic errors and proper averaging almost always performs better than any single estimator. In more limited contexts, e.g. parametric estimators with variable numbers of free parameters (small compared to # of data points) this is well established in Bayesian contexts. It's like regularization in the sense that the averaging will exclude the howlers---occasional idiosyncratic screw-ups from any single estimator, phenomena that tend to happen with under-regularized estimators.
  2) "But in the collaborative filtering case, they work well at least in part because there's not a canonical way of posing the problem statistically that's also tractable--- there are instead multiple ways to view the problem, which expose different information. Aggregating those views is a pretty straightforward way of getting more information."
  If the thing to be predicted also includes the unknown parameter of "exactly how are the judges going to define the performance metric", then similarly averaging over different possibilities is a good risk-minimization strategy.
  In practice (knowing how people watch movies and what netflix cares about)the statistical setting of Netflix seems to be this:
  There is an unknown distribution of draw tuples of (movies, people, ratings). (in practice, "date-of-rating" and"date-of-movie" turn out to be additional useful data).
  You have observed a large number of these tuples already.
  Then, given a number of draws for (movies, person, ratings) where person is fixed, predict the the rating given (new_movie, same_person).
  The asymmetry is natural because movies and people are not interchangable: movies do not have opinions about people.
  I don't consider Netflix to be very ill-posed statistically.
2. Re:a few different things going on by NoOneInParticular · 2009-09-01 10:43 · Score: 4, Interesting
  
  There's a lot of argument over why ensemble techniques work well in general, when using them on well-posed statistical problems.
  
  Not sure if there's a lot of argument, as the bias-variance decomposition does seem to give some critical insight in why ensemble techniques work. Couldn't find a good link, but loosely every statistical prediction method will make errors due to bias (being systematicaly wrong, always in the same direction), and errors due to variance (being highly dependent on the fitness data, overfitted). Error methods such as sum-of-squares can be readily decomposed into a term for bias and a term for variance.
  Some methods will have errors mainly due to bias (linear methods). Others have error mainly due to variance (neural nets for instance). Whichever method has the lowest sum of the errors - bias + variance - wins. This holds for any prediction method, so also for an ensemble method. If you take for instance the ensemble method of bagging, you will use the same method on differently sampled data (Bootstrap AGGregated). This has as an effect that you average out all the error due to variance, and are left with the error due to bias. If you happen to use a low bias method (such as a neural network or a decision tree), this works fine.
  For the recommender challenges, something even niftier is happening. Here many different methods are aggregated, each with different biases and different variances. So, in theory, when doing this, one should be able to average out both bias and variance error, and converge on the (Bayesian) optimal predictor. Note however that averaging is in itself a prediction method, so ensemble methods have their own bias and variance tradeoff. Fortunately, computing the mean is an unbiased and low variance method in its own right.
  What I always found very interesting about ensemble methods is that they effectively contradict Occam's razor, in that the end result of an ensemble is not a single theory that predicts the data well, but a whole set of mutually contradicting theories that each hold some of the truth. The ensemble result might actually be huge, even when the system is simple.
3. Re:a few different things going on by Anonymous Coward · 2009-09-01 11:21 · Score: 0
  
  Well, descriptively speaking, there's a lot of argument, as papers participating in the argument continue to be published. =] The hottest topic at the moment, afaict, is a dispute being a statistical view of boosting and opponents of the statistical view. There's a good back-and-forth on that in "Evidence contrary to the statistical view of boosting" (JMLR, 2008) and the 6 replies (and rejoinder to the replies) that were published alongside it.
4. Re:a few different things going on by martin-boundary · 2009-09-01 12:02 · Score: 1
  
  This idea that there are multiple ways of partially viewing the full problem is only a symptom that we don't yet know the *right* way to view the full problem.
  There's nothing special about ensemble methods that makes them better than a unified model. On the contrary, they are linear combinations of predictors, and the lack of unification makes this a rather inefficient idea.
  Compare with a series representation (eg Fourier, wavelets, etc): if each of the component predictors of your ensemble method is decomposed into a common set of basis functions, then in the best case there will be little overlap, but more likely there will be much overlap. So the ensemble is re-estimating the coefficients of the major components again and again. Not very efficient.
  Of course, the right functional basis to use is unknown, and quite likely highly problem dependent, so the idea of using ensembles is not too bad practically. But theoretically it's only a stopgap at best.
5. Re:a few different things going on by martin-boundary · 2009-09-01 12:21 · Score: 1
  
  That's not what ill-posed usually means.
  In fact, the actual problem definition was very well thought out, and the performance metric is not at all dependent on the judges, but is precisely the least squares criterion.
  What makes Netflix ill-posed is that the solution is highly sensitive to small changes in the supplied data. This is empirically clear from the fact that people tried many, many different solution techniques (which use the data in different ways), and each of those doesn't get very close to the target performance, and the target itself is not very close to zero anyhow.
6. Re:a few different things going on by Alpha830RulZ · 2009-09-01 14:50 · Score: 1
  
  What I always found very interesting about ensemble methods is that they effectively contradict Occam's razor, in that the end result of an ensemble is not a single theory that predicts the data well, but a whole set of mutually contradicting theories that each hold some of the truth. The ensemble result might actually be huge, even when the system is simple.
  Well put. We have some software that does predictive modeling/data mining using ensemble techniques. The ensemble models dependably work better, with less data preparation effort than the logistic regression models that the financial industry uses on the same sets of data. We have a hard time selling into the lending industry, because the complexity of the resulting ensemble models does not lend themselves well to being explained to regulators, even though the models themselves perform very well on holdout sets.
  
  --
  I was taught to respect my elders. The trouble is, it's getting harder and harder to find some.
Weighting of ensembles by reginaldo · 2009-09-01 07:19 · Score: 2, Funny

One of the difficulties of ensemble development is weighting the logic that is being develeped. For instance, one of the problems we deal with at my job is matching incoming text to it's cleaned value. We have a list of approved words ['happy', 'sad', 'angry', 'sleepy'], and a text input of 'hap'. We need to determine which valid word 'hap' should match. Some rules I can think of for properly matching are:

1.)Length of input compared to cleaned word.
2.)Number of nonpositional letter matches.
3.)Number of positional letter matches.

Depending on how rules are weighted determines what the answer will be (either sad or happy). I know at my job this weighting process requires very careful politicking. :D
1. Re:Weighting of ensembles by kthejoker · 2009-09-01 08:19 · Score: 1
  
  Serious question: do you guys have a software routine you use to account for fat-fingered typos? So do you weight the positional changes by how close a letter is to another on the keyboard?
  So for example,
  Leyboard
  would be a closer to match to "keyboard" than
  Reyboard
  because L is geospatially closer to K on the keyboard than R is?
  That would be an interesting program to see, anyway.
2. Re:Weighting of ensembles by Anonymous Coward · 2009-09-01 13:14 · Score: 0
  
  Also remember to account for Dvorak and different native language biased keyboards (AZERTY, etc.).
Management Summary ? by Anonymous Coward · 2009-09-01 07:26 · Score: 0

Ok, I must admit that the contents of this article were way over my simplistic head. Can someone give me the management summary in laymen terms ?
1. Re:Management Summary ? by turbidostato · 2009-09-01 07:37 · Score: 1
  
  "Can someone give me the management summary in laymen terms ?"
  Do-Your-Maths (full stop).
2. Re:Management Summary ? by Trepidity · 2009-09-01 08:04 · Score: 2, Informative
  
  Here's a stab at 3 sentences:
  Making a prediction by running multiple statistical prediction algorithms and combining their results often seems to work well. This is called an "ensemble method". Ensemble methods seem to work particularly well on collaborative filtering problems.
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
3. Re:Management Summary ? by Anonymous Coward · 2009-09-01 08:04 · Score: 0
  
  Yeah, thats the problem right there. As soon as 'math' is mentioned, my brain functions full stop.
4. Re:Management Summary ? by Anonymous Coward · 2009-09-01 08:17 · Score: 0
  
  Hrm. Using *multiple* statistical prediction algorithms and combining them to produce a result instead of just using a *single* statistical prediction algorithm to produce a result ? As in, using multiple sources instead of a single one source to get better results ? Sounds pretty much like normal common sense to me... But then again, I don't understand the math :)
5. Re:Management Summary ? by mbkennel · 2009-09-01 08:23 · Score: 2, Insightful
  
  Algorithms produce better results working in committees, unlike you.
jq by gmermnstinsmermwords · 2009-09-01 07:30 · Score: 0

I want that in my jquery to be bundled with windows 7
I'm sorry by nickdwaters · 2009-09-01 07:35 · Score: 1

I have been a developer since the 80's. There are so many trendy cute buzzwords flying around (cloud computing...what the ****?), "mashups", "ensemble", etc. Can't we just call it what it is instead of this marketing crap? So tired of it.
1. Re:I'm sorry by badboy_tw2002 · 2009-09-01 07:48 · Score: 1
  
  No problem! Please inform us what year we should have stopped naming stuff and just stuck with the tried and true.
  Sorry, progress doesn't stop because you want it to or are getting a little long in the tooth to learn new things.
2. Re:I'm sorry by Anonymous Coward · 2009-09-01 07:49 · Score: 0
  
  Would you prefer "Combination of Experts?"
  "Ensemble" is actually a machine learning term that's been used for at least a decade now.
3. Re:I'm sorry by kaputtfurleben · 2009-09-01 07:57 · Score: 1
  
  As much as I share your loathing for marketing, I don't believe this is a case of that. There is no company pushing this term, it's a community (or research?) -driven word. I'm not sure there IS a word for it yet except for ensemble. If you know otherwise, please let us know.
  I'll get off your lawn now.
4. Re:I'm sorry by gmermnstinsmermwords · 2009-09-01 07:59 · Score: 0
  
  expedia airline tickets are an online mashup, a cloud are machines programmed to be less like individual machines, and ensemble? well who know's what the fuck that means?
5. Re:I'm sorry by nickdwaters · 2009-09-01 08:13 · Score: 1
  
  Point taken. Utilizing the results of multiple models via bootstrap is nothing new. The term "ensemble" as applied just .... makes me have to deal with a new way which really annoys me. An ensemble as typically defined is a group of musicians working together in harmony. As applied in statistical modeling it is a process which selects the model which fits the data the best. Thus it it makes me irate as my understanding is in opposition with its classical use.
6. Re:I'm sorry by Trepidity · 2009-09-01 08:21 · Score: 4, Informative
  
  Yeah, the term dates back at least to the 1990s. The classic survey paper (over 1000 citations!) on the subject is "Ensemble Methods in Machine Learning" [pdf] by Tom Dietterich (2000), for those who want to glance through a survey. Though be warned that some of its specific conclusions are now dated--- e.g. there's been a *lot* written in both statistics and machine learning since then on what boosting "really" is and why it works.
  Dietterich presents the more machine-learning view of it, focused on algorithms, combination of predictions, iterative refinement, etc. The best survey from a statistical approach is probably Ch. 16 of this book by three Stanford profs, which you can probably read some of on Google Books.
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
7. Re:I'm sorry by amplt1337 · 2009-09-01 08:56 · Score: 1
  
  The "progress" in terminology you're talking about is mostly the result of you being too young to have bothered to learn the old things.
  Which is fine. Way too much stuff has happened in the past for anybody to know it all; it's just unremarkable that everything old should be new again, under a new name, as ideas get born and reborn.
  
  --
  Freedom isn't free; its price is the well-being of others.
8. Re:I'm sorry by Anonymous Coward · 2009-09-01 09:04 · Score: 1
  
  A field is typically defined as a patch of grass with some kind of enclosure around it, often made of wood, stone or natural bushes.
  Please while you are on your campaign, have math, science and computing rename the concepts they refer to as fields.
9. Re:I'm sorry by JumpDrive · 2009-09-01 11:28 · Score: 1
  
  I understand the what the parent post is getting at. There are a lot of marketing crap names out there. I've just gotten used to telling them please explain what you are talking about and very often I find that is where vendor, peer employee, employee get's stuck. You find out later it's just a dynamic application using a database and webserver.
  But in this case "Ensemble" is not a new term, even by my old fart standards, it's been around since 1995 and describes the use of multiple modeling methods to achieve an improved model. It can also be the same like method combined to create a different model. It's not a term that people would generally know and I think the original poster of the article should have defined it. Because it definitely sounds like one of this marketing terms.
  
  The name and the actual usage isn't like somebody figured out how to split the atom. If you do data mining and data analysis, you are probably going to event this technique on your own.
  "How did you come up with such a trustworthy predictive model?"
  " Well I used a neural network, a tree model, and a linear regression model and combined them together, an ensemble you might say of modeling techniques".
10. Re:I'm sorry by JumpDrive · 2009-09-01 11:39 · Score: 1
  
  I read one of the followup post and tehy had a link to The Elements of Statistical Learning: Data Mining, Inference, and Prediction By Trevor Hastie, Robert Tibshirani, Jerome Friedman
  
  First line of Chapter 16 Ensemble Learning: "The idea of ensemble learning is to build a prediction model by combining the strengths of a collection of simpler base models".
  
  Yeah, that's what I meant to say :-)
11. Re:I'm sorry by JumpDrive · 2009-09-01 11:45 · Score: 1
  
  Current favorite tech book: Racing the Beam [bit.ly] , on the Atari VCS
  
  So your the one that bought that book.
12. Re:I'm sorry by Trepidity · 2009-09-01 11:51 · Score: 1
  
  What's that cryptic comment mean? I did indeed buy it, though, yes. =]
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
13. Re:I'm sorry by mbkennel · 2009-09-01 11:53 · Score: 1
  
  As applied in statistical modeling it is a process which selects the model which fits the data the best. Thus it it makes me irate as my understanding is in opposition with its classical use.
  In fact it isn't. The classical practice is selecting the model which fits the data the best, known as "model selection".
  Ensemble methods do little "yes/no" model selection and instead make predictions from weighted combinations of many base predictors. To make a prediction, you need to score many or all of the base models and combine their outputs.
  The use isn't incommensurate with musical connotation.
14. Re:I'm sorry by JumpDrive · 2009-09-01 13:12 · Score: 1
  
  I followed the link and thought that it was a very obscure topic.
  Just struck me as something where I really couldn't imagine a whole lot of people would read something like this.
15. Re:I'm sorry by Trepidity · 2009-09-01 14:12 · Score: 1
  
  Hmm, I could see that, though the Atari VCS (aka Atari 2600) was a pretty popular platform. There's fan forums and such devoted to it. It seemed cool to me that someone finally wrote a good book on it, since it was pretty influential.
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
encoding data by gmermnstinsmermwords · 2009-09-01 07:37 · Score: 0

Isn't this thing about encoding data different types of ways and then using a combined result?
Sounds just like MCTS by darrencook · 2009-09-01 18:05 · Score: 1

Machine learning ensembles sounds just like monte-carlo tree search (MCTS) techniques (also called UCT), which are used in computer go (and more and more other AI problems) with great success.
The idea is that instead of trying to analyze a board position (which can be really, really difficult) using clever algorithms, you ask a random/simplistic algorithm to play out the rest of the game thousands upon thousands of times and see how many of those games it wins. The more it wins the better the positions.
Sounds crazy, but it actually works better than anything else.
(MCTS is usually thought of as using just one playout algorithm, with many random parameters; but that is still the same basic idea as ensembles using a bunch of different algorithms/models.)