Can Machine Learning Replace Focus Groups?

← Back to Stories (view on slashdot.org)

Can Machine Learning Replace Focus Groups?

Posted by samzenpus on Thursday May 31, 2012 @10:40AM from the what-does-the-machine-think? dept.

itwbennett writes "In a blog post, Steve Hanov explains how 20 lines of code can outperform A/B testing. Using an example from one of his own sites, Hanov reports a green button outperformed orange and white buttons. Why don't people use this method? Because most don't understand or trust machine learning algorithms, mainstream tools don't support it, and maybe because bad design will sometimes win."

4 of 93 comments (clear)

Min score:

Reason:

Sort:

OK, so... by war4peace · 2012-05-31 10:44 · Score: 5, Insightful

I have read the synopsis 4 (four) times and I didn't get shit.
Of course, TFA sheds some light on the whole thing, but really... work on your short version, guys, because what's in here makes no sense.

--
...gis sdrawkcab (usually not responding to ACs; don't bother posting as AC)
1. Re:OK, so... by Tarsir · 2012-05-31 14:42 · Score: 5, Insightful
  
  You know, I read the summary without understanding it, and just clicked through to read the article, but only after reading your comment did I realize just how little sense the summary really made.
  
  In a blog post, Steve Hanov explains how 20 lines of code can outperform A/B testing.
  It starts off talking about a nobody who did something that is apparently so trivial that it can be outdone by 20 lines of code. You might think that the following sentence will answer at least one of the questions raised by this sentence: Who is Steve Hanov? What is A/B testing? What do Steve's 20 lines of code do? But you'd be wrong.
  
  Using an example from one of his own sites, Hanov reports a green button outperformed orange and white buttons.
  Because the next sentence jumps to a topic whose banality and seeming irrelevance to the matter at hand defies belief. Three coloured buttons, one of which 'outperformed' the others, with nary a hint as to what these buttons do, or how one can outperform the others.
  
  Why don't people use this method?
  The third sentence appears to pick up where the first left off. Why don't people use the A/B testing method? Or are we talking about the three coloured buttons method?
  
  Because most don't understand or trust machine learning algorithms, mainstream tools don't support it, and maybe because bad design will sometimes win.
  The final sentence is a tour-de-force of disjointed confusion. It skips from machine learning algorithms that haven't been discussed, to tools with unknown purpose, to the design of something which was never specified.
  It's like the summary is some kind of abstract art installation whose purpose is to be as uninformative as possible. It is literally the opposite of informative: Not only does it provide no information, it raises questions which you can't even be sure relate to the purported topic at hand, because you don't know what the topic at hand is.
  It is either a bizarrely confused summary or one of the most artful trolls ever to grace Slashdot's front page
This is not exclusively machine learning by Anonymous Coward · 2012-05-31 11:04 · Score: 5, Insightful

This is not "machine learning" subsituting for human A/B testing. It's just changing the ratio of the number of visitors exposed to the "new" feature to be tested from 50% to 10%, while keeping the rest (90%) of the visitors using the "best so far" feature. There's also a bit of randomness thrown in when choosing which new feature the 10% of visitors get to test.
In this scheme, the human visitors are still doing the A/B testing, it's just that determination of which human is testing which feature dynamically adapts over time.
Now, if this guy had subsituted human A/B testing completely with a machine learning technology that could somehow determine which feature is better without any input from humans, then I'd be impressed. That's kind of what the summary and article imply. But that's not what he's done. He's just being a bit more sophisticated regarding which humans get to test which feature.
He's also made a big fat claim regarding the effectiveness of his method with zero evidence to back it up. Theoretical results regarding one-armed bandit problems are quite a far cry for real-world results regarding website feature selection. I'm looking forward to seeing some results of the proposed method on the latter.
The article's premise is entirely wrong by RandCraw · 2012-05-31 11:15 · Score: 5, Insightful

A/B focus testing is about observing how customers or users choose between two alternatives based on their qualitative sense of aesthetics. ML is about classifying data based on quantifying the data into defined classes or toward optimal values.
Predicting the outcome of a focus group is a completely different problem than multi arm slot machines. In focus groups there is no objective metric, so focus group problems are not amenable to machine learning unless your machine can define, measure, and perhaps predict aesthetic criteria.
Now THAT I'd like to see.