Gamers Outdo Computers At DNA Sequence Alignments

← Back to Stories (view on slashdot.org)

Gamers Outdo Computers At DNA Sequence Alignments

Posted by Soulskill on Monday March 12, 2012 @09:35AM from the must-not-be-using-watson dept.

ananyo writes "In another victory for crowdsourcing, gamers playing Phylo have beaten a state-of-the-art program at aligning regions of 521 disease-associated genes form different species. The 'multiple sequence alignment problem' refers to the difficulty of aligning roughly similar sequences of DNA in genes common to many species. DNA sequences that are conserved across species may play an important role in the ultimate function of that particular gene. But with thousands of genomes likely to be sequenced in the next few years, sequence alignment will only become more difficult in future. Researchers now report that players of Phylo have produced roughly 350,000 solutions to various multiple sequence alignment problems, beating the accuracy of alignments from a program in roughly 70% of the sequences they manipulated."

14 of 61 comments (clear)

Min score:

Reason:

Sort:

would be interesting to mine their data by Trepidity · 2012-03-12 09:41 · Score: 5, Interesting

I'm highly skeptical that these gamers are really using some un-automatable human-only deep skills, especially since they aren't exactly extensively trained in this game, not to the level of, say, good Go players. So the interesting question to me is not that they beat current algorithms, but whether data mining these hundreds of thousands of alignments can tell us something about how they're doing it. My guess is that there are some heuristics that can be mined from this data that would massively speed up search.
That's a more general point about how these stories are always pushed, though, sometimes by media, sometimes by the researchers themselves. Imo the most exciting thing about successful uses of "human computation" isn't that we can harness people to do things, but that we can gain some large data sets that will make it so we don't have to get people to do them anymore. Or at least, that should be the baseline, imo: that humans can beat some hand-crafted algorithm is one thing, but can they beat machine-learned algorithms trained on those humans' own gameplay logs?

--
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
1. Re:would be interesting to mine their data by K.+S.+Kyosuke · 2012-03-12 09:50 · Score: 5, Funny
  
  Perhaps they could make it into yet another captcha: "If you want to download your porn movie, please align the following two DNA fragments." :) If people can be made to do OCR for others, why not DNA alignment?
  
  --
  Ezekiel 23:20
2. Re:would be interesting to mine their data by rish87 · 2012-03-12 09:51 · Score: 2
  
  I agree 100% with the sentiment of figuring out how the players make the decisions and use it as new heuristics. The MSA problem isn't that computers cannot get the optimal solution, the problem is doing it quickly. Given enough time, a computer will always outdo or match a human. What needs to be done is improve the existing computational algorithms with heuristics learned from these players. Then we have much better results at a much faster rate.
3. Re:would be interesting to mine their data by tibit · 2012-03-12 10:01 · Score: 3, Interesting
  
  This is not as silly as you might think. If it weren't for generally fucked up academic politics, this would work wonders. Get a bunch of popular porn sites to accept phylo points as payment. My bet is that there'd be plenty teenagers and basement dwellers who can trade plenty of time for the money they don't have to pay for porn :)
  
  --
  A successful API design takes a mixture of software design and pedagogy.
4. Re:would be interesting to mine their data by Trepidity · 2012-03-12 10:09 · Score: 3, Insightful
  
  That's true; a legitimate hypothesis is that this task involves very difficult skills that humans are naturally adept at, like object recognition in images does. My guess is that aligning DNA sequences is not as strong an example of one of those kinds of problems as object recognition, in particular because it doesn't involve the large amount of general knowledge about the world that we bring to bear when interpreting scenes; aligning sequences is more of a "formal" problem, than recognizing what constitutes a "chair". But I'll admit I could be wrong. One way to find out would be to try to see how much can be mined from the data. ;-)
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
5. Re:would be interesting to mine their data by ottawanker · 2012-03-12 10:56 · Score: 5, Funny
  
  Worst case scenario is that the crackers write a really good DNA Sequencing programs to beat the captchas.
6. Re:would be interesting to mine their data by mug+funky · 2012-03-12 12:57 · Score: 3, Interesting
  
  wouldn't the problem at hand be NP-hard? maybe that's why gamers are beating the algos?
  could this be a new way to "monetize" the internet? outsourcing hard problems for cash. with a cloud paradigm, it doesn't matter whether it's a cluster of computers or a crowd of aspies when the end result is the same.
Re:Brilliant! by GrumblyStuff · 2012-03-12 09:48 · Score: 2

Makes the original premise of The Matrix that much better than the "lol we're batteries!"
Imagine... by KillAllNazis · 2012-03-12 09:55 · Score: 2

... a beowulf cluster of these!
1. Re:Imagine... by lister+king+of+smeg · 2012-03-12 11:25 · Score: 2
  
  call me when you can make a Beowulf culster of human brains, i bet porting C will be a real bitch though
  
  --
  ---Saying gnome 3 is better than windows 8 not so much a compliment as it is damning with light praise.
Achievement/Trophy Unlocked! by Eponymous+Hero · 2012-03-12 10:36 · Score: 2

Cured Lupus! 150G / Platinum Trophy

--
insensitive clod overlords obligatory xkcd car analogy russian reversals whoosh pedant fanbois ftfy in 3...2...1..PROFIT
Read the fine print... by whydavid · 2012-03-12 10:43 · Score: 5, Informative

This is an interesting finding, but let's not get too carried away. If you read the article, you'll see that: a) The phylo-based alignments are partial solutions. They are simplified for the human user by leaving many orthologous sequences out of the alignment. This means there is another algorithm that finishes these partial solutions before they can be compared to solutions produced solely by algorithms. b) Only 36% of the _best_ phylo-based solutions, once completed, were better than the algorithms' solutions. This is still an improvement, but it DOES NOT suggest that humans are better than computers at multiple sequence alignment. If you were to ever try to solve a real MSA problem by hand, you would quickly understand how completely hopeless it is. In fact, even aligning 2 sequences of any appreciable length by hand is a chore. The problem here is the misguided title: "Gamers outdo computers at matching up disease genes" which should read: "Gamers + computer outdo computers only at matching up very small fragments of disease genes, some of the time"
1. Re:Read the fine print... by whydavid · 2012-03-12 18:10 · Score: 2
  
  BAliBASE is a great reference, but all of the sequence alignments in the database were refined from algorithmically-derived alignments (implemented on computers) in the first place. I think it furthers my assertion that computers + humans > either alone when it comes to MSA. Certainly, the sheer scale of the data would prevent any sort of economic use of manual global alignment, even if the local alignments were best carried out by biologists. Again, my issue here is that the article gives the impression that gamers have "outdone" computers at matching up disease genes, when in reality the gamers have been presented with a very small slice of the problem (as I'm sure you recognize better than I) and only outperformed the computer alone in certain scenarios, certainly not the blanket 70% quoted in the news piece.
Re:Time limit by KhabaLox · 2012-03-12 11:59 · Score: 3, Interesting

I haven't played the "game", but I suspect that there are a lot of things like time limits that can serve as a motivation factor that actually increase user output in the aggregate. Having a time limit can give you a sense of urgency that will force you to work faster. The error rate may increase, but overall productivity could still be higher given that higher number of "answers" given per unit time.
Imagine two Magic The Gathering players. One assembles decks painstakingly, spending hours crafting card ratios just right, and researching combos to get the perfect balance of # cards to power of combo. Then he play tests it, goes back and makes adjustments, etc. The other throws decks together quickly and play tests them very quickly. He adjusts the deck without as much deliberate thought, but rather more quickly (perhaps intuitively). He is able to iterate much faster, and it's easy to imagine that if each player were given 1 month to pursue these strategies, the latter could easily come out with more decks that met some minimum standard of success (that was suitably high).
(Obviously, it's easy to see how inane, useless rewards can spur gamers to expend more time and "contribute" more to the game... just look at badges, trophies, etc. But I think it's just as possible that "negative" reinforcement ideas, such as a time limit, can have the same effect.)

--
Ceci n'est pas un sig.