Test Shows Big Data Text Analysis Inconsistent, Inaccurate

Posted by samzenpus on Sunday February 1, 2015 @05:40AM from the you'll-love-these-links dept.

DillyTonto writes The "state of the art" in big-data (text) analysis turns out to use a method of categorizing words and documents that, when tested, offered different results for the same data 20% of the time and was flat wrong another 10%, according to researchers at Northwestern. The Researchers offered a more accurate method, but only as an example of how to use community detection algorithms to improve on the leading method (LDA). Meanwhile, a certain percentage of answers from all those big data installations will continue to be flat wrong until they're re-run, which will make them wrong in a different way.

Slashdot Mirror

Test Shows Big Data Text Analysis Inconsistent, Inaccurate

1 of 60 comments (clear)