Slashdot Mirror


Machine Learning Expert Michael Jordan On the Delusions of Big Data

First time accepted submitter agent elevator writes In a wide-ranging interview at IEEE Spectrum, Michael I. Jordan skewers a bunch of sacred cows, basically saying that: The overeager adoption of big data is likely to result in catastrophes of analysis comparable to a national epidemic of collapsing bridges. Hardware designers creating chips based on the human brain are engaged in a faith-based undertaking likely to prove a fool's errand; and despite recent claims to the contrary, we are no further along with computer vision than we were with physics when Isaac Newton sat under his apple tree.

19 of 145 comments (clear)

  1. Michael Jordan by Anonymous Coward · · Score: 5, Funny

    A man of many talents.

  2. Computer vision... by Savage-Rabbit · · Score: 5, Interesting

    ... and despite recent claims to the contrary, we are no further along with computer vision than we were with physics when Isaac Newton sat under his apple tree.

    That's true, I looked into object recognition for image classification by content. Face recognition is proceeding fairly nicely but doing stuff like just programmatically classifying/tagging images by whether they contain a car, airplane, house, tree, dog, mountain .... without even trying to do things like identifying the type of airplane/dog/car is pretty much undoable in any reasonable amount of time with human level accuracy needed on garden variety PCs and tablets which is the application I'd be interested in. The fastest and most accurate image classifier/tagger is still a human. Am still looking forward to they day that changes but I'm not sure that will be within my lifetime.

    --
    Only to idiots, are orders laws.
    -- Henning von Tresckow
    1. Re:Computer vision... by Lennie · · Score: 5, Interesting

      Self driving cars isn't done based on looking at still images only. They have LIDAR which helps identify where objects are and what the size could be. Also they have very detailed maps of the roads, these are all taken into account when identifying objects.

      Have a good look at the limitations section on Wikipedia:
      "...that the lidar technology cannot spot potholes or humans, such as a police officer, signaling the car to stop."

      "The vehicles are unable to recognize temporary traffic signals. ... They are also unable to navigate through parking lots. Vehicles are unable to differentiate between pedestrian and policeman or between crumpled up paper and a rock."

      https://en.wikipedia.org/wiki/...

      Does that seem like a system that solved computer vision ?

      --
      New things are always on the horizon
    2. Re:Computer vision... by bouldin · · Score: 3, Informative

      The google car doesn't posses the kind of general visual intelligence he was describing. It solves very specific problems (follow the road; if something is in the way, then stop; match speed with the vehicle ahead).

    3. Re:Computer vision... by Ol+Olsoc · · Score: 5, Funny

      Vehicles are unable to differentiate between pedestrian and policeman or between crumpled up paper and a rock."

      Stupid damn things are always choosing scissors.

      --
      The shepherds did so well protecting the flock that the sheep no longer believed that wolves existed.
  3. Re:zomg singularity! by CRCulver · · Score: 5, Interesting

    The interview is slightly more nuanced than that. Prof. Jordan says that he can take off his academic hat and read musings on a common singularity with ordinary human awe and wonder. It is only in his work as an academic that he doesn't feel Kurzweil's ideas are relevant.

    I remain sceptical of the singularity idea myself, though for different reasons. When I read Kurzweil's The Singularity is Near , I was disappointed at how in claiming a never-ending increase in the pace of technological advancement, Kurzweil never dealt with the regulatory and consumer factors, and the whole notion of how humans perceive time in general. The wheels of government can only move so fast, and so mankind's access to radical new technology outside the lab (e.g. self-driving cars, new medical tech) must slow down to match the speed of regulatory agencies. Also, consumers can be convinced to buy new shiny things, but there is still a desire to get one's money's worth out of one's purchases, and lots of people still feel their computer or smartphone from three or four years ago is still good enough. Would the market go for replacing one's tech in the shorter and shorter spans that Kurzweil envisions?

    So when I read a computer scientist like Jordan admit that he sees no cause for singularity optimism within his work, I can only feel that Kurzweil's dream is a balloon being stuck with a thousand pins. Still, I continue to enjoy thinking about the subject.

  4. Re:zomg singularity! by Mostly+a+lurker · · Score: 3, Interesting

    I was disappointed at how in claiming a never-ending increase in the pace of technological advancement, Kurzweil never dealt with the regulatory and consumer factors, and the whole notion of how humans perceive time in general. The wheels of government can only move so fast, and so mankind's access to radical new technology outside the lab (e.g. self-driving cars, new medical tech) must slow down to match the speed of regulatory agencies.

    You make some good points. However, I believe the march towards the singularity will march inexorably forward for one (highly undesirable) reason: the insatiable appetite of the leaders of nations for power. The populations of those countries will not even be allowed to know much of what is being developed with hundreds of billions of their tax dollars, but technologies that leaders perceive could enhance their ability to dominate the world will be financed. There will be no regulation. If you want to know the state of the art in visual recognition, you should look at military applications: robot soldiers and autonomous drones. For applications of big data (especially its usefulness in widespread blackmailing activities) then, in spite of some initial missteps, look at the pervasive collection of data by the world's "intelligence agencies".

  5. Re:Cloud by JaredOfEuropa · · Score: 3, Insightful

    There's plenty of reasons I can think of why I'd prefer image recognition on my phone rather than the cloud. Privacy, for one. If you let FB tag your photos with the names of the people in it (after teaching it those names), what do you think happens to that data? You might not even want to share the photo or video stream with anyone... Another reason is that we still do not live in a world with ubiquitous and cheap mobile data. Travel abroad, and you'll find out quickly why cloud-based services like Waze aren't always a viable option.

    --
    If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
  6. I disagree. by serviscope_minor · · Score: 5, Informative

    As it happens, I am a computer vision expert.

    I do wonder how much useful stuff was done with the results from physics back then as opposed to emperical hand-hacking of everything. I suspect not much.

    Computer vision has a long way to go. On the other hand, there are plenty of things which it does do, some of which are more or less impossible otherwise.

    OCR is very useful. It runs the mail system of many countries and has plenty of use when it comes to digitising old documents. This would be possible, but deeply tedious by hand.

    Structure from motion is used heavily in the film industry to work out 3D structure and motion for placing virtual objects. Almost impossible to do well without computer vision.

    Photo stitching for automatic panoramas. Classic CV system, and my phone comes with it built in.

    Number plate recognition. Apart from the rather unpleasant big brother potential, London's congestion charging system runs off this and it does very good things for London.

    Those cameras/phones with face detection built in. Not sure how useful it is but it works.

    Lego Fusion is a recently released game which appears to rely on computer vision.

    Oh those phone based barcode and QR scanners. Very useful.

    The pick and place machines which use vision for accurate placement.

    This machine which is really awesome: https://www.youtube.com/watch?...

    Lots of other industrial things are controlled by CV.

    Certain types of super resolution microscopy are based on computer vision.

    And that's just a few off the top of my head.

    So yeah computer vision has a long way to go. On the other hand, it's out there doing real things right now. It might not be very advanced CV (the industrial stuff often is not because it needs to be reliable), but it's still CV and it's still being used.

    --
    SJW n. One who posts facts.
    1. Re:I disagree. by ledow · · Score: 5, Interesting

      The problem with computer vision is not that it's not useful, but that it's sold as a complete solution comparable to a human.

      In reality, it's only used where it doesn't really matter.

      OCR - mistakes are corrected by spellcheckers or humans afterwards.

      Mail systems - sure, there are postcode errors, but they result in a slight delay, not a catastrophe of the system.

      Structure from motion - fair enough, but it's not "accurate" and most of that kind of work isn't to do with CV as much as actual laser measurements etc.

      Photo stitching - I'd be hard pushed to see this as more of a toy. It's like a photoshop filter. Sure, it's useful, but we could live without it or do it manually. Probably biggest use in mapping, where it's a time-saver and not much else. It doesn't work miracles.

      Number plate recognition - well-defined formats on tuned cameras aimed at the right point, and I guarantee there are still errors. The systems I've been sold in the past claim 95% accuracy at best. Like OCR, if the number plate is read slightly wrongly, there are fallbacks before you issue a fine to someone based on the image.

      Face detection is a joke in terms of accuracy. If we're talking about biometric logon, it's still a joke. If we're talking about working out if there's a face in-shot, still a joke. And, again, not put to serious use.

      QR scanners - that I'll give you. But it's more to do with old barcode technology that we had 20 years ago, and a very well defined (and very error-correcting) format.

      Pick-and-place rarely relies on vision only. There's much better ways of making sure something is aligned that don't come down to CV (and, again, usually involve actually measuring rather than just looking).

      I'll give you medical imaging - things like MRI and microscopy are greatly enhanced with CV, and the only industry I know where a friend with a CV doctorate has been hired. Counting luminescent genes / cells is a task easily done by CV. Because, again, accuracy is not key. I can also refer you to my girlfriend who works in this field (not CV) and will show you how many times the most expensive CV-using machine in the hospital can get it catastrophically wrong and hence there's a human to double-check.

      CV is, hence, a tool. Used properly, you can save a human time. That's the extent of it. Used improperly, or relied upon to do the work all by itself, it's actually not so good.

      I'm sorry to attack your field of study, it's a difficult and complex area as I know myself being a mathematician that adores coding theory (i.e. I can tell you how/why a QR code works even if large portions of the image are broken, or how Voyager is able to keep communicating, despite interference on an unbelievable magnitude).

      The problem is that, like AI, practical applications run into tool-time (saving a human having to do a laborious repetitive task, helping that task along, but not able to replace the human in the long run or operate entirely unsupervised). Meanwhile, the headlines are telling us that we've invented "yet-another-human-brain", which are so vastly untrue as to be truly laughable.

      What you have is an expertise in image manipulation. That's all CV is. You can manipulate the image to be easier read by a computer which can extract some of the information it's after. How the machine deals with that, or how your manipulations cope with different scenarios, requires either a constrained environment (QR codes, number plates), or constant human manipulation to deal with.

      Yet it's sold as something that "thinks" or "sees" (and thus interprets the image) like we do. It's not.

      The CV expert I know has code in an ATM-like machine in one of the southern American counties. It recognises dollar bills, and things like that. Useful? Yes. Perfect? No. Intelligent? Far from it. From what I tell, most of the system is things like edge detection (i.e. image manipulation via a matrix, not unlike every Photoshop-compatible filter going back 20 years), derived heuristics and error-margins.

      Hence, "computer vision" is really a misnomer, where "Photoshopping an image to make it easier to read" is probably closer.

    2. Re:I disagree. by serviscope_minor · · Score: 5, Insightful

      In reality, it's only used where it doesn't really matter.

      That's patently false. It's used for industrial process control and things like that too. See for example the video I posted. To the manufacturers who use such a machine, it matters an awful lot.

      OCR - mistakes are corrected by spellcheckers or humans afterwards.

      I don't know how much you count this as "mattering". The IEEE has scanned and OCR'd their back catalogue of papers. I don't think they've been human checked due to the sheer volume. It's very useful to be able to get these online now.

      Mail systems - sure, there are postcode errors, but they result in a slight delay, not a catastrophe of the system.

      Well, it's not like humans are error free either. This is something people often forget. A national postal system is a very important thing, and CV is used to massively reduce the costs of being able to ship vast quantities of mail. Sure it makes mistakes, so do hand sorters. By an astonishing coincidence, I actually got a letter through my letterbox for my neighbour only yesterday.

      Structure from motion - fair enough, but it's not "accurate" and most of that kind of work isn't to do with CV as much as actual laser measurements etc.

      I'mn not sure what you mean by "not accurate". It has a scale ambiguity, for sure, and drifts, but so does any relative measurement system including lasers.

      Photo stitching - I'd be hard pushed to see this as more of a toy. It's like a photoshop filter. Sure, it's useful, but we could live without it or do it manually.

      Well, of course we could live without it. Turns out that humans can survive with nothing more than a pointed stick and a bit of animal fur. This means we could survive without almost everything around us.

      Anyhow, I doubt you'd get remotely comparable results by hand. You have things like vignetting, exposure changes, radial distortion etc to contend with. It's very, very hard to get a seam-free stitch.

      Number plate recognition - well-defined formats on tuned cameras aimed at the right point, and I guarantee there are still errors. The systems I've been sold in the past claim 95% accuracy at best. Like OCR, if the number plate is read slightly wrongly, there are fallbacks before you issue a fine to someone based on the image.

      But all systems have errors. Humans are quite error prone, especially in really boring repetitive tasks. One thing I've noticed is that where humans are really really good, they're held up as a gold standard, when they're not, perfection is held up as a gold standard.

      Face detection is a joke in terms of accuracy. If we're talking about biometric logon, it's still a joke. If we're talking about working out if there's a face in-shot, still a joke. And, again, not put to serious use.

      Face detection (not face recognition) works "pretty well", I reckon. You can download an old, non-state of the art algorithm like Viola-Jones in OpenCV. It's pretty good on the whole. And anyway: define "serious". But yeah, biometrics is a joke. I never would claim otherweise.

      QR scanners - that I'll give you. But it's more to do with old barcode technology that we had 20 years ago, and a very well defined (and very error-correcting) format.

      No, the old tech was laser or LED based scanning. The current ones use computer vision to avoid those complex, mechanical systems to be able to do a pretty good job with ubiquitous off the shelf sensors. Also, a generic vision based one can read pretty much all formats from a single place.

      Pick-and-place rarely relies on vision only. There's much better ways of making sure something is aligned that don't come down to CV (and, again, usually involve actually measuring rather than just looking).

      Sure they use servos and stuff for positioning, but those little crosshair marks over the board are what they use to get the high accuracy. The problem with the cheap-ass Chinese machines for a few gr

      --
      SJW n. One who posts facts.
    3. Re:I disagree. by Sqreater · · Score: 3, Informative

      I work in the USPS as an Electronics Technician (with an engineering degree) and I'd like to point out that our OCR system is accurate, fast, and robust. Our read rate is up to 98-99% and most of our human REC centers (humans read the addresses the OCR system cannot and send the result back to the machine in real time) are now shut down. Our scanners read and our image computers interpret typed and handwritten addresses, bar codes, id tags, and indicia at up to 30,000 letters per hour per machine. And they do it while having dust and glue and ink accumulating on the quartz windows of the cameras. They do this in an electrically noisy environment and with continuous heavy vibration. Yes, they run "unsupervised" and they have replaced hundreds of thousands of USPS employees. Any problem with CV at a higher level is a back end theory and programming problem and that will just take time and effort.

      --
      E Proelio Veritas.
  7. Re:Cloud by Anonymous Coward · · Score: 5, Funny

    funnier written as:

    .

    .

    .

    .

    .

    Latency.

  8. This is *not* what Michal Jordan actually believes by Anonymous Coward · · Score: 5, Informative

    I am doing a postdoc in applied statistics/machine learning and I was very surprised by this interview since it is contradictory to what Michael Jordan has himself expressed as an invited speaker at conferences as well as what his most recent research projects are focused at. It appears that, according to Michael Jordan himself as expressed on his webpage, the article is a hack-job where the journalist is completely misrepresenting his view on big data. To quote:


    I’ve found myself engaged with the Media recently (...) for an interview that has been published in the IEEE Spectrum.

    That latter process was disillusioning. Well, perhaps a better way to say it is that I didn’t harbor that many illusions about science and technology journalism going in, and the process left me with even fewer.

    The interview is here: http://spectrum.ieee.org/robotics/artificial-intelligence/machinelearning-maestro-michael-jordan-on-the-delusions-of-big-data-and-other-huge-engineering-efforts

    Read the title and the first paragraph and attempt to infer what’s in the body of the interview. Now go read the interview and see what you think about the choice of title.

    The title contains the phrase “The Delusions of Big Data and Other Huge Engineering Efforts”. It took me a moment to realize that this was the title that had been placed (without my knowledge) on the interview I did a couple of weeks ago. Anyway who knows me, or who’s attended any of my recent talks knows that I don’t feel that Big Data is a delusion at all; rather, it’s a transformative topic, one that is changing academia (e.g., for the first time in my 25-year career, a topic has emerged that almost everyone in academia feels is on the critical path for their sub-discipline), and is changing society (most notably, the micro-economies made possible by learning about individual preferences and then connecting suppliers and consumers directly are transformative). But most of all, from my point of view, it’s a *major engineering and mathematical challenge*, one that will not be solved by just gluing together a few existing ideas from statistics, optimization, databases and computer systems.

    Source: https://amplab.cs.berkeley.edu/2014/10/22/big-data-hype-the-media-and-other-provocative-words-to-put-in-a-title/

  9. Re:zomg singularity! by SuricouRaven · · Score: 5, Insightful

    I think he underestimated the power of stupidity.

    You can grant every reasonably well-off person in a country a device that gives them access to all scientific and engineering knowledge and a vast communications network - and half of them will use it to publish rambling arguments that the moon landing was fake, fossils are a hoax scientists made up to disprove the bible, autism is caused by vaccines and Obama is secretly a Kenyan Muslim Communist Atheist Black-Supremecist who hates America.

  10. Read the interview by Anonymous Coward · · Score: 5, Informative

    No, seriously. Here are some choice quotes:

    "I read all the time about engineers describing their new chip designs in what seems to me to be an incredible abuse of language. They talk about the “neurons” or the “synapses” on their chips. But that can’t possibly be the case; a neuron is a living, breathing cell of unbelievable complexity."

    "It’s always been my impression that when people in computer science describe how the brain works, they are making horribly reductionist statements that you would never hear from neuroscientists."

    "Lately there seems to be an epidemic of stories about how computers have tackled the vision problem, and that computers have become just as good as people at vision."

    "Even in facial recognition, my impression is that it still only works if you’ve got pretty clean images to begin with."

    "I have a hobby of searching for information about silly Kickstarter projects, mostly to see how preposterous they are, and I end up getting served ads from the same companies for many months."

    Here's the catch: all of these quotes are from the interviewer. Jordan has a lot of really nuanced claims here, but it's clear that the interviewer has an agenda of his own.

    1. Re:Read the interview by Zalbik · · Score: 3, Interesting

      Here's the catch: all of these quotes are from the interviewer. Jordan has a lot of really nuanced claims here, but it's clear that the interviewer has an agenda of his own.

      Yes, this is one of the more shameful examples of the reporter attempting to shove words down the interviewee's mouth, and completely misrepresenting the results.

      Take a look at the first sentence:
      "The overeager adoption of big data is likely to result in catastrophes of analysis comparable to a national epidemic of collapsing bridges"

      Then read the interview. At no point does Jordan indicate that the misanalysis of big data will cause a catastrophe comparable to the epidemic of collapsing bridges. Never. What he does (and apparently the reporter is either too stupid or too dishonest to represent), is provide an analogy between building a bridge without scientific principles and not performing proper statistical analysis on big data.

      He never makes a comparison between the outcomes of these two events. He basically says: if you build a bridge without scientific principles, it will fall down. If you are not careful in your analysis of big data, your results will be wrong.

      The whole article goes on in a very similar manner. Science reporters used to have something called "journalistic integrity". Here we get a click-bait article where a "reporter" has predetermined a topic that will gain lots of hits and is desperately trying to fit the interviewees words into his agenda.

      Shameful.

  11. Re:GREAT Interview (article really) by Beezlebub33 · · Score: 3, Interesting
    He is well known in the machine learning community. He was the editor of a popular book (now somewhat dated, 1998) called "Learning in Graphical Models". You can think of graphical models as large scale Bayesian networks, among others. The hard parts are figuring out what the network is and how to train them. Lots of scary math in there. So the guy is very smart, and has been involved deeply in the field for over 20 years.

    As someone who was involved in the previous neural network hype cycle (late 80s, early 90s), I'd have to agree with him that we go through these cycles, where a particular approach gain ascendency, then is shown to not work as well as the hype, and then gets rejected. On the inside, however, lots of good work continues to be done. The press (and then in popular opinion) keeps saying 'this is it, we're really close to AI' or somethign similar, and then when it doesn't pan out, then it is considered a bust. But, we are making progress, we know more than we did last year, and a lot more than 10 years ago. It is just that the problem is hard, and we're still trying to figure out some basic principles, so don't expect us to be there yet.

    --
    The more people I meet, the better I like my dog.
  12. Re:zomg singularity! by gweihir · · Score: 3

    The whole idea of "the Singularity" is nonsense. It is basically people seeking a surrogate "God" in technology, and the singularity is needed to create the "all knowing" aspect. There is however zero reason to believe it is even a remote possibility. All practical connections of more hardware have had a speed-up below 1 (i.e. use 2x the hardware get less than 2x the computing power) often significantly and fundamentally so.

    The singularity is the production of a child-like fantasy that ignores any and all facts that are known. Just like the idea of a religious "God" it does touch something in many people that makes them want to believe against better judgment.
     

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.