Slashdot Mirror


Google's Search Copying Accusation Called 'Silly'

itwbennett writes "Google's Bing sting, reported in Slashdot just days ago and subsequently denied by Microsoft, is now being called 'silly' and 'petty' by search industry analysts and execs. The reason: it would be impossible for Microsoft to use the copied results to reverse engineer Google's search algorithms. And in fact it is more likely that Microsoft was conducting competitive research. Charlene Li, founder of technology research and advisory firm Altimeter Group, saw Google's actions as a misguided response to a real threat from a competitor in its core search business. 'Google isn't used to having competition. You look at this incident and you wonder why they are doing this. It feels amateurish in a way, a kind of 'they're not playing fair' attitude,' she said."

27 of 380 comments (clear)

  1. Seriously? by Anonymous Coward · · Score: 5, Insightful

    They don't have to copy an algorithm if they are just copying search results. This response is amateur.

    1. Re:Seriously? by ILuvRamen · · Score: 5, Insightful

      I agree totally. What "research" includes looking for an already searched term on Google and then looking at what results come up...then slapping them into your own live result list for the general public? Bing's cheap algorithm is some search and crawling technology from like 2007 mixed with marketing, marketing, MARKETING! Oh, and flashy features that don't really work. So it's not that shocking that they're ripping off other people's results because their product is pretty hollow to begin with.

      --
      Google's Super Secret Search Algorithm: SELECT @search_results FROM internet WHERE @search_results = 'good'
    2. Re:Seriously? by Korin43 · · Score: 5, Insightful

      Hey, looking at the test next to mine isn't cheating. It's not like I could reverse-engineer the other students algorithm by looking at his test!

    3. Re:Seriously? by NeutronCowboy · · Score: 4, Insightful

      No kidding. I'm used to nonsense from "industry analysts", but this takes the cake. It's a complete non-sequitur. This never was a question of reverse-engineering. It's a question of straight-up ripping off results.

      On a related note, what's with all the Google-bashing recently? First the idea (which has now turned into a meme) that Google's search result are not the gold standard for search anymore, and now the idea (probably soon to be turned into a meme) that Google can't handle competition and is resorting to FUD?

      Yes, Google is no saint, it's not perfect. No shit, Sherlock. But if all I did was read "industry analysts" and various websites, I'd think that Google was about to fall apart, what with search sucking and all other products completely falling flat on their face. There's either a general search for the same story going on (Look Ma! I broke the news of Google sucking first!), or some grade A bullshitting is taking place.

      --
      Those who can, do. Those who can't, sue.
    4. Re:Seriously? by sortius_nod · · Score: 5, Informative

      What's worse is that Microsoft is a client of Altimeter Group:

      http://www.altimetergroup.com/disclosure

      Sorry Slashdot, maybe before pushing a story to front page you do a bit of research. The story was submitted by IDG (itwbennett), one of the biggest Microsoft shills on the net. This is all getting out of hand, Microsoft is in damage control and just pushing this FUD about to ensure that faithful Bingsheep keep thinking it's "the best search provider".

    5. Re:Seriously? by Maestro4k · · Score: 3, Insightful

      They don't have to copy an algorithm if they are just copying search results. This response is amateur.

      You can certainly make the case that Google setting up the "sting" operation was "silly", or "petty", but Microsoft's response to the whole thing has been quite enlightening. I think it's Microsoft that's got issues with having a real competitor, and it shows. Google's kinda just rubbing salt into the wounds, which isn't very professional, but MS needs to respond better. Trying to deny it, and at the same time accuse Google of committing "click fraud" to setup the sting (something which has a very specific meaning that's mostly criminal and has not a damn thing to do with Google's "sting" operation) comes across as... desperate at best.

      Personally I think the whole thing is silly on both sides, but MS's response has done a lot to wipe out the little bit of trust they'd gained in past years for behaving somewhat better. MS's response, and not the whole "sting", is making me even less likely to use Bing in the future as well. Both of these are outcomes I suspect MS didn't want to cause with their reaction. In a nutshell, Google won this little fight when MS started responding with denials and attempts to make Google look like they'd done criminal stuff.

    6. Re:Seriously? by sortius_nod · · Score: 5, Informative

      Sorry to reply to myself, but I just checked out Charlene's Twitter feed.

      http://twitter.com/#!/charleneli

      Can we say Microsoft shill?

    7. Re:Seriously? by ThePromenader · · Score: 4, Insightful

      I also agree - Bing is cheating. Never mind Google, they're second-sourcing ~everyone's~ results without giving them credit.

      Every search engine has its own search methods and data-parsing algorithms (down to the lowest in-site-search php code), and it is these algorithms that provide the 'top results' that bing toolbar (and/or IE) users are clicking on. Never mind the Bing toolbar user; what if the owner/creator of a search engine doesn't want any data generated by it to be sent to Bing - where does ~he~ opt out of MS' data-sculling program?

      Bing's tactics are distasteful for many reasons, but mainly a) because they exploit (toolbar) users to scull data from competitors and b) because Bing uses this data to provide 'top results' that it obviously values above those provided by its own algorithm. This is borderline - if not outright - industrial espionage.

      --

      No, no sig. Really.

      ThePromenader
    8. Re:Seriously? by NeutronCowboy · · Score: 3, Informative

      Methinks you are the clueless one here. The important part is indeed that Bing is essentially using Google results to boost its own accuracy. It doesn't matter that it comes through a user clicking on the first result of a Google search and opting to send that action to Microsoft. It wouldn't matter if MS had a bot directly scraping results from Google or had gremlins pick through the algorithm to send results via ESP. Microsoft deliberately and knowingly incorporated Google results into its own results, but without acknowledging this fact anywhere. That is the definition of plagiarism, and ultimately, cheating.

      If that's not the ultimate admission of "We don't know what the fuck we're doing, and have resorted to copying other people's results", I don't know what is.

      --
      Those who can, do. Those who can't, sue.
    9. Re:Seriously? by rtfa-troll · · Score: 4, Interesting

      A pure marketing lead response is 100% right. The funniest thing was that the attempt to claim click fraud. If we remember click fraud is where a site owner tries to get advertising revenue by making fraudulent clicks. I don't see how Google manages to get advertising revenue from Bing. This just seems to be a case of when you get caught start slinging as much mud around randomly as you can and hope people don't notice.

      In case people haven't noticed; what Google has discovered means that if you have private information leaked somewhere (e.g. a password in an SQL query) this means that bing is now pushing that straight from your browser (where it should normally be safe) onto the web. I'm surprised nobody has managed to find a bunch of interesting secret information in bing based on this. There must be some way to get it out. A good chance would be looking for unique keys in URLs or web pages and then feeding them into Bing.

      This just looks so obviously terribly wrong that you can see that Microsoft really doesn't have a clue about search. No wonder they have to copy.

      --
      =~ s,(.*),<sarcasm>$1</sarcasm>,g if any_point_you_wish();
    10. Re:Seriously? by rtfa-troll · · Score: 4, Interesting

      On a related note, what's with all the Google-bashing recently?

      I've seen some of it followed up on Grocklaw. As usual, it seems to trace back to Microsoft astro-turfers and lobby groups of various kinds. Microsoft seems to be pushing for some anti-Google anti-trust lawsuits, probably as a pre-emptive move to make any Google anti-trust moves more difficult during the various anti-patent lawsuits.

      --
      =~ s,(.*),<sarcasm>$1</sarcasm>,g if any_point_you_wish();
    11. Re:Seriously? by uglyduckling · · Score: 3, Insightful

      No, you totally missed the point.

      User A types some words into a text box, then clicks on a link that takes it to a different domain. Toolbar B records those words, together with the destination URL of that link that was clicked, and uses those words to slightly bias the results of search engine C.

      Toolbar B does this for every website that user A visits with the express permission of user A.

      The owners of search engine D get annoyed, because when they deliberately insert completely unique [text string -> URL] mappings to their search engine, and have their engineers click on those links, it shows up in search engine C.

      Note that the only reason search engine D were aware of this in the first place is because bizarre mis-spellings of words in their search engine later turned up in search engine C.

      So, there is a logical way to connect page 3 with content X - someone at some point entered X into a text box, then clicked on a link that lead to page 3. In this case it was some Google engineers. The issue here is that Google (along with a lot of Slashdot posters) are thinking of the web in a static sense: 'how could X possibly link to page 3??' - Bing stole that data. Microsoft are dynamically looking at what users do, and the text string -> URL -> click interaction is seen as a relationship between a phrase and a page that they want to take into account with their search results.

    12. Re:Seriously? by malkavian · · Score: 3, Insightful

      The problem, of course, being copyright, and claiming work as their own.
      Google create a false entry, accessible only through their own site. This is a work that is intended only to determine whether someone is actually stealing their results (i.e. taking those results, and passing them off as MS's own).
      By all means, index non-search sites. That's what search engines are for, but you can't possibly convince me that Microsoft didn't know they were looking at Google's search results.
      That really is akin to writing a dictionary by seeing what people read, then saying "Well, lots of people read this other dictionary, so I'll just lift entries verbatim from it, and claim they are my own"..
      Yes, search engine tweaking is a very fine art.. It's easy to pick up the wrong signal by mistake. if MS had confessed, and said "Ooops, programming/design error in our browser, this is how it happened, and we're now going to remove all search engine sites from our allowed input", weight of opinion may have been behind them more, rather than blithely saying "It's all Google's fault we're ripping them off".
      The root of this is that they're building a dictionary by directly reading a competing dictionary. This isn't creating a diverse, resilient ecosystem. It's parasitism.
      Everyone screws up, and things always go wrong. That's a fact of life. What isn't a fact is that strange need to point fingers and say "It's everyone else's fault but mine". Especially when it blatantly is your fault.

    13. Re:Seriously? by horza · · Score: 4, Insightful

      With tweets like "At Microsoft Productivity Council mtg on future of Office" and worse "Ribbon Hero which teaches how to use MSFT Office better. Making work (gasp!) fun", Charlene Li is obviously blatently dishonest in her representation of her position.

      "Charlene Li, founder of technology research and advisory firm Altimeter Group" - and as sortius_nod says, now paid shill.

      Phillip.

    14. Re:Seriously? by nstlgc · · Score: 4, Insightful

      First of all, I don't think Google has a copyright on any of the content they index, do they?

      Second, as I understand it, clickstream data points are only a small part of the equation. Notice how Google could only reproduce this by using totally bogus keywords, ensuring that the data they fed to Bing through the toolbar were the only data points being considered for those keywords?

      Bing tracks when users search for something, and what sites they visit as a result. I'd almost be offended intellectually if this was not part of their game to provide me with better search results.

      Disclaimer: I use Google almost exclusively. Bing can suck it, but this debate is ridiculously biased.

      --
      I'm Rocco. I'm the +5 Funny man.
    15. Re:Seriously? by Killall+-9+Bash · · Score: 3, Informative

      The keyword didn't appear in the page at all - they manually associated it to the page through the Google database, so there was no way for Bing to know the keyword unless it was spying^W recording the user searches on Google through the Bing Toolbar.

      Like it says they do in their EULA? Like the same thing the Google toolbar does?

      The original Google press release tries to spin this as if MS is stealing info from Google. The reality is all they are doing with the Bing bar is monitoring search clickthrough. Google is evil, has been since shortly after IPO, and one day the fanbois will notice, and will jump ship to whatever the next new thing is.

      --
      "Prediction: within 10 years, Windows will be a Linux distribution." Me, 7-6-2016
    16. Re:Seriously? by benjymouse · · Score: 3, Informative

      The problem, of course, being copyright, and claiming work as their own.
      Google create a false entry, accessible only through their own site.

      Bzzzzzt. Wrong. They created a public "honeypot" page available to everyone. Then they created a bogus search term and manipulated their own system to list the honeypot page for that search term. *Then* they volunteered into Bing toolbar and Suggested Sites, searched for the term and clicked the link.

      Bing toolbar - doing what "toolbars" do - reported back the clickstream. The search term appears readily available in the url of the first page, and the user quickly clicks on a link on that page. Bing's feedback analyzer creates a (very weak) relation between search terms from url of page 1 to page 2. What google did was game this system so that there were no other signals. Consequently it received relatively more weight. But it is not like Bind crawled Google or anything like that, which Google would like everyone to believe.

      --
      Reading slashdot one-liner: (irm http://rss.slashdot.org/Slashdot/slashdot).rdf.item | fl title,desc*
  2. "Competitive Research" by Anonymous Coward · · Score: 5, Insightful

    I don't see this phrase going down well in any other industry. If you copy a map or a book or the design for a car from a different company in the same field, you wouldn't get out of it by calling it "competitive research". Microsoft doesn't need to reverse engineer google's algorithm if they can just steal their results directly; in fact, it's simpler this way because it cuts out the middle part where they even bother to figure out how it works.

    1. Re:"Competitive Research" by grantek · · Score: 4, Insightful

      It's not "research" if the leeched data appears on your production site automatically and without review...

  3. It worked, though. by Animats · · Score: 5, Interesting

    It worked, though. It diverted attention from Microsoft's accusation that Google profits from search spam.

  4. allow me to cherry pick quotes. by Nyall · · Score: 3, Informative

    I read the article and it just seemed like a bunch of collated sound bites with all the intelligence of a 14 year old who thinks she wins arguments by being the first to call the other a hater.

    --
    http://en.wikipedia.org/wiki/Jury_nullification
  5. Microsoft is responding with misdirection by 93+Escort+Wagon · · Score: 4, Interesting

    They seem to be dancing around the core charge of copying what were nonsensical search results that, if not copied from Google, should not have returned any results. They also seem to be attempting to misdirect in talking about "copying Google's algorithm", when I believe the charge is specifically about copying search results.

    I did note that the "Altimeter Group" has only been around a couple years - and has a very website that is full of vague social media-related buzzwords without indicating what, exactly, is their actual skillset (if anything).

    --
    #DeleteChrome
    1. Re:Microsoft is responding with misdirection by martin-boundary · · Score: 3, Insightful

      They seem to be dancing around the core charge of copying what were nonsensical search results that, if not copied from Google, should not have returned any results.

      Uhm, how would that work, exactly?

      Let's say you have a search engine toolbar that looks over a user's shoulder to see what webpages they go to. Presumably, the links that leave those web pages carry information on said user's interests (eg if the user reads slashdot, then the links point to things like other people's comments, and also the site which carries TFA, etc). So the text of that page and the links would be automatically connected by the search engine.

      Now if a user goes on a webpage that happens to be a google results webpage, then the links on that webpage will be search results. If one user types in a weird query, then the toolbar will think that user likes those kinds of weird queries, and maybe that other people would like those, too.

      So when another user now types exactly the same query to prove the "sting", then the search engine will think it has found another user who likes weird queries, no? So it should show the connections it has learned from the previous webpage.

  6. Clearly an unbiased voice in this discussion by Cyberllama · · Score: 5, Informative

    Hey it's not like Microsoft is a client of the "Altimeter Group" and Google is not.

    http://www.altimetergroup.com/disclosure

    Oh? It's exactly like that?

    Look. Nobody thinks that Microsoft is "trying to reverse engineer their algorithm" from search results, but what they are apparently doing is harvesting user data from clicks. It appears that when a user searches from something, and clicks a link as a result of that search, the search term and site that the user found relevant is collected and used in their own search algorithm -- so they are, to some degree, piggybacking on Google here.

    On the one hand, its good to know what link your user found relevant -- that's important data for your own search engine to have, on the other hand that's really the sort of thing you should be gathering from your own damn search engine. I'm sure that by now, enough people are using Bing that they can get this data on their own. The only thing getting it through the browser instead of through bing allows them to do is gather it from Google users as well, which is essentially allowing them to tune their own algotrithm on the back of Google's.

    It's shady to say the least. Perhaps it was created with good intent -- as discovery tool for when users are on websites with internal search engines, but its obviously pulling in a lot more than that. If Microsoft continues to abuse that, they deserve any bad publicity they get as a result.

  7. Not a case of Pot Calling the Kettle Black by reiisi · · Score: 3, Interesting

    I think you're confused on the point of "attack".

    For example, I can post a link to this page. Google can now see the page. Of course, it could get to that page from within shopper.cnet.com, anyway, but the robots.txt file or NOINDEX/NOFOLLOW tags may be warning it off. (So Google has to walk the URL back up to http://shopper.cnet.com/robots.txt, to make sure, and it may not see http://www.shopper.com/robots.txt, by the way.)

    More to the point, I can post a link to this page of a search result on shopper.com. Then Google can see that search. And, in an hour or two, it might show up in a google search of "wall wart servers", which would be useless, but anyway.

    I can post a link to this query, however, and, not only might Google's spider collect it (from here), but it might not even have to get it from here. I'm probably not the first person to search shopper.com for "Small office home office server".

    I can't see there being an ethical issue here, because those links feed people to shopper.com. In fact, cnet likely has some agreements with Google on that. And many such search sites (well, smaller ones) deliberately use Google's search engines to save themselves a bit of infrastructure cost.

    Google, on the other hand, may prefer not to put some of those small search sites results on their general search pages, but that's a side issue.

    Now, how do you suppose that bing picks up a query like, "m4-7734-6al 63363r"? Unless someone posts that (like I just did), how does bing get that query just from my using it in a Google search a few minutes ago?

    To say this is a case of the pot calling the kettle black, you'd be accusing google of planting code in Chrome that watches for bing search results and feeds them back to google's search engine optimizer on the sly. (A new way for a browser to call home!) And/or of making deals with the Mozilla team. But the evidence you mention doesn't really support that, as someone else points out.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  8. I think this article says everything... by Missing.Matter · · Score: 5, Informative

    I think this article says everything that needs to be said on the issue:

    http://searchengineland.com/bing-why-googles-wrong-in-its-accusations-63279

    Essentially Bing's defense (as outlined in the article) goes like this:

    • Bing is monitoring users who opted in to send Bing data. They are watching their activity on any site, and not specifically Google.
    • The search signal generated by users does not dominate, unless it's the only signal (as Google tried to ensure it would be) it will have more weight, but not absolute. Even Google's test showed this to be true, as only a fraction of their honeypot terms made it to the other side.
    • Less frequent seach terms (the example given is pontneddfechan) Bing's results are relevant, unique, and ordered differently from Google's. Google's tests reveal the very special case where 0 signal comes from other sources.
    • What's the BFD in the end? Google alleges Bing is stealing results, but only shows one concrete example of this (tarsorrhaphy), which can be easily accounted for by crawling Wikipedia, which seems much more likely.
    1. Re:I think this article says everything... by hellop2 · · Score: 3, Insightful

      Are you serious? Clickstream, Surfstream, Searchstream. Fancy words for a keylogger. From your link:

      “We’re not copying but watching users,” Shum said.
      Weitz added, “The word ‘copy’ has a very specific connotation, and it’s wrong. We get the clickstream. We’re going to see it. We may choose to show it or not.”

      It doesn't matter if you call it watching instead of copying. It's still copying. Bing shouldn't be "watching" google's results, or "copying" the user's click behavior. That's like google's trade secrets. An analogy would be an online newspaper who copies articles verbatim from a competitor, and then justifying it by saying, "We didn't copy the article, we just monitored the user's eyestream and discovered this article. But it's ok because we copy everybody's articles."

      From your link: "Bing can also examine how people click on its own results that it lists in response to that search." No shit? It's like it's listed as an afterthought. Of course Bing should be paying attention to their own clicks... and not scraping their competitor's data. But instead, they're trying to justify it using PR words, and creating a convoluted argument that they are merely, "showing the surfstream" rather than "creating a reproduction of an original work", i.e. copying.

      It would be like a dating site copying a fake profile from a competitor. "We didn't copy that profile, we're just showing the datestream."

      --
      How many more years will slashdot have an off-by-one error on your Score in your profile?