Slashdot Mirror


Google Patents Search Algorithm

blastedtokyo writes "Google gets the first web search patent. According to this News.com.com article, Google was able to patent how they crawl and rank web pages. They claim "an improved search engine that refines a document's relevance score based on interconnectivity of the document within a set of relevant documents.""

39 of 362 comments (clear)

  1. OMG MORE PATENTS!!! by govtcheez · · Score: 4, Insightful

    Let's start screaming about how evil patents are and... oh wait, it's Google (and /. loves Google), so we'll get "Thank God they're this innovative and patented it before someone else stole it."

    1. Re:OMG MORE PATENTS!!! by Anonymous Coward · · Score: 4, Interesting

      Does this mean that with their algorithm now publicly available, we're going to find more "googlebuster" sites finding ways to improve their rankings?

    2. Re:OMG MORE PATENTS!!! by Bonker · · Score: 5, Insightful

      While I personally think that patents are repugnant, Google has fallen down on the 'just' side of using the patent laws the way they were intending to be used. They're not trying to bilk people out of vast sums of money ala British Telecom's hyperlink patent or Amazon's 1-click buy patent. They have a unique process that they've carefully guarded and have built a business around.

      Now that they've been awarded a patent for page-rank, it's required for them to make it public so that people can license it. You can't patent a trade secret and still have it be secret. People now have the opportunity to build new methods and innovate with Pagerank as a basis for that innovation. (Real innovation, not MS innovation.)

      Again, I think that patents are a misstep. I think they allow too many Amazon and BT events to happen. Despite the fact that the patent system is horribly broken, Google is using patent laws responsibly here. Wait until they announce a patent on 'all search technology that lists search results on a web page' or something like that. *Then* you can start complaining about how broken the patent system is.

      --
      The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
    3. Re:OMG MORE PATENTS!!! by cascadefx · · Score: 4, Insightful
      Software patents are a bad idea.

      [Rant On]
      Really.

      What G**gle is doing is basically quantifying word of mouth. Everyone knows the best restaurant in town is the one that everyone talks about. We "link" to the restaurant of our choice. Someone new to the office, and town, is looking to go to the best restaurant there is. They ask around. 5 people say it is the Puerto Vallarta Mexican restaurant 2 say it is White River Landing, and 8 others say it is Vince's. Vince's it is.

      Simple word-of-mouth ranking right there. If the new person throws in variables like, "I don't eat Mexican." That skews the results. Nothing fancy about this.

      In fact, I bet a few hours of research into Sociology, Psychology, and Linuquistics papers will turn up generic proofs and observations of the very same things that page rank takes care of in a different context. A context shift shouldn't be patentable. Much software (but not all) involves making these logical leaps. Many times they are leaps from pure science that is copyrighted (on the one hand) but (increasingly less so) open on the other. This is human knowledge we are dealing with. The Scientific Method... all that crap. It doesn't work unless everyone shares their toys. Start locking them up and you stifle innnovation (at the least) or become dictatorial master of (increasingly more of) everyone's lives.
      [/Rant Off]

    4. Re:OMG MORE PATENTS!!! by ergo98 · · Score: 5, Insightful

      Now that they've been awarded a patent for page-rank, it's required for them to make it public so that people can license it

      I had made this mistake before, confusing trade groups with patents. AFAIK patents do not force you to license it whatsoever. Instead they can be used to hunt down anyone who intrudes into your patent and sue them out of existence.

      In any case this isn't about PageRank, but is about a revised search technique: In a nutshell it is PageRank by resultset -> i.e. Say you searched for "Scooby Doo" : It gets the result set of Scooby Doo hits, and THEN it derives a pagerank within that set of Scooby Doo hits (versus the basic PageRank which derives the ranking for the whole net). It's funny because I had actually investigated the initial steps of a patent several years ago for something which I called a "combined corpus" (which in a similar light groups items by topic of discussion-i.e. a page on Crickets would get a good score for cricket searches by being referenced by lots of Cricket pages : It wouldn't benefit them to put a nude picture of Britney Spears to get a lot of links and boost their generic pagerank) because of the general ridiculousness of something like the basic PageRank, but I knew that against a giant machine like Google I wouldn't stand a chance so I just forgot about it (which is the problem with patents: How many people think of a great idea but then let it rot because of the almost certain patent overlaps). I've had that same thought process with a lot of, in my opinion, great ideas.

      People now have the opportunity to build new methods and innovate with Pagerank as a basis for that innovation. (Real innovation, not MS innovation.)

      Ooooh, nice use of the obligatory MS slam for mod points (ignore the fact that MS has been a fantastic patent citizen and has never, to my knowledge, enforced dubious patents). In any case how is it "innovation" for others to now use something existing (if Google allowed it)? Sounds like counter-innovation because everyone who might possibly overlap with this patent will now just dump the project lest they cross paths with Google.

    5. Re:OMG MORE PATENTS!!! by Com2Kid · · Score: 5, Insightful
      • What G**gle is doing is basically quantifying word of mouth.


      Which is a pretty impressive proccess. Making a set of mathmatical formulas out of an otherwise very much fluid and etheral concept. Not half bad.

      • Everyone knows the best restaurant in town is the one that everyone talks about.


      Oh? I think it is the one that everybody with a good sense of taste talks about? What is a good sense of taste? Welllll, now we are getting down to the nitty gritty. What defines a "trustable" website?

      • We "link" to the restaurant of our choice. Someone new to the office, and town, is looking to go to the best restaurant there is. They ask around. 5 people say it is the Puerto Vallarta Mexican restaurant 2 say it is White River Landing [muncienews.com], and 8 others say it is Vince's [muncienews.com]. Vince's it is.


      You must define the weight, if person a and b say Vinces but they both say that person C has "better taste" in Mexican Food, and person C says Puerto Vallarta, and enough of that goes on, than the decision base upon the results can be changes signifigently.

      • Simple word-of-mouth ranking right there.


      Yes, sounds like a good alpha-level project. :)

      • If the new person throws in variables like, "I don't eat Mexican." That skews the results. Nothing fancy about this.


      True, but what if resturant X has a style of Mexican that is mixed with, say, Soul Food, and the person REALLY loves Soul Food. Then what? Life gets confusing. :)


      • In fact, I bet a few hours of research into Sociology, Psychology, and Linuquistics papers will turn up generic proofs and observations of the very same things that page rank takes care of in a different context. A context shift shouldn't be patentable.


      Oh? If it is so obvious, why did search engines for so long, well, heh, suck. I remember using insanly complicated regexps with those "other" search engines to do what are now trivial searches on Google.

    6. Re:OMG MORE PATENTS!!! by Thavius · · Score: 5, Insightful

      There's two ways patents can be used: as a sword, and as a shield.

      IBM holds many interesting patents. One that caused a former employer of mine to take notice is one that covered anything that used templates to generate HTML files. This patent basically covers almost all WYSIWYG HTML creation tools (we were in the middle of creating one when it was issued). I haven't seen any breaking stories on how IBM is beating down small companies with it, and our company didn't get served a C&D order because of it.

      It appears that IBM is using the patent as a shield, to protect themselves against another company saying, "I invented that, give me money." It will protect them from being the target of an infringement suit.

      Other companies, such as BT, and Amazon, and others, are using their patents as a sword to exthort money out of companies. This is what I disagree with, because most often they target small companies first. They never seem to go after companies with resources, because they know their sword is not as sharp or strong as it could be.

      I'm not patents as an idea, but patents of some tech innovations have been abused. The side-swinging patent, that guy will never try to enforce his patent, because it was for fun. But just like anything else, patents can be abused to the detriment of everyone.

      Google's patent can be used in two ways. Let's see how they use it.

    7. Re:OMG MORE PATENTS!!! by iocat · · Score: 4, Informative
      Now that they've been awarded a patent for page-rank, it's required for them to make it public so that people can license it. You can't patent a trade secret and still have it be secret. People now have the opportunity to build new methods and innovate with Pagerank as a basis for that innovation. (Real innovation, not MS innovation.)
      Actually, they are required to disclose it, but not to license it. The patent gives them a 17 year legal monopoly to do whatever they want with it (use it, license it, bury it, etc.). As an example, Capri Sun never licensed their patented "juice bag" technology, forcing others to use inferior "drink boxes" to deliver product. Now that the patent is expired, other "drink bags" are on the market.

      More worrying is that software patents are sometimes granted using such general language that the entity getting the patent *doesn't* really have to disclose anything, enabling them to get both protection while keeping their invention secret, which is exactlty the opposite effect of what patents were intended for -- to get duplicable knowledge into the public domain after a period of protection for the original inventor.

      --

      Dude, I think I can see my house from here.

  2. Mis-title by Amsterdam+Vallon · · Score: 4, Informative

    It's not really their Search algorithm, it's their method of comprehensive PageRanking.

    They basically measure Web pages as either 1) portals, or 2) authorities.

    Sites like Kuro5hin and *nix have a lot of "Google juice" (i.e. weight in their ranking system) because they have so many links to other sites, while also garnering a slew of links to their main page.

    --

    Reply or e-mail; don't vaguely moderate. Ex-O'Reilly/MIT employee, now a full-time Google employee.
    1. Re:Mis-title by MilTan · · Score: 5, Informative

      PageRank doesn't actually distinguish between "portals" and "authorities." It "only" does a link-analysis of the web by essentially mutiplying some ranking vector by a matrix representing the links in the web, with a random jump to another location taking place with a certain probability to create a new ranking vector. Once this converges, you have the new "PageRank."

      PageRank scores are calculated completely independently of the search query. You are probably thinking of Kleinbergs HITS (or Hubs and Authorities) algorithm which uses an initial search query to prune the search space, and then identifies hubs and authorities in the web. In contrast to PageRank, which only uses forward links to calculate its rankings, HITS uses both forward and "backward" links to figure out its ratings. Furthermore, unlike PageRank, HITS produces different scores for different queries.

      The above tells us the following: That Kuro5hin and Slashdot have high pageranks not because of their excessive numbers of outlinks, but because many people point to their frontpages. Similarly, these high PageRanks mean that people that Slashdot or Kuro5hin point to get higher scores as well.

    2. Re:Mis-title by johnnyb · · Score: 5, Interesting

      What I found particularly cool about their algorithm was that they can return results for pages that google has not read. If there is a link to a page google has not read on a page that google is reading, it can still return results to the unread page based on the context of the link, and the popularity of the link on other pages. Really nifty stuff.

  3. Good for them... by DCowern · · Score: 5, Interesting

    They thought of a way to improve upon an existing invention. They were the first to do it. They want to make money from their idea. It's only logical for them to seek a patent. I guess congratulations are in order!

    1. Re:Good for them... by Ed+Avis · · Score: 5, Insightful

      It's not a question of whether Google is 'good' or 'evil'. It's a question of whether the patent office was right to grant this patent, and whether a patent system that includes software is of greater economic benefit to society than one that does not.

      You can ask: if patents on computer programs were not available, would Google have developed their idea anyway?

      --
      -- Ed Avis ed@membled.com
    2. Re:Good for them... by DCowern · · Score: 4, Insightful

      I'm sorry I came off trollish but I just don't see why every patent is seen as evil on Slashdot. I agree wholeheartedly that the patent system has gotten out of control. I just don't agree that every patent is evil. In a lot of cases, businesses need patents to exist. For example, what would happen if Microsoft figured out how to implement Google's page rank system and implemented it on MSN? Google would have no recourse and Microsoft has approximately 80 bajillion times the resources of Google and could easily out market them.

      And by the way... the difference between patents and RFCs is that with RFCs, there's no expectations of profit. They're made in cases where, as a previous poster pointed out, the greater societal benefit outweighs potential profits. Many RFCs and IEEE standards are based on corporate IP anyway, especially ones dealing with network protocols. Token Ring, FDDI, and Ethernet were all proprietary standards back in the day...

    3. Re:Good for them... by PygmyTrojan · · Score: 4, Interesting
      Mod it down or reply, that is the question...

      Hopefully you've read what you wrote by now and realized how stupid of a comment that was. Gravity was a discovery becuase it already existed. Algorithms are invented. I don't know about you, but I haven't seen any algorithms on the side of the road when I drive to work.

      Besides, most inventions were found by accident, that doesn't make them discoveries.

      --

      Trying is the first step towards failure.

    4. Re:Good for them... by Lumpy · · Score: 4, Insightful

      ...because they're Google. But if it were Microsoft patenting "an improved method for giving help to users", say maybe the help files vs. man pages, people would flame about prior art, talk endlessly out of their anuses about how Bill Gates is trying to wrest control of the tinfoil hat co-op from Mac users, and generally be nuisances.


      so you got any prior art on google? They patentented something that IS ACTUALLY INNOVATIVE and took time to patent.. other than patenting an icon that says "start" or patents on clicking on an object to cause an action.

      most patents are completely absurd and need to have an angry mob impale the patent-er and the clerk at the patent office that approved it on a large spike somewhere very public as a lesson to others.

      google's patent is actually something that is worthy of a patent (like the wankle engine, or using an exploding device to inflate a teflon/cloth bag to lessen the impact on the people in a vehicle during a crash... you know something innovative.. It's hard to recognize today in the sea of non-innovatives.

      --
      Do not look at laser with remaining good eye.
    5. Re:Good for them... by Qzukk · · Score: 5, Interesting

      The problem with the patent system today (in my view) isn't whether or not software should be patented, it is the complete crap that comes out of the PTO. They have no incentive to do things right, no disincentive against doing things wrong, so they just grant patents on crap.

      What needs to be done is first: fix the backlog so that it no longer takes years to grant a patent. Open new patent courts, hire more people, charge more for the patents, whatever it takes.

      Second: The USPTO needs to be responsible for its output. If a patent is overturned as invalid, court fees and the original cost of filing the patent should come out of its pocket.

      Third: Mandatory implementation. For a patent to be valid it must be implemented by someone, somewhere. If the holder of the patent is unable to implement the patent, it must accept at least one request to license the patent for implementation under generally accepted RAND practices. This prevents people from patenting something with no intention of ever producing that invention, for the purpose of preventing that invention from being produced.

      Fourth: establish a class of patents specifically for software. Things not in this class of patents cannot be used to claim software infringes. Turnaround must be quick, and part of the patent process should be "does someone else sell this product already" which should be relatively easy, compared to looking through millions of patents.

      Software patents will expire after two years, renewable once (at the price of the original patent) for another year. This better matches the reality of software development. It provides people a head start without granting them an essentially perpetual monopoly.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
  4. Patents creating artificial monopolies by taumeson · · Score: 5, Insightful

    Patents are a tool for creating temporary, artificial monopolies.

    With that said, aren't you glad Google might be able to stay on top and profitable, instead of having to resort to banner ad revenue, etc?

  5. Oh Please - Eugene Garfield did this is 1961 by tiltowait · · Score: 5, Informative

    Google didn't invent the concept behind PageRank, just its name. See my E2 writeup on citation analysis for more.

    1. Re:Oh Please - Eugene Garfield did this is 1961 by zmahk31 · · Score: 5, Informative

      In fact, the algorithm as a computational method goes back to Jacobi 1804-1851, and is essentially an iterative solver for large systems of linear equations.
      <p>
      Of course, it's still a significant contribution to see the application of the Jacobi method to ranking web pages, and I assume that they have done some clever and many more dirty tricks to get more realistic results, weed out duplicate pages, etc., which may or may not be part or the patent.
      <p>
      In any case, the basic page rank algorithm is quite intuitive to anyone who has worked with iterative numerical methods, and in fact a very nice illustration of the power of such methods.

  6. hmmm.... by Kevin+Stevens · · Score: 4, Insightful

    I am not quite sure of the purpose of this article since most patent articles are intended to point out the ridiculousness of the patent system, but this seems like a pretty legit patent to me. They developed a technology that is superior to their peers, that they developed completely in house w/out ripping anyone off. This passes my shadiness test. If anything, we should all be happy now that Google will be publishing some of the details for their system.

  7. Not that outrageous by Nikk+Name · · Score: 5, Insightful
    Compared to other "patent the fork" icon'ed items, this one is not that outrageous.

    Google's way of doing thing was certainly not the first way to search, it is not the most obvious way to search, it is not the only way to search, and it might not be the best way to search (something better likely will come along). In other words, I don't think this patent will harass many others at all.

    This is nothing near as bad as Amazon patenting message boards attached to sale items, or "one-click shopping" being patented.

    1. Re:Not that outrageous by tbmaddux · · Score: 4, Funny
      Compared to other "patent the fork" icon'ed items...
      Holy --! I have been reading Slashdot for, I dunno, a couple years now and I never noticed the meaning of that icon before.

      Seriously.

      I must be almost as dumb as the USPTO.

      --
      Can't you see that everyone is buying station wagons?
  8. Hmm.. by Astin · · Score: 5, Insightful

    Wow.. an internet patent that might actually make sense. It's not "A method to search through an index of web pages for relevant links to a user request for specific information." But the improvement on it. And it's generally accepted that Google DID improve web searching tremendously and have a unique method of doing it. Of course, this means it will be struck down immediately by some small company that gets a broader patent (see above) and sues them.

    --
    - In hell, treason is the work of angels.
  9. Slashdot hypocrites... by jwriney · · Score: 4, Insightful

    Aaaagh! Patents are bad! Patents are bad!

    (Psst - hey, Google's getting one.)

    Uh, well, (grumble) I guess that's okay then, er...

    Bring on the wave of apologists.

    --riney, Karmakaze

  10. Not entirely unexpected, but... by Tet · · Score: 4, Insightful

    I'm in two minds about this. Should Google get a patent for this? Google have innovated here, and thus the patent is a valid way to reward the effort they put in to designing the system, in exchange for the idea entering the public domain after the patent expires. While the duration of patents in IT related areas needs to be drastically shortened if they're to serve their original purpose, I'm not inherrently opposed to patents like this. The question then becomes, is it sufficiently obvious to anyone in the field that it shouldn't be patentable? Well, it's a tough call. The fact is that no one had done anything like that before Google. If it was so obvious, why not? My personal view is that it's obvious enough that if Google hadn't done it, someone else would have done within a couple of years. So while I don't think the patent should have been granted, I don't think it's as cut and dry a matter as it may at first appear...

    --
    "The invisible and the non-existent look very much alike." -- Delos B. McKown
  11. Re:watch out by DCowern · · Score: 5, Insightful

    What's wrong with what Google is doing? They're simply trying to keep an "edge" on the market. The reason why they're the best search engine out there is because they figured out how to make a better way to rank pages. They deserve to reap the benefits of that invention without anyone else cutting in on their business.

    As for the "googling" incident, I just think they're attempting to defend their trademark. If you don't do that kind of stuff, you lose your trademark. Kinda like how Kleenex and Xerox lost theirs (everyone says "may I have a kleenex?" or "could you xerox this?" and so it became colloquial and no longer a trademark).

    All Google is trying to do is cover their ass. If they decide one day to try to patent the search engine, then there'll be reason to get up in arms.

  12. Software patents by killmenow · · Score: 4, Informative

    I find it interesting that because it's google, some /.-ers are saying essentially "good for them!" But at the heart of it, it makes no difference who it is or what their intention is.

    Kids, software patents are bad, mm-kay...

  13. Good for them... by theGreater · · Score: 5, Insightful

    ...because they're Google. But if it were Microsoft patenting "an improved method for giving help to users", say maybe the help files vs. man pages, people would flame about prior art, talk endlessly out of their anuses about how Bill Gates is trying to wrest control of the tinfoil hat co-op from Mac users, and generally be nuisances.

    I love /.ing while in class, but honestly, people. Google gives a C&D letter, we all golf clap and say "way to defend your IP!" Someone else does it, and we all run to chillingeffects to boycott / whine / gripe / whatever.

    Here's a thought... get off your hobbyhorse, and start evaluating things based on FACTS, not the general feeling of techno-elitism you get from pretending you're cool because you get jokes written in PERL.

    And mod me -5 Troll, if you want. But it's the damned truth, and you know it.

    -theGreater.

  14. Algorithm now public? by terrencefw · · Score: 4, Interesting
    I thought Google's searching/ranking tehcnology was a closely-guarded trade secret, to make sure that people weren't able to engineer their rankings sucessfully.

    Now that they've patented their technology, surely that means that it's open to public scrutiny and therefore abuse as people exploit it's shortcomings.

    --
    Like tinyurl, but one letter less! http://qurl.co.uk/
  15. Re:Response Interesting by Anonymous Coward · · Score: 4, Interesting

    Do you realise that the Google search you link to, shows your comment as the top result? Its a Google loop!

  16. Re:The nature of the beast by tetro · · Score: 4, Insightful

    Patents aren't technically evil. It's just the way they're used.

    --
    .smell my feet.
  17. Not necessarily... by TopShelf · · Score: 4, Informative
    Patents are also widely used as a means of rewarding an inventor by giving them an avenue to license their technology to one or many users who can then implement it into commercial products. In that way you don't get a monopoly, nor does the inventor have to provide the capital required to bring something to market. You only get a monopoly if the patent holder refuses to sell licenses, or sells it to a single user.

    Think fuel injectors, for example, which are made by several suppliers, but have a patent holder who gets license revenue.

    --
    Stop by my site where I write about ERP systems & more
  18. Re:Sad day for computer scientists by s20451 · · Score: 4, Insightful

    With this privatization, they closed all their notebooks and journals and stopped teaching others how to implement a great webcrawler and search ranking system.

    Troll.

    The upside to patenting (at least in theory) is that Google no longer has to keep its IP secret, in fear that someone else will copy them. If you're so curious, why don't you request a copy of their patent yourself, and review it.

    --
    Toronto-area transit rider? Rate your ride.
  19. Prevent the spread of a deficient technology. by expro · · Score: 5, Interesting

    So, the bright side of this patent is that perhaps it will keep others from focusing on Google's obsession -- the reference popularity contest. But like any patent, it is subject to abuse, not that we know at all how Google intends to enforce it.


    I have requested improvements to Google's algorithms for years to make it more possible to search for a specific thing, rather than just a popular thing, but they don't have engineers, apparently, who understand these basic needs.


    AltaVista lets you wildcard, search for one word NEAR another word, use common words as part of a phrase, and construct a variety of very useful filters that are impossible with Google's popularity engine.


    AltaVista used to be the best out there, but compromised their own usefulness. If AV indexed more pages and had not dropped their usenet coverage, it would still be the most useful engine by far to an advanced searcher -- one looking for very specific things. I still go there often. Just because the masses use Google does not make it quality or best for advanced users. They have stagnated for years now. The masses use a lot of things produced by monopolists who are no longer required to innovate or even improve to the level of the competition.


  20. This patent is for NewRank/LocalRank, no PageRank by registro · · Score: 5, Insightful

    That is not the patent for PageRank.
    PageRank had already been patented by Stanford University, just before Google was created, when it was a community effort.
    This new patent is a patent over an improvement of PageRank, what they call now "LocalRank" and "NewRank". It is designed to stop competitor from developing pagerank-like technologies. Armed with that kind of patent, they can stop open-sorce Aspseek, Teoma and others from developing similar technologies.
    What they are tryng to do is extend patents over citation ranking and peer-review, something that has been around since the creation of the first libraries. This is NOT good.

    Basically, this means no more money from the suits to any citation-ranking related effor in any start-up, fearing litigation. It could mean also no more installations of open-source Aspseek (Google Appliance's competitor )in corporate environments, because of fear of litigation.

    This is sad.

  21. Patent # 6,526,440 by esme · · Score: 4, Informative
  22. The purpose of the patent system by Quetza · · Score: 4, Interesting

    There seems to be a lack of understanding about the original purpose of the patent system. The the distant past, knowledge was transferred from artisan to apprentice and through guilds. Back then, as now, people were very protective of their intellectual property, as it was their livelihood, so it would not be stored anywhere. If the person were to die without passing on the knowledge, it would be lost forever (like Damascus steel).
    To try and stop knowledge from being lost, governments introduced a patent system (first patent recorded in 1449) so that the creator of the knowledge would still get a fair financial reward for the item.

    IMHO there are 2 problems with the existing patent system implementations.
    1) As the technology becomes more complicated, those who verify patents are not skilled enough to accurately judge their validity.
    2) The time limit of patents is too inflexible. Many technology patents should have valid lengths of 5-10 years.