Slashdot Mirror


The First Open Ranking of the World Wide Web Is Available

First time accepted submitter vigna writes "The Laboratory for Web Algorithmics of the Università degli studi di Milano together with the Data and Web Science Group of the University of Mannheim have put together the first entirely open ranking of more than 100 million sites of the Web. The ranking is based on classic and easily explainable centrality measures applied to a host graph, and it is entirely open — all data and all software used is publicly available. Just in case you wonder, the number one site is YouTube, the second Wikipedia, and the third Twitter." They are using the Common Crawl data (first released in November 2011). Pages are ranked using harmonic centrality with raw Indegree centrality, Katz's index, and PageRank provided for comparison. More information about the web graph is available in a pre-print paper that will be presented at the World Wide Web Conference in April.

5 of 53 comments (clear)

  1. Re:We're #164! We're #164! by Anonymous Coward · · Score: 2, Interesting

    164. slashdot.org

    4072812. beta.slashdot.org

    Profit!!!

  2. I'd say the results are pretty obvious... by ausekilis · · Score: 2

    Up top you have those web sites that have their fingers in damned near everything, because they are looking at "centralization" of the website. More and more websites are using videos, and who better than YouTube to host? Need to provide a way to search your website? Google has already done it for you. Need to update your 3 billion fans what you're having for lunch? Facebook and Twitter have you covered. I can't see the list from work, but I'd wager that Facebook is up there too, with their ever-present "like" buttons. What's surprising is Wikipedia, you'll only sometimes see a link to Wikipedia, even on discussions on Slashdot, they don't go out there and wave their hands saying "everybody link to me" like other sites do.

    What about other aspects that would make a website "good"? Such as ease of navigation (find what you want in 5 clicks or less)? Size/amount of useful content? Number of external sites that link to their content?
    If we included that sort of data, YouTube could potentially be far up there with Wikipedia. I would think Google and Bing would be ruled out entirely since by their very design they don't hold real data.

  3. Re:Usefullness of results? by goombah99 · · Score: 3, Funny

    Using old data, they rank digg above reddit.

    Yeah Digg is too crowded, no one goes there anymore.

    If you have not done so please donate to Wikipedia.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  4. Linux is dying by Trepidity · · Score: 2

    It is now official. The Università degli studi di Milano has confirmed: Linux is dying.

    One more crippling bombshell hit the already beleaguered Linux community when UNIMI confirmed that Linux's flagship domain, kernel.org, fell to a shocking #1797 in the Common Crawl rankings. You don't need to be the Amazing Kreskin to predict Linux's future. Its domain now ranks just behind Excite.com, the now-irrelevant search engine from the 1990s, which edges it out at #1796.

    The glaring gap between Linux's ranking and the rankings of those in the vibrant, enterprise-ready world is in itself embarrassing enough: Apple #8, Microsoft #17, even Oracle #248. But what seals the coffin is that Linux has fallen behind even the notoriously moribund FreeBSD operating system in these industry-leading metrics, trailing it by nearly one thousand, five hundred positions.

  5. Re:CC? by Garble+Snarky · · Score: 2

    Maybe because the rank is controlled by links, and many pages link to CC even though people seldom follow those links?