Slashdot Mirror


Web Page Entanglement

jason writes "tangle is a system for what we call "web page entanglement". tangle creates links between pages automatically based on how users move from one page to another. tangle proxies connect together in a peer-to-peer network for scalability: as users surf the entangled web, they are passed from proxy to proxy. Each proxy serves as an expert for a particular subset of web pages. For example, you can take a look at the entangled version of the GNU homepage as seen through a tangle proxy. tangle alpha2, the first public version, has just been released. See http://tangle.sourceforge.net for more information, or read on..."

jason continues:

"By viewing the web through a tangle proxy, you can see the connections and associations left by those who surfed the web before you. By surfing the web using tangle, you also leave behind connections and associations for others who will surf in the future.

When you exit one page and enter another (by clicking a link or performing a search), a two-way link is created between the pages. As users surf through a particular page over time, tangle keeps track of popular ways to get to the page and popular places to go next. These entry and exit links are displayed at the top of each page, sorted by popularity.

Clicking on one of these entry/exit links tells tangle that you think the link is relevant and useful (like a vote for the link) and increases the link's popularity. In other words, if a user thinks of something relevant while reading a page and performs a search for it from that page, tangle gauges how others react to that association over time.

tangle is similar in some ways to the closed-loop hypertext system Everything2, though tangle works for the web at large.

We have several tangle proxies up and running. The tangle proxy software is also available for download.

A note for the paranoid:
Though tangle keeps track of web usage patterns, the focus is not on tracking the habits of individual users, but on tracking the trends of an entire community of users. tangle is GPL'd open source [source here], so you can see for yourself: clicking a link through a tangle proxy simply bumps up the links popularity---user IP addresses are completely ignored."

34 of 176 comments (clear)

  1. For Christsake don't run this on Slashdot! by Anonymous Coward · · Score: 4, Funny

    You'd get the a goatse.cx link on top of every page.

  2. I made an exit link by Phosphor3k · · Score: 5, Funny

    Through goatse.cx, and If we all play our part, we can get gnu.org associated with goatse.cx!

  3. Hmm by serps · · Score: 5, Funny

    Does this mean that once quantum computers arrive, we will experience quantum entanglement?

    Thank you, I'll be here all week :P

    --
    "Einstein argued that [...] God is not capricious or arbitrary. No such faith comforts the software engineer." ~ Brooks
  4. isn't this done already? by Anonymous Coward · · Score: 5, Interesting

    Microsoft does something similar with their Smart Tags. That is, they modify your page without you realizing it. Only with entanglement, it's done on the server, rather than on the browser.

    Is there a way to block entanglement?

    1. Re:isn't this done already? by Mnemia · · Score: 4, Insightful

      You don't, and you don't really have a right to. People can view your content through whatever proxy or filter they want if you put it online at a publically accessible URL. You as a content producer don't get to specify exact presentation.

    2. Re:isn't this done already? by snillfisk · · Score: 4, Informative

      Ok, for the end-user it looks modified, but please remember that the end-user him/herself has chosen to read pages through entangle .. hopefully they'll be aware of their own actions and realizing that they're reading pages through entangle.

      I believe we'll probably see quite a few entangle communities on the net, where you probably just start your own entangle community with your friends or your co-workers.

      .. and its not the browser who modifies the content, its the proxy .. i'm not sure if the proxy uses any special headers, but if it does, you may block your site for non-modified entangles .. But then again, why would you do that? It would only limit the audience and the usability of your own page.

      --
      mats
      One man's ceiling is another man's floor.
    3. Re:isn't this done already? by Mnemia · · Score: 4, Interesting

      I don't think you shouldn't be able to use mod_rewrite to alter all your URLs so people can't access things in way you didn't allow. There's nothing legally stopping you from doing that; after all, you own the server. But I do think this is unethical behavior if it is done for some reason other than security. It undermines the reason the Web is a powerful medium and not just clickable television or an electronic magazine. Linking and relinking is at the heart of a peer publishing world where anyone can put their work out there on an equal plane with the professionals and where anyone can comment, criticize, or critique the contents of other people's information.

      My view is that when you make a public website you are contributing your views and information to the massive global community of links and related information. This ecosystem feeds off of openness and places the quality of the content above marketing and branding. I think that you should be willing to accept that when you make a public website, unless you are worried you can't compete on merit.

      Basically, you're free to make whatever you want available, but you can't control what OTHER people do with that content once it leaves your site (within the bounds of copyright law, which has no bearing IMHO on the copy in the browser cache). That's the price you pay for using the Web to publish: you have to let everyone else have the same rights as you, and that includes the right to link. That's why you shouldn't use mod_rewrite to prevent deep linking, etc, though that's certainly preferable to sending out legal threats. You can do this if you want, but you're not being a responsible member of the Internet community.

  5. Wow... by dubious9 · · Score: 5, Interesting

    Brilliant, I can't believe someone hasn't come up with this before. It reminds me of the traveling salesment implementation that models the way ants work. Most ants go the way most ants go, everyonce and a while some ants stray to find a better path.

    If this isn't abused by users, I see the net becoming much more efficient for searching for information. You won't have to wait for the search engines to catch up while looking for the most popular page on a topic, because the best (or should I say most popular) pages on a topic will automatically link to each other based on user flow.

    Am I missing something here, or am I right in thinking this will revolutionize the way we surf (that is if enough sites do it.)?

    --
    Why, o why must the sky fall when I've learned to fly?
    1. Re:Wow... by wurp · · Score: 3, Interesting

      That's what I thought about crit.org in all of its incarnations. crit.org is a decorating proxy, like the entangler. But instead of tracking linking, it let you mark up web pages to make corrections, suggest links, request clarifications, etc. I used it for a while, then I used the ThirdVoice toolbar which did the same thing but was proprietary. AFAIK, virtually no one else used it. Even on the sites associated with the creators, it was rare to find anyone posting or get a response to your issues.

      Until there's a plugin you can put in your browser so that every page you visit is automatically viewed through these decorating proxies, they won't revolutionize anything. : (

    2. Re:Wow... by dubious9 · · Score: 4, Interesting

      -1 Ignorant. I happen to have a degree in CS.

      I may not have gotten the exact idea down, but yes a very good approximative traveling salesman algorithm is based on ant behavior.

      Do some research here for some undergrads that used the idea learned from here(pdf)

      (Which are link i got from a two minute perusal on google for "traveling salesman ants")

      Please have an idea what you are talking about next time.

      Here's the abstract from the latter source.

      We describe an artificial ant colony capable of solving the traveling salesman problem (TSP). Ants of the artificial colony are able to generate successively shorter feasible tours by using information accumulated in the form of a pheromone trail deposited on the edges of the TSP graph. Computer simulations demonstrate that the artificial ant colony is capable of generating good solutions to both symmetric and asymmetric instances of the TSP. The method is an example, like simulated annealing, neural networks, and evolutionary computation, of the successful use of a natural metaphor to design an optimization algorithm.

      --
      Why, o why must the sky fall when I've learned to fly?
    3. Re:Wow... by kmellis · · Score: 3, Insightful
      "Ironically, you could have typed four words into Google and understood what he was referring to, rather than typing in several dozen insulting him unfairly."
      Yes, but then he would have denied us the opportunity to learn something important about him. This could even be a win-win situation, if he learns something about himself, too. You got to look on the bright side of things, am I right?
    4. Re:Wow... by dubious9 · · Score: 3, Insightful

      Ok, maybe I didn't explain my thoughts well enough, which I know is prone to happen.

      The analogy is as follows: Nodes in the traveling salesman algorithm are akin to a ring of popular related websites. Traveling Salesman wants to find a way to minimize the distance required to travel to each node. Web Page Entanglement(WPE) wants to find a way to minimize the number of direct links (paths) between somehow related popular nodes.

      "Ants" work by testing each link, mostly following the shortest known path, but sometimes branch out to see if there is a shorter unknown path.

      WPE is similar because if users from a go to b, and users from b goto c, then naturally there will be some that go directly from a to c, which will rise to be a popular link, and thus a's links are more "optimized" to link to other popular somehow related websites.

      I find the similarities quite apparent. Perhaps you should open your mind and realize that they are quite possibly not ~100% unrelated. Besides none of the other replies to this thread have sided with you.

      I would like to hear from anybody that does side with mosch, because I may be wrong and I think it is a virture to assume that one is not correct. A virtue more people should adhere to.

      --
      Why, o why must the sky fall when I've learned to fly?
  6. Slippery Slope? by moronga · · Score: 5, Interesting

    If the more popular links are shown first, doesn't it just reinforce their popularity? Once a link becomes popular, is there any way to vote it down?

    1. Re:Slippery Slope? by Iguanaphobic · · Score: 4, Insightful

      As with competition in business, you can vote it down by simply going somewhere else.

      --
      Fascism should more properly be called corporatism, since it is the merger of state and corporate power.
    2. Re:Slippery Slope? by trentfoley · · Score: 4, Insightful
      In order to know if the page is worthwhile, you must look at it. And, then you can choose to go somewhere else. But, by looking at the worthless page, you have voted for it. There needs to be a way to indicate dissatisfaction with the choice. Perhaps the proxies could detect the user hitting the back button and use this for negative feedback. However, I think that might lead to too many false negatives. It's never easy, is it?

      If I'm way off, thats because I'm too damned lazy to read the article.

  7. Gak! advertiser links and spam by maximillionus · · Score: 3, Insightful

    How long before this goes the way of the search engines with people abusing this to promote their own links?

  8. Trusting what you read. by clunis · · Score: 5, Interesting

    Excluding mutually authenticated ssl sessions, how can I trust that the document I'm reading is the document I tried to download? The tangle service is already modifying the page to add its navigation links, so why not change the content too ( e.g. remove content that users might find offensive, replace ads on popular pages with ads that you've sold, change links to documents you host, etc. )? The same really goes for any proxy or cache service, and I'm not accusing these good people of doing this, but how do we protect ourselves from services that would as more of them appear?

    1. Re:Trusting what you read. by Jester99 · · Score: 5, Insightful

      Excluding mutually authenticated ssl sessions, when have you ever trusted anything online?

      There's 15 routers between you and any web page you're visiting. That page is transmitted in plaintext the whole way. A man-in-the-middle attack could easily filter/scrub/change/subvert any page you're viewing.

      I know paranoia's popular on slashdot about how "The Man" is going to censor your viewing habits, but if you think that this is some sort of new problem created by proxies... just look at how TCP/IP operates. And smack yourself for not thinking that it already could happen. This is not a new concept or a new danger.

      Take-away message: if you need to ensure your data's passing along the net securely... use a secure transport mechanism.

  9. Too bad by MxTxL · · Score: 5, Insightful

    It sounds cool, but might prove to be useless... the phenomenon will happen that popular sites will be the ones getting the most hits and just perpetuating that way just because they are popular. More useful but less popular sites will be overlooked because they haven't been looked at much.

  10. New information by Catskul · · Score: 4, Interesting


    If this caught on, I can imagine that it might be possible that people would tend to depend on it. It seems that information would become stagnate and new information ignored since nowone would have exited to it initally. Then again, maybe not. Just a thought.

    --

    Im not here now... Im out KILLING pepperoni
  11. Net use tracking. by bytesmythe · · Score: 3, Funny

    Well, we know exactly what the first entry link at NineNine's and autopr0n's sites will be.

    --
    bytesmythe
    Hypocrisy is the resin that holds the plywood of society together.
    -- Scott Meyer
  12. A shame that's it's so slow ... by Hektor_Troy · · Score: 4, Informative

    Responstimes are close to a minute right now on the linked proxy. How would it stack up, if you ran a local entanglement proxy? Would response times still be high, due to negotiations with other nodes?

    --
    We do not live in the 21st century. We live in the 20 second century.
  13. Link bad! by The+Pi-Guy · · Score: 3, Informative

    You should be using this (http://zip.cse.ucsc.edu:8080/request?inform_about _proxy=&link_from_page_title=&link_from_page_url=h ttp://slashdot.org/&link_to_page_url=http://www.gn u.org/ for those who don't trust me) link instead so the referrer will be Slashdot, so the referrer will be correct.

    --j

  14. here's an idea by tq_at_sju · · Score: 3, Insightful

    put links on your web pages based on what the web page is about

    --
    http://www.vanillaafro.com - take me seriously and I will shoot you
  15. Ahhh, more pr0n ads! by stevens · · Score: 4, Funny

    This appears to use the same idea as referer-links on weblogs. Here's the progression from idea to uselessness:

    1. Obtain data from visitors as they browse.
    2. Post data obtained form visitors on the same site.
    3. Watch as three new internet startups market a tool to spam pr0n links on all the pages that use (1) and (2), above.
    Only let your users post shit on your site if you want it to all be pr0n spam or goatse links.
  16. microsoft stuck in the middle by jdkane · · Score: 5, Funny

    Hey, I just checked the entangled version of the Microsoft.com site, and all the entry and exist links seem to go to Slashdot, Free Software Foundation, or other places that Microsoft stands against. Looks like Slashdot has done its job. Pretty funny.

  17. Undesired Anomalies? by whereiswaldo · · Score: 4, Funny

    This appears as an exit link:

    "anarax.net - easier to use than a virgin on prom night"

    Not very tasteful for a professional site.

  18. I like it the way it is by mao+che+minh · · Score: 4, Interesting
    Call me old fashoined, but I really like the way that it works now. I like browsing the web, page by page, without having my surfing and the surfing of others being influenced by the content's popularity. I enjoy having many different outlets for the searching of information that retrieve information and "rank" it by a variety of ways (and many search engines using different means in which to "rank" it).

    Don't get me wrong though, this is a very creative and useful thing. For example, this would be extremely useful for searching through technical support knowledge bases or for a large company's document archive system. I would just rather they leave my web surfing alone. ;)

    1. Re:I like it the way it is by Dynedain · · Score: 3, Insightful

      From my experience, this would be a horrible thing for tech support databases.

      As it is, most major tech support sites already rank and display information based on how many people have already accessed it, informed them of usefullness, etc.

      Invariably, when I visit vendor tech support pages looking for information, I am looking for some of the most obscure problems. And I have a hell of a time finding the information that I need, because I'm not looking for the 'popular' stuff. And if I ever do find what I need, I better bookmark it or print it, because if I come back later, there's no way I'm ever going to find it again.

      I'd rather have a plain, simple, boolean word search engine over an 'intelligent' support database any day.

      --
      I'm out of my mind right now, but feel free to leave a message.....
  19. Re:Tangleless P2P Web by foniksonik · · Score: 5, Interesting

    Is anyone working on a personal P2P portal? Seems like an extension of what you're talking about. What I see is software which works like a webserver but is local and accessed P2P. Instead of DNS you use the P2P model to direct traffick and search for content, whether it is files or html/web media. All you'd need is a renderer (think gecko) hooked in to parse html, etc. to the peer who is browsing your site. This of course could also serve up blogs or calendars or whatever other types of web services you wanted to offer to your peers.

    --
    A fool throws a stone into a well and a thousand sages can not remove it.
  20. Yes, everything2 is right (not redundant) by Pyrosophy · · Score: 3, Informative

    Mods, this isn't redundant, it's true... and old news since Everything2 is already around.

    Of course the problem they've experienced on Everything2 is that some cool or sexy sounding link is irresistible to click on, causing these links to rise to the top regardless of their relevance. Thus, it decreases the usefulness of the "entanglement".

    Sex memes really are the most pernicious out there... can you honestly tell me you could resist clicking on "The Screensavers - Nude Episode"? The cost (clicking) to possible benefit (grrrrrrrrrrrrrrr) ratio is just too small not to expend the click.

    Pop-up hell might increase cost, thereby disciplining hormonal clickers, but even then. The Onion used to have an ad called "Naked Scottish Weathergirls" -- one of the most clicked on on the web. It led to a messageboard eventually where people posted digitized women in Scotland -- so many people must have arrived there and posted messages asking about the naked women it was unreal.

  21. interesting .. but is it effective? by Dr.+Awktagon · · Score: 5, Interesting

    I'm reminded of the idea of leaving your campus grounds unpaved, and then waiting for the "natural" grooves to appear in the ground where people walk, and then paving over those to make the sidewalks. You've probably seen an example of where there's a sidewalk connecting two points but then there's a worn-out groove nearby that's better, or connects from a more popular location.

    Some people think it's rude or immature for people to create these grooves by not walking on the sidewalk, but I see it as an example of an arrogant designer who thinks he knows the best way simply by studying a piece of paper. It's amazing sometimes, the groove just appears almost magically in an optimal place, given the layout of buildings and traffic patterns.

    This applies to web pages too. But, unlike sidewalks and buildings, you can't see your other destinations when you're sitting on a web page, so how do you know where to go next? This seems like it will just constantly reinforce the previous set of links, whatever they are.

    I didn't fully read the documents (/. strikes again) but what I saw says you move from page to page either by 1) following an existing link or 2) using a search function. #1 is not going to create fresh paths.

    It seems to me, a better idea would be to present a user with all possible links, or a subset of possible links, the first few times they visit. Then as they click through the site, add their arcs to the database.

    After the first few visits, you can stop showing all links, and show them the "most popular" links. If you just show the popular links up front, new paths may not be discovered.

    So perhaps this technique could be seen as a way to remove unpopular links, to trim the fat from a page. Then again, it might not be good to change a page after a person has gotten used to it.

    It's very interesting though. As the web matures, you'll see more of this sort of analysis to move beyond static web pages.

  22. Concerns by mattr · · Score: 3, Insightful
    Some points to consider (based on the handout:

    1. Server load.

    2. Limited feedback. Would be much more interesting as a tool for discovery if users could grade their findings. Presumably annotation would allow memos to be posted.

    3a. Privacy concerns, i.e. this would seem to provide more transparency to crowds. And Slashdotters might become more predictable. (Nah!)

    3b. Privacy concerns II. By announcing statistics of aggregate use it might be possible for a repressive regime (China, Scientology) to gain ammunition against individual websites by being able to prove how many visitors they had and (by purchasing an advertisement on an associated server like yahoo) what their IP addresses and demographic profile are (as impled by 3a above). ActiveX or Javascript exploits may also target heavy traffic streams with relatively little effort.

    4. Confusing intent. Adding visible backlinks seems quite valuable. However the client still cannot look more than one ply above its current location in what is still an undirected tangle. Is the tangle team (nice name by the way) aware of the large body of work already accomplished in annotation, syntactic web, Xanadu, etc.? What pressures exist to get people to take the less-travelled routes, or is the purpose to increase the traffic of popular sites? In that case are annotations superfluous? More docs please.

    5. (?) a bug in slash they note.

  23. I'm super paranoid man! by Call+Me+Black+Cloud · · Score: 5, Funny

    A note for the paranoid:
    Though tangle keeps track of web usage patterns, the focus is not on tracking the habits of individual users, but on tracking the trends of an entire community of users. tangle is GPL'd open source [source here], so you can see for yourself...


    Yes, but since this runs on the server, how do I know you're really running the source that's available?.

    Or maybe I'm worrying too much, and the check really is in the mail, my information really won't be sold to 3rd parties, that really does happen to all guys at one time or another, and it's not me, it's you.