Bow Tie Theory: Researchers Map The Web
Paula Wirth, Web Tinker writes "Scientists from IBM Research, Altavista and Compaq collaborated to conduct the most intensive research study of the Web. The result is the development of the "Bow Tie" Theory. One of the initial discoveries of this ongoing study shatters the number one myth about the Web ... in truth, the Web is less connected than previously thought.
You can
read more about it "
OK, first off, I'm really getting sick of this. Nerds don't have to be interested in Linux or Open Source to be a Nerd. That's how you define yourself and therefor you think that all nerds should be like you, right? Because this doesn't interest you doesn't mean that you have to troll the article or that it's not important to other people that consider themselves nerds...
I have been using the Internet almost since it started, and can even remeber the pre-web technologies like "gopher", "wais" and "veronica". If I need to find a web page, I can always use one of the major search engines like googal and altavista it doesn't matter to me if Joe Average's page is not linked, since it is probably something he hacked together one evening and put up at geocities, and has not updated it for over a year.
Wow, how often do you use "googal"? I mean, if you can't even spell google right, why should we believe that you have any knowledge of the internet? And the typical "Joe Average" doesn't have that page up there for you, it's for family and friends. Very few personal pages have a target audience larger than people they know or people that want information on them.
I find all these "personal" pages on the web are a major irritant, as they seldom contain useful information, and they clog up the search engines with non-relavent crap, by polluting the search space.
If I want to know what Joe Sixpack in Assmunch Arizona called his dog, or to see pictures of his pickup truck, I would ask him. But I don't.
Have you heard of logical searches? If you know how to search the web properly, you should be able to find just about anything that you are wanting within the first 5 hits. Know what search engines to use for what you want and how to use the logical operators to filter that "non-relevant" crap.
It is about time that us "geeks" re-claimed our Internet from the dumbed down masses. We should return to the days of ARPA, when only people with a legitimate requirement could get net access. The "democratization" (i.e. moronification) of the web has gone too far and is responsible for the majority of problems us "original internet users" are seeing. The flood of newbies must not only be stopped, it needs to be REVERSED. These non-tech-savvy people need cable TV, and not something as sophisticated and potentially dangerous as the Internet.
Perhaps a new more exclusive "elite" (in the good sense of the word) Internet should be set up, running only IPv6. Then we could capture some of the community spirit of the pre-AOL "good old days". And maybe these spammers, skript kiddies and trolls would back off.
Ooh, just what the web needs, more "elite" people like you. Dammit, the web is about information, equality and business. It's not just for you "31337 H@X0RZ" anymore. Grow up! Most of the technology that you're using today was devoloped because of the popularity of technology. You try to "reclaim" technology for your little group and you'll shink the market for it so much that companies won't bother with it.
kwsNI
The good ones are expecting their consumers to pay them back, the bad ones are trying to IPO.
What does this mean? Do the Ciscos of the world expect to stay in business by having end consumers a penny at a time repay their VC debt? I wouldn't think so. And only bad companies IPO? Thats a rather shallow view isn't it?
More race stuff in one place,
than any one place on the net.
There's actually a very good reason for this: include a link on your page, and there's a non-zero chance that the viewer will follow it. If the web page in question is essentially an ad (which many pages are, these days), having someone follow a link off-site is like watching them change the channel when your commercial comes on the TV. Why provide them with the out?
Given the criteria they picked, there have to be four groups. The binary-valued criteria are "has links to it" and "has links from it". There are then four possible combinations. All four exist in practice, which is to be expected. Big deal.
- p0rn
It has to be on it's own, slashdot wont link to it, and we do generally care to count it. Someone had to say it. Devil DuckyDevil Ducky
Devil Ducky
MY peers would get out of jury duty.
Yes, I agree entirely with you and have been considering the issues you raise for some time. The most urgent needs facing the internet today are a) to get rid of current users who aren't capable of using it, and b) preventing further users from accessing it in the future. I have some ideas on how to proceed with these two points.
IMHO, the solution is to stop letting everyone access the web. There are two ways this should be implemented. Firstly, anyone under the age of 18 (or 16 or 21) should not be allowed on the web at all. Until they are adults, they cannot be trusted to handle the large amounts of dangerous information which the web can provide to them, and during this vulnerable stage in their life they can be swayed by rhetoric and promises. Doing this will immeadiately stop the market for censorware and filtering, since if only repsonsible adults can use the net, then they can handle what is seen there, and get rid of pirates, script-kiddies and trolls who are almost exclusively under 16.
Secondly, access to the web should be dependent on some kind of examination process, whereby people who want to use the web have to take a test to determine their suitability. In this way we can weed out the undesirables from the net and make sure that the content on it is of uniformely high quality. Rather than having sites dedicated to racist hate, terrorist manifestos and anti-Christian diatribes we can have decent sites which educate and enlighten readers, like we had before open access.
Now, I know these comments will offend some /.ers, but try to look beyond your liberal hand-waving for a minute and think about these proposals. The net is becoming a cesspool, and this is the only way to clean it up.
Of COURSE there are a bunch of 'dead end' and non-connected sites. There are a thousand web rings for Leonardo DiCraprio just languishing, having been abandoned for whoever is hot now... argh...
I love Google, but lately when I search I get more results consisting of dead links and posts to message boards than any useful info. I've been on the mailing list for the Search Engine Watch newsletter for a couple years now, and while there's a lot being done to weed through all the fluff, IMHO the fluff is growing at too high a rate for the technology to keep up with presently.
Anybody currently active in the industry got an insight into how search engines are combatting all this expired flotsam?
The Divine Creatrix in a Mortal Shell that stays Crunchy in Milk
The House Between - Original Sci-Fi Series
They haven't realized that people want information from more than one source. They also haven't realized that providing links to those alternate sources will improve their credibility.
pooptruck
All opinions are my own - until criticized
... until criticized.
Just so you know, I'm not here to criticize your opinions. I'm here to criticize your sig. The first, most obvious problem, is that you are missing a period at the end of your sentence. Please fix this. Secondly, you should not have hyphenated that sentence. That's just wrong in so many ways. In mid-sentence -- like this -- you use a dash. In plain ASCII text, a dash is two -- count them -- two hyphens. There are other characters available, but those fall outside the 7-bit range and therefore, they cannot be trusted. Not that any of this matters because you should have used parentheses (the little round things on either side of this little comment here) or an ellipsis...
Here are some samples of what your sig should look like:
All opinions are my own (until criticized).
or
All opinions are my own
You must please understand sigs are very important. Unlike comments, you can change your sig and fix it and make it look pretty. Anybody that criticizes spelling in a post is an elitist and a hypocrite, but sigs can be changed. You can make a difference!
That said, it's time you got yourself a new sig. Thank you.
Sorry, but "World Wide Bow Tie" just doesn't do it for me. Plus, we'd have to rename W3C to W2BTC, and that would just screw things up...
kwsNI
How did they do this? They used Altavista.
So their entire theory of "bow-tie connectedness" conveniently forgets that Altavista exists. Fortunately for us web users, Altavista (insert your favorite search engine) does exist, and its existence seems to invalidate their hypothesis.
So it's an interesting idea, but if it ignores the existence of search engines it doesn't really hold much meaning.
--
It's a
-- Danny Vermin
This was shortly followed by announcements from the W3C of the 'Angel-Hair', 'Fucilli', and 'Linguini' web theories.
-josh
I find all these "personal" pages on the web are a major irritant, as they seldom contain useful information, and they clog up the search engines with non-relavent crap, by polluting the search space.
Really I think you should be blaming the search engines for that, not the web itself. It's the search engines who index it, after all.
The most convenient way to fix this problem would simply be for all your favourite sites to use meta information properly. This is exactly what it was designed for. Unfortunately there are too many lazy designers around that don't bother to implement it properly, so it's no wonder that search engines have trouble indexing and promoting it appropriately. Most geocities users who don't update their homepage for a year won't know or care about how to use meta info effectively, and it would quite easily demote their pages by default.
I don't want to sound too boring but one of the best things about the net in its (mostly) unregulated state today is it's openness and how it lets information be distributed so easily. Sometimes this information is unreliable but the same mechanism can't prevent open debate about the same information, either.
Personal homepages are simply documents that somebody has placed on a server indirectly attached via a network to your own. If you don't like them, disconnect your computer from that network. If you want a censored system, then by all means design it, patent it, and only sell the rights to the people you want to use it.
only if you want super gee wizz stuff
.oO0Oo.
the bleeding edge will always be messy as new technologies race to be ahead and others fall down
it's ALWAYS been like that in virtually every field of study in computing and no doubt humanity
ride the wave or swim back top shore, your choice
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Look at the money going into streaming media. A large segment of the business world still sees the internet as just another medium for TV or radio broadcasting. By it's very nature broadcasting is not interconnected, it's passive and linear.
Tim Berners-Lee wrote in his book, Weaving the Web that the main obstacle to the web being a true information web of shared knowledge is that content is controlled by too few. He was upset that browsers were developed which could not edit web pages like his original browser/editor.
The silver lining to this, IMHO, is the "weblog" phenomenon, including sites like Slashdot, where ordinary users can contribute their ideas, especially in html format so that they can contribute links. I really believe that some day soon the conventional media sites will be forced to give this kind of capability to their readers, or else risk losing all those eyeballs to Slash-like sites.
"What I cannot create, I do not understand."
Netscape and Microsoft have market shares enough that their "features" are used, but none are big enough to set a de facto standard.
Wouldn't it be nice if *one* browser had a flawless implementation of the W3C standard?
All opinions are my own - until criticized
All this tells me is that developers are selective in what they link to. Some tend to get together and link to each other. Some tend to link only to themselves. Some want to be noticed so they provide lots of links, but aren't truly interesting, so nobody links to them.
This makes complete sense. If every page had links to every other page, you would never be able to find anything. Each page would have too many links. The way the web is developing, you start looking for info within the IN group (usually a search engine or someones index page). This lead to the SCC which eventually points you to a leaf node in the OUT group which has the truly interesting information.
I find this structure to be efficient and elegant.
Aah, change is good. -- Rafiki
Yeah, but it ain't easy. -- Simba
Remember the days when you would go to a web page and every sentence had at least one link? Even corporate sites weren't shy about doing off-site linking.
Of course the web was atrocious, but if you found a dumb page (take my old one for example) there was always something linked to that WAS moderately interesting.
Wiki pages are awfully remniscient of the "old web". (Of course that one is centered around eXtreme Programming and kinda boring, IMO, but it's the principle!)
Oh well. The corpratization of the web has brought lots of cool things, too; they're just harder to find now.
"I'm not arguing for linking to random information just because you can, but informative linking is why hypertext has the hyper."
/anything/ relevant. An example would be Microsoft's knowledge base/help (sic) site. Try to find something there.
And I'm not arguing the opposite. We shouldn't just link every single word to Everything2.com just because we can, and, God Forbid, our site would not be linked enough to the core if we did not. The content has to weighed. What frustrates me even more than a page with absolutely no applicable links (when it would be useful), is a site which has big blue glaring links all over the place and I can't find
It's 10 PM. Do you know if you're un-American?
Endless? I doubt it. Start a business. Chances are it will go belly up in the first year or so. Take a job. You will be working for someone else and giving them the fruits of your labor while they hand out a small pittance to you (not bad sometimes but at least realize it). Vote? Doesn't really count unless you organize a large group of people to vote like you do (special interest groups). How about something simple like build a shed on property you own with your tools, made with wood you bought (or made if you can do that)? Sure you can do that. Assuming you have the permission of the locality you live in. Or you could just build it and have the locality order you to tear it down and get permission first.
Let's be honest now and drop the sarcasm. In America we are free... to a point. We live under a mass of laws that have been enacted over time to appease one group or another. Some of this is good... some of it is bad or just down right unenforceable, in and of themselves. As for standing a better chance here then anywhere else I think you would have a pretty good chance in Canada, Great Britian, or several other countries. America does not have a lock on success in this world. We happen to just be the most arrogant about it (unfortunatly).
The study does not mention the impact of secure sites. For that matter, any site that the search engine couldn't crawl. For instace, what abot PHP, ASP or other script generated sites? Do these show up as "outs" or strongly connected or dead links? I have not seen anything in the paper addressing this. I have to imagine there are a fairly large number of scripted pages and secure websites. What about AD-click links, does this make the web appear more connected just because Ads appear on a otherwise dead end page? There may be reasons to question the validity of any such reasearch done using web-bots since the nature of web-content has rapidly changed over the last few years.
OK, so you're trolling, but I almost agree with you anyway. Back in the late '80s, I was wishing more people were connected to the net. It was a great place to be. Now they're all here, I occasionally find myself wishing they weren't. The problem is that there's no quality control. If only people with half a brain were allowed Internet access, we wouldn't have the AOL syndrome. But real life isn't like that. For better or worse (overall, I think it's for better, despite the problems it causes), the unwashed masses do contribute to the essence of the net. For every 1000 AOL lusers, the general population gives us a Rob Malda or an Iliad. Not an ideal ratio, but better than nothing.
"The invisible and the non-existent look very much alike." -- Delos B. McKown
The web is broke. We're not using it properly
Is there a proper way of using the web? I don't thing so. The web is many things to many people. That's what makes it so alive and so interesting.
there are too many poorly done corporate sites
Sure, so what? See Strugeons's Law (90% of everything...). A badly designed corporate site tends to be its own punishment.
We need more of these research projects to help us figure out what needs to be changed.
Seems like you want to impose good taste and proper programming practices on the web. Thank you very much, I'll pass. I don't want the web to be Martha-Stewardized.
Kaa
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
Adamic and Huberman (2) 99. L. Adamic and B. Huberman. Scaling behavior on the World Wide Web, Technical comment on Barabasi and Albert 99.
Aiello, Chung, and Lu 00. W. Aiello, F. Chung and L. Lu. A random graph model for massive graphs, ACM Symposium on the Theory and Computing 2000.
Albert, Jeong, and Barabasi 99. R. Albert, H. Jeong, and A.-L. Barabasi. Diameter of the World Wide Web, Nature 401:130-131, Sep 1999.
Barabasi and Albert 99. A. Barabasi and R. Albert. Emergence of scaling in random networks, Science, 286(509), 1999.
Barford et. al. 99. P. Barford, A. Bestavros, A. Bradley, and M. E. Crovella. Changes in Web client access patterns: Characteristics and caching implications, in World Wide Web, Special Issue on Characterization and Performance Evaluation, 2:15-28, 1999.
Bharat et. al. 98. K. Bharat, A. Broder, M. Henzinger, P. Kumar, and S. Venkatasubramanian. The connectivity server: fast access to linkage information on the web, Proc. 7th WWW, 1998.
Bharat and Henzinger 98. K. Bharat, and M. Henzinger. Improved algorithms for topic distillation in hyperlinked environments, Proc. 21st SIGIR, 1998.
Brin and Page 98. S. Brin, and L. Page. The anatomy of a large scale hypertextual web search engine, Proc. 7th WWW, 1998.
Butafogo and Schniederman 91. R. A. Butafogo and B. Schneiderman. Identifying aggregates in hypertext structures, Proc. 3rd ACM Conference on Hypertext, 1991.
Carriere and Kazman 97. J. Carriere, and R. Kazman. WebQuery: Searching and visualizing the Web through connectivity , Proc. 6th WWW, 1997.
Chakrabarti et. al. (1) 98. S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S. Rajagopalan. Automatic resource compilation by analyzing hyperlink structure and associated text, Proc. 7th WWW, 1998.
Chakrabarti et. al. (2) 98. S. Chakrabarti, B. Dom, D. Gibson, S. Ravi Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Experiments in topic distillation, Proc. ACM SIGIR workshop on Hypertext Information Retrieval on the Web, 1998.
Chakrabarti, Gibson, and McCurley 99. S. Chakrabarti, D. Gibson, and K. McCurley.Surfing the Web backwards, Proc. 8th WWW, 1999.
Cho and Garcia-Molina 2000 J. Cho, H. Garcia-Molina Synchronizing a database to Improve Freshness . To appear in 2000 ACM International Conference on Management of Data (SIGMOD), May 2000.
Faloutsos, Faloutsos, and Faloutsos 99. M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power law relationships of the internet topology, ACM SIGCOMM, 1999.
Glassman 94. S. Glassman. A caching relay for the world wide web , Proc. 1st WWW, 1994.
Harary 75. F. Harary. Graph Theory, Addison Wesley, 1975.
Huberman et. al. 98. B. Huberman, P. Pirolli, J. Pitkow, and R. Lukose. Strong regularities in World Wide Web surfing, Science, 280:95-97, 1998.
Kleinberg 98. J. Kleinberg. Authoritative sources in a hyperlinked environment, Proc. 9th ACM-SIAM SODA, 1998.
Kumar et. al. (1) 99. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the Web for cyber communities, Proc. 8th WWW , Apr 1999.
Kumar et. al. (2) 99. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Extracting large scale knowledge bases from the Web, Proc. VLDB, Jul 1999.
Lukose and Huberman 98. R. M. Lukose and B. Huberman. Surfing as a real option, Proc. 1st International Conference on Information and Computation Economies, 1998.
Martindale and Konopka 96. C. Martindale and A K Konopka. Oligonucleotide frequencies in DNA follow a Yule distribution, Computer & Chemistry, 20(1):35-38, 1996.
Mendelzon, Mihaila, and Milo 97. A. Mendelzon, G. Mihaila, and T. Milo. Querying the World Wide Web, Journal of Digital Libraries 1(1), pp. 68-88, 1997.
Mendelzon and Wood 95. A. Mendelzon and P. Wood. Finding regular simple paths in graph databases, SIAM J. Comp. 24(6):1235-1258, 1995.
Pareto 1897. V Pareto. Cours d'economie politique, Rouge, Lausanne et Paris, 1897.
Pirolli, Pitkow, and Rao 96. P. Pirolli, J. Pitkow, and R. Rao. Silk from a sow's ear: Extracting usable structures from the Web , Proc. ACM SIGCHI, 1996.
Pitkow and Pirolli 97. J. Pitkow and P. Pirolli. Life, death, and lawfulness on the electronic frontier, Proc. ACM SIGCHI, 1997.
Simon 55. H.A. Simon. On a class of stew distribution functions, Biometrika, 42:425-440, 1955.
White and McCain 89. H.D. White and K.W. McCain, Bibliometrics, in: Ann. Rev. Info. Sci. and Technology, Elsevier, 1989, pp. 119-186.
Yule 44. G.U. Yule. Statistical Study of Literary Vocabulary, Cambridge University Press, 1944.
Zipf 49. G.K. Zipf. Human Behavior and the Principle of Least Effort, Addison-Wesley, 1949.
___
I can't stand developing for 3 different browsers on 4 different platforms, 12 screen resolutions, 3 color depths...
Then don't! Thats one thing many "web authors" still don't get... The WWW is a text-oriented medium. It's a page of text that has links to other pages of text. Everything else is just cruft.
HTML doesn't define how a web site should look to the pixel, and this is one of it's strong points. It's up to the user to decide how to view a site. If the user doesn't want images, your site should look just fine without them.
The minute you start checking to make sure your site looks the same on all browsers, you should re-think your entire site. Why do you want it to look the same on all browsers (it won't by the way...)? This usually indicates that you are focusing too much on presentation and not enough on content.
The web is broke. We're not using it properly
I agree with your second statement. The web isn't broke... people just aren't using it properly. There are so many corporate sites that look like brochures. It's sickening. My previous job was to set up a web page for a small business, and all they wanted me to do was scan each page of their brochure into GIF's, put them up on the web, and put "forward" and "backward" buttons on the bottom to navigate between pages. I said, WTF!?!? The concept of actually including text information and links to other resources was totally absurd to my boss.
These kinds of people think of the web only as a marketing tool, and thus can't take advantage of the power it has to offer.
In general, the AltaVista crawl is based on a large set of starting points accumulated over time from various sources, including voluntary submissions. The crawl proceeds in roughly a BFS manner, but is subject to various rules designed to avoid overloading web servers, avoid robot traps (artificial infinite paths), avoid and/or detect spam (page flooding), deal with connection time outs, etc. Each build of the AltaVista index is based on the crawl data after further filtering and processing designed to remove duplicates and near duplicates, eliminate spam pages, etc. Then the index evolves continuously as various processes delete dead links, add new pages, update pages, etc. The secondary filtering and the later deletions and additions are not reflected in the connectivity server. But overall, CS2's database can be viewed as a superset of all pages stored in the index at one point in time. Note that due to the multiple starting points, it is possible for the resulting graph to have many connected components.
-----
I hope people don't use this paper to promote arbitrary linkage to other sites. I mean /why/ do things have to be more connected? When I'm on my web page I don't want or need one click access to every other part of the web. That's why there are portals and search engines. Islands I understand. But we wouldn't necessarily /want/ those two sections of 24%, origin and termination, to be arbitrarily linked more to the core. We'd just end up with the whole web being a humongous hairball of a core in which each page linked to many other pages in the core. What a mess. People put indices in one place, at the BACK of a book, for a reason.
It's 10 PM. Do you know if you're un-American?
Why not do the obvious thing? Make each page one giant GIF or JPEG image. You can use an imagemap to let people navigate.
Why don't you do THIS for your customers? It gives them exactly what they want, they can get pixel-level control of how their site looks. They can even digitize paper brochures and do it that way. And by ignoring all of the cruft with crappy HTML 4 different browsers on 5 different platforms, you can probably do the site cheaper and easier this way too.
HTML and pixel-accurate renderings are MUTUALLY EXCLUSIVE. HTML isn't, wasn't and should NOT be designed for that. If you want something better, either design it yourself, or piggyback it on something which works and can be done today. JPEG or GIF or PNG.
If your customers want to look like idiots on the web, then I'm sure they'll like this. Not only this, you should ADVERTISE this advantage. ``The only web design company who's sites look EXACTLY how you want, on every browser on every OS.''
I visit pages that have content I want. I bookmark and revisit pages that lead me to content I want. If Palm's website doesn't link to palmgear or their development section doesn't bookmark to the GCC cross compiler tools. I won't visit them.
....
I still remember the HP48 websites circa 4 years ago.. 95% of them were crap, just links to another
site that was full of links to other crap sites.. Had ONE of those sites had a catagorized set of links, I would have bookmarked it in an instant. ``links to development software sites'' ``links to fan sites'' ``links to shareware sites'' ``links to math sites''
It is about time that us "geeks" re-claimed our Internet from the dumbed down masses. We should return to the days of ARPA, when only people with a legitimate requirement could get net access. The "democratization" (i.e. moronification) of the web has gone too far and is responsible for the majority of problems us "original internet users" are seeing. The flood of newbies must not only be stopped, it needs to be REVERSED. These non-tech-savvy people need cable TV, and not something as sophisticated and potentially dangerous as the Internet.
While my wife and I often joke about the sentiment of this statement (at least once a day, one of us will point at a website or an email and say, "Yet Another person who really shouldn't be on the Internet"), we also know that actually believing it is horribly shortsighted thinking. Yes, there's a lot of no-content fluff out there on the web. But people have to start somewhere. I wouldn't expect a person's first web page to be more meaningful than "here's my house, my family, and my pets" any more than I'd expect a 6-year-old's first two-wheeler to be a Harley.
Granted, some folks never get past the "training wheels" stage. (Okay, make that "a lot of folks" these days.) But the Internet has long passed the days when it was a tool for a select group of people. If the S/N ratio is dropping precipitously, well, then, improve your noise filters. Make it a habit to include things like "-url:aol.com" in every search if you need to. You're one of the "tech-savvy" crowd (directed at the original AC who posted); use your tech knowledge! If all you can do is bitch about the fluff on the web without using readily-available tools to cut through it, maybe you're not as tech-savvy as you'd like us to believe.
"I shouldn't have to" isn't a valid response, either. In any information search, irrelevant data will turn up, and you're going to have to sort through it anyhow.
Aero
We can believe in you for 3 minutes, but beyond that, even the King of All Cosmos can't be expected to wait.
The minute you start checking to make sure your site looks the same on all browsers, you should re-think your entire site. Why do you want it to look the same on all browsers (it won't by the way...)? This usually indicates that you are focusing too much on presentation and not enough on content.
Clearly, you're not a developer. For those of us who do this for a living, it's about presentation and content. And we're not necessarily designing our *own* websites, we're designing for clueless clients who refuse to be convinced of certain practices/standards no matter HOW MUCH we pound them into their puny skulls.
Web Developers/Designers have the most *clueless* clientele of most any industry, and we have to develop for them, not us. Believe it or not, graphics occasionally look good on a website, and people WANT THEM. And considering how IE and NS handle tables, alignment so very differently, we DO have difficulties making it look the same in both browsers.
For those of us who design, we know that this is a perpetual, neverending headache.
Quidquid latine dictum sit, altum viditur.
I only post comments when someone on the internet is wrong.
But getting the analogy wrong just reflects the simplicity of what they've done. Their categorizations based on number of links in and number of links out could have been made a priori. They did measure the size of each group which was somewhat interesting.
A much more interesting study would be of the paths that people actually follow. Who cares what links the author put up if nobody clicks on them. But, the paths that people take would tell us a lot. Do peoples start at their bookmarks? Do they start at portals? Can they be categorized? And the real question: what paths do the people who buy stuff take?
This is currently being discussed at Kuro5hin (pronounced "corrosion").
Will I retire or break 10K?
"In fact, I've run into web developers who have never HEARD of the w3c."
<<**SHUDDER**>>
It's 10 PM. Do you know if you're un-American?
If you consider ramsey theory then you'll know that any two coloring of a graph will give a group of vertices that are strongly interconnected (a clique) and/or a group that where none of the vertices is connected to any other(anti-clique).
For example, coloring a complete 6 vertex graph will either give a clique or anti-clique of three vertices. In a social context, this means that in a group of 6 people there will be a group of at least 3 people who either do not know anyone else in the group or know everyone else in the group. Using a theorem by Erdos tells use that the web probably does not have a clique or anti-clique of size greater than 1+log n (here log = log base 2) where n is the number of web sites. Another result says that there is guaranteed to be a clique or anticlique of that is at least as large as the fourth root of n where n is the number of web pages.
"When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
Perhaps they should put a list of these unconnected websites up on the web, then they could quickly and easily disprove their own research.
Study or no, isn't it rather obvious that an awful lot of the web isn't accessable (linked) unless you know exactly where the resource you're looking for is? Ask anyone who does web searches routinely - you'll get vastly different results looking with different engines, and you can rest assured you'll miss lots of information that you might find useful.
Case in point - recently I needed to find some information about a specific company. Now this company's name is virtually the same as a popular Unix variant (no, not Linux). Searching for the company name, once all the Unix links were weeded out (this was a chemical company) led to some federal documents about the company, but not much other information at all.
As it turns out, the company in question HAS a web site (and has had one for some time) - it just wasn't linked from anywhere I could access on the common search engines.
Still, it's nice to have some data on this...
-- Rick
- Slashdot
- The rest
'The rest' can be further subdivided into 3 parts:- News sites Slashdot links to
- Non-new sites which get slashdotted
- News sites talking about Slashdot
- The other category would be "None of the above " but in that case we don't really care to count, do we.
And of course I have to mention that this study of mine is highly unbiased, openminded, and generally guaranteed to be 100% completely accurate.__________________________________________
God did not appoint us to suffer wrath but to receive salvation through our Lord Jesus Christ --1Thes5:9
Our analysis reveals an interesting picture (Figure 9) of the web's macroscopic structure. Most (over 90%) of the approximately 203 million nodes in our crawl form a single connected component if hyperlinks are treated as undirected edges. This connected web breaks naturally into four pieces. The first piece is a central core, all of whose pages can reach one another along directed hyperlinks -- this "giant strongly connected component" (SCC) is at the heart of the web.
In graph theory, a strongly connected component is a set of mutually reachable equivalence classes of vertices in a graph - i.e a group in which every vertice is reachable from each other.
What's interesting is that the four groups mentioned in this article are all approximately the same size, with the SCC group being only slightly larger than the others, which are:
So what they're saying is that really only about a quarter of the internet is the core that is strongly connected to the rest of it. Which is interesting in itself, because I'd have thought it was a lot higher.
>I find all these "personal" pages on the web are a major irritant, as they seldom contain useful ?information, and they clog up the search >engines with non-relavent crap, by polluting the search space.
Has it ever occured to you that slashdot was once Rob's personal page ?
Marriage is considered capital punishment for the theft of a goat in some third world countries...
actaully, there is one. It's called amaya and it's been developed by the W3C for use as both a browser and editor. It's not the prettiest of things, but it's not designed to be. It's designed to be a fully standards compliant web browser. Head to the W3C's page and take a look at it. It's available for most platforms and really quite useful.
-Jason
If I could only live my life with my threshold at 4...
The WWW is a text-oriented medium. It's a page of text that has links to other pages of text.
What you've just described is gopher with links.
I've said this on slashdot before, and I'll say it again: The web is NOT Gopher. The web is a multi-media platform. Including graphics, animation, video, sound, and any other funky stuff people want to throw up on it. The whole "The web should be text. Graphic elements are clutter." mentality makes me sick. I agree 100% that a site should NOT be DEPENDANT on graphics or other 'specialty media' to get content accross. That's what good consideration for the text-based users and ALT tags are for. But a web without graphics is merely gopher tunneled over http.
Why do you want it to look the same on all browsers (it won't by the way...)?
It's pretty simple: clients don't understand the web. They want all that pretty crap. They REQUIRE it to look the same wherever they see it. They expect things as low level as kerning and leading to be the same, universally.
Like I said in my first post, we (as in everyone) need to recognize that the web is a new medium. Traditional media conventions don't apply.
I have to disagree with you there. Undoubtedly, the web started out as and was designed for a text oriented medium of information propagation. However, it is also true that it has outgrown its original design. How else do you explain "IMG" tags? Why would they be required in a txt only medium?
Yes there are limitations originating from its design goal that generate a sense of awkwardness when implementing graphic oriented pages. However, there are principles of web page design which can be followed to minimize the awkwardness. Graphics is now very much on the web : deal with it the best you can. Closing your eyes and hoping it will go away is not a good solution.
I have no solution for the original problem posed regarding programming for multiple browsers. This is inded a bitch. But the one about multiple resolutions is much more easily fixed : program your webpages to a fixed resolution. I contract at IBM, and IBM's standard is that the webpage must be displayable on a 640x480 resolution without having to scroll. There are exceptions to this rule of course, but these sites need to get approval for exceptions from higher up.
There is no such thing as luck. Luck is nothing but an absence of bad luck.
IUt is not everyones internet. The internet was funded by business for business and is supported and enhanced by business and for business. You are an invited guest here, mind your manners.
The dumbing down is done by the masses, but it is neither wanted nor promoted. The internet gets it's legs from the billions in capital business (mostly US) provide for their benefit, not yours. Pr0n, Joe sixpack's dog pics and AOL crap are just unwanted byproducts.
More race stuff in one place,
than any one place on the net.
How do people find your pages? How do indexes and search engines work? It's all based on the textual content. Google doesn't do OCR on your GIFs of scanned brochures, or voice recognition on your MP3s of your radio spots. Even if images or sounds are the focus of your site, you'd best have plenty of text that indexes and describes that content.
What loads fastest, given surfers the information they're looking for in the least time? Text.
What can be displayed in the user's choice of colors and fonts, so that it's legible in any situation? Text.
What can be rendered on a PDA, or read by a text-to-speech converter for the blind? Text.
What should web designers do when clients don't understand these issues? Apply the clue stick. Gently and with respect, but firmly, make it clear that you know more about the internet and the WWW than they do, that's why they are paying you, and if they want an rhinestone-encrusted illegible and unusable site that takes three days to load over a 28.8k PPP link, then they can hire a 12-year-old who's just finished reading HTML for Dummies instead of a professional - and then spend the next few months wondering why they bother having a web site, since it's done fsck all for their business.
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood
Hmmm, isn't possible that the number of disconnected sites is much greater, do to the fact that you can't find all the non-linked sites... since there is no where to find them from? Anybody have any info from the study?
2) Advertisers and news sites link into corporate pages
3) Personal home pages are highly likely to link into popular sites, but not be linked-into themselves
Applying these ideas, and others like them, leads to the "bowtie".
> The web is broke. We're not using it properly
I agree with your second statement. The web isn't broke... people just aren't using it properly. There are so many corporate sites that look like brochures. It's sickening. My previous job was to set up a web page for a small business, and all they wanted me to do was scan each page of their brochure into GIF's, put them up on the web, and put "forward" and "backward" buttons on the bottom to navigate between pages. I said, WTF!?!? The concept of actually including text information and links to other resources was totally absurd to my boss.
These kinds of people think of the web only as a marketing tool, and thus can't take advantage of the power it has to offer.
Look at news sites. Howmany times do you come across a articles that are word-for-word taken directly from the printed page. (Almost to the fact that it says, "continued on page 3C".)
The worst part is the page-turning. You know, the "next page" links at the bottom of articles. That right there is a sign that your sight is broken. You're using a static and linear approach in a dynamic and nonlinear medium.
Break the story up. Link God damn it! If a comany gets mentioned link to it, not one of those pathetic stock quote drivels that news sites make. If some person made a speech, don't just quote the one or two sentences, link to the speech.
I'm convinced that the web is going to suck until our children ascend to power. Look at television. In the early days of the late 40s and 50s everything was very rigid. You basically had radio programs being done in front of a cammera. After a generation was raised on televions did you actually get programs that started to take advantage of the medium. Compare how news was done in 1950 to how it's done today. Look at educational television. Before you had the monotone droning voice of an old man, and now you have Sesame Street. The same thing is going to happen to the web.
- Are nerds only interested in linux and open source? NO
- It is not our/your internet! It is everyones Internet! If the internet has "dumbed down" then it is just appealing to the masses.
- Buy a domain like elitegeek.net and create your own net if you want with search engines that only have the data you want in them. The very freedoms that allowed the explosive growth of the net allows you to carve off a little section if you want.
- By what arrogance do you believe you have a right to join your IPv6 net where others do not. Who is the test body?
- It scares me to death to read a post like this when I have to wonder if you could actually mean it!
Your post smacks of the argument for removing the right to vote because everyone keeps voting in the same dumb bastards.Never underestimate the dark side of the Source
I'm a web developer. I've always loved the potential of the web until recently. Now I don't like working with it. I can't stand developing for 3 different browsers on 4 different platforms, 12 screen resolutions, 3 color depths, and design templates that came from a print artist who thinks that the web is one big brochure.
The web is broke. We're not using it properly, there are too many poorly done corporate sites, contributing to insecurity, poor usability and incompatibility.
Many clients we work with are dead set against sending anyone away from their site. I don't think they realize that links are what the web is made of. This contributes to the unreachable part of the bowtie. These corporate folk are afraid that by linking away from the site, they will lose a viewer, and that use won't find their way back. They don't realize that the web is a pull technology, and the if the user was looking for certain information, the user will come back if it is the best source of such info. The back button is one of the browsers most used features.
We need more of these research projects to help us figure out what needs to be changed. The W3C is a start, but it's expensive to join and it's rare that you find a website that conforms to the standards. In fact, I've run into web developers who have never HEARD of the w3c.
The web is a new, completely different medium. It's not a CDROM, it's not a brochure, it's not TV. We can't keep treating it like these other media.
Well, according to the abstract,
So it has immediate practical use if you're writing spiders, and so on. I'm not sure whether "insight into... the sociological phenomena which characterize [the Web's] evolution" counts as something which does you any good, but you never know where the resulting studies might lead.(Anyhow, who says research has to do anyone any good?)
GROGGS: alive and well and living in