Slashdot Mirror


A Statistical Review of 1 Billion Web Pages

chrisd writes "As part of a recent examination of the most popular html authoring techniques, my colleague Ian Hickson parsed through a billion web pages from the Google repository to find out what are the most popular class names, elements, attributes, and related metadata. We decided that to publish this would be of significant utility to developers. It's also a fascinating look into how people create web pages. For instance one thing that surprised me was that the <title> is more popular than <br>. The graphs in the report require a browser with SVG and CSS support (like Firefox 1.5!). Enjoy!"

294 comments

  1. I clicked I'm Feeling Lucky on this article by dada21 · · Score: 1, Funny

    and all I got was Britney Spears.

    Sheesh.

    1. Re:I clicked I'm Feeling Lucky on this article by whitehatlurker · · Score: 1

      I get a different top return for that search.

      --
      .. paranoid crackpot leftover from the days of Amiga.
  2. We've come a long way by suso · · Score: 3, Funny

    if the tag isn't on the top elements list.

    1. Re:We've come a long way by dgatwood · · Score: 1
      I'm guessing that was , but that's just a hunch.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    2. Re:We've come a long way by dgatwood · · Score: 1
      Sigh. A few seconds too late. Oh, well.

      I'm guessing one reason for its decreased use is that a lot of browsers refuse to honor that tag.... On the other hand, most browsers still honor the property in CSS. :-D

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

  3. Blink by suso · · Score: 4, Funny

    the tag.

    1. Re:Blink by mysqlrocks · · Score: 3, Funny

      the <blink> tag.

      I must have blinked, I didn't see it the first time.

    2. Re:Blink by sbenj · · Score: 1

      I still wish you could surround an entire page with a tag. Just imagine.

    3. Re:Blink by ReverendLoki · · Score: 4, Funny
      Still, the only good use I ever saw for that tag was the line:

      Schrodinger's cat is <blink>not</blink> dead.

      Every other usage just caused me to browse elsewhere.

      --
      09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
    4. Re:Blink by Billy+the+Mountain · · Score: 1

      You can, it's called the tag

      --
      That was the turning point of my life--I went from negative zero to positive zero.
    5. Re:Blink by sbenj · · Score: 1

      It was a misprint. What I thought I wrote was "putting a tag around a whole page". The one time I don't preview....

    6. Re:Blink by Repton · · Score: 2, Funny

      All you need to do is blink at the right frequency and you'll never see it at all!

      --
      Repton.
      They say that only an experienced wizard can do the tengu shuffle.
    7. Re:Blink by hixie · · Score: 1

      was 97th. Used more often than , , , , , , , and ...

  4. is more popular than by InsideTheAsylum · · Score: 5, Funny

    well when people talk like this and dont bother using punctuation spacekeys or any of the skills that they have been taught in school its no wonder why webpages turn out like this not to mention those long runon sentences and also all that broken code that are the fist attempt at a webpage by a twelve year old kid who tried to steal someone elses layout and replaced the word with his own then you start to look at all of those dynamically generated webpages and the layouts and the style sheets and its no wonder why the good old br tag never get a work out.

    1. Re: is more popular than by Fr05t · · Score: 1, Funny

      "...out."

      Hooray! I've never been so happy to see a period!

    2. Re: is more popular than by aussie_a · · Score: 5, Funny

      Never been scared your girlfriend was pregnant? Oh wait, this is slashdot. Nevermind.

    3. Re: is more popular than by FinestLittleSpace · · Score: 1

      I don't mind them. It's just the decorating afterwards that I'm not a fan of.

    4. Re: is more popular than by Anonymous Coward · · Score: 0

      My girlfriend is pregnant, you insensitive clod!

    5. Re: is more popular than by Anonymous Coward · · Score: 5, Funny

      Women and Compilers... miss a period and they go wild.

    6. Re: is more popular than by Anonymous Coward · · Score: 1, Funny

      Congratulations.
      Is it yours?

    7. Re: is more popular than by Anonymous Coward · · Score: 0

      I'm pretty sure it's mine (and not, say, her husband's). We're expecting our son in March.

  5. Finally... by RandoX · · Score: 5, Funny

    An un-slashdottable server.

    1. Re:Finally... by p0 · · Score: 1

      now that "googling" is a real verb, if a site gets "googled" (as in slashdotted) wtf will u call that?

      --
      This is my sig. There are thousands more, but this one is mine.
    2. Re:Finally... by menkhaura · · Score: 1

      "Googled" as in "slashdotted"? Clusterfucked!
      "Googled" as "searched for"? Googled.

      --
      Stupidity is an equal opportunity striker.
      Fellow slashdotter Bill Dog
    3. Re:Finally... by Firehed · · Score: 1

      Serious downtime, I'd imagine. Let's face it... as amazing as the power of a slashdotting is, the google crawler could probably have more of an effect if someone in GoogleLand gets a bit bored.

      --
      How are sites slashdotted when nobody reads TFAs?
  6. BR tag? by p0 · · Score: 5, Insightful

    With css power you really do not need to use br, maybe that is the reason for the small stats for the tag's use?

    --
    This is my sig. There are thousands more, but this one is mine.
    1. Re:BR tag? by masklinn · · Score: 3, Interesting

      Small stat? are you joking?

      This is about the number of sites that use the tag, not the number of tags out in the wild, and <br> is used on more pages than <table>, there are as many pages with at least one <br> than pages with at least an <img> tag

      That's freaking huge, for a tag that should almost never be used.

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    2. Re:BR tag? by Eightyford · · Score: 1

      With css power you really do not need to use br, maybe that is the reason for the small stats for the tag's use?

      But don't we all use br's when we quote other people on slashdot?

    3. Re:BR tag? by crumley · · Score: 2, Funny
      But don't we all use br's when we quote other people on slashdot?
      No.
      --
      Preventive War is like committing suicide for fear of death. - Otto Von Bismarck
    4. Re:BR tag? by torunforever · · Score: 1
      I was curious about the following line from their "text elements" analysis.
      The \ "attribute" is almost certainly the result of people writing markup like <br\> when intending to do <br>. Of course, neither is particularly useful to browsers when the page is sent as text/html (as all these pages were).
      Anyone know what they mean by that? My guess is it's supposed to say that "when indending to do <br />", and since that is xhtml, it's pointless to use that syntax when the page is being sent as html.
    5. Re:BR tag? by CRCulver · · Score: 1

      But don't we all use br's when we quote other people on slashdot?

      No, for good semantically-sensible comments, you should be using the blockquote tag for the remarks of those to whom you are replying, as I have done here.

    6. Re:BR tag? by crabpeople · · Score: 1

      you must have used a br because otherwise your text would be right above ["reply to this"] (or your sig) as this will probably be. br and p are the ownage.. i dont see why we need more than html 3. its good enough for everyone.

      --
      I'll just use my special getting high powers one more time...
    7. Re:BR tag? by poot_rootbeer · · Score: 1

      a tag that should almost never be used. I don't understand on what you're basing that opinion on, as the BR tag is not deprecated in the HTML 4.x nor XHTML 1.x standards. Demanding a line break at a particular location is perfectly cromulent syntactic markup. (Actually, it's more of a suggestion than a demand; non-page-based devices will quietly ignore the tag, should anyone ever develop a practical non-page-based device for the web.) What SHOULD never happen, I think, is for BR to be treated as a substitute for proper block-level delineation. If you're ending a paragraph and starting a new one, you should have two open P tags and two close P tags. Sticking two BR tags in a row in there instead isn't semantically correct, even if it is practically force of habit to those that grew up using typewriters.

    8. Re:BR tag? by Metasquares · · Score: 1

      I disagree.
      is extremely useful when you want to allow your users to enter in certain HTML tags without allowing them to launch XSS attacks.

      For that matter, <br> is useful when users enter in a combination of text and HTML. Putting a BR where the newline was preserves the formatting of the text as the user entered it (for example, see the HTML of this Slashdot post. I'm entering it as plain old text and I placed no BR tags in it). A tag like <pre> may be better for that, though.

    9. Re:BR tag? by MyHair · · Score: 1

      That's probably what they mean. I started using /> for a while before I realized it was only valid for xhtml and not valid for any definition of html. I've seen other people do this, too. I also did the same thing for <img src="boobies.jpg" /> and other no-content tags; I don't have the plugins to see the graph so I can't see if this "element" shows up in img, meta or the others.

    10. Re:BR tag? by Bogtha · · Score: 2, Insightful

      The <br> element type is kept around for a few minority uses. Things like poetry, code listings, etc, where dividing something up into lines is necessary. These things are rare, which is why masklinn said "should almost never be used" and not "should never be used".

      What SHOULD never happen, I think, is for BR to be treated as a substitute for proper block-level delineation.

      Yes, and if you take into account the idea that most pages that use the <br> element type do so in precisely this way, you'll end up agreeing with masklinn and myself.

      --
      Bogtha Bogtha Bogtha
    11. Re:BR tag? by Luyseyal · · Score: 1
      Of course with
       you get those asshole side-scroller trolls (scroll-trolls?).
      
      

      -l

      --
      Help cure AIDS, cancer, and more. Donate your unused computer time to worldcommunitygrid.org. Join Team Slashdot!
    12. Re:BR tag? by ScottyH · · Score: 1

      I'm confused, cromulent?

    13. Re:BR tag? by TubeSteak · · Score: 1
      I got really fucking tired of putting 'br' and 'p' throughout my posts.

      Went and changed my default to "Plain Old Text" and haven't looked back since.
      • Plain Old Text: Same as "HTML Formatted", except that [BR] is automatically inserted for newlines, and other whitespace is converted to non-breaking spaces in a more-or-less intelligent way.


      Lets me make spelling mistakes faster than evar.
      --
      [Fuck Beta]
      o0t!
    14. Re:BR tag? by masklinn · · Score: 1

      Demanding a line break at a particular location is perfectly cromulent syntactic markup.

      There are only two cases I know of where using a <br> tag is more logical than wrapping the text in <p<: poetry and <address> tag.

      I'm pretty sure that fits the "should almost never be used" thing, as most people don't write pages full of poetry and addresses.

      I never said that <bré> must never be used, I said that it should almost never be e.g. should (very) rarely be used. There is a subtle yet important nuance here.

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    15. Re:BR tag? by masklinn · · Score: 1

      I disagree. <br> is extremely useful when you want to allow your users to enter in certain HTML tags without allowing them to launch XSS attacks.

      For that matter, <br> is useful when users enter in a combination of text and HTML. Putting a BR where the newline was preserves the formatting of the text as the user entered it (for example, see the HTML of this Slashdot post. I'm entering it as plain old text and I placed no BR tags in it). A tag like may be better for that, though.

      The paragraph element p exists for a reason.

      Do you realize how trivial it is to just add a <p> tag at the start of each line a user entered? (and it's as trivial to add the </p> tag, even though it's optional in HTML)

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    16. Re:BR tag? by masklinn · · Score: 1

      you must have used a br because otherwise your text would be right above ["reply to this"] (or your sig) as this will probably be.

      Duh no, paragraphs on slashdot have a bottom margin.

      Example is my post, composed only of paragraphs and a blockquote at the top, and it has quite a bunch of space before the sig.

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    17. Re:BR tag? by Just+Some+Guy · · Score: 1
      But don't we all use br's when we quote other people on slashdot?

      No. I wrapped your quote in a <em>, and started this paragraph with a <p>.

      I do use it a lot for portraying conversations:

      Eightyford: Don't we use br's?
      Me: Sometimes. I just used one.
      Eightyford: w00t!

      I don't think I ever use them for anything else.

      --
      Dewey, what part of this looks like authorities should be involved?
    18. Re:BR tag? by Gonoff · · Score: 1
      --
      I'll see your Constitution and raise you a Queen.
    19. Re:BR tag? by Iron+E · · Score: 1

      The <br> element type is kept around for a few minority uses. Things like poetry, code listings,

      You can use the <pre> </pre> tag for this.

    20. Re:BR tag? by Bogtha · · Score: 1

      Why <em>? It's just as much of a mistake to use <em> when you want italics as it is to use <i> when you want emphasis. The best code to use is <p> elements within <blockquote> elements, and if you want to set the quote apart from your comment further, then use an <i> element within the <p> elements. But you aren't emphasising what he is saying, you are quoting it, so you shouldn't be using <em>.

      --
      Bogtha Bogtha Bogtha
    21. Re:BR tag? by Blakey+Rat · · Score: 1
      Stupid question: I have a website I'm building, and I though it would look nice to have the user's names appear vertically on the left-hand size of their avatar image. (Hopefully this diagram avoids the lame filter:)
      B -------
      l |.....|
      a |.AVA.|
      k |.TAR.|
      e |.....|
      y -------
      How do I accomplish that without using a br tag after each letter? The last time my HTML knowledge was current was 1998, to give you an idea of where I stand here. (I'm sorry; I don't get paid to do web development, and keeping up with the huge amount of HTML, XHTML, CSS, JS, etc is just too hard.)
    22. Re:BR tag? by Blakey+Rat · · Score: 1

      No, for good semantically-sensible comments, you should be using the blockquote tag for the remarks of those to whom you are replying, as I have done here.

      I quote with italics because it's a ton quicker to type.

    23. Re:BR tag? by kchrist · · Score: 1

      I'm pretty sure that fits the "should almost never be used" thing, as most people don't write pages full of poetry

      You haven't looked at Geocities lately, I presume?

    24. Re:BR tag? by Kelson · · Score: 1

      So how does one properly mark up a haiku without using
      ?

      I'm curious.

    25. Re:BR tag? by Red+Alastor · · Score: 0, Offtopic

      I'd guess surrounding the name in a div and then with CSS making that div width equal to 1em and positioning it on the left of your avatar picture.

      --
      Slashdot anagrams to "Sad Sloth"
    26. Re:BR tag? by mrchaotica · · Score: 1

      Awww, ain't that cute... but it's wrong!

      Italics are for <em>emphasis</em>, not quoting. Please, won't you be semantically correct? Think of the children^W bots!

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    27. Re:BR tag? by Metasquares · · Score: 2, Insightful

      Because I don't know if the user wants to enter a paragraph. What the user entered is a line break (that's what hitting return does), thus br is the tag to use. If the user wants to enter in a paragraph, he can enter his own p tag or skip a line (which is the default p tag behavior anyway) and the p tag will be used.

      My site is XHTML, so the closing tag is required (not that that's stopping me).

    28. Re:BR tag? by hixie · · Score: 1

      Yeah, that was a typo, we meant
      .

    29. Re:BR tag? by Anonymous Coward · · Score: 0

      May I play devil's advocate for a second? I use br in all its wrongfulness all the time on pages I know will not be around for the next 8 years because sometimes it's faster to type it. Is this wrong? Do I lose some geek points? I know XHTML but have never used it at work because if it takes me longer to write something because it doesn't validate my boss doesn't care.

    30. Re:BR tag? by CRCulver · · Score: 1

      you must have used a br

      No, I used the p tag and closed it at the appropriate time. That is why it is formatted the way it is.

      i dont see why we need more than html 3. its good enough for everyone.

      Except for the blind that need to browse the web with screenreaders. HTML 3 doesn't have the semantic tags that later versions of HTML brought.

    31. Re:BR tag? by Anonymous Coward · · Score: 0

      Uhhh... this coming from someone with a gazillion
      's on his home page.

    32. Re:BR tag? by Just+Some+Guy · · Score: 1
      Why <em>? etc.

      Sheer inertia. Because this is Slashdot and it's not semantic anyway. Because that's what everone else does.

      You're correct; I won't argue that. In the real world I use tags much as you describe, but this isn't that world.

      --
      Dewey, what part of this looks like authorities should be involved?
    33. Re:BR tag? by DocOmega · · Score: 0

      The pre tag will work
      Read the pre-vious posts now
      Moonlight wanes pre-dawn.

      --
      Meh
    34. Re:BR tag? by masklinn · · Score: 1

      You will use the <br> tag, poetry is one of the few legit uses of this element (notice how I said that it should almost never be used, not absolutely never

      Marking up address fields is also considered a legit use of the <br> tag btw

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    35. Re:BR tag? by mdecarle · · Score: 1

      Would you not be able to -for some or most fonts- be able to fit 'il' into 1em?

      g
      il
      b
      e
      r
      t

      If so, that would not be a good enough solution.

    36. Re:BR tag? by Domo-Sun · · Score: 2, Interesting

      How do I accomplish that without using a br tag after each letter?

      I'd guess surrounding the name in a div and then with CSS making that div width equal to 1em and positioning it on the left of your avatar picture.


      Oh, and good luck getting it to work in all browsers. Gee whiz! What is the logic behind this? You have to wrap everything in DIVs and spans, then write a bunch of ridiculous code, for what reason, so we can hold up to an irrationally strict, un intuitive, standard.

      This is the opposite of accessibility. It's simply a waste of time for the author...

      Though, now that I think of it, this is not the best example of BR use, since screen readers would spell everyones name out.. Oh, so I'd have to say go with the 1em CSS box, maybe try a monospace font.

      And please, please try to use an existing box and try to avoid using DIV and SPAN, if you can.

      Oh, wouldn't the text just flow out, or under the box? I think you can't do this?

      I simply think some of the extreme concepts about how we should deprecate everything are failing to view the logic behind future potential uses, and forget how long it takes to actually get any tags to work universally, so let's just keep the tags we have, thank you. The same thing happened with italic, they said it should be deprecated and never used, then they came up with a few examples where it's needed.

      If you people want to crusade against something, maybe go after the people who use DIV class=heading. That annoys me when I try to make my user stylesheets. Oh, and since we're on the topic, slashdot is using .block for the slashboxes, and it's on that list of most used classes, so pleas put a unique ID in the body element, like #slashdot-front #slashdot-games. Something that I could put it into context: #slashdot .block{display:none;}. If I block .block, then my userCSS messes up the rest of the web. I worked around this by blocking each box individually.

    37. Re:BR tag? by Red+Alastor · · Score: 1

      1 em, non-proportional font then.

      --
      Slashdot anagrams to "Sad Sloth"
  7. No GOTOs? by slashbob22 · · Score: 1, Redundant

    I was expecting a few GOTO commands.

    For Example:
    IF browser="IE" GOTO Spyware

    --
    Proof by very large bribes. QED.
    1. Re:No GOTOs? by the+computer+guy+nex · · Score: 4, Funny

      How about:

      IF(Post=Old_And_Tired) GOTO Mod_Down

    2. Re:No GOTOs? by Bloke+down+the+pub · · Score: 0

      you.fail(it);

      --
      It's true I tell you, feller at work's next door neighbour read it in the paper.
  8. Not complete by Anonymous Coward · · Score: 5, Funny

    It didn't have everything of course. Some elements were censored on behalf of the Chinese government.

    1. Re:Not complete by onedotzero · · Score: 1

      ...which reminded me of this image, just posted over at B3ta...

      --
      onedotzero
      thedigitalfeed.co.uk

  9. blink is more popular than http by Anonymous Coward · · Score: 0

    I question his results.

  10. what's the point of a 1 billion page sample? by ecklesweb · · Score: 3, Interesting

    I have to ask, what's the purpose of a 1-BILLION page sample? That's the beautiful thing about statistics. If you can say something about the distribution of characteristics within a population, you don't have to survey the entire population to get meaningful results. Are the study authors proposing that no standard distribution can be applied to the entire universe of web pages? If that's the case, then do the statistics they apply to their sample of one billion really say anything predictive about the entire population?

    Aside from the cool factor of saying they sampled a billion pages, I don't see what extra benefits are gained from that extra effort.

    1. Re:what's the point of a 1 billion page sample? by chrisd · · Score: 1
      Well, I'm guessing that 1000 was too small of a sample :-)

      --
      Co-Editor, Open Sources
      Open Source Program Manager, Google, Inc.
    2. Re:what's the point of a 1 billion page sample? by Anonymous Coward · · Score: 5, Informative

      You get a decrease of the variance of the mean.

    3. Re:what's the point of a 1 billion page sample? by Durinthal · · Score: 5, Insightful

      If you can have a larger sample, why not use it? It's more accurate that way.

    4. Re:what's the point of a 1 billion page sample? by Anonymous Coward · · Score: 0

      Yes but only proportional to the SQUARE ROOT of N!

    5. Re:what's the point of a 1 billion page sample? by metlin · · Score: 1

      I doubt Google was doing that just for the purposes of data gathering, though.

      Imagine - they were able to scale the system to process 1 BILLION webpages. That is a significant achievement, which means that somewhere in Google, they have the ability to not only gather and sort/search a lot of data, but also derive meaning from it (statistical or otherwise).

      That is a significant achievement.

      Data by itself becomes fairly pointless after a while, however finding relations and meaning within that data is what makes it hard. And doing so for large amounts of data is even better.

    6. Re:what's the point of a 1 billion page sample? by leuk_he · · Score: 1

      a billion is way beyond cool. Do you even understand how much a billion is. for a Billion dollar you could buy your own small counltr, a billion bricks build a tower that is unbelieveable big. And so on.

      but that billion is the thing that is most interresting. the other part is just statistics that are just fun, nothing more.

    7. Re:what's the point of a 1 billion page sample? by ChrisGilliard · · Score: 1

      I agree with the people who said that basically a billion sounds cool. I suppose you could use a million, but that would not be as cool as a billion. A billion is the new million. A company that is as media savy as Google is understands this.

      --
      No Sigs!
    8. Re:what's the point of a 1 billion page sample? by haluness · · Score: 1

      Obviously I don't know whether Google performed more (or sophisticated) analysis on the billion pages. But if it simply calculating sums and means, it's more a matter of time than sexy algorithms.

      I mean, just distribute the counting over processors - this problem seems trivially parallel

      But of course, I don't work for Google, so who knows what those wizards are doing with the stats!

    9. Re:what's the point of a 1 billion page sample? by Musteval · · Score: 1

      According to Google, sqrt(1 000 000 000) = 31 622.7766

      Which is slightly larger than 1.

      --
      Note to mods: I'm probably being sarcastic.
    10. Re:what's the point of a 1 billion page sample? by metlin · · Score: 1


      I would think that tokenise-ing and statistically analyzing such data would not be a trivial task for that large a sample.

      Then again, maybe someone from Google could tell us? (Chris?)

    11. Re:what's the point of a 1 billion page sample? by chrisd · · Score: 1
      So I'd like to preface my response to your question by saying that I don't want to sound like we are showing off here, but Google has invested a lot of time and resources into making this kind of thing somewhat trivial to do. From a computer science and cpu-time perspective, not so trivial, but we do have the available spare facilities to do this kind of thing as we like within a reasonable amount of time.

      Chris

      --
      Co-Editor, Open Sources
      Open Source Program Manager, Google, Inc.
    12. Re:what's the point of a 1 billion page sample? by Sebastopol · · Score: 1

      I thought google had like 10 billion pages archived?

      That would be 10%, which is still pretty large, I guess.

      IANAStatitician, and I never understood how a confidence interval isn't tied to the population size...

      Too weird.

      --
      https://www.accountkiller.com/removal-requested
    13. Re:what's the point of a 1 billion page sample? by Bloke+down+the+pub · · Score: 0
      A billion is the new million.
      Like it. Continuing with the numerical inflation, what's the new googol?
      --
      It's true I tell you, feller at work's next door neighbour read it in the paper.
    14. Re:what's the point of a 1 billion page sample? by shoolz · · Score: 2, Informative

      Because with statistics, increasing the sample size does not result in a uniform increase in accuracy.

      If you start with a sample size of 1000 and add an additional 10000, the accuracy will increase dramatically. But if you start with 1,000,000,000, and increase it by another 1,000,000,000, the accuracy won't go up even by as much as 0.0001%

      Yes, I'm pulling the numbers out of the air, but the point is that there exists a sweet spot where the additional effort does not pay off.

    15. Re:what's the point of a 1 billion page sample? by Anonymous Coward · · Score: 0

      You can extrapolate from a small sample to a population, provided you can draw randomly from that population, according to the probability distribution you care about, and provided the variance is relatively small; the larger the variance, the larger the sample you need. Given that the distribution of tags on, say, CNN pages (lots of images, video, etc) is hugely different from the distribution on Google pages (few images, much text), is hugely different from the distribution on ten year old personal homepages hanging around on a free hosting service somewhere, it's annoying to come up with a "pick randomly" method, and argue its correctness.

      Additionally, why bother? Statistics tells us that the sample is very likely to be very close to the full set. If you've got a computer doing the analysis, then the difference in effort between picking a sample and doing the whole thing is just the decision to wait ten minutes for a computer to churn versus waiting a week. If you're going to wait a couple months for the publication process, that week is irrelevant, and you can save yourself the effort of arguing why your sample is a good one.

      (For those who don't understant why you'd have to argue your sample is a good one: try estimating the average height of people at a university by randomly sampling ten people from the basketball team... then by randomly sampling ten people from the first class in the morning... then by randomly sampling the first ten people who arrive in the cafeteria. The first sample is very obviously biased. The second may have a subtle bias (if it's a women's studies class, it may be predominantly women, who tend to be shorter than men, for example). The third might have a bias, and the fact that I can't think of one offhand doesn't prove that it's an unbiased sample.)

    16. Re:what's the point of a 1 billion page sample? by poot_rootbeer · · Score: 1

      If you can have a larger sample, why not use it? It's more accurate that way.

      Because there's a point of diminishing returns.

      If a 1-million-page sample gives you 85% accuracy, and a 2-million-page sample gives you 95% accuracy, it may be worth the extra time and effort to process the 2-million-page sample. But if reaching 96% accuracy requires you to process 1 BILLION pages, it's probably not worth the time or the effort.

    17. Re:what's the point of a 1 billion page sample? by finelinebob · · Score: 2, Interesting

      A couple of people have pointed out that the larger the sample size, the less chance there is to attribute a meaningful difference to a situation that is actually a random fluctuation. That may be true, but I believe the point the parent is trying to make is that one of the key advantages of statistical modeling is that you can accurately model very large groups by studying very small samples of that group. If there was actually a need for this large a sample, then fine. Otherwise, the sample size is more sensational than informational.

      For example, many medical studies rely on samples of a couple thousand people. If that number is supposed to represent US citizens, then that sample size is roughly 0.001% of the population.

      To answer whether 1 billion cases is overkill or not, it would be helpful to know the size of their entire database -- how many individual web pages have they catalogued? How big was the sample size relative to the population? Another issue that might have influenced choosing such a large sample is the number of pages generated dynamically, using standardized templates. If Google has catalogued a corporate website that has several thousand pages all following the same template, do those pages act as unique, individual entries that should be given the same weight page-by-page as a site that has only 10 pages? How might the entire depository of, say, eBay.com or even Slashdot.org skew results? The large sample size may have been required to render such "cell" sizes irrelevant.

      Of course, seeing some numbers from their study would have been nice. If they reported p values of 0.00000001 then it would have been easy to say this was a case of overkill.

    18. Re:what's the point of a 1 billion page sample? by ColdDimSum · · Score: 1
    19. Re:what's the point of a 1 billion page sample? by 99BottlesOfBeerInMyF · · Score: 1

      But if reaching 96% accuracy requires you to process 1 BILLION pages, it's probably not worth the time or the effort.

      You're assuming it took significantly more work. They just pulled all of these from Google's cache, so the extra work may have been letting their script run overnight instead of for an hour in the morning. More pages will make it more accurate and I'm sure they are more qualified to judge the proper amount of work/reward more-so than anyone not doing the project.

    20. Re:what's the point of a 1 billion page sample? by ACorrosionOfDeviants · · Score: 1

      DISCLAIMER: I'm an academic researcher in the social sciences. I often use statistics in my work.

      Other than the computational and data acquisition challenges, a larger sample is always preferred over a smaller one. In particular, confidence intervals are smaller -- in other words, estimates of population parameters are more precise; we know more with large samples, and what we know, we know with greater confidence.

      Although there are diminishing returns on increasing sample size (as other posters have noted), the only downsides of a larger sample are the costs of acquiring more data and the costs of analysis. Evidently, Google was willing to incur those costs.

      There are few absolutes in research, but this one of them: there are no statistical advantages of small samples. Ceteris paribus, you should trust the results of a large-sample study more than those of a small-sample study -- at least, to the extent that you can "trust" any statistics.

    21. Re:what's the point of a 1 billion page sample? by Hognoxious · · Score: 1
      what's the new googol?
      The googolplex? Unless that's the old new googol.
      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    22. Re:what's the point of a 1 billion page sample? by Anonymous Coward · · Score: 0

      I have to ask, what's the purpose of a 1-BILLION page sample?

      The purpose is to show off: my sample is bigger than yours and I work for Google:)

    23. Re:what's the point of a 1 billion page sample? by masklinn · · Score: 1

      it's probably not worth the time or the effort.

      Time? Effort? Dude, it's google we're talking about, once the script to collect the data is done the amount of data processed is irrelevant. The guys probably dumped a pair of scripts on the fucking google server, ran some queries and got the data back, there's no effort involved, it would be cool to get some stats of the work itself but I doubt they needed more than a day to pull it out.

      Creating the SVG graphics is probably what actually took the most efforts...

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    24. Re:what's the point of a 1 billion page sample? by masklinn · · Score: 1

      The googlebots parse (tokenize and analize) HTML already, that's why they're able to understand that what's in a page title is important, and what's in an should be given more weight than what's in an .

      And then they feed the HTML-less text to the full text engine, which is clearly much more complex than an SGML parser.

      They've been tokenizing and storing that kind of data forever, it's just seating in their datacenters waiting for someone to pull it out to create that kind of stats.

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    25. Re:what's the point of a 1 billion page sample? by Anonymous Coward · · Score: 0

      At my school, the first people to the cafeteria are short fat girls, ergo bias.

    26. Re:what's the point of a 1 billion page sample? by Anonymous Coward · · Score: 0

      That's the square root of 1 billion, not the variance from the mean.

    27. Re:what's the point of a 1 billion page sample? by jerkmonster · · Score: 1

      confidence intervals ARE linked to population size. the variance of a consistent estimator goes down to zero, meaning that the confidence interval aymptotically reduces to a spike around the true parameter in question. i don't know whether or not a billion is enough to approximate asymptotically, but my guess is that it is.

    28. Re:what's the point of a 1 billion page sample? by kfg · · Score: 0, Offtopic

      If you need to cut a foot long piece of thread off the spool and you have a micrometer, why not use it?

      KFG

    29. Re:what's the point of a 1 billion page sample? by Phleg · · Score: 1

      A couple of people have pointed out that the larger the sample size, the less chance there is to attribute a meaningful difference to a situation that is actually a random fluctuation. That may be true, but I believe the point the parent is trying to make is that one of the key advantages of statistical modeling is that you can accurately model very large groups by studying very small samples of that group.
      Right, but you're confusing the means and the end. The reason we use smaller samples and apply statistics to them is because collecting data about massive numbers of samples is difficult. If Google can collect data for one billion webpages with little more effort than ten thousand, then they should. Statistics is only a means towards the end of collecting data for large groups, by extrapolating the values of large groups from a relatively small sample size. If, however, we have access to the entire dataset, then there's no point in testing a subset of it.
      --
      No comment.
    30. Re:what's the point of a 1 billion page sample? by garberian · · Score: 1
      Aside from the cool factor of saying they sampled a billion pages??

      Are you kidding me?? Statistics aside, I think its friggen awesome that some guy actually found the time to sample one billion web pages. (Or even had a script do it for him...he still had to assimilate the data.) Think of the tons of time it must have taken, as compared to what a normal sample (95% confidence, p equal to/less than .05, etc.) would have.

      But seriously, if the summary wouldn't have said the number billion, I probably wouldn't have cared about it. ;)

    31. Re:what's the point of a 1 billion page sample? by nusuth · · Score: 1

      According to my chemisty books' statistics, if you have a priori knowledge of distribution the sweet spot is twenty-something. Even 30 is almost synonymous with infinity.

      --

      Gentlemen, you can't fight in here, this is the War Room!

    32. Re:what's the point of a 1 billion page sample? by tgv · · Score: 1

      The point is: large numbers of rare events. In statistical linguistics (which is where I do/did some of my work), 1000 words can give you some clue, but anything less than a million is simply not sufficient, and studies show that every sample still misses out on rare events, since the total number of rare events is simply huge.

      Another point is: representation. If you pick a thousand web sites, how can you be sure that they are representative of the entire population? With a billion, you can get accurate distributions over collections of 1000 pages instead of just a single number.

      Suppose you wanted to see which character is most frequently used per top level domain. If you would take 1000 pages, chances are that your sample would only contain a few of even no pages for certain domains. That would make your statistics highly unreliable.

      So, yes, 1.000.000.000 is a good place to start. And that's why it's cool...

    33. Re:what's the point of a 1 billion page sample? by DocOmega · · Score: 0
      The googolplex? [sic]

      No, the Googleplex is in Mountain View. :P

      --
      Meh
    34. Re:what's the point of a 1 billion page sample? by anpe · · Score: 1

      I presume that some background on how this is technically possible _and_ simple enough to implement is available here ?

    35. Re:what's the point of a 1 billion page sample? by Moderatbastard · · Score: 0

      [sic] is used to indicate a spelling mistake in a quoted passage. There is no spelling mistake in the post you replied to, only a rather bad pun.

      --
      1/3 of jokes get modded OT. If you get the joke, mod 1 in 3 insightful/interesting/underrated to restore karma balance.
    36. Re:what's the point of a 1 billion page sample? by in10se · · Score: 1

      If you consider that google has crawled over 10 billion web pages, we are already down to a 10% sample. However if you consider the billions and billions of additional uncrawled pages from intranet sites to small sites with no incoming links and sites with their robots.txt or meta tags set to nocrawl or nofollow, 1 billion actually is a very small sample of the web.

      --
      Popisms.com - Connecting pop culture
  11. dude by dotpavan · · Score: 1

    I am still at the 22nd page, lot more to go (1 billion? OMG!).. see you all there

  12. Cool statistics by mendaliv · · Score: 1

    Their study on the <img> element is quite interesting.

    3/4 of the parsed pages use alt text with their <img> tags, and about 10% use image maps... which I find a little scary. I haven't seen an image map in years.

    1. Re:Cool statistics by XFilesFMDS1013 · · Score: 1

      I took a webpage class not too long ago, within the last 12 months at least, and one of the things we had to do was to make an image map. It was worthless, well, not so much the idea, a clickable image is a nice thing to look at and all, but besides writing below it that you're supposed to click on a certain part, how the hell are people supposed to know what to do? And if you're going to write on the page anyway, just make some links, they're a lot nicer, and you don't have to load an image, not fun for those people on dial up (yes, they still exist, and some don't even have 56K). Overall, that class was worthless.

    2. Re:Cool statistics by Anonymous+Brave+Guy · · Score: 1

      I can't quite agree with that. Like frames and Flash, image maps have their place, and can be a useful tool in the right circumstances. It's just that those circumstances happen rather more rarely than image maps are used in practice, which gets them a rather unfair bad reputation. I've seen a couple of nicely presented graphical site maps that were much easier to understand than a big nested list, and relied on image maps to link through to the content pages, for example.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    3. Re:Cool statistics by Anonymous Coward · · Score: 0

      How about a map where you can click on different counties to select one? Or a pie chart - and you can click on a segment to get more details. Etc.

    4. Re:Cool statistics by Anonymous Coward · · Score: 1, Informative

      Image maps are often used on banner ads. I would guess that this is the main reason why they are so popular in this analysis.

  13. well this is new by Abstract_Me · · Score: 1, Funny

    we haven't slashdotted the google server... but it would appear that the firefox download site for extensions is.

  14. \. shows up in the Web Authoring Statistics by digitaldc · · Score: 4, Funny

    The 'br' element

    The br element is a simple one, yet used on so many pages that it is the 8th most-used element. It is used more than the p element.

    clear, style, class, soft, id, and \.


    Wow! I never knew you guys were that popular.

    --
    He who knows best knows how little he knows. - Thomas Jefferson
    1. Re:\. shows up in the Web Authoring Statistics by shrikel · · Score: 5, Funny
      You're confused. Backslashdot is across the street.

      (sheesh)

      --
      Any sufficiently simple magic can be passed off as mere advanced technology.
    2. Re:\. shows up in the Web Authoring Statistics by hagrin · · Score: 1

      Just an honest question - why is it forward slashdot? Backslashdot just not as cool? Programming reason why?

    3. Re:\. shows up in the Web Authoring Statistics by dragonman97 · · Score: 1

      It's because of URLs and *nix paths. i.e. H-T-T-P-Colon-SLASH-SLASH-www-DOT-slashdot-DOT-org . It's kind of intended to mess with people's heads, and in general, CmdrTaco thought it was cool. :)

      Also, this site used to have far more of a Linux bias, and while one typically references './' instead of '/.' (which is pretty much meaningless), it's kind of a mix of familiar characters. *sigh* I miss those days around here...

  15. Google is good today. by hey · · Score: 1

    Not just non-evil. This is useful and interesting stuff.

  16. Best bash I've seen in a long time: by Benanov · · Score: 4, Funny
    From TFA, the classes page:

    The rest of the top 20 classes are either presentational or otherwise meaningless (msonormal, for example, which is one of the classes that Microsoft Office uses in its "HTML" output).
    1. Re:Best bash I've seen in a long time: by HeroreV · · Score: 1

      Amazing what a pair of quotation marks can do.

  17. With apologies to Warren Zevon by Tackhead · · Score: 1
    > As part of a recent examination of the most popular html authoring techniques, my colleague Ian Hickson parsed through a billion web pages from the Google repository to find out what are the most popular class names, elements, attributes, and related metadata.

    "Unfortunately, it was also of significant interest to the DOJ, who wanted to know how many times the word 'boobs' appeared in the first 50 characters after the string "IMG SRC". Because we didn't actually look for this data, and because the DOJ folks didn't believe us when we told them so, we're now enjoying a taxpayer-funded vacation in sunny Cuba."

    > We decided that to publish this would be of significant utility to developers

    whom we would encourage to send lawyers, guns and money; the blink tag now encloses the rotating ad banner.

  18. Some of these results... by Dracos · · Score: 3, Insightful

    Prove that most people (and WYSIWYGs) don't know how to produce valid and accessible markup. The img alt attibute (an accessibility requirement) was found significantly less than width, height, and border.

    I'm working on a site now where the project owner is continually reducing usability and accessibilty of the entire site (Never mind that he secretly had a third party come up with an ugly design and ambushed the dev team with it).

    I keep telling everyone to deconstruct the adage "form follows function". It means function comes first. He doesn't care what anything *is* or how it *works*, only what it looks like. And, of course, that it's ugly.

    1. Re:Some of these results... by DeafByBeheading · · Score: 1

      He doesn't care what anything *is* or how it *works*, only what it looks like. And, of course, that it's ugly.

      Not only that, but HTML is not meant to let you determine how things look. You know that, of course, but he doesn't seem to. HTML is all about what things are, and not how they look. How they look is the browser's job, and not only is there plenty of room for differing interpretations in the web standards, but there are things such as end-user font size changes that just can't be predicted. No design should break because a user wants to read the font at two points larger (but many do, most hideously); that's the point of HTML.
      --
      Telltale Games: Bone, Sam and Max
    2. Re:Some of these results... by pimpimpim · · Score: 1
      Actually I looked at quite a few pages with lynx and w3m, and I really don't think the img alt attribute is a helpful thing per se. Especially when you end up with a page saying "image" "image" "image" "image" "click here".

      With image names that are at least descriptive I can come a lot further.

      "menu.jpg" "links.jpg" etc...

      --
      molmod.com - computing tips from a molecular modeling
  19. SVG, uh. by Janek+Kozicki · · Score: 1

    so I'm using debian sarge, and oh well - flame about dozens of other distros, but currently I'm too lazy[1] to update to etch, or anything else. And in sarge there is firefox 1.0.4 without SVG. Anyone knows some backported debs for sarge that will provide SVG support?

    [1] everything is about priorites, I spend some time reading /. but in fact I have some work to do, and this work is not switching linux distros around.

    --
    #
    #\ @ ? Colonize Mars
    #
    1. Re:SVG, uh. by Janek+Kozicki · · Score: 1
      --
      #
      #\ @ ? Colonize Mars
      #
  20. Dumb by Anonymous Coward · · Score: 0


    It's really dumb to present pictures with Flash.

    1. Re:Dumb by Spad · · Score: 4, Insightful

      It's even dumber to state that someone is presenting pictures with Flash when they're actually using SVG.

  21. Ad for anti-IE by jamienk · · Score: 4, Insightful

    It looks like a subtle push against IE: many mantions of the HTML 5 spec (which is being written by WHAT a workgroup that includes many browser companies but not MS); use of SVG; written by a major FF developer.

    Way to go Google! Pour on the pressure!

    1. Re:Ad for anti-IE by Bogtha · · Score: 3, Informative

      written by a major FF developer

      I don't believe Ian Hickson has been involved with Firefox; if I remember correctly, he used to hack on Mozilla, but then started work at Opera before Firefox took off.

      I don't think it's a jab at Internet Explorer, it's just that he knows that the target audience is likely to have a decent browser, so he's used the features likely to be available.

      --
      Bogtha Bogtha Bogtha
  22. Beford's Law by SIGFPE · · Score: 1

    I'm curious to see how closely Benford's Law is followed by these pages. It should be easy for Google to run the stats.

    --
    -- SIGFPE
    1. Re:Beford's Law by EvanED · · Score: 3, Interesting

      I had an interesting run-in with Benford's law a bit ago. I had this typed up already, so here goes (description of the law omitted; read the Wikipedia link in the parent -- it's really cool):

      You see, my hard drive crashed about two weeks ago. It had three partitions on it, and two of them are still perfectly readable. The third is pretty well shot. (Fortunately, it was the most useless partition; it's main contents was Windows itself. This does mean ANOTHER Windows installation -- after having to do one a few weeks before -- but really that's no biggie compared with my actual data. And while I'm on that subject, I had two hard drives; when I got the newer one, I put all my work stuff on it as well as a new Linux installation specifically because it was less likely to fail, and I look back at that decision now with great happiness, because it is that foresight that has made this no big deal at all.)

      I've been trying to recover data off of the third partition, and it seems that if you do a full scan of the partition it appears as if the data was just deleted. Most of the time it's able to recover information, but not always: folder names are often lost. They show up in the recovery programs I tried as just Folder2393 for example. (Numbers ranged from 2 to 5 digits.)

      The folder numbers approximately follow Benford's law.

      Here is the approximate distribution:
      (M. S. Digit) (% of folders) (Ideal Benford %)
      1 32 30.1
      2 15 17.6
      3 12 12.5
      4 12 9.7
      5 19 7.9
      6 03 6.7
      7 03 5.8
      8 02 5.1
      9 02 4.6

  23. Good God in Heaven by Run4yourlives · · Score: 1

    Some choice tidbits FTA:

    For example, looking at what HTML ids and classes are most common, and at how many sites validate (and yes, we know that we're not leading the way in terms of validation).

    There are more elements (from Microsoft Office) on the Web than there are elements.

    If someone can explain why so many pages would use a tag and then not put any cells in it, please let us know.

    Web "professionals" (and I am one of that group) have got a long, long, long way to go before we're actually taken seriously, it seems, as coders.

    1. Re:Good God in Heaven by aitan · · Score: 1
      There are more elements (from Microsoft Office) on the Web than there are elements.
      And that without taking into acount the st1: tags (Smart Tags of MS Word) that they claim don't know where did they come from http://code.google.com/webstats/2005-12/editors.ht ml
    2. Re:Good God in Heaven by Anonymous Coward · · Score: 0

      Web "professionals" (and I am one of that group) have got a long, long, long way to go before we're actually taken seriously, it seems, as coders.

      I hate to be the one to break this to you, but HTML is a markup language, used specifically for formatting, there's no logic involved at all, so that may be why it is relatively unimpressive. Asking a group of HTML writers to all make a page that looks a certain way will produce large groups of pages that all have the same tags all arranged the same way, in fact there might only be one way to create the page at all. But that isn't the case if you ask a group of C coders to all write a program that performs the same task. In that case, you will get a lot of creativity, and that's sort of what HTML is lacking, and the reason why you can use a program to write HTML for you (not that I do, because I prefer the HTML I produce to be compliant).

  24. Bah... by Run4yourlives · · Score: 2, Interesting

    Again, properly formatted this time:

    For example, looking at what HTML ids and classes are most common, and at how many sites validate (and yes, we know that we're not leading the way in terms of validation).

    There are more <o:p> elements (from Microsoft Office) on the Web than there are <h6> elements.

    If someone can explain why so many pages would use a <table> tag and then not put any cells in it, please let us know.

    Web "professionals" (and I am one of that group) have got a long, long, long way to go before we're actually taken seriously, it seems, as coders.

  25. Not so fast - I'm pulling up mostly blank pages... by xxxJonBoyxxx · · Score: 1

    Not so fast - I'm pulling up mostly blank pages...

    Classes

    How many different class names do pages use? Well, most pages apparently don't use the class attribute at all, and it's downhill from there:

    (nothing for about 15 lines)

    Which class names are used on the most pages? Here are the top 20:

    (nothing for about 15 lines)

    This actually maps very well to the elements that are being proposed in HTML5:

    etc...

  26. Re:Strangely... by Anonymous Coward · · Score: 0

    Working fine here (Linux version).

  27. Why couldn't the Justice Department do this by phpsocialclub · · Score: 1

    With all of this talk of the justice department requesting records from Google.

    Why could they not just use this method to get their data?

  28. Re:Strangely... by onedotzero · · Score: 1

    They showed up fine for me. I had to upgrade (installed version was 1.07) but they certainly loaded.

  29. One thing that screws up web page studies by MonkeyBoyo · · Score: 1

    One thing that screws up web page studies is that some sites duplicate pages hundreds or thousands of times.

    Oliver Steele did a cute study on how to spell aargh.

    Unfortunately much of his data is screwed up because he counted pages for each spelling not unique pages.

    For this study, I don't see this problem ocurring.

    1. Re:One thing that screws up web page studies by Anonymous Coward · · Score: 1, Funny
      aaaarrrrggghhhhh!

      I just changed the results... now he has to redo all those pretty colors...

    2. Re:One thing that screws up web page studies by Anonymous Coward · · Score: 0

      That's actually the interesting thing about having google do this study, they very likely have some pretty sophisticated techniques for determining page similarity, so they could reduce their sample sizes down significantly... I'm sure they ran this over their indexed data, which i might take to mean the amount of real data that google has is a lot less then the 8 billion, sure it's 8 billion pages but i'd bet unique pages is closer to the 1 billion used in this study...

  30. Now that's what I call... by SIGFPE · · Score: 1

    ...irony!

    --
    -- SIGFPE
  31. You're missing the most obviuos statistic by Baldrson · · Score: 1, Interesting
    A lot of work has been done on the power laws of (possibly misnamed) "scale free" networks. The simplest is the law that says the frequency of a symbol is inversely proportional to its rank of its frequency. In other words, the most frequently occuring entity is twice the second and three times the third... most frequently referenced symbols.

    The most work on this, in the case of the WWW is the frequency with which pages are hyperlinked. A lot of work has been done on hyperlinking without access to the exhaustive database used by Google. I know that Google's business model started with rank ordering pages on their results by how often they were href'ed elsewhere so the data is there obviously and it wouldn't be a serious imposition on their proprietary information to publish analysis of the href power law.

  32. is NOT more popular than by Anonymous Coward · · Score: 1, Insightful

    Whilst may appear on more distinct pages,
    surely is used more frequently in the aggregate; that is, the multiplicity of occurrences of
    on many pages far exceeds the single(?) occurrence of on most pages.

    1. Re: is NOT more popular than by grimJester · · Score: 1

      Ah, that makes sense. I was wondering why the average table had one row and one cell.

  33. Opera also supports SVG by TheJavaGuy · · Score: 4, Informative

    FYI, Opera also supports SVG. I'm surprised that Ian Hickson didn't have Opera also mentioned on that Google page, after all he worked at Opera until a few months ago.

    --
    Opera Watch - An Opera browser blog.
  34. Re:Strangely... by Maskull · · Score: 1

    Same here. A few show up, but most are blank. Suggestions, anyone?

  35. Please read your own article... by Anonymous Coward · · Score: 0

    For instance one thing that surprised me was that the "title" is more popular than "br".

    Err...this isn't a count of the number of times an attribute is used in a page. It's a count of the number of pages that make use of an attribute.

    A page using "title" once and "br" 10 times will show once in each column.

    More pages have titles than contain at least 1 br tag. Given that a nonzero number of pages are ads, images, or otherwise have no text, why would this suprise you?

    BTW, slashdot appears neither to respect unrecognized tags nor usual escape sequences like &lt sometext &gt...

  36. TITLE vs. BR by HTH+NE1 · · Score: 1

    For instance one thing that surprised me was that the <title> is more popular than <br>

    I'm not surprised. The TITLE container is required for every HTML page to be considered valid across all versions and is the most important text on the page, used by search engines to link to the page. Though browsers will accept pages without it, you'd be a damn fool not to use it.

    BR is optional and generally unnecessary when P handles your general hard line breaking needs. Even with TITLE being once, only once, and no less than once per page while there can be several BR tags on a page, BR is generally omissable. I'd expect overuse of BR to be more common on blogs that don't bother to detect paragraphs.

    Now if it were TITLE vs. TR there'd be no contest.

    --
    Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
    1. Re:TITLE vs. BR by ergowa · · Score: 1

      Yeah, why is this surprising? Title is also used if I bookmark a page. When would I *not* want to use it?

  37. Heh by Z0mb1eman · · Score: 3, Interesting
    This reminds me of the old joke that there only ever was one 'make' script, and everyone else modified it.

    I wonder how much of what they found is influenced by how people learned to write HTML - which in all likelihood was to copy code from existing pages... might explain parts of what they found, such as:

    Most people (roughly 98%) include head, html, title and body elements. This is somewhat ironic, since three of those four elements are optional in HTML
    --
    ClutterMe.com - easiest site creation on the Net. Just click and type.
    1. Re:Heh by Blink+Tag · · Score: 2, Informative
      Most people (roughly 98%) include head, html, title and body elements. This is somewhat ironic, since three of those four elements are optional in HTML

      Somewhat true. The HEAD tag is technically optional (per spec), but TITLE is required, and must be in the HEAD. Thus HEAD is required in practice.

      From the HTML 4.01 spec:

      Every HTML document must have a TITLE element in the HEAD section.

      Though marked as "start tag optional"/"end tag optional", the BODY and HTML tags do provide useful symantec relevance.

    2. Re:Heh by Bogtha · · Score: 1

      Most people (roughly 98%) include head, html, title and body elements. This is somewhat ironic, since three of those four elements are optional in HTML

      This isn't true. Everybody includes head, html, title and body elements. And all four elements are required, not optional. However, the opening and closing tags for the head, html and body elements are optional.

      Just because you can't see any tags, it doesn't mean the elements aren't there. Take a look in a DOM inspector, or style them with CSS, you'll soon realise they are there. And it isn't your browser correcting for bad markup, that's exactly how an HTML user-agent should parse documents without opening or closing head, html and body tags.

      --
      Bogtha Bogtha Bogtha
    3. Re:Heh by Kunta+Kinte · · Score: 1
      Somewhat true. The HEAD tag is technically optional (per spec), but TITLE is required, and must be in the HEAD. Thus HEAD is required in practice.

      Maybe the spec is saying that if you have the optional head you must have a title tag in it?

      --
      Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
    4. Re:Heh by Anonymous Coward · · Score: 0

      No. is implied until you get to - see http://annevankesteren.nl for a valid HTML page with a <title> and no <head>.

    5. Re:Heh by hixie · · Score: 1

      Yeah, you're right. I should have been more careful in my wording here. Still, you got the point I was trying to make! :-)

    6. Re:Heh by icepick72 · · Score: 1

      I got the point that the tags are optional for the programmer to input. One point: I can see HEAD standing above the others (no pun intended) as far as being close to a requirement as any for web page designers when using DHTML because client-side scripting is processed inside the HEAD first before the BODY scripting is executed. This, a programmer can be sure of. One of the few things you can be sure of when JavaScripting is concerned. Now somebody prove me wrong ....

  38. Font still popular by superflippy · · Score: 2, Interesting

    In their list of the 19 most popular elements, the font tag was #16. This element was deprecated when, back in 2000 or so?

    Of course, there may have been a lot of old pages in the sample, or pages built with older versions of HTML. But I've seen first-hand people using font tags to make an error message red, for example, even in a page that's using XHTML 1.0. I try to explain to the developers I work with why they shouldn't use them. I remove the font tags when those same developers add them to pages I've laid out for them. Zombie-like, they refuse to die.

    --
    Your fantasies contain the seeds of important concepts.
    1. Re:Font still popular by poot_rootbeer · · Score: 1

      I've seen first-hand people using font tags to make an error message red

      You know, there's something to be said for the straightforwardness of the "Font. Color. Red. Do it." approach.

      With CSS, the developer has to decide whether to set the color as an inline style, as a page-defined style, or as part of an external stylesheet. Whether to apply that style to an existing element containing the error message, or to wrap the error text in a new SPAN element. Whether the CSS style should be applied based on tag name, class, id -- maybe a convoluted combination of all three.

      Font tags still Just Work.

    2. Re:Font still popular by Bogtha · · Score: 1

      In their list of the 19 most popular elements, the font tag was #16. This element was deprecated when, back in 2000 or so?

      The <font> element type was deprecated in 1997.

      That's actually the year before Google was incorporated. Just think, an entire multi-billion international corporation has been created and grown to dominate the search engine market in less time than it takes for one annoying element type to die.

      I bet if Microsoft announced that the next version of Internet Explorer was going to disregard the <font> element type, it would be almost dead within six months. And it would be less code to maintain for Microsoft as well.

      --
      Bogtha Bogtha Bogtha
    3. Re:Font still popular by Blakey+Rat · · Score: 1

      Font tags are easy to include, even an HTML idiot like me can do it, and they work in all browsers. Why do you hate them so much?

    4. Re:Font still popular by WWWWolf · · Score: 2, Insightful
      You know, there's something to be said for the straightforwardness of the "Font. Color. Red. Do it." approach.

      I don't know. I rather prefer the straightforwardness of "This is a title. You know how to format it." approach.

      With FONT tags, you need to specify the font and color on a single passage of text. Then on another. And then another. And then another. And for the good measure, just another. And by the way, one more. And that one too. And that one there, even when you just described that other one back there to have the exact same font and color. Oh, and that one too. And almost forgot that one there.

      After Netscape & IE 4 died, CSS just works.

  39. table with no by saigon_from_europe · · Score: 4, Informative
    From the article:
    If someone can explain why so many pages would use a
    <table>
    tag and then not put any cells in it, please let us know.
    I don't know if they counted dynamic pages, but I guess they did. In dynamic pages, an empty table is quite normal.

    Your code usually goes like this:
    <table>
    <% for each element in collection %>
    <tr><td> something </td></tr>
    <% end for %>
    </table>

    So it is quite easy to get the empty table if the collection is empty.
    --
    No sig today.
    1. Re:table with no by mblase · · Score: 1

      I don't know if they counted dynamic pages, but I guess they did. In dynamic pages, an empty table is quite normal.

      I doubt it. This is from Google, which only searches the server's output, not the uncompiled code.

    2. Re:table with no by Bogtha · · Score: 1

      I've only ever seen that type of code in exceptionally crappy web applications, because if you have nothing to put in the table, you usually want to put a message like "Sorry, no products matched your query" or similar into the page. If you are already putting the conditional in place depending on whether there is data or not, then all you have to do is put the opening and closing table tags inside the conditional instead of outside it.

      --
      Bogtha Bogtha Bogtha
    3. Re:table with no by Kelson · · Score: 1

      Um... if you've decided to use only table headers ()? I don't know.

    4. Re:table with no by kchrist · · Score: 1
      Of course, but if the collection in the grandparent post is empty, you will end up with this in your HTML output:
      <table>
      </table>
    5. Re:table with no by evand · · Score: 1
      Or, you know, the developer could always do something like
      <% if collection.size > 0 %>
      <table>
      <% for each element in collection %>
      <tr><td> something </td></tr>
      <% end for %>
      </table>
      <% end if %>
    6. Re:table with no by Anonymous Coward · · Score: 0

      Smarty (a PHP templating system) has built-in features specifically for these circumstances.

      {foreach from=$someList item=thisItem}
      {$thisItem}
      {foreachelse}
      Alas! Thar was nothun'
      {/foreach}

    7. Re:table with no by mattwarden · · Score: 1

      I'll refrain from making a comment about how you happened to choose ASP-style delimiters for your example of poor programming practice.

    8. Re:table with no by saigon_from_europe · · Score: 1
      I'll refrain from making a comment about how you happened to choose ASP-style delimiters for your example of poor programming practice.
      But I would not refrain to say that I use them for my bad coding practice in JSP :)

      Wait a minute, you know that these are from ASP, and you don't know that they are used in JSP... What tool you might be using then?
      --
      No sig today.
    9. Re:table with no by mattwarden · · Score: 1

      They are also used in PHP. That doesn't change the fact that they are called 'ASP-style' delimiters. But, since you asked, I've used ASP quite a while ago. Then I used Java/JSP for around 3 years. And I have been using PHP since.

  40. Re:Strangely... by jimwelch · · Score: 1

    Working fine here (WinDoze version).

    --
    Never trust a man wearing a coat and tie!
  41. The reason not to do this by winkydink · · Score: 4, Informative

    Capitalization makes all the difference in the sentence:

    i helped my uncle jack off a horse

    --

    "I'd rather be a lightning rod than a seismometer." -Ken Kesey

    1. Re:The reason not to do this by I7D · · Score: 1

      Parent was modded informative? What, are you people taking notes?

      --
      Neil is that you? Yeah yeah, it's me... Neil...
    2. Re:The reason not to do this by Anonymous Coward · · Score: 0

      You must be new here.

    3. Re:The reason not to do this by HeroreV · · Score: 1

      Where else would I get my jokes?

    4. Re:The reason not to do this by Anonymous Coward · · Score: 0

      It could still be ambiguous when read aloud. Try "my uncle jerk" instead.

  42. Re:Not so fast - I'm pulling up mostly blank pages by stunt_penguin · · Score: 2, Informative

    Try using a SVG compatible browser. SVG graphics *tend to* work better that way.

    --
    When the posters fear their moderators, there is tyranny; when the moderators fears the posters, there is liberty.
  43. Button class by Sky+Cry · · Score: 1
    The button class baffles us. We can't really tell what what it is used for. Similarly, the link class, which is apparently very popular, seems strange. Why would authors label something with that class?

    Button class is usually used when people want some links (<a href>) look like a button. (Light top and left borders, dark bottom and right borders, different background, inverted on hover, etc.)
    1. Re:Button class by tedpearson · · Score: 1

      That's exactly what the button class is used for. In the company that I worked for, we wanted to have absolute control over how our buttons (submit, clear, etc) would look in a browser. So we used styled links, because there is no guarantee what a form button will look like in a given browser (Safari is a good example of buttons that you can't control at all).

    2. Re:Button class by Kelson · · Score: 1

      Then there's the 88x31 or 80x15-pixel banners that are commonly used for things like "Valid HTML," "Get Firefox," "Mac User," etc. in footers or sidebars. They're often referred to as buttons.

      I can't speak for other developers, but I've got several designs where I use a "button" class on the image, the link, or the paragraph containing the set of banners to style them differently than the rest of the sidebar. Centering them, for instance, or ensuring that the linked images don't use a border, etc.

    3. Re:Button class by HeroreV · · Score: 1

      But you of course included form buttons in the markup and used JavaScript to replace them with the links, right? No? Oh well, I'm sure the visitors with JavaScript disabled weren't really that important anyway. Imagine the looks on their faces when they hit submit and nothing happened! LOL! onclick="nothing"! LMAO!

    4. Re:Button class by tedpearson · · Score: 1

      I was working on a complex web application for which we specifically only supported IE>=5.5 and greater and FF>=1.0. It was a you-need-an-account type of thing, so we were less worried about the random lynx user or people with js disabled. You are correct in the case of a widely available site used by more than a few specific organizations.

  44. Re:Strangely... by menkhaura · · Score: 1

    Go grab Seamonkey.

    --
    Stupidity is an equal opportunity striker.
    Fellow slashdotter Bill Dog
  45. Re:Strangely... by Anonymous Coward · · Score: 0

    What browser version are you using?

    What operating system?

    You need a standards compliant browser i.e. one that implements SVG such as FireFox 1.5 or Opera 9.

    I am using FF 1.5 on XP and it's all good. These are excellently presented graphical data, the way the web is supposed to work.

    IE won't work because it doesn't support the standard.

    W98 users note: You need [Opera 9] or [FF 1.5 beta 1 (not a later FF release) and GDI+ installed]

  46. Re:Firefox 1.5 by bigbadbuccidaddy · · Score: 1

    And IE6 + ASV6 (http://www.adobe.com/svg/viewer/install/beta.html ) doesn't work either. All the graphs are blank, and if I go directly to svg by url, I get a big black rectangle.

    I vote this as the worst use of svg on the internet.

  47. GoLive by gmerideth · · Score: 1
    GoLive's footprints are all over the Web. A scary number of pages use , not to mention the multitude of , , and elements.


    Didn't need a billion page analysis to point out that horrible fact.
    --
    Why do overlook and oversee mean opposite things?
  48. What about plugins? by AndrewStephens · · Score: 2, Insightful

    I would be interested in seeing how many web pages use Java applets, Flash, Shockwave, Quicktime, ActiveX controls etc, etc. Sadly the authors did not include this information.

    --
    sheep.horse - does not contain information on sheep or horses.
  49. Re:Opera also supports SVG by MagicM · · Score: 1

    It does, but the latest version (8.51) doesn't appear to deal with the graphs very well. It just shows black blocks.

  50. Re:Strangely... by Billosaur · · Score: 1

    Cool browser! Unfortunately, it didn't help... I suspect the content's being blocked locally somehow.

    --
    GetOuttaMySpace - The Anti-Social Network
  51. Re:BR tag? CSS, duh! by conJunk · · Score: 1
    It's weird; a lot of this study seems to ignore CSS where it's fairly obvious that's what's going on.

    You're right about BR. It's just about useless these days.

    Look at this sentence from the 'HTTP Headers' section:

    There are pages that use the Window-Target header, and even some that use the Link header (though we haven't yet checked what for!). There are even some pages that include the Content-Style-Type header.

    Excuse me? the link header is for including stylesheets (among other uses). The fact that they've got such! emphatic! pucntuation! here makes me wonder just how important they took this study, and what kind of employees they made responsible for it.

  52. Re:Strangely... by Billosaur · · Score: 1
    (Score:2, Troll)

    Talk about knee-jerk moderation...

    --
    GetOuttaMySpace - The Anti-Social Network
  53. Re:Firefox 1.5 by LWATCDR · · Score: 1

    Works fine for me using Firefox 1.5 under Suse 9.3.

    --
    See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  54. Re:BR tag? CSS, duh! by gstein · · Score: 1

    Euh... who is being thick now? That page is about HTTP Headers, not elements you find in the element of an HTML page. Specifically, the Link HTTP header mentioned refers to Section 19.6.2.4 of RFC 2068.

    And yeah... it ignored CSS. It's looking at page elements in order to help out the WHAT folks.

  55. Pretty crappy page authoring... by xxxJonBoyxxx · · Score: 1

    Pretty crappy page authoring...not to tell a poor end user that he/she was missing a required viewer (w/ Mozilla 1.7.6). My old Firefox 1.0 showed a "click here to download plug-in", but never came back with a plug-in. (OK, so then I tried Firefox 1.5 and it worked.)

    1. Re:Pretty crappy page authoring... by Bogtha · · Score: 2, Insightful

      Pretty crappy page authoring...not to tell a poor end user that he/she was missing a required viewer

      It's explicitly mentioned on the very first page ("Note: You will need a browser with SVG and CSS support to view the result graphs correctly. We recommend Firefox 1.5.").

      --
      Bogtha Bogtha Bogtha
    2. Re:Pretty crappy page authoring... by xxxJonBoyxxx · · Score: 1

      ...but remember that the Web supports "direct" links. In other words, if someone gets a link to just this report's "elements" page, there's no hint. Thus, it's crappy page authoring, because it will look like a broken web page to the average user.

    3. Re:Pretty crappy page authoring... by Ilgaz · · Score: 1

      It _is_ broken, Safari with latest SVG plugin from Adobe does not show it. As well as Opera latest on OS X.

      I am _not_ installing Firefox to view their graphs. I always supported Camino for OS X and still do but this kind of "gecko fascism" does not help anything.

    4. Re:Pretty crappy page authoring... by masklinn · · Score: 2, Insightful

      Gecko fascism indeed, I mean what a bunch of bastard, using completely valid SVG files, oooh the nerve of them blokes...

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    5. Re:Pretty crappy page authoring... by Ilgaz · · Score: 1

      Nobody will install Firefox just because their functioning SVG Plugin did not display them for some reason.

      Note that non displayed SVG costed me
      1)Needless re-download of software I already have (svg plugin)
      2)Very needly check of permissions (Adobe and Macromedia has their own permission universe)
      3)Needless re launch of my browsers (for already existing plugin to init)

      So excuse me when I name it Gecko fascism.

      BTW- SVG goes nowhere being toy of purist geeks. As an ordinary user which uses the thing coming with OS (until omniweb updates), I will file a bug report to Adobe about this. This is how to help SVG.

    6. Re:Pretty crappy page authoring... by masklinn · · Score: 1

      You'll probably be delighted to learn that the graphics work perfectly in Opera 9 TP1 though, probably means that the Google guys are also Technology Preview fascists as well Gecko fascists.

      Now instead of pushing for the update of that sucky Adobe plugin, how about helping with the native SVG implementation of Safari that's being worked on atm?

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    7. Re:Pretty crappy page authoring... by Ilgaz · · Score: 1

      I don't see why Adobe one must be sucky. As far as I know, it is a more complete implementation than others.

      Also I see only Adobe tries to support SVG in commercial level.

      If it sucks because "it is Adobe" and/or "not open source", I am not in that game.

      If Apple does SVG plugin as they did PDF plugin (10.4.x), I will do just the same, install a better professionally coded plugin over it. Just like I use Luratech Jpeg 2000, Adobe PDF plugin etc.

    8. Re:Pretty crappy page authoring... by masklinn · · Score: 1

      I don't see why Adobe one must be sucky. As far as I know, it is a more complete implementation than others.

      Looks like it is indeed eh. BTW, I think that the nightly safari builds support SVG as well, just so you know.

      --
      "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
    9. Re:Pretty crappy page authoring... by poopdeville · · Score: 1

      Are they on the Apple dev site, or just accessible internally?

      --
      After all, I am strangely colored.
    10. Re:Pretty crappy page authoring... by adpowers · · Score: 1

      There is a website full of Safari nightlies for the download. It is sort of a pain to get to, so I recommend bookmarking it or something.

      Unfortunately, I tried Safari first and it didn't work for some reason. I guess they aren't far enough along yet.

  56. Re:Firefox 1.5 by bigbadbuccidaddy · · Score: 1

    The latest Opera shows the graphs as black rectangles as well.

    As does the Batik squiggle project.

    The only way I've sucessfully seen a graph is to view the source in IE, manually build the link to the svg, and go directly to the svg in the Firefox browser.

  57. Script attributes by Stan+Vassilev · · Score: 1

    Among the top 15 attributes used in the [script] tag are the following:

    "langauge"
    "langugage"
    "languaje"

    Link to that page in the stats:
    http://code.google.com/webstats/2005-12/scripting. html

    I just have no comment to this.

    1. Re:Script attributes by HeroreV · · Score: 1
      The language attribute is deprecated anyway. Betcha didn't know that, did you? Instead, you should use the "tiep", "taip", or "tipe" attribute.
      <script tipe="leturs/js">
        alert('OMG wElComE 2 MAi PaJe!!!!11one');
      </script>
  58. Stop spreading FUD by jonasj · · Score: 1

    Firefox 1.5 and Seamonkey 1.0 are both based on Mozilla 1.8.0. When they use the exact same rendering engine, how would switching from one of them to the other give you any difference in how a site works? It won't, and I suspect that you knew that and just used this opportunity to advertise your favorite browser.

    Now, if you had just replied to the article and pointed out that Seamonkey is another browser that also supports SVG, that would be totally fair; but instead you chose to reply to someone asking for help because their Firefox wouldn't show it (which is strange; mine does), telling them to grab Seamonkey, even though you knew it wouldn't make a difference as they use the same rendering engine, just to spread FUD about Firefox because you want people to use Seamonkey instead. And that is NOT fair.

    --
    You know, Microsoft's street address also says a lot about their mentality.
  59. Re:Firefox 1.5 by Anonymous Coward · · Score: 0

    The graphs showed up once I installed the Adobe SVG Viewer: http://www.adobe.com/svg/viewer/install/main.html

  60. Re:Firefox 1.5 by bigbadbuccidaddy · · Score: 2, Interesting

    The black box is caused by them not using type="text/css" on the ?xml-stylesheet declaration. type is a required attribute. If I add that it renders properly on all the svg viewers I tried.

  61. Poor style by Google by Jugalator · · Score: 1, Redundant

    Web developers shouldn't aim for writing for one browser, but as many as possible.

    They're doing the exact opposite of what they should be doing.

    They're doing what led us into this shitty IE situation in the first place; targetting specific browsers instead of the public.

    Can anyone tell me what's here that can't be visualized with GIF's?

    Even if it'd mean less features for the user, they should at least graciously fall back to a more basic technology than SVG's.

    How do these pages look on IE, Opera, Safari, or Konqueror under default configurations?

    If this is what Google sometimes wish to do, design pages to push a specific browser, they're no better than Microsoft.

    --
    Beware: In C++, your friends can see your privates!
    1. Re:Poor style by Google by Kelson · · Score: 1

      IE 7 beta 1 - nothing where the graphs would be expected
      Opera 8.5 - black boxes in place of graphs
      Opera 9 preview 1 - graphs are visible but about 1/2 the size they are in Firefox
      Konqueror 3.5 - text from graph appears inline
      Safari - don't have a Mac handy at work, but SVG only recently made it into development versions of WebKit.

      Something key to recognize, though, is that all the major browsers except IE either have partial SVG support already or are working on it. It's kind of frustrating that Firefox 1.5 and Opera 8.5 support different subsets, but they're both working toward a standard so we can expect those subsets to converge over the next few releases.

    2. Re:Poor style by Google by Toothpick · · Score: 1

      Safari - don't have a Mac handy at work, but SVG only recently made it into development versions of WebKit.

      Safari 2.0.3 does not render the charts. But then, my copy of FF Deer Park optimized for G4 (but still ridiculously slow!) does not, either -- it gives the "click to install plugin" bar.

      Camino 1.0+ renders them perfectly. Camino would rawk if it could support FF extensions.

    3. Re:Poor style by Google by pbhj · · Score: 2, Insightful

      >>> "Can anyone tell me what's here that can't be visualized with GIF's?"

      I don't think that's the point ... it's about the creation of the images, not their visualisation. These images can be created on the fly from varying data with only textual manipulation of the code - the processing will be extremely light as will the data load on the servers. Presumably the xml-to-image parsing in the browser incurs a processing penalty though.

      If you view code of one of the graphs http://code.google.com/webstats/2005-12/charts/uni que-classes-per-page.svg you'll see that it is less than 10k. It also has a theoretical infinite resolution; which might be useful if the graphs are to be used for a presentation (like printing them on the moon using lasers!!?).

      Use of FF isn't too suprising as the section code.google.com is for promotion of OSS.

      It looks to be an internal project that we have just happened to be given access too ... assuming the officers of Google that need access have FF1.5 then the web devs have probably met their brief?!

    4. Re:Poor style by Google by bigbadbuccidaddy · · Score: 1

      10k is pretty big. If the goal was small size they would use svgz.

    5. Re:Poor style by Google by HeroreV · · Score: 1

      Can anyone tell me what's here that can't be visualized with GIF's?

      PNGs are better than GIFs in every way except animation. (And MNGs are better at animation than GIFs.)

  62. Re:BR tag? CSS, duh! by Bogtha · · Score: 1

    the link header is for including stylesheets (among other uses).

    This kind of misunderstanding is why people should learn the proper names for things. The study is referring to the Link HTTP header. You are referring to the <link> HTML element type. Headers are not element types, even if most people call both of them "tags".

    Using the Link HTTP header for stylesheets is not practical because most browsers don't support it and those that do only added support recently.

    --
    Bogtha Bogtha Bogtha
  63. Re:Strangely... by Anonymous Coward · · Score: 0

    Talk about a flaming faggot.

    Or are you not the egotistical type?

  64. Depth by xant · · Score: 1

    So far everyone who has replied to you has ignored one thing. A thousand may be fine for seeing a simple "A or B" statistical difference at significant levels; with ANOVA you can even track a few different significant traits.

    The number of traits they were trying to discover was unknown at the start; furthermore, they expected it to be very high. Lots of different HTML tags in the standards, but even more nonstandard tags, nonstandard attributes; they even found information about how different attributes are misspelled. Example: nobody can spell "language" on the <script> tag, and they can tell you exactly how many spell it "langauge". They found lots of data points that wouldn't have existed in a sample of a thousand. (My guess is they almost all would existed in a sample of a million, but in numbers too small for statistical significance.)

    Most people would have settled for a million, I think, but if you have the resources to get a billion, there actually is useful information in there for you to use.

    --
    It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
  65. The author's never been around proxy servers, eh? by xxxJonBoyxxx · · Score: 1
    The author's never worked with proxy servers, has he?

    I laughed when I read this... "The \ "attribute" is almost certainly the result of people writing markup like (br\) when intending to do (br). Of course, neither is particularly useful to browsers when the page is sent as text/html (as all these pages were)."

    (OK, for those who don't get it, one reason that so much content is sent with an "incorrect" text/html header is that many proxy servers will dump content on the floor unless it has a text/html header.)

  66. Questionable value by ergowa · · Score: 1

    Between the questionable conclusions and the sometimes poor quality of the writing (not to mention, are there graphs and charts? I didn't see any.), I wonder about the usefulness of such an analysis.

    Take, for example, the commentary on the element. Abuse is in the eye of the beholder. A number of pages don't follow standards or use deprecated elements. In some cases, that's not entirely the fault of the authors. If I'm developing a corporate site that demands backwards compatibility to Netscape 4.x or an ancient version of IE, I'm certainly not going to jump through all sorts of hoops with layered CSS hacks when I can just use a deprecated element.

    And which specifications are we talking about? If I include those elements and validate my document, which elements will fail? At present six by my count (and not five) of those attributes in are deprecated by the W3C for HTML 4.0.

    Regarding the use of classes, I wonder how much HTML coding the authors do. I have had countless opportunities to style an element using a "copyright" class (rather than something like "small"). In some ways, it's a better practice since it describes the element rather than the style being applied to that element. It's still not ideal, but in the real world, I can remember that this element, like footer, appears in a certain place on the page and style it accordingly. Using a element is not a substitute; it's not meta-data, it's a display element the user sees.

    Similarly, "The button class baffles us. We can't really tell what what it is used for. Similarly, the link class, which is apparently very popular, seems strange. Why would authors label something with that class?" How about I have a submit button and a link side-by-side and I want them to look the same (that is, both appear as buttons)? If it makes sense from a user experience standpoint, then I'll use it. I can certainly see using a link class to style certain links on a page (say in a left navigation or the body) different from others. It's sloppy, but it gets the job done and, even though I'd avoid it whenever possible, I'm not going to slam someone else for doing so.

    And on it goes, "onmouseover on a elements is a little worrying; presumably those are mostly cases of the status bar being overridden". How about image rollovers for navigation? Empirically, I've seen fifty sites with image rollovers for every site that changes the status line. The authors then state (in the next section) that the relative few uses on the element is the assumption that few people are using rollovers. Since they typically are applied to the anchor, this is an erroneous assumption. Of course, why bother with scripting events when you can use CSS (apart from pesky backwards compatibility)?

    In general, the tone of the article seems to be that many people should not be allowed on the web because they can't follow standards (and are illiterate, in many cases). Nothing is said about browser being inconsistent in following standards, nor about how many of those pages are legacy pages from who knows when. The general attitude seems to be that HTML is as rigorous as a programming language. If that last were the case, browsers would only display pages that conformed to the 4.01 Strict standard or maybe the XHTML 1.0 Strict DTD. I mean, if you really want to slam users for not caring what the standards say, see how many of those documents are properly formed according to the XHTML standards. I don't even have to do an "analysis" to know the number would be very much on the low side.

    1. Re:Questionable value by Anonymous Coward · · Score: 0

      are there graphs and charts? I didn't see any.

      From TFA:

      Note: You will need a browser with SVG and CSS support to view the result graphs correctly. We recommend Firefox 1.5.

      The connection between these two remarks is left as an exercise for the reader.

    2. Re:Questionable value by hixie · · Score: 1

      "" is the HTML5 DOCTYPE, and HTML5 is why I did the study, so yeah, I used the HTML5 DOCTYPE. I don't see what's wrong with that. :-)

      I didn't specify a character encoding because I used US-ASCII, which is the default for text/html, and which is also a subset of UTF-8, which is the default for image/svg+xml and text/css. Thus there was no need to set it. Nothing wrong there.

      As for the entity for ">", there is no reason to use it. It takes longer to type, and is harder to maintain. Why would it be stupid?

      And finally, TFA actually explicitly mentions the fact that Google's pages as a whole don't validate, in the very first paragraph. We know.

      You need to chill, dude. Go play some games or something. :-)

  67. Just to clarify... by jonasj · · Score: 0, Troll

    What I mean by spreading FUD is that your comment implies that Firefox's support for SVG is not as good as Seamonkey's, when it is in fact exactly the same. Also, I'm not trying to start a Seamonkey vs. Firefox flamewar here.

    --
    You know, Microsoft's street address also says a lot about their mentality.
  68. Zipf's law by Anonymous Coward · · Score: 0

    So, do the results follow Zipf's law?

  69. Re:Opera also supports SVG by Kelson · · Score: 1

    Opera 9 can handle the graphs (8.5 doesn't), but it's still in beta. Interestingly, on my Linux box, the Opera 9 preview renders the pages faster and scrolls more smoothly than Firefox 1.5 (scrolling the first page with three graphs is really slow in FF), but the scale is much smaller. To actually read the graphs on Opera I have to zoom in about 200%.

  70. Window-Target by d-e-w · · Score: 1
    There are pages that use the Window-Target header, and even some that use the Link header (though we haven't yet checked what for!). There are even some pages that include the Content-Style-Type header.

    Wasn't creating a Window-Target HTTP header a trick for always breaking out of other people's frames (if someone links to your site and framed your site content within their own). I thought it was more reliable (back in 1999/2000) than the various JS tricks for breaking out of frames.

  71. For folks does not (want) to run Firefox by Ilgaz · · Score: 3, Informative

    http://www.adobe.com/svg/viewer/install/main.html got suitable plugins for browsers/OS of choice.

    Notice that I got SVG plugin installed for ages, Safari didn't display the graphs. Is it because I am not using "a browser with CSS"? Well, nevermind really...

    This is the thing why I and others have negative views against firefox, svg and even .ogg. Rootless promotion of this kind...

    1. Re:For folks does not (want) to run Firefox by k3v1n · · Score: 1

      Calm down! SVG is a standard that will probably appear in the next major version of Safari. It's already in Webkit (available in in the latest nightlies).

  72. Wisdom by AeroIllini · · Score: 2, Interesting
    They've really hit on some wisdom here.

    There are several statistics they quoted which I have suspected for a long time, but only now can confirm with numbers.

    more than half of pages use the target attribute on the a element somewhere.


    I can't begin to describe the frustration I feel when I'm forced to use Internet Explorer and clicking links causes pages to fire up in a million new windows. Whether or not a link opens in a new window, a new tab, or the current window/tab really should be a client-side choice. Webmasters think they're being helpful by letting you separate your workspace into many windows, but they're really just slowing people down. Thank God for Firefox.

    It seems most pages use presentational attributes: the fourth most used attribute across all elements is the table element's border attribute, followed by the height and width attributes on img, followed by <table width="">, <table cellspacing="">, <img border="">, and <table cellpadding="">. Interestingly, though, the most frequently used attribute on the body element (namely bgcolor) is only used on around half of pages, with all the other presentational attributes on body being used even less. One possible explanation is that on average, colors are mostly done using CSS, while layout is mostly done using HTML tables.


    This makes perfect sense. While colors, fonts and styles are pretty much standard in a cross-browser environment, due to many various interpretations of the CSS Box Model, coding layout purely in CSS can be a terrible chore. It's usually much quicker to do a few simply layouts in tables (header, sidebar, content) and use CSS for pretty much everything else.
    --
    For security, the MD5 hash of this message and sig is 09f911029d74e35bd84156c5635688c0.
    1. Re:Wisdom by ergowa · · Score: 1

      On the last comment, which also relates to the lack of valign tags. Of course I'm using tables still.

      On site I worked on, I need to middle align an image and after struggling for way too long to get text and an image to line up along-side each other, I finally just threw it into a table (long story short, there were more issues than I could solve using CSS and use of DIVsor other elements).

      Until browsers all follow standards and layout tags don't require hacks, I suspect this practice will continue.

  73. Re:Worst use of SVG ever by jamesots · · Score: 2, Funny

    Yeah, and what's the point of using HTML? They could have posted an image of the text to the same effect.

    --
    Ho hum for the life of a bear
  74. Re:Firefox 1.5 by Kelson · · Score: 1

    Works for me. Firefox 1.5 and Opera 9 preview both display the graphs.

  75. Markov Chains by ImaLamer · · Score: 1
    Personally I've had an idea about running, say, a billion web pages through a program that creates markov chains from text strings.

    I've run some text through a free program before to create these. Some are funny, some are just silly, all formed from various Gutenberg texts and a few usenet love stories (text pr0n). Fairy tales, love stories and the bible make an interesting match;
    Little Boy Blue, come, blow your horn!
    The sheep's in the mountain the Lord is a vapour of the tree for the strangers to
    take the book that thou hatest the deeds of the wicked shall decay, and the noise of chariots and horsemen, through the pillar of fire; that he had opened the sixth curtain in the gate, and there are innumerable before him and he heard the voice of a person, if he see that he died of itself, or any peacock gay, So, dearest Jen, if you'll be mine, let us find occasion of word against him?

    and...
    For who among men is not of divers colours

    The saints glorify God for his fiddlers three.

    FOR WANT OF A FEATHER-Birds of a man seduce a virgin

    The childish shall possess them

    Fatness hath covered his face, and shalt let down thy milk

    And when it was well lubricated. This procedure
    wasn't really necessary, but who is the cause which I have sinned, and thou shalt
    fill it

    And if all the heavens, or who gave the cock
    understanding?
     

    Actually I've got a really great idea for a program that would use text and markov chains. It's a little silly, but I wouldn't just give it out to anyone. E-mail me. It wouldn't be a project for the faint of heart and would require something like Google's cache of Internet pages (and Wikipedia content, Gutenberg content, ...) Hint: it's a distributed computing project.
    1. Re:Markov Chains by SIGFPE · · Score: 1

      He he! I remember doing that for the book Mr Happy on my 16K BBC Micro over 20 years ago. Had to type in the whole book myself. To think that you can now do it for non-trivial chunk of the complete corpus of published human writing boggles the mind!

      --
      -- SIGFPE
    2. Re:Markov Chains by ImaLamer · · Score: 1

      BTW... let me add that I used CYGWIN to compile and run the c program to do this. If that link means you do work there I can't thank you enough for a great product...

  76. actually, firefox hardly works there by XO · · Score: 1

    You'll probably need Opera, and it's Zoom feature, to be able to actually READ anything on those charts. The headers are microscopic, and the charts themselves not much bigger.

    --
    "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    1. Re:actually, firefox hardly works there by Kelson · · Score: 1

      You'll beed the Opera 9 preview to show the graphs. 8.51 only manages to show black rectangles.

      If you do manage to view the page in Firefox, the size is actually readable -- it shows the graphs about twice the size at which Opera displays them.

    2. Re:actually, firefox hardly works there by kingturkey · · Score: 1

      The read-able-ness is a matter of opinion and hinges on your resolution. I had to squint a bit to read it. I tried using ctrl+scroll wheel to zoom in, but firefox lacks a zoom and instead just increases the text size. So the graphs got bigger but so did the text to the side which then overlapped the graphs, making it rather useless to zoom in. Also the scroll wheel is inverted, scroling down makes it bigger and up makes it smaller, what the hell?

    3. Re:actually, firefox hardly works there by XO · · Score: 1

      ah.. well, i'm sitting at 1600x1200, on a 20" monitor, i didn't see much difference between firefox and opera's rendering, i definitely had to use the zoom on opera. i tried with firefox, since it's reflex, but obviously only the text got bigger.

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
  77. Untitled Document by Kelson · · Score: 1

    On the other hand, you'd be amazed how many pages are called "Untitled Document" or "Page Title".

  78. Of course TITLE is more popular than BR by goynang · · Score: 1

    TITLE is more popular than BR as it's used to create the title of the page - the bit that appears in the browser's title bar. Just about every HTML document will have it. BR is just for line breaks and not necessarily needed (or even ideal in the days of CSS).

    So no real surprise that it is more popular really.

  79. Re:Firefox 1.5 by Kelson · · Score: 1

    The Opera 9 preview displays the graphs, but at a different scale than Firefox.

  80. Re:BR tag? is used in 7 out of 8 pages by TekGoNos · · Score: 3, Informative

    The summary got it wrong,

    the study states that there are more pages using title, than pages using br. NOT that more title tags are used than br tags.

    Approximatly 98% of all pages have a title tag and approximatly 7 out of 8 pages have (at least one, probably more) br tags.

    --
    I have discovered a truly remarkable proof for my post which this sig is too small to contain.
  81. Re:Opera also supports SVG by Bogtha · · Score: 1

    Opera and Firefox only support some bits of SVG. Currently released versions of Opera can't handle the SVG in the article, although the latest beta can.

    --
    Bogtha Bogtha Bogtha
  82. Re:Strangely... by masklinn · · Score: 1

    hit F5, the graphics are hella slow and sometimes don't load at first.

    --
    "The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
  83. Re:Worst use of SVG ever by idonthack · · Score: 1

    HTML loads quickly, renders quickly in all browsers, and is more scalable than an image of the text. SVG has many of the same bonuses, but only a couple browsers support it, and nobody has complete support.

    --
    Why is it that when you believe something it's an opinion, but when I believe something it's a manifesto?
  84. Set-Cookie2 insecure? by tedhiltonhead · · Score: 2, Interesting
    The linked site claims the Set-Cookie header is "considered insecure":
    The Set-Cookie header (which is one of the ten most-used headers) is present on about two orders of magnitude more pages than the Set-Cookie2 header (despite the former being considered insecure).
    After glancing over the RFC for Set-Cookie2, I can't see where it says Set-Cookie is "insecure". Google turns up nothing useful. Does anybody know more about this?
    1. Re:Set-Cookie2 insecure? by hixie · · Score: 2, Informative

      Yeah, I misspoke on this. Set-Cookie is insecure (due to domain-crossing problems -- should a cookie sent to a.b.c get sent to z.b.c? Depends on "b" and "c" in ways that depend on month-to-month political changes around the globe), but as far as I can tell, Set-Cookie2 is also insecure. I had thought it fixed this, but apparently not.

  85. Re:Firefox 1.5 by bigbadbuccidaddy · · Score: 1

    My Firefox 1.5 is on XP, I'll bet your on Linux. I checked the Firefox bugzilla, someone reported a bug today referencing this same google site. Their behavior was different than mine; their graphs wouldn't show up until they did a Print Preview, but Firefox didn't crash. Further testing with my Firefox has crashed on many other SVGs.

    My Opera is 8.51 and renders the graphs as black rectangles.

    And I would like to express special thanks to the user who moderated my original comment as flaimbait, but without knowing who they are I cannot.

  86. counting pages with not elements by miro+f · · Score: 1

    as far as I could see, this didn't compare the numbers of br but the amount of pages that contain br. if you compare numbers I'm sure there are more
      than , but of course if you compare the number of pages that contain them then of course is more popular as it should be on every page.

    this is clear because is ranked higher than which shouldn't be if they were counting elements instead of pages with elements

    --
    being vague is almost as cool as doing that other thing...
  87. Re:Firefox 1.5 by Kelson · · Score: 1

    My Firefox 1.5 is on XP, I'll bet your on Linux.

    Yes, but I just tried it on my XP box and it works there too. There must be something else going on.

  88. Noindex, Nofollow by Anonymous Coward · · Score: 0

    How come the META tag didn't show how many times the "ROBOTS" name showed up?

    1. Re:Noindex, Nofollow by hixie · · Score: 1

      It did. It's third on the "name" chart, fourth on the combined chart. Or did I misunderstand your question?

  89. some handy titbits by pbhj · · Score: 1

    This review is quite interesting (from a web dev's POV).

    There are also some handy little bits of info: Lists of most used attributes and tags could give an indication as to which tags Google will use and which will just be thrown out.

    Statements like: "More pages use the completely worthless <meta> name="revisit-after"> than use the <em> element!"; appear to be dropped in on purpose as hints for less experienced devs. Similarly "Next we have two name values: keywords, which these days is mostly useless" on http://code.google.com/webstats/2005-12/metadata.h tml suggests that I can stop worrying that perhaps Google finds even a smidge of value in this data.

    Then there's bits like "One area of future study would be to see what these attributes are used for: is onunload used mostly by Web applications for legitimate purposes, or is it used more by hostile sites to show pop-unders?" which suggest that if you're using onunload legitimately your pagerank is about to take a nose dive!!?

    I'd not come across pingback and "link rev" before.

    Thanks for all the fish.

  90. Re:Strangely... by bigbadbuccidaddy · · Score: 1

    My similar comment got moderated down as Flamebait, then Offtopic. Also I wonder if the Anonymous Coward with the Genius IQ who posted is also the moderator?

  91. Re:Firefox 1.5 by bigbadbuccidaddy · · Score: 1

    Agreed; I will try to figure it out what's wrong for the betterment of Firefox.

  92. Re:Worst use of SVG ever by jdub_dub · · Score: 1

    Is there something wrong with trying to encourage the masses to install SVG support in the browser? Is there anything wrong with the standard that implies that it should not come standard with browsers?

  93. Fix for Firefox 1.5 by bigbadbuccidaddy · · Score: 3, Informative

    If your Firefox 1.5 doesn't display the graphs, or crashes, do the following as suggested by the Google webstats author:

    Apparently there's a problem in Firefox 1.5 regarding SVG images if you
    had SVG in the registry. Try following the steps described here:

          https://bugzilla.mozilla.org/show_bug.cgi?id=30358 1#c3

  94. Re:The author's never been around proxy servers, e by Anonymous Coward · · Score: 0
    • he's talking about XHTML being sent as HTML. this is done for IE's sake, not proxies'
    • no need for scare quotes, it is incorrect
  95. Re:Strangely... by Billosaur · · Score: 1

    Thanks for the advice. Now that I'm at home, it loads just fine. Connection at work must have been slow.

    --
    GetOuttaMySpace - The Anti-Social Network
  96. Re:I used to work with Ian Hickson by hixie · · Score: 1

    lol.

  97. Re:BR tag? CSS, duh! by hixie · · Score: 1

    As other people pointed out, I meant the HTTP Link: header, not the HTML element.

    But as to who wrote the study... well... I'm on the CSS working group. And the WHAT working group. Make of that what you will.

  98. Re:BR tag? CSS, duh! by hixie · · Score: 1

    Actually, Mozilla has supported it for about 5 or 6 years now. Still, yeah, the other browsers, not so much. In fact it was dropped from the HTTP spec due to lack of implementations.

  99. Re:Firefox 1.5 by Anonymous Coward · · Score: 0

    I did a (web developer plugin) "miscellaneous"->"small screen rendering" view switch for getting it to work. Using the svg urls directly also worked. It just didnt work out of the box as it should.

  100. Re:Firefox 1.5 by hixie · · Score: 1

    Actually the type pseudo-attribute is optional on ; see the errata.

  101. Re:Poor style by Google ... you're right-ish by pbhj · · Score: 1

    10k is quite big, but not for an image that can be resolved infinitesimally.

    Also, I suspect that if Google use mod_gzip (or whatever it's called) then the benefit of svgz wouldn't exist. The 10k was the size of the file stored on my comp: gzipped it's 1544k (so I assume that is the transmitted size).

  102. if(Post==Old_And_Tired) GOTO Mod_Down by LordOfTheNoobs · · Score: 1

    Can't forget that second sign now. Who knows what damage you could cause? Posters might actually have to be original.

    --
    They're there affecting their effect.
  103. Re:Poor style by Parent by Anonymous Coward · · Score: 0

    Slashdotters shouldn't aim for writing for one person, but as many as possible.

    Parent is doing the exact opposite of what they should be doing.

    They're doing what led us into this shitty English situation in the first place; targetting specific people instead of the public.

    Can anyone tell me what's here that can't be said in Chinese?

    Even if it'd mean less understandability for the user, they should at least graciously fall back to a more popular language than English.

    How does Parent's post look in Chinese, Hindi, Arabic, or Russian under default configurations?

    If this is what Parent sometimes wishes to do, write comments to push a specific language, they're no better than Hitler.

  104. Re:Poor style by Google ... you're right-ish by radarsat1 · · Score: 1
    Also, I suspect that if Google use mod_gzip (or whatever it's called) then the benefit of svgz wouldn't exist. The 10k was the size of the file stored on my comp: gzipped it's 1544k (so I assume that is the transmitted size).

    You mean 1544 bytes of course.. :) But we all knew that.

    I really can't wait for SVG to take over. So glad it's starting to get some respect...

  105. something like.... by Anonymous Coward · · Score: 0

    "non-page-based device for the web"

    something like a line that scrolled by, with user adjustable speed, so that an unlimited amount of text could be displayed in a small area? Like a super marquee? I like it. Used to own a speed reader trainer that did that (maxed it out eventually at 2400 WPM)

  106. Re:BR tag? CSS, duh! by iamlucky13 · · Score: 1

    Theoretically most of the places where the BR tag is used it should be replaced with block elements like the P tag, and margins and/or padding specified in CSS where appropriate, but I've found a few spots on my own sites where I haven't been able to get consistent appearences between IE and other browsers without falling back on it.

    Interesting point about the link header. As far as I remember, that's the preferred method to load a stylesheet, rather than @import.

    By the way, I typed <br> 4 times in the course of writing this post. I won't taint my own website, but slashdot is fair game.

  107. You have a point there. by finelinebob · · Score: 1

    You actually have a billion points there.

  108. OT -- sig by Anonymous Coward · · Score: 0

    Life's a bitch...

  109. Safari with SVG by MochaMan · · Score: 1

    Or download a nightly build of Safari with SVG (for those who're not afraid of beta).

    1. Re:Safari with SVG by Ilgaz · · Score: 1

      As licensed owner of Omniweb (modified webkit), I wait for the 5.5 beta and give it a break until it ships.

      So I use the "generic" and "widely supported" Apple Safari not to mess with Firefox incompatabilities and the fact that it is NOT a native OS X program.

      This is the exact reason why I "flamed" the Gecko-only purpose of the report.

      Thanks for the link though. BTW, it doesn't work with that nightly too I heard.

  110. Re:BR tag? Yes, BR's OK, here. by Domo-Sun · · Score: 1

    Except for the blind that need to browse the web with screenreaders. HTML 3 doesn't have the semantic tags that later versions of HTML brought.

    Oh yes. Won't you think of the blind? And won't you think of the children?

    I call shenanigans! Please tell me how using BR is going to mess things up for the blind, because I'm reading all of this with a screen reader, and it's working for me. Please don't simply chant the mantra, try and prove your point.

    Yes, we all know that you're 1337 because you read the spec and you're quoting from it, but we're on a messageboard, it's not like a personal webpage. It's sort of idiotic to have to nest everything in verbose BLOCKQUOTE that requires paragraph nesting, just for a few sentences on /.. I mean, some of these quotes are actually smaller than the tags required to nest them.

    Oh, and then there's the box bugs in old browsers when you try to use DIV for layouts. So now you have all these boxes nested and they're going to create lots of weird gaps.

    I love web development and CSS to death, but it seems like people just don't get when it's ok to break the rules, or they think it's semantic when their page validates.

    Think of semantics like this: There needs to be a contrast between different elements. Maybe it would be best suited with BLOCKQUOTE (I wish we had HTML3's BQ), but as long as there's contrast between the elements, then it's not the end of the world. [DIV class=heading] is and example of little contrast, or rather, the contrast is not in the document.

    Arguing about whether we should use BLOCKQUOTE or I is like arguing whether someone should have used a comma, or semicolon. Or should we say 32 semicolons vs 7 commas? Heck, you could accomplish the same thing by simply saying, "CRCulver said:" and "crabpeople said:", and I might do that if it was a small sentence.

    An aural browser, presumably would read italic differently, just as my user CSS files are written to display italic differently. Any aural browser developer who doesn't do it that way is just stupid.

  111. Re:Opera also supports SVG by mce · · Score: 1

    I just installed the Adobe plugin in a Mozilla 1.7.x tree and get the same effect. Sample SVGs that I find elsewhere on the web work fine, though.

  112. Re:Not so fast - I'm pulling up mostly blank pages by Anonymous Coward · · Score: 0

    Yes. It sucks.

    I am so fed up with Google. They could as easily have done this with normal GIF, JPG, or PNGs. Of course, that's not so "cool", is it?

    Now I have to have a certain version of a certain browser to look at some images? This forum would react very different had that company from Redmond done this.

  113. Re:Not so fast - I'm pulling up mostly blank pages by hixie · · Score: 2, Informative

    It has nothing to do with "cool"; SVG happens to be easier for us to produce than bitmaps, and anyone who is going to be able to read this report and view graphics will be using an SVG-capable browser. The fact that it found bugs in every SVG browser out there is merely a bonus, it means that SVG support will get better.

    We used standards. It's not our fault if there was only one released browser that supported those standards well enough for you to be able to see the graphics.

  114. Re:Worst use of SVG ever by idonthack · · Score: 1

    There's nothing wrong with encouraging. What's wrong is the fact that we're entirely excluded until some body out of our control includes it into the browser we use. And it's hardly standard if only a couple browsers are capable of displaying it, and (from what I hear) not very well either.

    --
    Why is it that when you believe something it's an opinion, but when I believe something it's a manifesto?
  115. OmniWeb by MochaMan · · Score: 1

    Amusignly, I'm also a licenced OmniWeb (sometimes) user. Until Safari came along, it was without question, the best browser by far for Mac OS X. When Safari picked up tabs, I switched and stayed away until about the OW 5.0 timeframe. Since then I switch between OmniWeb and Safari on and off and keep my bookmarks in del.icio.us.

    I also think it's fantastic that the OmniGroup releases their basic frameworks as open source. Very nice gesture to the community.

  116. I'm feeling violated by Sontas · · Score: 2, Insightful

    1 billion pages! Talk about a violation of privacy! The justice department is only asking for a random sample of 1 million addresses and the search results for any 1 week period. This guy gets access to 1 billion pages via the google repository (whatever that is), conducts detailed analysis of the contents of those pages, and nary a word of dissent from the vast Slashdot audience.

  117. Re:Poor style by Google ... you're right-ish by pbhj · · Score: 1

    ::chuckle:: ;0)>