Slashdot Mirror


Google Tags Content Creators

bizwriter writes "Google announced that it will support authorship HTML tags, a way to associate Web content with the individuals who create it. Suddenly, search engines know when one person was responsible for a body of work, no matter where content appears on the Web. If Google incorporates this into page relevance and ranking, as it is considering, the result could change the balance of power between those who create and those who publish."

67 comments

  1. Where's the incentive for publishers? by m50d · · Score: 1

    So you can add these tags that mean google will direct people to the original author rather than your click-through blog - but why would you?

    --
    I am trolling
    1. Re:Where's the incentive for publishers? by Anonymous Coward · · Score: 1

      So you can add these tags that mean google will direct people to the original author rather than your click-through blog - but why would you?

      Because anything that helps put Gawker Media out of business is OK by me.

      More seriously, because if I'm reading your blog's link to an article, it's because I want your commentary on the article. I might want the Fark thread about it, but I certainly don't want Gawker's take on BoingBoing's post about that dude on Reddit who read a NASA press release. If you're just linkwhoring for cash, go for it, but if a blog author is actually trying to provide informed commentary on something, it generally behooves one to link to a primary source, if available.

    2. Re:Where's the incentive for publishers? by Anonymous Coward · · Score: 0

      SEO purposes. If google downranks pages that don't.

    3. Re:Where's the incentive for publishers? by m50d · · Score: 1

      So the SEO types will lie in them, just like they do with meta tags.

      --
      I am trolling
  2. Article Explained by pinkushun · · Score: 4, Informative

    It is made to sound more uncontrolled that it is. This is what really happens:

    The markup uses existing standards such as HTML5 (rel=”author”) and XFN (rel=”me”) to enable search engines and other web services to identify works by the same author across the web.

    This is handy, allowing search engines to find content by a specific author. It's not like Google will automatically decide what content links to which author.

    We can't expect Google to give purely weighted search results based on this either. More like they will keep their existing page rankings, and include this extra author meta-data in specialized searches.

    We know that great content comes from great authors, and we’re looking closely at ways this markup could help us highlight authors and rank search results.

    The bnet article seems to over dramatize it, possibly due to a lack of understanding what this means for content creators.

    Or do I also have the wrong idea?

    1. Re:Article Explained by imamac · · Score: 1

      Parent=better sumary. Thank you.

    2. Re:Article Explained by somersault · · Score: 2

      I agree that they probably won't use it in search rankings, otherwise everyone will just copy the current number 1 "best author" in their tags..

      --
      which is totally what she said
    3. Re:Article Explained by tepples · · Score: 1

      It's also a crime in some countries to put a fraudulent notice of authorship on a work. For example, in the United States, see 17 USC 1202-1205.

    4. Re:Article Explained by Balthisar · · Score: 1

      Yes, but does that apply to the source code or to the displayed content? Copyright law doesn't seem to support HTML tags, whereas a direct statement "Copyright 2011 by Firstname Lastname" passes muster.

      (Note than in the USA we all know you don't need a copyright statement to have the copyright. That's not what this is about.)

      --
      --Jim (me)
    5. Re:Article Explained by drinkypoo · · Score: 0

      Yes, but does that apply to the source code or to the displayed content?

      I just checked, and the answer is in the link provided to you. But I'm not going to tell you what the answer is, because that would be enabling your asshat behavior.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    6. Re:Article Explained by swillden · · Score: 1

      Yes, but does that apply to the source code or to the displayed content?

      I just checked, and the answer is in the link provided to you. But I'm not going to tell you what the answer is, because that would be enabling your asshat behavior.

      By my reading of the law... it makes no distinction between source or displayed content, but I see nothing in the law that would prohibit a copyright holder from claiming that someone else was the author. Perhaps some other law would, particularly if the claim could be construed as defamation, but I don't see anything in copyright law that addresses this issue.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    7. Re:Article Explained by badkarmadayaccount · · Score: 1

      Why not GPG sign it?

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  3. Authorship Tag by imamac · · Score: 1

    The authorship link doesn't work for me so it may answer this, but...what's to stop me from "borrowing" someone's author tag and bumping up my site on the search results?

    1. Re:Authorship Tag by DZign · · Score: 2

      probably nothing.. as well as another site copying your site can just remove your tag and replace it with theirs, claiming they're the original author..

    2. Re:Authorship Tag by Anonymous Coward · · Score: 0

      The authorship link doesn't work for me so it may answer this, but...what's to stop me from "borrowing" someone's author tag and bumping up my site on the search results?

      You are blocked by the need for a reciprocal link, or for being part of the same site.

    3. Re:Authorship Tag by TaoPhoenix · · Score: 1

      The full power of the Copyright SWAT team. Or Slander & Libel.

      Summarizing you, you're talking about putting Respected_Author tags on 4chan posts.

      --
      My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
    4. Re:Authorship Tag by archen · · Score: 1

      Which is exactly what will happen. Current link farms will cross pollinate each other and it will be nearly impossible to tell who really wrote anything. Least likely will be the person who did write the original content.

    5. Re:Authorship Tag by DZign · · Score: 1

      Replying to myself: seems it has to be reciprocal to work.So that's stopping someone from linking to an official author.

      You need rel=me on both sites linking to eachother.
      http://www.google.com/support/webmasters/bin/answer.py?answer=1229920

      Now I wonder - it's an html5 tag. Should I already implement it on my own website which isn't html5 or would google then just ignore it ?
      I can already put it on my own site, blog, facebook, .. but if it's going to be ignored then I won't bother..

    6. Re:Authorship Tag by Tacvek · · Score: 1

      Google's engine does not distinguish between the various versions of HTML. As long as Google successfully detects the page as html (and it is quite good at determining that), you can use any feature from any version and Google could not care.

      For what it is worth, this markup is also valid HTML 4, but HTML 4 simply does does not define the meaning of the "me" or "author" values of the rel attribute, while HTML 5 does define the meaning (although I have not actually verified that).

      --
      Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
  4. What could possibly go wrong? by TaoPhoenix · · Score: 2

    Oh dear me, am I missing something?

    So you can totally spoof random people's names into any webpage? So searches for author=Obama come up with doctored pics of Osama-Obama slash or something?

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
    1. Re:What could possibly go wrong? by NNKK · · Score: 1

      Oh dear me, am I missing something?

      So you can totally spoof random people's names into any webpage? So searches for author=Obama come up with doctored pics of Osama-Obama slash or something?

      Thanks for the imagery, but what is it that makes you think you can't _already_ claim any random person wrote something? Do you think the normal non-tag text in an HTML document is under a magic spell that present misattribution?

    2. Re:What could possibly go wrong? by Anonymous Coward · · Score: 0

      No one is saying this is somehow immune to spam or abuse. This is orthogonal to other measures taken to control (search engine) spam. Well-behaved sites can use this to make the author information explicitly visible to Google and other crawlers. It should be fairly easy to completely discard the author information from sites with bad standing, and still use the information from sites with a clean record.

    3. Re:What could possibly go wrong? by Anonymous Coward · · Score: 0

      So searches for author=Obama come up with doctored pics of Osama-Obama slash or something?

      [Interior, White House]
      "DAMN YOU WIKILEAKS! Those were supposed to be private!"

    4. Re:What could possibly go wrong? by antifoidulus · · Score: 1

      Or you could attach the name of your arch nemesis to the goatse picture....

    5. Re:What could possibly go wrong? by Combatso · · Score: 1

      im waiting to see so of my more ugly friends getting tagged as "goatse" by Facebooks face detection

    6. Re:What could possibly go wrong? by Anonymous Coward · · Score: 0

      Actually the article does mention a mechanism against abuse. Every content page has to link to a profile page on the same site, and profile pages on different sites have to reciprocally link to each other.

      Someone trying to spoof Obama won't be able to because they won't be able to place their content on whitehouse.gov, or set up a reciprocal link on http://www.whitehouse.gov/administration/president-obama.

  5. A tag in the HTML source? It can be ripped... by mfarah · · Score: 1

    If this is implemented via tags in the HTML itself, it can be easily detected and stripped by content thieves, can't it?

    If I copy the entire body of work of, say, the War Nerd, and set up a copycat blog ("the war geek"), how can these tags (which I've already modified) tell this is a blatant rip-off?

    --
    "Trust me - I know what I'm doing."
    - Sledge Hammer
    1. Re:A tag in the HTML source? It can be ripped... by Lunaritian · · Score: 1

      That's probably true. But if I understood this right, the point is to make the authors more visible on the internet - for example if I find a blog I like, I can easily find more writings by the same author, no matter what site they're on.

    2. Re:A tag in the HTML source? It can be ripped... by grumbel · · Score: 1

      Judging from the Google blog this doesn't sound much like a rip protection, but more as a way to allow searches like "Show me everything else the author of this particle has written". That said, rip protection should be possible, when they would mark the first page that they find with content as special and then everything with the same content as copy.

    3. Re:A tag in the HTML source? It can be ripped... by fuzzyfuzzyfungus · · Score: 2

      They can't. The fact that it is just basic HTML means that detect-and-strip will be downright trivial; but there is nothing(outside of the darkest fantasies of the "trusted computing" set) that could actually stop such activity.

      It seems like this falls into the category of 'potentially useful incremental change'. It isn't resistant to rip-offs(but neither was the status quo) and it makes it somewhat easier for good-faith actors to make a pertinent piece of metadata easily accessible. The metadata dreams of the 'semantic web' types seem doomed to founder in a morass of epistemological horrors; but tagging a few bits of metadata that people are obviously interested in seems quite sensible.

      More robust(not entirely bulletproof) solutions would certainly be possible; but they would involve much greater changes to the way web browsers work, and the workflow of common authoring mechanisms. For instance, assymetric-key crypto and document signing would, if widely used by authors and sensibly interpreted by web browsers and other document/media viewing applications allow authorship claims to be harder to falsify.(You could still falsely claim to have authored somebody else's work, just strip their signature and substitute your own; but you could no longer falsely claim that somebody else was the author of a given work, since you wouldn't be able to sign it as them). If you added cryptographically verified timestamps from one or more "trusted" sources, you could go one step further and allow people to demonstrate that they were the first to sign something(which would still be vulnerable to rip-offs by scrapermedia LLC programmatically scooping up every unsigned document that some poor noob puts on the web and automatically 'first-signing' it; but would make stripping and re-signing much easier to detect in general).

      Such changes, though, would, unlike the HTML tag, involve serious overhaul of how the browser works, how much 'normal people' use crypto(and protect their private keys), and the features supported by authoring software. This doesn't mean that it would be a bad thing(in fact, it would have other interesting side-benefits); but it would Not be an easy move to make.

      (The side benefits, of such a change, for browsers; would be that it would allow you to make the browser cache immensely more powerful and useful: In order to support cryptographic verification of authored elements and then integration of those elements with stylesheets and other webpage/CMS goo, browsers would have to have a generic capability to retrieve, cryptographically validate, and then integrate "packages" of material. This capability could also be applied to things like CSS stylesheets, javascript libraries, etc. Hypothetically, for instance, instead of a page simply specifying a javascript library, and a location on the server from which to retrieve it, a page could specify the library, its SHA-whatever hash, and its signer, along with at least one URL at which to obtain it. If the browser already has an object with an identical SHA hash(even if downloaded when visiting some entirely different domain, not uncommon for semi-standard stuff like jquery) it could skip retrieval. This would also allow page authors to link to 3rd party locations without fear of tampering that could compromise their pages. Given the increasing prevalence of large, resource-heavy, web applications, use of 3rd party CDNs, and similar, giving browsers the ability to securely cache 'packages' and then use them to construct pages, free from concerns about cross-domain attacks, and giving page authors the ability to securely invoke 3rd party resources, without risk of after-the-fact tampering, would be quite handy.)

    4. Re:A tag in the HTML source? It can be ripped... by msuarezalvarez · · Score: 0

      That must be an amazing particle to get you that interested!

    5. Re:A tag in the HTML source? It can be ripped... by jfengel · · Score: 1

      If you include the host domain in the digital signature, you'd be able to prevent people from re-hosting the work (or at least detect it and ignore copies). You'd still need the priority system you suggested to identify THE author (otherwise, as you say, somebody could rip and re-sign the content for a new host).

      It's probably too much work for the benefit you'd get, but it might be worth the experiment, and Google is exactly the people to do that experiment. It means a vast amount of crunching, possibly too much once everybody (including every spammer) is signing their pages. Enc

    6. Re:A tag in the HTML source? It can be ripped... by WuphonsReach · · Score: 1

      That's probably true. But if I understood this right, the point is to make the authors more visible on the internet - for example if I find a blog I like, I can easily find more writings by the same author, no matter what site they're on.

      Unless the author has a common name like John Doe...

      The only way a tag like this *might* work would be to make the tag value a public-key signature of the content enclosed inside the tag. Which would allow you to see that content A was signed by key XYZ, as was content B and C, but not D.

      This will get abused, just like meta tag keywords got abused.

      --
      Wolde you bothe eate your cake, and have your cake?
  6. Looks abusable to me by mrsam · · Score: 1

    If somehow it's discovered that a particular author earned a high pagerank, what exactly would prevent linkfarms from tagging that author on every one of their pages?

    1. Re:Looks abusable to me by xveg · · Score: 1

      Google is not that dumb, the article is just wrong.

      From google

      This tells search engines: "The linked person is an author of this linking page." The rel="author" link must point to an author page on the same site as the content page. For example, the page http://example.com/content/webmaster_tips could have a link to the author page at http://example.com/authors/mattcutts. Google uses a variety of algorithms to determine whether two URLs are part of the same site. For example, http://example.com/content, http://www.example.com/content, and http://news.example.com can all be considered as part of the same site, even though the hostnames are not identical.

  7. publisher or re-publisher? by sgt+scrub · · Score: 1

    Most people add their HTML to a server in one way or another. Isn't that publishing? It isn't like there are private web sites with articles that where written by an author then transferred to HTML to be posted to the web. Oh wait. No. AOL isn't that way any longer.

    --
    Having to work for a living is the root of all evil.
  8. not even that obvious by r00t · · Score: 1

    I pick a respected author, perhaps academic, who writes about similar things as me. I publish my crap whitepaper claiming to be him. It's likely that no human will notice the deception. Depending on my goals, the human-readable text of the whitepaper will claim the author to be him or me.

    1. Re:not even that obvious by TaoPhoenix · · Score: 1

      Oh, of course.

      I used a little humor. But yes, you absolutely have a clear case - you submit something in an intelligent style, and the first pass no one notices, until it accidentally gets picked up and then they slam the original creator.

      What for example if that math paper that got hosed last week was *spoofed*? It's bad enough if the original author goofed, but since he got pulverized for "not checking", what if it was a classy defamation attack?

      --
      My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
  9. John Smith by r00t · · Score: 1

    Lots of people have common names. You could be a Michael or a Mary or a Mohammed or a Jennifer or a William.

    1. Re:John Smith by Anonymous Coward · · Score: 0

      That's why we have these magical things called "last names."

    2. Re:John Smith by JosJuice · · Score: 0

      The same problem exists when you put a generic copyright notice on anything. First name+last name is usually as far as you'll get, both when using old methods and this new method.

  10. Hooly majoly! by Anonymous Coward · · Score: 0

    Google have put the meta desc UNDER the url.
    I CAN'T COPE!

  11. Fraudulent authorship notices by tepples · · Score: 2

    what's to stop me from "borrowing" someone's author tag

    Federal law, as I pointed out in another comment.

    1. Re:Fraudulent authorship notices by Anonymous Coward · · Score: 0

      what's to stop me from "borrowing" someone's author tag

      Federal law, as I pointed out in another comment.

      Didn't know the federal law had jurisdiction in the UK.

    2. Re:Fraudulent authorship notices by Anonymous Coward · · Score: 0

      That's offset by Google weighting results originating from the user's country. Re-attribution from a UK site for instance would probably be listed below the real author's US site if the person searching is from the US. Or vice versa.

  12. Boy. and how by unity100 · · Score: 1

    will you prevent publishers from modifying that tag on the fly ? its just a simple text replacement operation.

    1. Re:Boy. and how by icebraining · · Score: 1

      Who says it's meant to prevent it?

  13. Locke and Demosthenes by bitflippant · · Score: 1

    I was wondering when it would be possible to quote and requote the amazing debate that will change our society as we know it and transform us all into peace loving philanthropists who respect life. Oh wait! that debate happened already in irc chat.

  14. FINALLY! by Xacid · · Score: 1

    We'll get to find out who Goatse REALLY is.

  15. to those of you saying by circletimessquare · · Score: 1

    that it will be easy to randomize/ spoof/ rip off, and a stupid tag doesn't change anything:

    FIRST APPEARANCE of author tag means something. and no, it doesn't mean i can change the publish date on the file to June 1st, 1896 and always be the first author: when did SEARCH ENGINES first see content XYZ with author tag ABC?

    that's case closed, right there. you can't spoof this system, unless you have a time machine, or you can hack google

    now, if anyone rips off your content, you will be able to point to google's independent records and say "google says i wrote it first, you're ripping me off"

    down the road, this could even replace the copyright system, since this is basically how copyright currently works

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    1. Re:to those of you saying by rwv · · Score: 1

      In that case, ripped off content of sites that Google scans hourly will get credit while real authors who maintain sites that are scanned less frequently won't get the credit they deserve. The new SEO will be "use blogger" which gets scanned (at least by Google) when you press "Publish". Unless Google can collaborate with other sites which allow users to publish data for the "first published" data? Does WordPress have hooks for such a collaboration? Would such a system be able to track plagiarism that is changed/tweaked a bit by a derivative author? I believe derivative content creators already have ways of giving credit to where it's due. I'd sure love to see original content (if such a thing exists in this day and age) and credit-where-its-due content get promoted while the copiers and derivative cheapskates get buried.

    2. Re:to those of you saying by circletimessquare · · Score: 1

      you're talking about some pretty fringe time cases

      besides, the problem is easily corrected: if you write something valuable to you that you fear someone will rip off, you ACTIVELY submit the page to the search engines, rather than waiting for them to be passively scanned

      --
      intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    3. Re:to those of you saying by Anonymous Coward · · Score: 0

      What are you talking about, http://www.google.com/addurl ? On that page I don't see any guarantee that Google crawls the submitted pages immediately.

    4. Re:to those of you saying by circletimessquare · · Score: 1

      we're talking about a whole new system here, that google just put in place

      so either google is really concerned about properly attributing sources, and guarantees the timestamp on a submission

      or google just added support for the author="" attribute, and all their work means nothing

      besides, you really believe there's no timestamp record on their addurl page?

      google, the people who track everyone and everything?

      --
      intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    5. Re:to those of you saying by Anonymous Coward · · Score: 0

      What would they use it for? Timeline:
      - SEO tells Google about url U at time T
      - SEO finds a website that just published new cool content
      - SEO copies the content to url U
      - Google scans url U, declares SEO the original author based on timestamp T?

      Is your IQ frigging negative?

    6. Re:to those of you saying by circletimessquare · · Score: 1

      hey, asshole: it's a new system, give it time. i'm glad you've decided everything already for all of us. don't be such a blowhard

      --
      intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
  16. Links to URL, not name by Mathinker · · Score: 1

    See details here, where it is explained that all works authored by someone in a domain should be linked to a unique author page at that domain, and that authors can associate/link their author pages between various domains using reciprocal linking.

  17. What about plagiarism? by mcgrew · · Score: 1

    Will this help or hurt? A little before the turn of the century I researched Quake and Quake II console commands, tested them all, and wrote short descriptions of how to use them and what they did. It was copied on dozens of other web sites, word for word, usually with no attribution and usually with someone else's name on it.

    Meta tags were badly misused to spam search engines. And what if you're putting content on someone else's site and have no control over the meta tags?

    1. Re:What about plagiarism? by Curunir_wolf · · Score: 1

      Will this help or hurt? A little before the turn of the century I researched Quake and Quake II console commands, tested them all, and wrote short descriptions of how to use them and what they did. It was copied on dozens of other web sites, word for word, usually with no attribution and usually with someone else's name on it.

      I'm not sure that would even be covered by copyright law. You aren't allowed to copyright "facts" or "factual data". Maybe if your "short descriptions" were long enough, or expounded on the command beyond being a simple summary, it could be considered an original work. But for the most part, a simple compilation or list of factual information is not considered a copyrightable work.

      --
      "Somebody has to do something. It's just incredibly pathetic it has to be us."
      --- Jerry Garcia
    2. Re:What about plagiarism? by jvkjvk · · Score: 1

      I, on the other hand, believe it would be.

      Here's the original authorship:

      wrote short descriptions of how to use them and what they did

      Or are you saying that technical help documentation cannot be copyrighted?

      I imagine there are a few other people who would disagree with that as well.

      Note - this is entierly seperate from a discussion on what *should* be able to be copyrighted, much less what goals we wish with the laws and whether they accomplish those goals.

      Regards.

    3. Re:What about plagiarism? by mcgrew · · Score: 1

      The data can't be copyrighted, but its presentation is. If you write a book about chemistry I can read it, learn from it, and write my own chemistry book using the facts from your book as long as I present those facts in my own words. The plagarists copied the entire thing whole cloth, even using the same IP address I used in one of the examples. Although my question here is about plagarism rather than copyright infringement (I had no problem with someone republishing it provided they gave me credit and a link to the original, which a few folks did), it was certainly copyright infringement. Were I a greedhead I could have probably hired a lawyer and gotten rich.

    4. Re:What about plagiarism? by pinkushun · · Score: 1

      Well then let me thank you for those lists, muchly appreciated! :: Q1 fan

  18. That's because the UK has its own counterpart by tepples · · Score: 2

    Didn't know the federal law had jurisdiction in the UK.

    That didn't stop your Parliament from enacting its own counterpart to this legislation in 2003, as section 296ZG of British copyright law.

  19. Re:Claim by TaoPhoenix · · Score: 1

    Because this is an Author Tag! (Cue the Serious Stern Face.)

    Of course twerps can claim stuff. So far people can just laugh stuff off.

    Now the obvious use of the tag is for the copyright police... they're gonna try to make the author tag a statement almost akin to under oath. So all those tv show clips on youtube that don't have the network=author tag are instant slam-bait.

    But now the more dangerous case is when Da Gov wants to do False Flag cases, and posts pics of Democrats sharing lingerie, and they put "Author=___Congressman", they fire it away as a "political hit and run" and leave him explaining to the masses that "it wasn't him, I didn't lick".

    Remember all these break-in cases? If the hacker breaks into your account, and posts stuff on your account with you as the Author, same thing. "Dammit, that's not my Brittany-CookieMonster mashup!"

    In short, by making a tag out of it all, it's a case of something truly awful.

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
  20. Gaming the system? by ResidentSourcerer · · Score: 1

    If this were used for ranking, then I would expect web masters to attribute articles to Big Names.

    I would hope that Google would have a policy of fingerprinting the articles. Most people's writing style is sufficiently unique that claiming that someone else wrote Foo is fairly obvious on analysis.

    I hope also that there is a search tool so that I can find all articles attributed to me.

    And suppose that Slashdot and phpBB support this tag so that I can find all the posts by a given author.

    --
    Third Career: Tree Farmer Second Career: Computer Geek First Career: Teacher, Outdoor Instructor, Photographer.