Slashdot Mirror


Hulu Munging HTML With JS To Protect Content

N!NJA writes "Hulu has started encoding the html that they send to people's browsers, and then decoding it using javascript before rendering it. [...] They then run the character stream through a series of javascript functions to convert it back into plain text before pushing it into your browser using DHTML. That's quite a lot of effort just for fun, so I assume that is to stop screen scrapers from parsing content." I really can't understand all this effort. Boxee displayed the Hulu advertising perfectly. I suspect Alec Baldwin is to blame.

6 of 281 comments (clear)

  1. Cat & Mouse. by 0100010001010011 · · Score: 5, Informative

    The XBMC guys already made a plugin after the last hulu change. It'll take a few hours and a new one will be made.

    Especially if you SEND the user all the info they need, how hard is it to decode functions? There are crackers out there that take decoded assembly to figure out how to bypass DRM, what makes Hulu think their implementation will be any more difficult?

    1. Re:Cat & Mouse. by tweek · · Score: 4, Informative

      It has nothing to do with piracy. It has to do with revenue from cable company contracts. The problem the "content providers" had was that via Boxee and other set-top pcs, people could forgo cable all-together and that would be a huge chunk of lost revenue. Hulu is popular but the ad revenue from Hulu is nothing compared to the money the cable companies pay "content providers".

      * I quote "content providers" because Hulu liked to use that phrase when Boxee was shut out. The fact of the matter is that Hulu is co-owned by two of these "content providers" so in essence, Hulu *IS* the "content provider"

      --
      "Fighting the underpants gnomes since 1998!" "Bruce Schneier knows the state of schroedinger's cat"
  2. Phase One is Over by wonkavader · · Score: 5, Informative

    TunerFreeMCE couldn't scrape the data. Mission accomplished. Oh, wait... Tada:

    "Update- version 2.6.7 is now available to download to work round this new tactic."

    And now, I supposed, there will be a DMCA attack as phase two.

  3. Re:Dumb question here by ynef · · Score: 5, Informative

    Yes, in fact, HtmlUnit is my preferred browser simulation library in Java for this very reason: it allows you to write very easy to understand Java code, and it uses Rhino as a JavaScript interpreter. Completely brilliant, and yet few people know about it.

  4. Re:Huh? by AKAImBatman · · Score: 4, Informative

    The particular situation here deals with compressed/encoded HTML in an effort to prevent screen-scraping. This leaves two options for screen scrapers:

    Option 1
    1) Figure out how the decoder works
    2) Replicate the decoder functionality in the screen scraper
    3) Parse the decoded HTML
    4) Make changes as the encoding scheme changes
    5) ???
    6) Profit!

    Option 2
    1) Link a Javascript engine like SpiderMonkey, Rhino, V8, or SquirrelFish into the screen scraper
    2) Run the Javascript to decode the HTML
    3) Parse the decoded HTML
    4) ???
    5) Profit!

  5. Re:Brand dilution guys.... by MightyYar · · Score: 4, Informative

    They are being knuckleheads. Their "website" is analogous to a traditional TV channel and Boxee is analogous to a set-top cable box. You'd still get the Hulu ads, still get the Hulu branding.

    To be fair, it seems like Hulu would very much like to be on Boxee - the distaste of the content providers' policies is palpable on their blog.

    --
    W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.