Slashdot Mirror


Google Accused of Trying To Patent Public Domain Technology (bleepingcomputer.com)

An anonymous reader shares a report: A Polish academic is accusing Google of trying to patent technology he invented and that he purposely released into the public domain so companies like Google couldn't trap it inside restrictive licenses. The technology's name is Asymmetric Numeral Systems (ANS), a family of entropy coding methods that Polish assistant professor Jarosaw (Jarek) Duda developed in the early 2000s, and which is now hot tech at companies like Apple, Google, and Facebook, mostly because it can improve data compression from 3 to 30 times. Duda says that Google is now trying to register a patent that includes most of the ANS basic principles. Ironically, most of the technology described in the patent, Duda said he explained to Google engineers in a Google Groups discussion from 2014. The researcher already filed a complaint, to which WIPO ISA responded by calling out Google for not coming up with "an inventive contribution over the prior art, because it is no more than a straightforward application of known coding algorithms." A Google spokesperson refused to comment, and the mystery remains surrounding Google's decision to patent something that's in the public domain since 2014.

6 of 101 comments (clear)

  1. Re:Not necessarily Google, per se. by ShanghaiBill · · Score: 4, Informative

    This could also be a case where the employee, seeking to get rewarded for a patent, hid the prior art from Google.

    Unlikely. Most companies FORBID engineers from searching for prior art. That is done by the legal dept, not engineering.

    If you let engineers search for prior art, you open yourself up to lawsuits for intentional infringement. If you have an explicit policy against patent searches, you can always claim "Hey, we didn't know".

  2. Re:They'll get the patent by kanweg · · Score: 4, Informative

    No, if you look at the Search report it says that all claims are not New, so not patentable, in view of an article by Duda.

    https://worldwide.espacenet.co...

    Your prejudice shows. ....

  3. Re:What's the problem? by JesseMcDonald · · Score: 3, Informative

    Without a patent he can't (successfully) sue them for using it, no, but since he released it into the public domain we must assume that he didn't intend to do that anyway. He wanted people to use it. The fact that the technique was well-known before Google attempted to patent it does mean that their patent application is invalid due to prior art. You can't just patent a technique someone else invented, even if it is in the public domain.

    --
    "The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
  4. ANS is published. by DrYak · · Score: 5, Informative

    Sounds like he didn't actually register a patent,

    No indeed, he didn't.

    but simply declared that his idea was public domain. The article isn't clear on exactly how he did so.

    Wut ? It's right there even in the TFS on /. : he published the stuff back in the 2000s.

    The patent office won't necessarily count that as prior art, unless it's formally published.

    If you google a bit around :
    - arXiv:0710.3861 - "Optimal encoding on discrete lattice with translational invariant constrains using statistical algorithms" first published in 2007 (that's the bat-shit crazy stuff that only a few mathematicians managed to understand but lay ground for the whole stuff)
    - arXiv:0902.0271 - "Asymmetric numeral systems" first published in 2009 (second paper, where he re-visited these concept, and which spawned, among other the FSE - Finite State Entropy - implementation of tANS that is used by Yann Collet's Zstd - recently moved to facebook).
    - arXiv:1311.2540 - "Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding" first published in 2013 and cites actual implementation such as ycollet's fse.

    All these papers (which also cite actual real-world implementations) all predate Google's patent filing.

    To actually prevent a company from monopolizing the idea, the most effective strategy would be to actually patent it and put it under a copyleft patent license... that is, patentleft.

    Or you know, just publish it.
    Like everybody else does in the academic world.
    Formal publication DOES COUNT as prior art in most sane parts of the world.

    That's also why Range-Coding, the predecessor of tANS and cousin of the patented arithmetic coding isn't patented itself: it was published.

    Disclaimer: I've worked on entropy encoders for the compression of genomic data as part of the PoSeNoGap project.

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
  5. Prior Art. by DrYak · · Score: 4, Informative

    Also, releasing something into the public domain means abandoning all rights to it. So rather than ensuring Google can't patent it,

    He did so by publishing it (look my other post with arxiv refs).
    These publications constitute prior art.
    Google CANNOT patent it be cause by now, 2017, this techniques have been known for 10 years.
    (Including successful implementation by Yann Collet's FSE and another one by one of the coders of the Farbrausch demo team).

    he ensured that he has no standing to sue.

    He can technically challenge the patent on ground of prior art.

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
  6. Nit picking : NOT 3-30x better but *faster*. by DrYak · · Score: 5, Informative

    Just to nitpick :
    tANS (table Assymetric Numeral Systems) such as the FSE (Finite State Entropy) implementation by Yann Collet, at the hearth of Zstd compressor (now Facebook's) are NOT 3-30x better than other modern post-Huffman entropy encoder, such as binary-arithmetic encoding or range-encoding.
    They are much *FASTER*. By lots.

    They all boil down to the same logic:
    do not use a fixed code-word book like Huffman (which is thus limited to integer number of bits).

    but try to get as close as Shanon's theory predict the necessary bits, by subdividing number space.

    range encoding works by dividing an arbitrary big number (usually an extremely long binary number).
    For each symbol, you split this range in sub-ranges. More frequent symbol get a wider sub-range, more rarer symbol get a narrower one.
    You pick the sub-range corresponding to the next symbol in the text you need to encode.
    Then you keep the same work by subdividing *THAT* sub-range by the probabilities of symbol occuring in the position after that.
    - Encoding relies a lot on multiplications (there isn't such a thing as a direct multiplication in transistors. Instead you implement it in microcode by combining shifting and adding. modern CPU have big shift-adders units, so they can manage in chunks of 64bits - on modern Intel it's a few cycles delay, 1 more if you want the upper 64bits of a 128bits product).
    - Decoding relies on division (rules of three to "zoom" into the subranges), and division are fucking slow (again no such things as "division" with transistors. Instead you implement it in microcode, and most modern CPU tend to do it bit-by-bit meaning it's fucking slow. - on modern Intel it's dozens of cycles delay).
    This thing was published in 1979. (And thus isn't patentable)
    It's provable that it approaches arbitrarily close the Shanon limit. (all the symbols in the stream occupy a total of bits that is close the invert log2 of their respective frequencies).

    Arithmetic Coding is a cousin technique patented more or less in the same era.
    In can be seen as a special sub-case of range-encoding where the range is [0;1] and you subdivide it in fraction.
    It's most often implemented as binary arithmetic encoding.
    Input symbol are converted into a bit stream.
    Then for each bit, the fraction is divided in two half. e.g.: if each bit has a 50:50 chance, the [0;1] is subdivided at 0.5
    You chose the range under or above this mid-point depending if the bit is 0 or 1 and move forward to the next bit and sudivide the fraction based on the next probability (e.g.: mid point at 0.25).
    Given that there are only 2 symbols, and therefore only a mid-point fraction to keep track of, its implementation is a little bit simpler to follow.
    But because this works on *every single bit* of the input stream, you can guess it's either slow (on CPU - hence CABAC versus CAVLC in H264 videos) or require high-frequency on hardware implementation.
    As it is basically a variation of Range Encoding it can achieve similar arbitrarily close to optimum encoding.

    Then come Mr. Duda with his ANS.
    They are basically range encoding turned on its head - to understand you must basically look at range encoding's bit the other way around.
    - Range encoding works by subdividing an arbitrarily big number. ANS work by build a progressively bigger number.
    - Range encoder usually works by writing out the most significant (high bits) away and then shift the working values. - ANS works by considering the least significant bit (the low part).

    And here comes the real magic :
    - With range-entropy: if you have 2 symbols with 50:50 chances, you subdivide the range in 2 halfs. With a different probability you still end up with 2 adjacent sub-ranges of different lenght.
    - With ANS, because you figuratively "reverse the bits" : if you have 2 symbols with 50:50 chance, you'll find them alternating in

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]