Slashdot Mirror


ZeoSync Makes Claim of Compression Breakthrough

dsb42 writes: "Reuters is reporting that ZeoSync has announced a breakthrough in data compression that allows for 100:1 lossless compression of random data. If this is true, our bandwidth problems just got a lot smaller (or our streaming video just became a lot clearer)..." This story has been submitted many times due to the astounding claims - Zeosync explicitly claims that they've superseded Claude Shannon's work. The "technical description" from their website is less than impressive. I think the odds of this being true are slim to none, but here you go, math majors and EE's - something to liven up your drab dull existence today. Update: 01/08 13:18 GMT by M : I should include a link to their press release.

25 of 989 comments (clear)

  1. 100:1 ? I don't think so... by Mr+Thinly+Sliced · · Score: 5, Insightful

    They claim 100:1 compression for random data. The thing is, if thats true, then lets say we have data A size (1000)

    compress(A) = B

    Now, B is 1/100th the size of A, right, but it too, is random, right (size 100).

    On we go:
    compress(B) = C (size is now 10)
    compress(C) = D (size 1).

    So everything compresses into 1 byte.

    Or am I missing something.

    Mr Thinly Sliced

    1. Re:100:1 ? I don't think so... by oyenstikker · · Score: 5, Funny

      Maybe they'll be able to compress their debt to $1 when they go under.

      --
      The masses are the crack whores of religion.
    2. Re:100:1 ? I don't think so... by arkanes · · Score: 5, Insightful

      I suspect that when they say "random" data, they are using marketing-speak random, not math-speak random. Therefore, by 'random', they mean "data with lots of repetition like music or video files, which we'll CALL random because none of you copyright-infringing IP thieving pirates will know the difference"

    3. Re:100:1 ? I don't think so... by MikeTheYak · · Score: 5, Insightful

      It goes beyond bullshit into the realm of humor:

      ZeoSync has developed the TunerAccelerator(TM) in conjunction with some traditional state-of-the-art compression methodologies. This work includes the advancement of Fractals, Wavelets, DCT, FFT, Subband Coding, and Acoustic Compression that utilizes synthetic instruments. These are methods that are derived from classical physics and statistical mechanics and quantum theory, and at the highest level, this mathematical breakthrough has enabled two classical scientific methods to be improved, Huffman Compression and Arithmetic Compression, both industry standards for the past fifty years.

      They just threw in a bunch of compression buzzwords without even bothering to check whether they have anything to do with lossless compression...

    4. Re:100:1 ? I don't think so... by Bandman · · Score: 5, Funny

      I get the idea that this part of the algorithm is perfected by them...its the decompresser that's giving them fits...

      Step 1: Steal Underpants
      Step 3: Profit!

      We're still working on step 2

  2. Time for a new law of information theory? by Anonymous Coward · · Score: 5, Funny

    The odds on a compression claim turning out to be true are always identical to the compression ratio claimed?

  3. Tech details from the crappy Flash-only website by bleeeeck · · Score: 5, Informative
    ZeoSynch's Technical Process: The Pigeonhole Principle and Data Encoding Dr. Claude Shannon's dissertation on Information Theory in 1948 and his following work on run-length encoding confidently established the understanding that compression technologies are "all" predisposed to limitation. With this foundation behind us we can conclude that the effort to accelerate the transmission of information past the permutation load capacity of the binary system, and past the naturally occurring singular-bit-variances of nature can not be accomplished through compression. Rather, this problem can only be successfully resolved through the solution of what is commonly understood within the mathematical community as the "Pigeonhole Principle."

    Given a number of pigeons within a sealed room that has a single hole, and which allows only one pigeon at a time to escape the room, how many unique markers are required to individually mark all of the pigeons as each escapes, one pigeon at a time?

    After some time a person will reasonably conclude that:
    "One unique marker is required for each pigeon that flies through the hole, if there are one hundred pigeons in the group then the answer is one hundred markers". In our three dimensional world we can visualize an example. If we were to take a three-dimensional cube and collapse it into a two-dimensional edge, and then again reduce it into a one-dimensional point, and believe that we are going to successfully recover either the square or cube from the single edge, we would be sorely mistaken.

    This three-dimensional world limitation can however be resolved in higher dimensional space. In higher, multi-dimensional projective theory, it is possible to create string nodes that describe significant components of simultaneously identically yet different mathematical entities. Within this space it is possible and is not a theoretical impossibility to create a point that is simultaneously a square and also a cube. In our example all three substantially exist as unique entities yet are linked together. This simultaneous yet differentiated occurrence is the foundation of ZeoSync's Relational Differentiation Encoding(TM) (RDE(TM)) technology. This proprietary methodology is capable of intentionally introducing a multi-dimensional patterning so that the nodes of a target binary string simultaneously and/or substantially occupy the space of a Low Kolmogorov Complexity construct. The difference between these occurrences is so small that we will have for all intents and purposes successfully encoded lossley universal compression. The limitation to this Pigeonhole Principle circumvention is that the multi-dimensional space can never be super saturated, and that all of the pigeons can not be simultaneously present at which point our multi-dimensional circumvention of the pigeonhole problem breaks down.

  4. The proofs in the pudding. by neo · · Score: 5, Funny

    ZeoSync said its scientific team had succeeded on a small scale in compressing random information sequences in such a way as to allow the same data to be compressed more than 100 times over -- with no data loss. That would be at least an order of magnitude beyond current known algorithms for compacting data.

    ZeoSync announced today that the "random data" they were referencing is string of all zero's. Technically this could be produced randomly and our algorythm reduces this to just a couple of characters, a 100 times compression!!

  5. Re:how can this be? by Rentar · · Score: 5, Funny
    I'm going to agree with you here. If there's no pattern in the data, how can you find one and compress it. The reason things like gzip work well on c files (for instance) is because C code is far from random. How many times do you use void or int in a C file? a lot :)

    So a perl programm can't be compressed?

  6. Re:Current ratio? by radish · · Score: 5, Informative


    For lossless (e.g. zip, not jpg, mpg, divx, mp3 etc etc) you are looking at about 2:1 for 8-bit random, much better (50:1?) for ascii text (e.g. 7-bit non-random).

    If you're willing to accept loss, then the sky's the limit, mp3 @ 128kbps is about 12:1 compared to a 44k 16bit wave.

    --

    ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

  7. Some background reading: by Quixote · · Score: 5, Interesting

    Section 1.9 of the comp.compression FAQ is good background reading on this stuff. In particular, read the "WEB story".

  8. Egads... by RareHeintz · · Score: 5, Funny
    ZeoSync said its scientific team had succeeded on a small scale...

    The company's claims, which are yet to be demonstrated in any public forum...

    ...if ZeoSync's formulae succeed in scaling up...

    Call the editors at Wired... I think we have an early nominee for the 2k2 vaporware list.

    ZeoSync expects to overcome the existing temporal restraints of its technology

    Ah... So even if it's not outright bullshit, it's too slow to use?

    "Either this research is the next 'Cold Fusion' scam that dies away or it's the foundation for a Nobel Prize," said David Hill...

    Somehow I think this is going to turn out more Pons-and-Fleischmann than Watson-and-Crick. Almost anytime there's a press release with such startling claims but no peer review or public demonstration, someone has forgotten to stir the jar.

    When they become laughingstocks, and their careers are forever wrecked, I hope they realized they deserve it. And I hope their investors sue them.

    I should really post after I've had my coffee... I sound mean...

    OK,
    - B

    1. Re:Egads... by RareHeintz · · Score: 5, Funny
      Of course! What was I thinking? Why not just use a table lookup of every possible sequence of bytes of any length?

      See you all later - I have some coding to do!

      OK,
      - B

  9. They are using time travel! by harlows_monkeys · · Score: 5, Funny
    From one of the things on their site: Although currently demonstrating its technology on very small bit strings, ZeoSync expects to overcome the existing temporal restraints of its technology and optimize its algorithms to lead to significant changes in how data is stored and transmitted (emphasis added).

    Using time travel, high compression of arbitrary data is trivial. Simply record the location (in both space and time) of the computer with the data, and the name of the file, and then replace the file with a note saying when and where it existed. To decompress, you just pop back in time and space to before the time of the deletion and copy the file.

  10. Directed evolution by HalfFlat · · Score: 5, Funny

    They're looking for investment money?

    Just think of it as an innumeracy tax on
    venture capitalists.

  11. Re:how can this be? by tjansen · · Score: 5, Informative
  12. Re:Current ratio? by markmoss · · Score: 5, Informative

    whats the current ratio? I would take the *zip algorithms as a standard. (I've seen commercial backup software that takes twice as long to compress the data as Winzip but leaves it 1/3 larger.) Zip will compress text files (ASCII such as source code, not MS Word) at least 50% (2:1) if the files are long enough for the most efficient algorithms to work. Some highly repetitive text formats will compress by over 90% (10:1). Executable code compresses by 30 to 50%. AutoCAD .DWG (vector graphics, binary format) compresses around 30%. Back when it was practical to use PKzip to compress my whole hard drive for backup, I expected about 50% average compression. This was before I had much bit-mapped graphics on it.

    Bit-mapped graphic files (BMP) vary widely in compressibility depending on the complexity of the graphics, and whether you are willing to lose more-or-less invisible details. A BMP of black text on white paper is likely to zip (losslessly) by close to 100:1 -- and fax machines perform a very simple compression algorithm (sending white*number of pixels, black*number of pixels, etc.) that also approaches 100:1 ratios for typical memos. Photographs (where every pixel is colored a little differently) don't compress nearly as well; the JPEG format exceeds 10:1 compression, but I think it loses a little fine detail. And JPEG's compress by less than 10% when zipped.

    IMHO, 100:1 as an average (compressing your whole harddrive, for example), is far beyond "pretty damn good" and well into "unbelievable". I know of only two situations where I'd expect 100:1. One is the case of a bit-map of black and white text (e.g., faxes), the other is with lossy compression of video when you apply enough CPU power to use every trick known.

  13. Not possible by Eivind · · Score: 5, Informative
    Someone already pointed out that repeated compression would give infinite compression with this method. But there's another easy way to show that no compressor can ever manage to shrink all messages

    The proof goes like this:

    • Assume someone claims a compressor that will compress any X-byte message to Y bytes where Y<X
    • There are 2^(8*X) possible messages X bytes long.
    • There are 2^(8*Y) possible messages Y bytes long.
    • Since Y is smaller than X, this means that no 1 to 1 mapping between the two sets can exist, because they're not equally large.
    You see this simply if I claim a compressor that can compress any 2-byte message to 1 byte.

    There are then 65536 possible input-messages, but onle 256 possible outputs. So It is mathemathically certain that 99.7% of the messages can not be represented in 1 byte. (regardless of how I choose to encode them)

    These claims surface ever so often. They're bullshit every time. It's even a FAQ-entry on sci.compression

  14. Re:how can this be? by ergo98 · · Score: 5, Informative

    Well firstly I'd say the press release gives a pretty clear picture of the reality of their technology: It has such an overuse of supposedly TM'd (anyone want to double check the filings? I'm going to guess that there are none) "technoterms" like "TunerAccelerator" and "BinaryAccelerator" that it just is screaming hoax (or creative deception), not to mention a use of Flash that makes you want to punch something. Note that they give themselves huge openings such as always saying "practically random" data: What the hell does that mean?

    I think one way to understand it (Because all of us at some point or another have thought up some half-assed, ridiculous way of compressing any data down to 1/10th -> "Maybe I'll find a denominator and store that with a floating point representation of..."), and I'm saying this as not a mathematician or compression expert : Let's say for instance that this compression ratio is 10 to 1 on random data, and I have every possible random document 100 bytes long -> That means I have 6.6680144328798542740798517907213e+240 different random documents (256^100). So I compress them all into 10 byte documents, but the maximum variations of a 10 byte documents is 1208925819614629174706176 : There isn't the entropy in a 10-byte document to store 6.6680144328798542740798517907213e+240 different possibilities (it is simply impossible, no matter how many QuantumStreamTM HyperTechTM TechoBabbleTM TermsTM) : You end up needed, tada, 100 bytes to have the entropy to possibly store all variants of a 100 byte document, but of course most compression routines put in various logic codes and actually increase the size of the document. In the case of the ZeoSync claim though they're apparently claiming that somehow you'll represent 6.6680144328798542740798517907213e+240 different variations in a single byte : So somehow 64 tells you "Oh yeah, that's variation 5.5958572359823958293589253e+236!". Maybe they're using SubSpatialQuantumBitsTM.

  15. Re:how can this be? by Erik+Hensema · · Score: 5, Funny

    Perl source is as close to truly random data as possible.

    --

    This is your sig. There are thousands more, but this one is yours.

  16. Re:I think their investment model requires pigeons by softsign · · Score: 5, Interesting
    I'm not sure if I understand your point, but from what I do understand, it seems to me you are missing it.

    If you look at this sequence as a one-dimensional series: 00101101, it's pretty hard (at least for a processor) to distinguish a pattern there... it's a pseudo-random sequence. But if I paint it this way, in 2d: (0,0) (1,0) (1,1) (0,1), I can step back and see a square with sides of length one.

    AFAIK, what these people are claiming is that they've developed a way to step WAY back, to n-dimensions, and have patterns emerge from seemingly random data.

    It's not the random-number generation that's significant here... it's the purported ability to compress a seemingly random sequence. RLE typically doesn't fare very well with pure random data because it only looks for certain types of redundancy.

    If I haven't missed the boat here, it's really a very interesting achievment.

  17. Re:ZeoTech Scientific Team fake? by King+Babar · · Score: 5, Informative
    Okay, the mysterious Dr. Wlodzimierz Holtzinski doesn't get a single hit on Google.

    Well, that's because they mis-spelled his name. Seriously, I bet they are really trying to refer to Wlodzimierz Holsztynski, who posts to Polish newsgroups from the address "sennajawa@yahoo.com". His last contribution to the one Usenet thread that mentions "zeosync" and his name uses the word "nonsens" a lot, also the phrase "nie autoryzowalem", and the sentence "Bylem ich konsultantem, moze znowu bede, a moze nie, z nimi nie wiadom." Somebody who really knows Polish could probably have a field day with this and other posts...

    I'm getting the idea that some people on the scientific team might be better termed "random people we sent email to who actually responded once or twice".

    --

    Babar

  18. Anyone remember the OWS hoax? by wberry · · Score: 5, Interesting

    Back in 1991 or 1992, in the days of 2400 bps modems, MS-DOS 5.0, and BBS'es, a "radical new compression tool" called OWS made the rounds. It claimed to have been written by some guy in Japan and use breakthroughs in fractal compression, often achieving 99% compression! "Better than ARJ! Better than PKzip!" Of course all my friends and I downloaded it immediately. Now we can send gam^H^H^Hfiles to each other in 10 minutes instead of 10 hours!

    Now I was in the ninth grade, and compression technology was a complete mystery to me then, so I suspected nothing at first. I installed it and read the docs. The commands and such were pretty much like PKzip. I promptly took one of my favorite ga^H^Hdirectories, *copied it to a different place*, compressed it, deleted it, and uncompressed it without problems. The compressed file was exactly 1024 bytes. Hmm, what a coincidence!

    The output looked kind of funny though:
    Compressing file abc.wad by 99%.
    Compressing file cde.wad by 99%.
    Compressing file start.bat by 99%.
    etc. Wait, start.bat is only 10 characters, that's like one bit! And why is *every* file compressed by 99%? Oh well, must be a display bug.

    So I called my friend and arranged to send him this g^Hfile via Zmodem, and it took only a few seconds. But he couldn't uncompress it on the other side. "Sector Not Found", he said. Oh well, try it again. Same result. Another bug.

    So I decided that this wasn't working out and stopped using OWS. Their user interface needed some work anyway, plus I was a little suspicious of compression bugs. The evidence was right there for me to make the now-obvious conclusion, but it didn't hit me until a few *weeks* later when all the BBS sysops were posting bulletins warning that OWS was a hoax.

    As it turns out, OWS was storing the FAT information in the compressed files, so that when people do reality checks it will appear to re-create the deleted files, as it did for me. But when they try to uncompress a file that actually isn't there or has had its FAT entries moved around, you get the "Sector Not Found" error and you're screwed. If I hadn't tried to send a compressed file to a friend I might have been duped into "compressing" and deleting half my software or more.

    All in all, a pretty cruel but effective joke. If it happened today somebody would be in federal pound-me-in-the-ass prison. Maybe it happened then too...

    (Yes, this is slightly off-topic, but where else am I going to post this?)

    --
    LAMP hosting on Debian, SSH, no bandwidth cap, PayPal accepted - http://secondbrainhosting.com/
  19. Re:ZeoTech Scientific Team fake? by Evacuator · · Score: 5, Informative

    With my limited understanding of polish I can add that he talks about the nonsense of him beeing in the scientific team. He also states that his name was used without any authorisation and he points out that the whole affair is only for hustling the money from investors.

    --
    Human beeing is just an advanced, self-learning machine.
  20. Simple, it can't be by nusuth · · Score: 5, Insightful
    I have been pretty late to this thread, and I'm sorry if this is redundant. I just can't read all 700 posts.

    1:100 average compression on all data is just impossible. And I don't mean "improbable" or "I don't belive that", it is impossible. The reason is pigeon hole principle, for simplicity assume that we are talking about 1000bit files, although you can compress some of these 1000bit files to just 10bits, you cannot possibly compress all of them to 10bits, as with 10 bits is just 1024 different configurations while 1000bits call for representations of 2 different configurations. If you can compress the first 1024, there is simply no room to represent remaining 2-1024 files.

    ...And that is assuming the compression header takes no space at all...

    So every loseless compression algorithm that can represent some files with other files less than original in length must expand some other files. Higher compression on some files means number of files that do not compress at all is also greater. Average compression rate other than 1 is only achiveable if there is some redundancy in original encoding. I guess you can call that redundancy "a pattern." Rar, zip, gzip etc. all achieve less than 1 compressed/original length on average because there is redundancy in originals : programs that have some instructions, prefixes with common occurance, pictures that are represented with full dword although they use a few thousand colors, sound files almost devoid of very low and very high numbers because of recording conditions etc. No compression algorithm can achive less than 1 ratio averaged over all possible strings. It is a simple consequence of pigeon hole principle and cannot be tricked.

    --

    Gentlemen, you can't fight in here, this is the War Room!