ZeoSync Makes Claim of Compression Breakthrough

how can this be? by posmon · 2002-01-08 01:14 · Score: 3, Informative

even lossless compression still relies on redundancy within the data, normally repeating patterns of data. surely 100-1 on TRUE random data is impossible?

--

update comments set karma=-1, reason='offtopic' where sid=26315

Re:how can this be? by jrockway · 2002-01-08 01:21 · Score: 4, Insightful

I'm going to agree with you here. If there's no pattern in the data, how can you find one and compress it. The reason things like gzip work well on c files (for instance) is because C code is far from random. How many times do you use void or int in a C file? a lot :)

Try compressing a wav or mpeg file with gzip. Doesn't work too well, becuase the data is "random", at least in the sense of the raw numbers. When you look at patterns that the data forms, (i.e. pictures, and relative motion) then you can "compress" that.
Here's my test for random compression :)

$ dd if=/dev/urandom of=random bs=1M count=10
$ du random
11M random
11M total
$ gzip -9 random
$ du random.gz
11M random.gz
11M total
$

no pattern == no compression
prove me wrong, please :)

--
My other car is first.
Re:how can this be? by Rentar · 2002-01-08 01:26 · Score: 5, Funny

I'm going to agree with you here. If there's no pattern in the data, how can you find one and compress it. The reason things like gzip work well on c files (for instance) is because C code is far from random. How many times do you use void or int in a C file? a lot :)

So a perl programm can't be compressed?
Re:how can this be? by Shimbo · 2002-01-08 01:27 · Score: 3, Interesting

They don't claim they can compress TRUE random data only 'practically random' data. Now the digits of Pi are a good source of 'practically random' data for some definition of the phrase 'practically random'.
Re:how can this be? by harlows_monkeys · 2002-01-08 01:36 · Score: 3, Interesting

I realize that what I'm about to propose does not work. The challenge is to figure out why
Here's a proposal for a compression scheme that has the following properties:
1. It works on all bit strings of more than one bit.
2. It is lossless and reversible.
3. It never makes the string larger. There are some strings that don't get smaller, but see item #4.
4. You can iterate it, to reduce any string down to 1 bit! You can use this to deal with pesky strings that don't get smaller. After enough iterations, they will be compressed.
OK, here's my algorithm:
Input: a string of N bits, numbered 0 to N-1.
If all N bits are 0, the output is a string of N-1 1's. Otherwise, find the lowest numbered 1 bit. Let its position be i. The output string consists of N bits, as follows:
Bits 0, 1, ... i-1 are 1's. Bit i is 0. Bits i+1, ..., N-1 are the same as the corresponding input bits.
Again, let me emphasize that this is not a usable compression method!. The fun is finding the flaw.
Re:how can this be? by Dr_Cheeks · 2002-01-08 01:46 · Score: 3, Insightful

If the data was represented a different way (say, using bits instead of bytesize data) then patterns might emerge...
With truly random data there's no pattern to find, assuming you're looking at a large enough sample, which is why everyone else on this thread is talking about the maximum compression for such data being 1:1. However, since "ZeoSync said its scientific team had succeeded on a small scale" it's likely that whatever algorithm they're using works only in limited cases.
Shannon's work on information theory is over 1/2 a century old and has been re-examined by thousands of extremely well-qualified people, so I'm finding it rather hard to accept that ZeoSync aren't talking BS.

--
Re:how can this be? by s20451 · 2002-01-08 01:46 · Score: 3, Informative

Of course patterns occur in random data. For example, if you toss a fair coin for a long time, you will get runs of three, four, or five heads which recur from time to time. The point is that in random, noncompressible data, the probability of occurrence for any given pattern is the same as the probability of any other pattern.

--
Toronto-area transit rider? Rate your ride.
Re:how can this be? by tjansen · 2002-01-08 01:57 · Score: 5, Informative

Yawn... see the comp.compression FAQ, compression of random data
Re:how can this be? by ergo98 · 2002-01-08 02:17 · Score: 5, Informative

Well firstly I'd say the press release gives a pretty clear picture of the reality of their technology: It has such an overuse of supposedly TM'd (anyone want to double check the filings? I'm going to guess that there are none) "technoterms" like "TunerAccelerator" and "BinaryAccelerator" that it just is screaming hoax (or creative deception), not to mention a use of Flash that makes you want to punch something. Note that they give themselves huge openings such as always saying "practically random" data: What the hell does that mean?

I think one way to understand it (Because all of us at some point or another have thought up some half-assed, ridiculous way of compressing any data down to 1/10th -> "Maybe I'll find a denominator and store that with a floating point representation of..."), and I'm saying this as not a mathematician or compression expert : Let's say for instance that this compression ratio is 10 to 1 on random data, and I have every possible random document 100 bytes long -> That means I have 6.6680144328798542740798517907213e+240 different random documents (256^100). So I compress them all into 10 byte documents, but the maximum variations of a 10 byte documents is 1208925819614629174706176 : There isn't the entropy in a 10-byte document to store 6.6680144328798542740798517907213e+240 different possibilities (it is simply impossible, no matter how many QuantumStreamTM HyperTechTM TechoBabbleTM TermsTM) : You end up needed, tada, 100 bytes to have the entropy to possibly store all variants of a 100 byte document, but of course most compression routines put in various logic codes and actually increase the size of the document. In the case of the ZeoSync claim though they're apparently claiming that somehow you'll represent 6.6680144328798542740798517907213e+240 different variations in a single byte : So somehow 64 tells you "Oh yeah, that's variation 5.5958572359823958293589253e+236!". Maybe they're using SubSpatialQuantumBitsTM.
Re:how can this be? by Erik+Hensema · 2002-01-08 02:22 · Score: 5, Funny

Perl source is as close to truly random data as possible.

--
This is your sig. There are thousands more, but this one is yours.
Re:how can this be? by FlatEarther · 2002-01-08 04:41 · Score: 4, Funny

It is possible despite the many (uninformed) negative comments that have appeared concerning this truly amazing breakthrough in compression technology. I, myself, using my own patented compression technology - The Shannon-Transmogrificator (TM) have managed to compress the entire Reuters article to a mere 4 ASCII characters (!), with essentially no loss in meaning: 'C', 'R', 'A', 'P'. I wonder if anyone can improve on this ?

100:1 ? I don't think so... by Mr+Thinly+Sliced · 2002-01-08 01:14 · Score: 5, Insightful

They claim 100:1 compression for random data. The thing is, if thats true, then lets say we have data A size (1000)

compress(A) = B

Now, B is 1/100th the size of A, right, but it too, is random, right (size 100).

On we go:
compress(B) = C (size is now 10)
compress(C) = D (size 1).

So everything compresses into 1 byte.

Or am I missing something.

Mr Thinly Sliced

Re:100:1 ? I don't think so... by oyenstikker · 2002-01-08 01:19 · Score: 5, Funny

Maybe they'll be able to compress their debt to $1 when they go under.

--
The masses are the crack whores of religion.
Re:100:1 ? I don't think so... by Xentax · 2002-01-08 01:20 · Score: 3, Informative

No...the compressed data is almost certainly NOT random, so it couldn't be compressed the same way. It's also highly unlikely any other compression scheme could reduce it either.

I'm very, very skeptical of 100:1 claims on "random" data -- it must either be large enough that even being random, there are lots of repeated sequences, or the test data is rigged.

Or, of course, it could all be a big pile of BS designed to encourage some funding/publicity.

Xentax

--
You shouldn't verb words.
Re:100:1 ? I don't think so... by arkanes · 2002-01-08 01:21 · Score: 5, Insightful

I suspect that when they say "random" data, they are using marketing-speak random, not math-speak random. Therefore, by 'random', they mean "data with lots of repetition like music or video files, which we'll CALL random because none of you copyright-infringing IP thieving pirates will know the difference"
Re:100:1 ? I don't think so... by Rentar · 2002-01-08 01:22 · Score: 3, Interesting

This is a proof ('though I doubt it is a scientificly correct one), that you can't get lossless compression with a constant compression factor! What they claim would be theroretically possible if 100:1 where an average, but I still don't think this is possible.
Re:100:1 ? I don't think so... by MikeTheYak · 2002-01-08 01:25 · Score: 5, Insightful

It goes beyond bullshit into the realm of humor:

ZeoSync has developed the TunerAccelerator(TM) in conjunction with some traditional state-of-the-art compression methodologies. This work includes the advancement of Fractals, Wavelets, DCT, FFT, Subband Coding, and Acoustic Compression that utilizes synthetic instruments. These are methods that are derived from classical physics and statistical mechanics and quantum theory, and at the highest level, this mathematical breakthrough has enabled two classical scientific methods to be improved, Huffman Compression and Arithmetic Compression, both industry standards for the past fifty years.

They just threw in a bunch of compression buzzwords without even bothering to check whether they have anything to do with lossless compression...
Re:100:1 ? I don't think so... by Mr+Thinly+Sliced · 2002-01-08 01:34 · Score: 4, Funny

Not only that, but I just hacked their site, and downloaded the entire source tree here it is:

01101011

Pop that baby in an executable shell script. Its a self extracting
./configure
./make
./make install

Shh. Don't tell anyone.

Mr Thinly Sliced
Re:100:1 ? I don't think so... by swillden · 2002-01-08 01:48 · Score: 4, Funny

So everything compresses into 1 byte.
Duh, are you like an idiot or something?
When you send me a one-byte copy of, say, The Matrix, you also have to tell me how many times it was compressed so I know how many times to run the decompressor!
So everything compresses to *two* bytes. Maybe even three bytes if something is compressed more than 256 times. That's only required for files whose initial size is more than 100^256, though, so two bytes should do it for most applications.
Jeez, the quality of math and CS education has really gone down the tubes.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:100:1 ? I don't think so... by pmc · 2002-01-08 02:12 · Score: 4, Funny

Duh, are you like an idiot or something?

You're the moron, moron. When you get the one byte compressed file, you run the decompressor once to get the number of additional times to run the decompressor.

What are they teaching the kids today? Shannon-shmannon nonsense, no doubt. They should be doing useful things, like Marketing and Management Science. There's no point in being able to count if you don't have any money.
Re:100:1 ? I don't think so... by Bandman · 2002-01-08 02:12 · Score: 5, Funny

I get the idea that this part of the algorithm is perfected by them...its the decompresser that's giving them fits...

Step 1: Steal Underpants
Step 3: Profit!

We're still working on step 2

--
Check out my sysadmin blog!
Re:100:1 ? I don't think so... by Happy+Monkey · 2002-01-08 03:43 · Score: 3, Informative

You then need to add one bit of data to tell whether you've compressed it or not.

--
__
Do ya feel happy-go-lucky, punk?
Re:100:1 ? I don't think so... by biobogonics · 2002-01-08 05:45 · Score: 3, Insightful

I suspect that when they say "random" data, they are using marketing-speak random, not math-speak random. Therefore, by 'random', they mean "data with lots of repetition like music or video files, which we'll CALL random because none of you copyright-infringing IP thieving pirates will know the difference"

Actually, if you change the domain you can get what appears to be impressive compression. Consider a bitmapped picture of a child's line drawing of a house. Replace that by a description of the drawing commands. Of course you have not violated Shannon's theorem because the amount of information in the original drawing is actually low.

At one time commercial codes were common. They were not used for secrecy, but to transmit large amounts of information when telegrams were charged by the word. The recipient looked up the code number in his codebook and reconstructed a lengthy message: "Don't buy widgets from this bozo. He does not know what he is doing."

If you have a restricted set of outputs that appear to be random but are not, ie white noise sample #1, white noise sample #2 ... all you need to do is send 1, 2... and voila!
Re:100:1 ? I don't think so... by grytpype · 2002-01-08 05:47 · Score: 3, Funny

I just ran another compression pass on that, and i got:

BS

--
- Have a picture
Re:100:1 ? I don't think so... by swillden · 2002-01-08 06:05 · Score: 3, Funny

I don't need to encode the number of compressions, every decompression consists of decompressing 256 times.
I think you mean at most 256 times. Supposing I had to perform 10 compressions to compress to a singe byte. After you had decompressed 10 times, you'd have the data. the next decompression would make some other file 100 times larger than the Matrix. So if you could recognize the correct file when you saw it, I could avoid transmitting the decompression count.
So, I just have to prepend a string saying "This is it!" before compressing!
Also, it occurred to me after my previous posting (and to another poster, I saw) that if we can compress to a single byte, why not to a single bit? This is a great advance, which I believe I shall patent quickly before that other poster does, because now I can give you my copy of The Matrix over the phone! I can just tell you if it's a 1 or 0. For that matter, I don't even have to tell you -- you can just try both possibilities!
So my question now is, does the decompressor only produce strings of bits that exist somewhere and were once compressed, or does it produce anything? Can I just think "I want a great term paper..." and then try decompressing both 1 and 0 until I get it (in no more than 8 or ten iterations of the decompressor, 'cause I want a paper, not a novel).

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:100:1 ? I don't think so... by zhensel · 2002-01-08 06:28 · Score: 3, Funny

Quantum theory has everything to do with compression. Inside sources have revealed that this compression scheme works on the uncertainty principles key to quantum physics. You see, any strinng of 100 bits has a distinct probability of being compressable to a single bit. Of course, this means that this compression scheme will produce bogus results 99.999999% of the time, but think of the wonder of compression realized the other .000001% of the time! Furthermore, the system requirements for their technology are as follows: x86 PC running WindowsXP (to take advantage of DirectX in wickedly rendering the fractals neccessary for the compression), a particle accelerator, and a heavy dose of optimism combined with a complete lack of skepticism.

Time for a new law of information theory? by Anonymous Coward · 2002-01-08 01:15 · Score: 5, Funny

The odds on a compression claim turning out to be true are always identical to the compression ratio claimed?

Tech details from the crappy Flash-only website by bleeeeck · 2002-01-08 01:15 · Score: 5, Informative

ZeoSynch's Technical Process: The Pigeonhole Principle and Data Encoding Dr. Claude Shannon's dissertation on Information Theory in 1948 and his following work on run-length encoding confidently established the understanding that compression technologies are "all" predisposed to limitation. With this foundation behind us we can conclude that the effort to accelerate the transmission of information past the permutation load capacity of the binary system, and past the naturally occurring singular-bit-variances of nature can not be accomplished through compression. Rather, this problem can only be successfully resolved through the solution of what is commonly understood within the mathematical community as the "Pigeonhole Principle."

Given a number of pigeons within a sealed room that has a single hole, and which allows only one pigeon at a time to escape the room, how many unique markers are required to individually mark all of the pigeons as each escapes, one pigeon at a time?

After some time a person will reasonably conclude that:
"One unique marker is required for each pigeon that flies through the hole, if there are one hundred pigeons in the group then the answer is one hundred markers". In our three dimensional world we can visualize an example. If we were to take a three-dimensional cube and collapse it into a two-dimensional edge, and then again reduce it into a one-dimensional point, and believe that we are going to successfully recover either the square or cube from the single edge, we would be sorely mistaken.

This three-dimensional world limitation can however be resolved in higher dimensional space. In higher, multi-dimensional projective theory, it is possible to create string nodes that describe significant components of simultaneously identically yet different mathematical entities. Within this space it is possible and is not a theoretical impossibility to create a point that is simultaneously a square and also a cube. In our example all three substantially exist as unique entities yet are linked together. This simultaneous yet differentiated occurrence is the foundation of ZeoSync's Relational Differentiation Encoding(TM) (RDE(TM)) technology. This proprietary methodology is capable of intentionally introducing a multi-dimensional patterning so that the nodes of a target binary string simultaneously and/or substantially occupy the space of a Low Kolmogorov Complexity construct. The difference between these occurrences is so small that we will have for all intents and purposes successfully encoded lossley universal compression. The limitation to this Pigeonhole Principle circumvention is that the multi-dimensional space can never be super saturated, and that all of the pigeons can not be simultaneously present at which point our multi-dimensional circumvention of the pigeonhole problem breaks down.

Is this April 1st? by tshoppa · 2002-01-08 01:16 · Score: 3, Informative

This has *long* been an April 1st joke published in such hallowed rags as BYTE and Datamation for at least as long as I've been reading them (20 years).

The punchline to the joke was always along the lines of

Of course, since this compression works on random data, you can repeatedly apply it to previously compressed data. So if you get 100:1 on the first compression, you get 10000:1 on the second and 1000000:1 on the third.

The proofs in the pudding. by neo · 2002-01-08 01:19 · Score: 5, Funny

ZeoSync said its scientific team had succeeded on a small scale in compressing random information sequences in such a way as to allow the same data to be compressed more than 100 times over -- with no data loss. That would be at least an order of magnitude beyond current known algorithms for compacting data.

ZeoSync announced today that the "random data" they were referencing is string of all zero's. Technically this could be produced randomly and our algorythm reduces this to just a couple of characters, a 100 times compression!!

The pressrelease by grazzy · 2002-01-08 01:20 · Score: 4, Informative

ZEOSYNC'S MATHEMATICAL BREAKTHROUGH OVERCOMES LIMITATIONS OF DATA COMPRESSION THEORY

International Team of Scientists Have Discovered
How to Reduce the Expression of Practically Random Information Sequences

WEST PALM BEACH, Fla. - January 7, 2001 - ZeoSync Corp., a Florida-based scientific research company, today announced that it has succeeded in reducing the expression of practically random information sequences. Although currently demonstrating its technology on very small bit strings, ZeoSync expects to overcome the existing temporal restraints of its technology and optimize its algorithms to lead to significant changes in how data is stored and transmitted.

Existing compression technologies are currently dependent upon the mapping and encoding of redundantly occurring mathematical structures, which are limited in application to single or several pass reduction. ZeoSync's approach to the encoding of practically random sequences is expected to evolve into the reduction of already reduced information across many reduction iterations, producing a previously unattainable reduction capability. ZeoSync intentionally randomizes naturally occurring patterns to form entropy-like random sequences through its patent pending technology known as Zero Space Tuner(TM). Once randomized, ZeoSync's BinaryAccelerator(TM) encodes these singular-bit-variance strings within complex combinatorial series to result in massively reduced BitPerfect(TM) equivalents. The combined TunerAccelerator(TM) is expected to be commercially available during 2003.

According to Peter St. George, founder and CEO of ZeoSync and lead developer of the technology: "What we've developed is a new plateau in communications theory. Through the manipulation of binary information and translation to complex multidimensional mathematical entities, we are expecting to produce the enormous capacity of analogue signaling, with the benefit of the noise free integrity of digital communications. We perceive this advancement as a significant breakthrough to the historical limitations of digital communications as it was originally detailed by Dr. Claude Shannon in his treatise on Information Theory." [C.E. Shannon. A Mathematical Theory of Communication. Bell System Technical Journal, 27:379-423, 623-656, 1948]

"There are potentially fantastic ramifications of this new approach in both communications and storage," St. George continued. "By significantly reducing the size of data strings, we can envision products that will reduce the cost of communications and, more importantly, improve the quality of life for people around the world regardless of where they live."

Current technologies that enable the compression of data for transmission and storage are generally limited to compression ratios of ten-to-one. ZeoSync's Zero Space Tuner(TM) and BinaryAccelerator(TM) solutions, once fully developed, will offer compression ratios that are anticipated to approach the hundreds-to-one range.

Many types of digital communications channels and computing systems could benefit from this discovery. The technology could enable the telecommunications industry to massively reduce huge amounts of information for delivery over limited bandwidth channels while preserving perfect quality of information.

ZeoSync has developed the TunerAccelerator(TM) in conjunction with some traditional state-of-the-art compression methodologies. This work includes the advancement of Fractals, Wavelets, DCT, FFT, Subband Coding, and Acoustic Compression that utilizes synthetic instruments. These are methods that are derived from classical physics and statistical mechanics and quantum theory, and at the highest level, this mathematical breakthrough has enabled two classical scientific methods to be improved, Huffman Compression and Arithmetic Compression, both industry standards for the past fifty years.

All of these traditional methods are being enhanced by ZeoSync through collaboration with top experts from Harvard University, MIT, University of California at Berkley, Stanford University, University of Florida, University of Michigan, Florida Atlantic University, Warsaw Polytechnic, Moscow State University and Nankin and Peking Universities in China, Johannes Kepler University in Lintz Austria, and the University of Arkansas, among others.

Dr. Piotr Blass, chief technology advisor at ZeoSync, said "Our recent accomplishment is so significant that highly randomized information sequences, which were once considered non-reducible by the scientific community, are now massively reducible using advanced single-bit- variance encoding and supporting technologies."

"The technologies that are being developed at ZeoSync are anticipated to ultimately provide a means to perform multi-pass data encoding and compression on practically random data sets with applicability to nearly every industry," said Jim Slemp, president of Radical Systems, Inc. "The evaluation of the complex algorithms is currently being performed with small practically random data sets due to the analysis times on standard computers. Based on our internally validated test results of these components, we have demonstrated a single-point-variance when encoding random data into a smaller data set. The ability to encode single-point-variance data is expected to yield multi-pass capable systems after temporal issues are addressed."

"We would like to invite additional members of the scientific community to join us in our efforts to revolutionize digital technology," said St. George. "There is a lot of exciting work to be done."

About ZeoSync

Headquartered in West Palm Beach, Florida, ZeoSync is a scientific research company dedicated to advancements in communications theory and application. Additional information can be found on the company's Web site at www.ZeoSync.com or can be obtained from the company at +1 (561) 640-8464.

This press release may contain forward-looking statements. Investors are cautioned that such forward-looking statements involve risks and uncertainties, including, without limitation, financing, completion of technology development, product demand, competition, and other risks and uncertainties.

In this house we obey the 2nd law of thermodynamic by tshoppa · 2002-01-08 01:26 · Score: 3, Insightful

From the Press Release:

This press release may contain forward-looking statements. Investors are cautioned that such forward-looking statements involve risks and uncertainties, including, without limitation, financing, completion of technology development, product demand, competition, and other risks and uncertainties.

They left out Disobeying the 2nd law of Thermodynamics!

Re:Current ratio? by CaseyB · 2002-01-08 01:30 · Score: 3, Informative

but whats the current ratio?

For truly random data? 1:1 at the absolute best.

Re:Current ratio? by radish · 2002-01-08 01:30 · Score: 5, Informative

For lossless (e.g. zip, not jpg, mpg, divx, mp3 etc etc) you are looking at about 2:1 for 8-bit random, much better (50:1?) for ascii text (e.g. 7-bit non-random).

If you're willing to accept loss, then the sky's the limit, mp3 @ 128kbps is about 12:1 compared to a 44k 16bit wave.

--

---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

Been there, done that... by color+of+static · 2002-01-08 01:31 · Score: 4, Informative

There seems to be a company claiming to exceed, go around, obliterate Shannon every few years. In the early 90's there was a company called Web (before the WWW was really around by a year or so). They made claims of compressing any data, even data that had already been compressed. It is a sad story that you should be able to find in either the sci.compression FAQ or the renewed deja archives. It basically boils down to as they got closer to market, they found some problems... you can guess the rest.
This isn't limited to the field of compression of course. There are people that come up with "unbreakable" encryption, infinite gain amplifier (is that gain in V and I?), and all sorts of perpetual motion machines. The sad fact is that compression and encryption are not well understood enough for these ideas to be killed before a company is started or stacked on the claims.

Some background reading: by Quixote · 2002-01-08 01:34 · Score: 5, Interesting

Section 1.9 of the comp.compression FAQ is good background reading on this stuff. In particular, read the "WEB story".

On the contrary! by Simon+Tatham · 2002-01-08 01:35 · Score: 3, Insightful

Quite the contrary: if they had claimed to be achieving 100:1 compression on truly random data, they would be provably talking total rubbish. Consider the number of possible bit strings of length N. Now consider the number of possible bit strings of length N/100. There are fewer of the latter, right? Therefore, if you can compress every length-N string into a length-N/100 string, at least two inputs must map to the same output. Hence, you can't uniquely recover the input from the output - and the compression cannot be lossless.

The fact that they hedge and talk about "practically" random sequences is the only thing that makes it possible they're telling the truth!

Not random data by edp · 2002-01-08 01:36 · Score: 4, Redundant

ZeoSync is not claiming to reduce random data 100-to-1. They are claiming to reduce "practically random" data 100-to-1, and Reuters appears to have misreported it. What "practically random" data should mean is data randomly selected from that used in practice. What ZeoSync may mean by "practically random" is data randomly selected from that used in their intended applications. So their press release is not mathematically impossible; it just means they've found a good way to remove more information redundancy in some data.

The proof that 100-to-1 compression of random data is impossible is so simple as to be trivial: There are 2^N files of length N bits. There are 2^(N/100) files of length N/100 bits. Clearly not all 2^N files can be compressed to length N/100.

Egads... by RareHeintz · 2002-01-08 01:36 · Score: 5, Funny

ZeoSync said its scientific team had succeeded on a small scale...

The company's claims, which are yet to be demonstrated in any public forum...

...if ZeoSync's formulae succeed in scaling up...

Call the editors at Wired... I think we have an early nominee for the 2k2 vaporware list.

ZeoSync expects to overcome the existing temporal restraints of its technology

Ah... So even if it's not outright bullshit, it's too slow to use?

"Either this research is the next 'Cold Fusion' scam that dies away or it's the foundation for a Nobel Prize," said David Hill...

Somehow I think this is going to turn out more Pons-and-Fleischmann than Watson-and-Crick. Almost anytime there's a press release with such startling claims but no peer review or public demonstration, someone has forgotten to stir the jar.

When they become laughingstocks, and their careers are forever wrecked, I hope they realized they deserve it. And I hope their investors sue them.

I should really post after I've had my coffee... I sound mean...

OK,
- B

--
http://www.bradheintz.com/
- updated

Re:Egads... by RareHeintz · 2002-01-08 02:24 · Score: 5, Funny

Of course! What was I thinking? Why not just use a table lookup of every possible sequence of bytes of any length?
See you all later - I have some coding to do!
OK,
- B

--
http://www.bradheintz.com/
- updated

What is compression by Vapula · 2002-01-08 01:37 · Score: 3, Interesting

Compression, after all, is removing all redundancy from the original data.

So, if there is no redundancy, there is nothing to remove (if you want to remain lossless).

When you use some text, you may compres by remving some letter evn if tht lead to bad ortogrph. That is because English (as other langages) is redundant. When compressing some periodical signal, you may give only one period and tell that the signal is then repeated. When compressing bytes, there are specific methods (RLE, Huffman's trees,...)

But, in all these situations, there was some redundancy to remove...

A compression algorithm may not be perfect (it usually has to add some info to tell how the original data was compressed). Then, recompressing with another compression algorithm (or sometimes, the same will do the trick) may improve the compression. But the information quantity inside the data is the lower limit.

Now, take a true random data stream of n+1 bits. Even if you know the value of the n first bits, you can't predict the value of n+1. In other words, there is no way that could allow the express these n+1 bits with n (or less) bits. By definition, true random data can't be compressed.

And, to finish, compression ratio of 1:100 can be easily archived with some data... take a sequence of 200 bytes at 0x00... It may be compressed to 0xC8 0x00. Compression ratio is really only meaningful when comparing different algorithms compressing the same data stream.

Might be possible... but I doubt it... by Zocalo · 2002-01-08 01:39 · Score: 3, Interesting

Reading through the press release it seems to imply that they take the "random" data, massage the data with the "Tuner" part, then compress it with the "Accelerator" part. This spits out "BitPerfect" which I assume is their data format. It's this "massaging" of the figures where it's going to sink or swim.

Take very large prime numbers and the like, huge strings of almost random numbers that can often be written as a trivial (2^n)-1 type formula. Maybe the massaging of the figures is simply finding a very large number that can be expressed like the above with an offset other than "-1" to get the correct "BitPerfect" data. I was toying around with this idea when there was a fad for expressing DeCSS code in unusual ways, but ran out of math before I could get it to work.

The above theory maybe bull when it comes to the crunch, but if it could be made to work, then the compression figures are bang in the ball park for this. They laughed at Goddard remember? But I have to admit, I think replacing Einstein with the Monty Python foot better fits my take on this at present...

--
UNIX? They're not even circumcised! Savages!

What happens when you run it backwards? by sprag · 2002-01-08 01:42 · Score: 4, Funny

A thought just occurred to me: If you can do 100:1 compression and compress something down to, say, 2 bytes, what would 'ab' expand to? My thought is "ZeoSync Rulz, Suckas"

They are using time travel! by harlows_monkeys · 2002-01-08 01:44 · Score: 5, Funny

From one of the things on their site: Although currently demonstrating its technology on very small bit strings, ZeoSync expects to overcome the existing temporal restraints of its technology and optimize its algorithms to lead to significant changes in how data is stored and transmitted (emphasis added).

Using time travel, high compression of arbitrary data is trivial. Simply record the location (in both space and time) of the computer with the data, and the name of the file, and then replace the file with a note saying when and where it existed. To decompress, you just pop back in time and space to before the time of the deletion and copy the file.

Directed evolution by HalfFlat · 2002-01-08 01:47 · Score: 5, Funny

They're looking for investment money?

Just think of it as an innumeracy tax on
venture capitalists.

ZeoTech Scientific Team fake? by dannyspanner · 2002-01-08 01:48 · Score: 4, Insightful

For example, at the top of the list Dr. Piotr Blass is listed as Chief Technical Adviser from Florida Atlantic University. But he seems to be missing from the faculty. Google doesn't turn up much on the guy either. Hmmm.

I've not even had time to check the rest yet.

Re:ZeoTech Scientific Team fake? by King+Babar · 2002-01-08 04:11 · Score: 5, Informative

Okay, the mysterious Dr. Wlodzimierz Holtzinski doesn't get a single hit on Google.

Well, that's because they mis-spelled his name. Seriously, I bet they are really trying to refer to Wlodzimierz Holsztynski, who posts to Polish newsgroups from the address "sennajawa@yahoo.com". His last contribution to the one Usenet thread that mentions "zeosync" and his name uses the word "nonsens" a lot, also the phrase "nie autoryzowalem", and the sentence "Bylem ich konsultantem, moze znowu bede, a moze nie, z nimi nie wiadom." Somebody who really knows Polish could probably have a field day with this and other posts...
I'm getting the idea that some people on the scientific team might be better termed "random people we sent email to who actually responded once or twice".

--
Babar
Re:ZeoTech Scientific Team fake? by Evacuator · 2002-01-08 06:06 · Score: 5, Informative

With my limited understanding of polish I can add that he talks about the nonsense of him beeing in the scientific team. He also states that his name was used without any authorisation and he points out that the whole affair is only for hustling the money from investors.

--
Human beeing is just an advanced, self-learning machine.

The real "Pigeon hole principle" by richieb · 2002-01-08 01:52 · Score: 3, Informative

If I recall my set theory properly the "Pigeon Hole Principle" simply states that if you have 100 holes and 101 pigeons, when you distribute all the pigeons into all holes, there will be at least one hole with at least two pigeons.

I don't recall any of this crap about pigeons flying out of boxes. Or am I getting old?

--
...richie - It is a good day to code.

Re:Current ratio? by markmoss · 2002-01-08 02:01 · Score: 5, Informative

whats the current ratio? I would take the *zip algorithms as a standard. (I've seen commercial backup software that takes twice as long to compress the data as Winzip but leaves it 1/3 larger.) Zip will compress text files (ASCII such as source code, not MS Word) at least 50% (2:1) if the files are long enough for the most efficient algorithms to work. Some highly repetitive text formats will compress by over 90% (10:1). Executable code compresses by 30 to 50%. AutoCAD .DWG (vector graphics, binary format) compresses around 30%. Back when it was practical to use PKzip to compress my whole hard drive for backup, I expected about 50% average compression. This was before I had much bit-mapped graphics on it.

Bit-mapped graphic files (BMP) vary widely in compressibility depending on the complexity of the graphics, and whether you are willing to lose more-or-less invisible details. A BMP of black text on white paper is likely to zip (losslessly) by close to 100:1 -- and fax machines perform a very simple compression algorithm (sending white*number of pixels, black*number of pixels, etc.) that also approaches 100:1 ratios for typical memos. Photographs (where every pixel is colored a little differently) don't compress nearly as well; the JPEG format exceeds 10:1 compression, but I think it loses a little fine detail. And JPEG's compress by less than 10% when zipped.

IMHO, 100:1 as an average (compressing your whole harddrive, for example), is far beyond "pretty damn good" and well into "unbelievable". I know of only two situations where I'd expect 100:1. One is the case of a bit-map of black and white text (e.g., faxes), the other is with lossy compression of video when you apply enough CPU power to use every trick known.

Their claims are 100% accurate by Mr+Z · 2002-01-08 02:05 · Score: 3, Interesting

Their claims are 100% accurate (they can compress random data 100:1) only if (by their definition) random data comprises a very small percentage of all possible data sequences. The other 99.9999% of "non-random" sequences would need to expand. You can show this by a simple counting argument.

This is covered in great detail in thecomp.compressionFAQ. Take a look at the information on the WEB Technologies DataFiles/16 compressor (notice the similarity of claims!) if you're unconvinced. You can find it in Section 8 of Part 1 of the FAQ.

--Joe

--
Program Intellivision!

team members by loudici · 2002-01-08 02:07 · Score: 3, Interesting

navigating through the flash rubbish you can reach a list of team members that includes steve smale from berkeley and richard stanley from MIT who both are existing senior academics.

so either someone has lent their names to weirdoes without paying attention or there is something of substance hidden behind the PR ugliness. after all the PR is aimed toward investors, not toward sentient human beings, and is most probably not under the control of the scientific team.

--
Dev elpizw tipota, dev phoboumai tipota eimai lephteros http://euclidian.org

Re:No Way... by CaseyB · 2002-01-08 02:07 · Score: 3, Insightful

It will (probably) get smaller, a reduction is more likely the bigger the file is.

It "probably" will not.

The reason is that in a random stream you may get repeating patterns (although you may not), and it's these repeating patterns which deflate uses.

Any encoding that saves space by compressing repeating data, also adds overhead for data that doesn't repeat -- at least as much overhead as you saved on the repetition, over the long run.

There ain't no such thing as a free lunch.

How to compress ANY data to one bit by jd · 2002-01-08 02:08 · Score: 3, Funny

Simply have the bit big enough. Let's say you're using one of those old-fashioned binary computers, and want to compress everything to 1/Nth the size. No problem, you simply need a bit with 2^N states. Everything then fits on that single bit.

(Of course, this DOES create all sorts of other problems, but I'm going to ignore those, because they'd go and spoil things.)

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

Infinite monkey compression. by Sobrique · 2002-01-08 02:09 · Score: 4, Funny

Don't bother compressing it, just delete it, and then get an infinite number on monkeys on an infinite number of typewriters to re-produce the original.

It's rare to see such a baldfaced scam by Thagg · 2002-01-08 02:10 · Score: 4, Interesting

I was wondering as I read the headline and summary on slashdot "how can these sleazeballs possibly promote this scam, because it would be easy to show counterexamples?" This shows, once again, that I lack the imagination and chutzpah of a real con artist.

The beauty of this scam is that zeospace claims that they can't even do it themselves, yet. They've only managed to compress very short strings. So, they can't be called to compress large random files because, well gosh, they just haven't gotten the big file compressor work yet. So, you can't prove that they are full of shit.

Beautiful flash animation, though. I particularly like the fact that clicking the 'skip intro' button does absolutely nothing -- you get the flash garbage anyway.

thad

--
I love Mondays. On a Monday, anything is possible.

Not possible by Eivind · 2002-01-08 02:12 · Score: 5, Informative

Someone already pointed out that repeated compression would give infinite compression with this method. But there's another easy way to show that no compressor can ever manage to shrink all messages

The proof goes like this:

Assume someone claims a compressor that will compress any X-byte message to Y bytes where Y<X
There are 2^(8*X) possible messages X bytes long.
There are 2^(8*Y) possible messages Y bytes long.
Since Y is smaller than X, this means that no 1 to 1 mapping between the two sets can exist, because they're not equally large.

You see this simply if I claim a compressor that can compress any 2-byte message to 1 byte.

There are then 65536 possible input-messages, but onle 256 possible outputs. So It is mathemathically certain that 99.7% of the messages can not be represented in 1 byte. (regardless of how I choose to encode them)

These claims surface ever so often. They're bullshit every time. It's even a FAQ-entry on sci.compression

From the press release: Huh? by mblase · 2002-01-08 02:22 · Score: 3, Interesting

Existing compression technologies are currently dependent upon the mapping and encoding of redundantly occurring mathematical structures, which are limited in application to single or several pass reduction. ZeoSync's approach to the encoding of practically random sequences is expected to evolve into the reduction of already reduced information across many reduction iterations, producing a previously unattainable reduction capability. ZeoSync intentionally randomizes naturally occurring patterns to form entropy-like random sequences through its patent pending technology known as Zero Space Tuner?. Once randomized, ZeoSync's BinaryAccelerator? encodes these singular-bit-variance strings within complex combinatorial series to result in massively reduced BitPerfect? equivalents. The combined TunerAccelerator? is expected to be commercially available during 2003.

Now, I'm not as geeky as some, but this looks suspiciously like technobabble designed to impress a bunch of investors and provide long-term promises which can easily be evaded by the end of the next fiscal year. I mean, if they really did have such a technology available today, why is it going to take them an entire twelve months to integrate it into a piece of commercial software?

Re:No Way... by Eivind · 2002-01-08 02:24 · Score: 3, Insightful

Get yourself some random data (real random is of course somewhat hard to find! but the output from a crypto-strength RNG is OK) and zip it. It will (probably) get smaller, a reduction is more likely the bigger the file is.

Bullshit. There will be patterns, but the point is, all patterns are equally likely, so this does not help you. Don't believe me ? Test it yourself. Pull say a megabyte of your /dev/random (this will take a while!) And then try to compress it with all the compressors on your machine. Zip, Compress, Bzip, you name it.

The odds are very high (as in 99.999% ++) that none of the compressors will manage to shrink the file a single byte. Infact they will probably all cause it to grow very sligthly.

Re:Compression to one bit by kzinti · 2002-01-08 02:29 · Score: 3, Informative

Seriously though, the comp.compression FAQ [faqs.org] is really worth a read, especially question #9 [faqs.org]

YES! Ditto. Seconded. Somebody mod this guy up.

Here's a bit to whet your appetite:

9.1 Introduction

It is mathematically impossible to create a program compressing without loss
*all* files by at least one bit (see below and also item 73 in part 2 of this
FAQ). Yet from time to time some people claim to have invented a new algorithm
for doing so. Such algorithms are claimed to compress random data and to be
applicable recursively, that is, applying the compressor to the compressed
output of the previous run, possibly multiple times. Fantastic compression
ratios of over 100:1 on random data are claimed to be actually obtained.

Such claims inevitably generate a lot of activity on comp.compression, which
can last for several months. Large bursts of activity were generated by WEB
Technologies and by Jules Gilbert. Premier Research Corporation (with a
compressor called MINC) made only a brief appearance but came back later with a
Web page at http://www.pacminc.com. The Hyper Space method invented by David
C. James is another contender with a patent obtained in July 96. Another large
burst occured in Dec 97 and Jan 98: Matthew Burch applied
for a patent in Dec 97, but publicly admitted a few days later that his method
was flawed; he then posted several dozen messages in a few days about another
magic method based on primes, and again ended up admitting that his new method
was flawed. (Usually people disappear from comp.compression and appear again 6
months or a year later, rather than admitting their error.)

Other people have also claimed incredible compression ratios, but the programs
(OWS, WIC) were quickly shown to be fake (not compressing at all). This topic
is covered in item 10 of this FAQ.

Re:No Way... by radish · 2002-01-08 02:50 · Score: 3, Funny

*Reads FAQ* *Blushes*

OK, so I went the "negligable housekeeping route". Maybe I should get a job in the patent office. ;-)

--

---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

Re:I think their investment model requires pigeons by softsign · 2002-01-08 03:31 · Score: 5, Interesting

I'm not sure if I understand your point, but from what I do understand, it seems to me you are missing it.

If you look at this sequence as a one-dimensional series: 00101101, it's pretty hard (at least for a processor) to distinguish a pattern there... it's a pseudo-random sequence. But if I paint it this way, in 2d: (0,0) (1,0) (1,1) (0,1), I can step back and see a square with sides of length one.

AFAIK, what these people are claiming is that they've developed a way to step WAY back, to n-dimensions, and have patterns emerge from seemingly random data.

It's not the random-number generation that's significant here... it's the purported ability to compress a seemingly random sequence. RLE typically doesn't fare very well with pure random data because it only looks for certain types of redundancy.

If I haven't missed the boat here, it's really a very interesting achievment.

Anyone remember the OWS hoax? by wberry · 2002-01-08 05:02 · Score: 5, Interesting

Back in 1991 or 1992, in the days of 2400 bps modems, MS-DOS 5.0, and BBS'es, a "radical new compression tool" called OWS made the rounds. It claimed to have been written by some guy in Japan and use breakthroughs in fractal compression, often achieving 99% compression! "Better than ARJ! Better than PKzip!" Of course all my friends and I downloaded it immediately. Now we can send gam^H^H^Hfiles to each other in 10 minutes instead of 10 hours!

Now I was in the ninth grade, and compression technology was a complete mystery to me then, so I suspected nothing at first. I installed it and read the docs. The commands and such were pretty much like PKzip. I promptly took one of my favorite ga^H^Hdirectories, *copied it to a different place*, compressed it, deleted it, and uncompressed it without problems. The compressed file was exactly 1024 bytes. Hmm, what a coincidence!

The output looked kind of funny though:
Compressing file abc.wad by 99%.
Compressing file cde.wad by 99%.
Compressing file start.bat by 99%.
etc. Wait, start.bat is only 10 characters, that's like one bit! And why is *every* file compressed by 99%? Oh well, must be a display bug.

So I called my friend and arranged to send him this g^Hfile via Zmodem, and it took only a few seconds. But he couldn't uncompress it on the other side. "Sector Not Found", he said. Oh well, try it again. Same result. Another bug.

So I decided that this wasn't working out and stopped using OWS. Their user interface needed some work anyway, plus I was a little suspicious of compression bugs. The evidence was right there for me to make the now-obvious conclusion, but it didn't hit me until a few *weeks* later when all the BBS sysops were posting bulletins warning that OWS was a hoax.

As it turns out, OWS was storing the FAT information in the compressed files, so that when people do reality checks it will appear to re-create the deleted files, as it did for me. But when they try to uncompress a file that actually isn't there or has had its FAT entries moved around, you get the "Sector Not Found" error and you're screwed. If I hadn't tried to send a compressed file to a friend I might have been duped into "compressing" and deleting half my software or more.

All in all, a pretty cruel but effective joke. If it happened today somebody would be in federal pound-me-in-the-ass prison. Maybe it happened then too...

(Yes, this is slightly off-topic, but where else am I going to post this?)

--
LAMP hosting on Debian, SSH, no bandwidth cap, PayPal accepted - http://secondbrainhosting.com/

"practically random data" by hackerhue · 2002-01-08 05:36 · Score: 3, Funny

The output from a pseudo-random number generator is usually considered "random enough for practical purposes." So if you define "practically random data" as "data that is random enough for practical purposes," you can compress it by storing the random seed and the string length. ;-)

I think I can beat their 100:1 compression ratio with this scheme.

--

To get something done, a committee should consist of no more than three persons, two of them absent.

Re:how can this be? Answer: BitPerfectTM by Alsee · 2002-01-08 05:55 · Score: 4, Insightful

Note the results are "BitPerfectTM", rather than simply saying "perfect". They try to hide it, but they are using lossy compression. That is why repeated compression makes it smaller, more loss.

"Singular-bit-variance" and "single-point-variance" mean errors.

The trick is that they aren't randomly throwing away data. They are introducing a carefully selected error to change the data to a version that happens to compress really well. If you have 3 bits, and introduce a 1 bit error in just the right spot, it will easily compress to 1 bit.

000 and 111 both happen to compress really well, so...

000: leave as is. Store it as a single zero bit
001: add error in bit 3 turns it into 000
010: add error in bit 2 turns it into 000
011: add error in bit 1 turns it into 111
100: add error in bit 1 turns it into 000
101: add error in bit 2 turns it into 111
110: add error in bit 3 turns it into 111
111: leave as it. Store it as a single one bit.

They are using some pretty hairy math for their list of strings that compress the best. The problem is that there is no easy way to find the string almost the same as your data that just happens to be really compressable. That is why they are having "temporal" problems for anything except short test cases.

Basicly it means they *might* have a breakthrough for audio/video, but it's useless for executables etc.

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.

Simple, it can't be by nusuth · 2002-01-08 06:19 · Score: 5, Insightful

I have been pretty late to this thread, and I'm sorry if this is redundant. I just can't read all 700 posts.

1:100 average compression on all data is just impossible. And I don't mean "improbable" or "I don't belive that", it is impossible. The reason is pigeon hole principle, for simplicity assume that we are talking about 1000bit files, although you can compress some of these 1000bit files to just 10bits, you cannot possibly compress all of them to 10bits, as with 10 bits is just 1024 different configurations while 1000bits call for representations of 2 different configurations. If you can compress the first 1024, there is simply no room to represent remaining 2-1024 files.

...And that is assuming the compression header takes no space at all...

So every loseless compression algorithm that can represent some files with other files less than original in length must expand some other files. Higher compression on some files means number of files that do not compress at all is also greater. Average compression rate other than 1 is only achiveable if there is some redundancy in original encoding. I guess you can call that redundancy "a pattern." Rar, zip, gzip etc. all achieve less than 1 compressed/original length on average because there is redundancy in originals : programs that have some instructions, prefixes with common occurance, pictures that are represented with full dword although they use a few thousand colors, sound files almost devoid of very low and very high numbers because of recording conditions etc. No compression algorithm can achive less than 1 ratio averaged over all possible strings. It is a simple consequence of pigeon hole principle and cannot be tricked.

--

Gentlemen, you can't fight in here, this is the War Room!

Confirmed with my Polish speaking coworkers by Ewann · 2002-01-08 07:00 · Score: 3, Informative

We have three native Polish speakers in my office. I asked one of them to translate the professor's reply. She said the gist of it is that he was upset they released his name, he didn't authorize any information release, etc. Apparently didn't deny or confirm the truth of the information but said something about having "more important things in my career" or something like that (not verbatim quote).

Slashdot Mirror

ZeoSync Makes Claim of Compression Breakthrough

67 of 989 comments (clear)