Tighter Video Compression With Wavelets
RickMuller writes: "There is a Caltech Press Release here that talks about a new 3D video compression algorithm by Caltech's Peter Schroeder and Bell Labs' Wim Sweldens that they claim is 12 times smaller than MPEG4 and 6 times smaller than the previously best published algorithm. The algorithm uses wavelets for the data compression.
Potential applications in real estate (digital walk-throughs of houses) are cited in the article. Anyone figure out a way to wire this stuff up to Q3 Arena yet? The results were presented in a talk at SIGGRAPH 2000 in New Orleans."
"You'll also be able to see how it will look after you knock out a wall, reapaint the rooms, and drop in new furniture from a 3-D catalogue"
...but will it allow wireframe/noclip mode, so I can track the plumbing, electrical, and network connections through the walls?
--
"It's tough to be bilingual when you get hit in the head."
this can only mean one thing - download full length hollywood films which are only 50 megs instead of 300!
hooray!
--
when the rain comes, they run and hide their heads. they might as well be dead.
a new 3D video compression algorithm by Caltech's Peter Schroeder and Bell Labs' Wim Sweldens that they claim is 12 times smaller than MPEG4 and 6 times smaller than the previously best published algorithm.
That's great that the algorithm is smaller, but what we really want is smaller data
---
Interested in the Colorado Lottery?
Interested in the Colorado Lottery or Powerball games?
check out http://colotto.com
This isn't going to make movies any smaller to download.
What it's going to do is make 3D worlds smaller to download.
It's not the compression technique that will allow you to view in complete 3D the inside of a house, but the fact that you can record a 3D model of a house and still have it small enough to download.
The biggest improvent would probably be for VRML type technologies. And it's not going to make quake faster, but it could possibly let someone on a 28.8 use a customized skin that can be quickly sent to all other computers. Most people download quake worlds before they start playing rather than on the fly. -Kashent
They are comparing polygon mesh compression with video compression. Sounds like apples-to-oranges to me. Also sounds like it will have no effect on video compression, and it will have limited impact on rendering time.
I say limited, because you still need to draw those polygons. However, one nice feature of wavelets (at least for images) is that you can easily extract just enough data for displaying at a particular resolution. If that property holds for polygon meshes, then you should be able to draw only as many polygons as are useful for your display resolution.
I have included a sample of the technology compressing "The Matrix" below:
1
As well as Quake 3 demo:
0
Note: Also decreases viewing time, increasing the ability of the user to consume more media.
-- Bird in the Bush: The Renewable Energy Blog http://www.birdinthebush.org
Nope. 3d data is already smaller than 2d data. You have a set of textures (already present in a 2d movie, except that you don't tend to save space by knowing where/how they're tiled) and a set of meshes. Everything is included ONCE and then re-used later when it's needed without having to redownload if you're streaming.
For instance, if you look at the content stuff that comes with a 3d package (probably not a good example) you have ~500k of files which are used to create animations which, even when compressed (albeit with shitty indeo compression) take up over a megabyte.
Finally, if you took a look at the objects, motion files, and textures for toy story and then compared that to what a full-resolution MPEG 4 movie of the entire feature would be like, I think you'd come to the conclusion that the 3d data is more space-efficient than the 3d data. Of course, you have to walk the line between processing time needed to render on the remote end, and amount of data you're going to send.
So really, I can't see what they've actually accomplished here, and it's definitely apples and oranges to compare anything for 3d data with a 2d video standard. Perhaps they are misusing the term three-dimensional.
On a side note, a coworker says it would be possible to send vertex coloring data which you've prerendered, and then have the user's system do all the easy rendering tasks, overlaying the vertex coloring. This would let you have good dynamic lighting effects (Radiosity, anyone?) and still be able to keep the bandwidth low. If anyone does this after reading this post, you owe me two copies of the server and the content creation software :)
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
MPEG-4 Can be used for 3D content. The Web 3D consortium is currently working on the project. I assume this is what they were comparing to.
As anybody who has ever seen the stependous quality of high-res wavelet images knows, this compression is pretty amazing. Although there are probably other, more legitimate I might add, uses for this, the first thing that comes to mind is porn videos. Seriously! The adult market is one of the only internet ventures making money, and this new compression just helps out the US ecomony some more. (Not to mention the fact, that if wavelet compression allows streaming quality comparable to DVD, then you can cancel your subscription to the Spice channel ;)
A deep unwavering belief is a sure sign you're missing something...
I know this article specifically refers to using wavelets for compressing 3D, but it should be able to be used for video. FYI, wavlet compression is a method that uses fractals to compress an image. If you've seen the wavelet demos, you know just how much better than JPEG wavlet compression looks. Using the same process, it is also possible to compress a sequence of frames as in a video. This is demonstrated in some of the wavlet demos floating around the net. Right now they're in black and white, and are pretty small, but it is conceivable that they'll get better. A big problem with wavelet compression is the CPU cost. Even single images can take a second or two to decompress. However, that can probably be offset by hardware decompression mechanisms, and it doesn't seem that they're will be a shortage of CPU power anytime soon. If you've ever seen how well wavelet lets you compress images 50:1 with so little quality degredation, you'll understand what an impact this method could have on computer video.
A deep unwavering belief is a sure sign you're missing something...
I just finished writing a proposal to NASA for some instruments on the Solar Probe spacecraft. That's a pretty telemetry-constrained mission. We tested a proprietary wavelet-compression algorithm at 50:1 on 14-bit images (yes, that's about a quarter-bit per pixel) and even at that level it's very hard to tell the difference between compressed and uncompressed images with the naked eye. (The algorithm seems to work by quantizing the sizes of features in the image).
At that level of compression, a 30Hz stream of 6bit-per-channel 640x480 images would only require just over 3Mbps of bandwidth -- and that's without taking any advantage of the relationship between frames. It's easy to believe that another factor of 50 could come out of a combination of more aggressive compression and either diferential encoding or 3-D wavelets. We could end up with full-motion, full-rate video being squirted through 60kbps connections.
In most natural or real-world data (i.e. images, geometry data, etc.) the information at a given point in the data is very highly dependent on the data at nearby points. Thus, there is a certain amount of redundancy in the data, and this redundancy is spatially localized. The concept in transform coding is to apply some transformation (either linear or nonlinear; the wavelet transform and Fourier transforms are linear) to this data to reduce the statistical redundancy.
Even after applying the transform, you haven't saved anything in terms of the space required to store the data; all you've done is change the basis used to represent the data. Now you take the transformed data and place it into a bunch of bins, each of which is identified with an integer. At this stage, called quantization, you are modifying the information present, because the best accuracy with which you can recover the data is given by the width of the bins. At this stage, you take the sequence of integers and apply a lossless coding scheme to it to reduce the number of bits required to represent the stream of integers. The compression happens at this stage. Wavelets do a better job than blocked discrete Cosing transform (used in JPEG) at reducing the statistical redundancy of the input data; thus wavelet-based image compression compresses more efficiently than JPEG.
What Schroeder and Sweldens have done is taken an a very general, widely applicable method for constructing wavelet transforms (known as the lifting scheme, invented by Sweldens) and adapted it for representing mesh nodes and connectivity information, i.e. geometry (which incidentally could just as easily be higher dimensional data). Thus they have a wavelet transform for geometry. They achieve compression by using the EZW coding scheme, developed for coding wavelet coefficients of images and used in the JPEG2000 standard, and applying it to their geometry wavelets.
It should be very nice for low-bitrate storage and transmission of geometry, as well as successive-refinement transmission (i.e. the 3-d data gets better and better looking as more bits arrive).
All is Number -Pythagoras.
Sure, wavelets are O(n), FFT is O(n log(n)).
But the FFT has a much better constant, and so is generally faster on real-world data sets.
The real win with wavelets isn't speed, it is the match to the real world data. A sharp boundary in the FFT has to have a "long tail" in the coefficients, causing Fourier transforms to suffer from things like the Gibbs effect. Wavelets allow you to make a deliberate tradeoff between smoothness and sharp boundaries. So more information is in fewer coefficients.
BTW a lot of the better wavelet algorithms (eg wavelet packets) are no longer O(n). Why not? Because they allow you to dynamically choose the best representation out of a family of representations. That extra freedom requires processing time...
Cheers,
Ben
My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
With wavelets at a very basic level there are too many options. Wavelet researchers don't talk about a wavelet transform, they have entire families of wavelet transforms algorithms to argue over. Each is better in different circumstances.
This makes standardization harder. There are a lot of tradeoffs. Do we go with the one that works better on smooth data? Or on boundaries? The one which is symmetric so that the errors it produces tend to be harder for the human ear to pick up? Or the one which is orthogonal, giving it a ton of nice mathematical properties? Shall we have a simple wavelet transform? Or a dynamic wavelet packet transform? Do we work from the most significant bit of data to the least? Do we try to order the data in some way? (The first allows for bandwidth to determine the compression level chosen, the second is key for streaming output.)
The basic idea of a wavelet is very flexible. So you get a lot of choices, none of which is obviously better than the others. This makes it hard to decide which should be made a standard...
Cheers,
Ben
My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht