More On The BBC's Codec 'Dirac'
TioHoltzman writes "El Reg is reporting about a new codec that is built on top of wavelet technology and seems to offer performance that is "roughly in line with the Video Codec 9" from Microsoft. The project has been released as open source on SourceForge. This looks like it might be really interesting." (Previously mentioned a few weeks back.)
The Sourceforge page says that Dirac uses arithmetic coding. Aren't there patents on arithmetic coding? I thought that was the problem with using JBIG for bilevel images, and why most free compressors use Huffman coding or the like.
Last time I checked, wavelet compression methods were burdened by many patents: google search. What does that mean for users of the codec?
"I love my job, but I hate talking to people like you" (Freddie Mercury)
Call me a zealot, but I think things are better off open source, doubly so in the case of codecs. I mean, it's a media encapsulation. If a codec is open, then the potential for cross-platform success is much better. Potential for profit may go down, but I'm talking innovation, not wallets.
Only the purest of souls seek enlightenment. Everyone else just wants power.
Does adding a little note saying "we covered this a few weeks ago" always get the editors off the hook for posting the same article twice? ;)
Am I the only one who thinks that Dirac sounds like some sort of monster from the Dr. Who series?
BLING BLING. Meet the architecture that's changing everything.
BBC to Put Entire Radio & TV Archive Online
Spam Vikings await.
The standard way to compress both audio and video is with the Discrete Cosine Transform, or DCT. MPEG audio and video are based on DCT.
The basic idea of DCT is to transform the data into a series of waves, which tends to concentrate the data. Then you throw away part of the data, and then use lossless encoding on what is left. If you just threw away pixels, the result would be obvious in an image; but if you throw away part of the wave specification data, the results are not as obvious.
With DCT, consistent data sets compress very well (e.g., a blue sky or a white wall). Pictures with lots of sharp little edges (e.g., a field of blades of grass) compress much less well.
My understanding is that potentially wavelets will compress even better than the DCT. However, they are not enough better to be a huge win at the moment.
steveha
lf(1): it's like ls(1) but sorts filenames by extension, tersely
as for wavelet compression being a novel codec, what about apple's pixlet technology?
Some drink at the fountain of knowledge. Others just gargle.
This is from 1998.
http://www.seyboldreports.com/SRIP/wavelet/
steveha
lf(1): it's like ls(1) but sorts filenames by extension, tersely
Regardless of patents etc. it doesn't matter that there is something as good as a Microsoft codec. Unless there is a perceived advantage, unfortunately it isn't going to become widely adopted because the huge mass marketing machine that is Microsoft is pushing its technology and making it the easy to use default.
You only have to look at Mozilla/Firebird which have finally matured into reasonably solid stable products. Netscape innovated, then lost market share and IE got a foothold. Now it doesn't matter to most companies that there is once again a good alternative in Mozilla because it only has a small marketshare. In the case of MP3, it took more of a foothold earlier on but we're already seeing movement towards proprietary formats.
The only way that the open source community is going to do well here is to provide a single coherent product without branches that is trivial to install and use for the average non-technical computer user. Unfortunately the very nature of open source and free software makes this difficult, because you have to reach a consensus amongst a diverse range of very intelligent people with very different politcal agendas. Choosing a single united front is a huge challenge.
Forget the codec for a moment. If I want to install the latest client operating system from Microsoft there is only 1. (This is the ideal - I know we've had Me/98/XP running concurrently but that's still only 3). How many Linux distributions exist - each version with its quirks and styles. It may be fantastic from the point of view of evolution of the software. Its not going to get users switching over.
These posts express my own personal views, not those of my employer
While wavelets doesn't offer a breathtaking advantage in data rate vs. quality factor, it does appear to lend itself to a simpler implementation than does DCT, and unlike MPEG, which is very intensive on the encoder, wavelets places symmetrical burdens on encoder and decoder.
It was a core assumption in the design of MPEG that the world market for encoders was quite small (where have we heard that theme before???) Clearly, the assumption was false, and one advantage of switching to a wavelets technology would be reduced cost per unit for encoders.
--- Bill
Many many people do not understand how the government can tax a TV set, and I can admit I am somtimes in that crowd, but let us alos recognize that the Beeb is perhaps the most important source of news, regardless of how they get it to you, and more ways is better, that exists on this insane mudball today. I hope that whatever the Beeb does is a huge success. It has to be. Or the sky will fall & crush us all to death. Taht I am not kidding about......Bush just thanked Rumsfeld for torturing people. Up is down & down is up. And Amerikans are mostly OK with this.
But the BBC isn't the government - it's public service broadcasting at its best (though it's not as good as it might be, since it feels the need to justify the license fee by playing the ratings game and filling the schedule with mindless drivel). The BBC has been at the forefront of broadcast engineering development since the 1920s, and I'm happy to see them contributing to the world once more.
And the top rate of income tax over here isn't 50%, it's 40% - I wish it was 50% for high earners, then perhaps they'd have less disposable income to push house prices beyond the reach of the rest of us.
oh brave new world, that has such people in it!
There are lots of great or just good enough codecs out there. Having an open source codec would be great, but the biggest problem today is not getting the best/freest codec but instead is making it available from the average browser. From a practical point of view, it might be more worthwhile resigning oneself and exerting effort to make common formats (Windows, Quicktime) work well from a Linux computer (from my understanding the Mplayer plugin won't stream Windows/Quicktime).
Not that this type of research should be discontinued, of course, but from the numerous projects I've been involved in that used streaming media, common availability was the biggest problem... we often had to produce video for Windows, Quicktime and Real. There are some environments (technophobes, corporations, and government) where you can't install a new plugin.
In fact I think a Java based media streaming applet might be a great solution, since Java has pretty good saturation (although *sigh* there is no entirely free software or open source Java implementation at this moment).
(This is an excerpt from the book 'Surely You're Joking, Mr. Feynman!' and is for everyone here who has, or hasn't, heard of Paul Adrien Maurice Dirac, the namesake of this new codec. It also conveniently fits in with the two articles about Japan that made their way onto Slashdot today.)
..."
While in Kyoto I tried to learn Japanese with a vengeance. I worked much harder at it, and got to a point where I could go around in taxis and do things. I took lessons from a Japanese man every day for an hour.
One day he was teaching me the word for "see." "All right," he said. "You want to say, 'May I see your garden?' What do you say?"
I made up a sentence with the word that I had just learned.
"No, no!" he said. "When you say to someone, 'Would you like to see my garden?' you use the first 'see.' But when you want to see someone else's garden, you must use another 'see,' which is more polite."
"Would you like to glance at my lousy garden?" is essentially what you're saying in the first case, but when you want to look at the other fella's garden, you have to say something like "May I observe your gorgeous garden?" So there's two different words you have to use.
Then he gave me another one: "You go to a temple and you want to look at the gardens
I made up a sentence, this time with the polite "see."
"No, no!" he said. "In the temple, the gardens are much more elegant. So you have to say something that would be equivalent to 'May I hang my eyes on your most exquisite gardens?'"
Three or four different words for one idea, because when I'm doing it, it's miserable; when you're doing it, it's elegant.
I was learning Japanese mainly for technical things, so I decided to check if this same problem existed among the scientists.
At the institute the next day, I said to the guys in the office, "How would I say in Japanese, 'I solve the Dirac equation'?"
They said such-and-so.
"OK. Now I want to say, 'Would you solve the Dirac equation?' -- how do I say that?"
"Well, you have to use a different word for 'solve,'" they say.
"Why?" I protested. "When I solve it, I do the same damn thing as when you solve it!"
"Well, yes, but it's a different word -- it's more polite."
I gave up. I decided that wasn't the language for me, and stopped learning Japanese.
Nothing to do with the government. The BBC is granted a charter from Parliment, but is not government run or funded. The BBc is funded by a compulsory license fee for owning equipment capable of recieving and decoding their broadcasts such as a TV or tuner card. Basically it's a tax on virtually every household and business in the UK. There is a discount for black & white TV's, pensioners and those with vision based disabilities. In the 'old days' you used to need a 'wireless licence' as well for radios!
This means that when information is dropped in each block (according to the compression required), the edges of blocks suffer in a way unrelated to the edge of adjacent blocks. The result -- as the quality decreases, the edges between blocks become more and more obvious, and the whole image becomes 'blocky'.
I believe this is one way that wavelet technology improves -- the individual wavelets are spread over the whole image, without regard for any blocks, and so the compression degrades much more gracefully.
As you say, the DCT converts each 8x8 block into a series of cosine waves, both horizontally and vertically in the block. Then, when it needs to reduce the space, it drops the higher-frequency coefficients first -- this is why sharp edges, with lots of high frequency information, suffer most. (You tend to find that lower-frequency coefficients try to compensate, giving the characteristic ripples near sharp edges.) Areas that are relatively smooth, with only low-frequency information to start with, suffer much less.
Another way JPEG loses information is by colour. The human eye is much more sensitive to fine changes in brightness than it is to fine changes in colour; so the picture is transformed from RGB into a brightness channel and two colour channels, and the brightness channel gets a greater share of the limited space. It's quite interesting, if you're, er, interested in that sort of thing...
Ceterum censeo subscriptionem esse delendam.
Pixlet is designed for real-time editing, so it has minimal artifacts and no interframe compression. Dirac is for broadcast, so it is much more agressive about compression and can take advantage of motion compensation and other computationally expensive compression techniques.
You are right, however, that wavelets are not at all a new compression technology. People started playing with it at least 10 years ago and JPEG-2000 uses wavelets for still photo compression. I think that the computational load has prevented their use in video until recently.
It needed some improvements (more searching), and had some faults: around when it came out, it took a 600MHz Alpha (The fastest processor at that time, or darn near it) 24hours for a 30-sec clip, because it used brute force, and the quality was good, and compared to other compression types they all were much larger, and some looked worse. The problem is the difficulty in finding the fractals that will work. Recreating the image is relatively easy.
The great thing about wavelets is how they work at arbitrary resolution without much of a performance hit. Edges look like edges. Since you can basically make a general description of an image and just keep adding more detailed wavelets until you've got the compression/quality ratio you're looking for, and you can define quality however you'd like. One of the ideas for JPEG2000 is to have a field in image tags to specify how much of the image a browser should download, so you'd only have to keep one copy on the server. (By the way, where the hell is JPEG2000?)
The above just takes advantage of spatial similarity (if a pixel is one color, it's neighbors are probably similar), but you can also take advantage of temporal similarity (if a pixel is one color in this frame, it's probably a similar color in the next one). You can also do motion compression, though when you get to that level of optimization you generally lose the symmetry between sender and receiver resource consumption. Of course, that might just be another CS dissertation away.
WARNING: there is a trojan on your
There are lots of great or just good enough codecs out there. Having an open source codec would be great, but the biggest problem today is not getting the best/freest codec but instead is making it available from the average browser.
Yes, and why are so few codecs available? Two reasons: (1) most codecs out there are a software engineering mess and hence hard to integrate into anything, and (2) most of them are heavily covered by patents and copyrights so people can't just write a plug-in and distribute it.
Something like Dirac holds the promise of letting people create simple, self-contained, freely distributable players that either play stand-alone or can be easily plugged into browsers. Furthermore, the same is true for encoders, allowing people to create content more easily.
And, unlike MPEG encoders, which have lots of weird parameters and flags, Dirac looks like it is simple enough that making high-quality encodings does not require a Ph.D.
In fact I think a Java based media streaming applet might be a great solution, since Java has pretty good saturation (although *sigh* there is no entirely free software or open source Java implementation at this moment).
Well, even there, a simpler format can help: something like Dirac is probably a whole lot easier to re-implement in Java than something like MPEG4.