Will Compression Be Machine Learning's Killer App? (petewarden.com)
Pete Warden, an engineer and CTO of Jetpac, writes: When I talk to people about machine learning on phones and devices I often get asked "What's the killer application?". I have a lot of different answers, everything from voice interfaces to entirely new ways of using sensor data, but the one I'm most excited about in the near-team is compression. Despite being fairly well-known in the research community, this seems to surprise a lot of people, so I wanted to share some of my personal thoughts on why I see compression as so promising.
I was reminded of this whole area when I came across an OSDI paper on "Neural Adaptive Content-aware Internet Video Delivery". The summary is that by using neural networks they're able to improve a quality-of-experience metric by 43% if they keep the bandwidth the same, or alternatively reduce the bandwidth by 17% while preserving the perceived quality. There have also been other papers in a similar vein, such as this one on generative compression [PDF], or adaptive image compression. They all show impressive results, so why don't we hear more about compression as a machine learning application?
All of these approaches require comparatively large neural networks, and the amount of arithmetic needed scales with the number of pixels. This means large images or video with high frames-per-second can require more computing power than current phones and similar devices have available. Most CPUs can only practically handle tens of billions of arithmetic operations per second, and running ML compression on HD video could easily require ten times that. The good news is that there are hardware solutions, like the Edge TPU amongst others, that offer the promise of much more compute being available in the future. I'm hopeful that we'll be able to apply these resources to all sorts of compression problems, from video and image, to audio, and even more imaginative approaches.
I was reminded of this whole area when I came across an OSDI paper on "Neural Adaptive Content-aware Internet Video Delivery". The summary is that by using neural networks they're able to improve a quality-of-experience metric by 43% if they keep the bandwidth the same, or alternatively reduce the bandwidth by 17% while preserving the perceived quality. There have also been other papers in a similar vein, such as this one on generative compression [PDF], or adaptive image compression. They all show impressive results, so why don't we hear more about compression as a machine learning application?
All of these approaches require comparatively large neural networks, and the amount of arithmetic needed scales with the number of pixels. This means large images or video with high frames-per-second can require more computing power than current phones and similar devices have available. Most CPUs can only practically handle tens of billions of arithmetic operations per second, and running ML compression on HD video could easily require ten times that. The good news is that there are hardware solutions, like the Edge TPU amongst others, that offer the promise of much more compute being available in the future. I'm hopeful that we'll be able to apply these resources to all sorts of compression problems, from video and image, to audio, and even more imaginative approaches.
I thought the Pied Piper platform already uses ML to improve compression, right?
Violence is the last refuge of the incompetent. Polar Scope Align for iOS
Perhaps you should get a patent for this vague description of a mathematical expression. We have learned with the right Supreme Court justices, you can circumvent established case law.
When I talk to people about it, I get a "Shut up Nerd. Don't try to weasel out of the real issue. It is your turn to pay a round."
Don't fight for your country, if your country does not fight for you.
- Hey phone manufacturers, increase CPU performance 20x and then license our ML compression algorithm for a 17% in space savings!
- Thanks, but we'll increase storage space by 17% instead.
- Oops, haven't thought of that. Have a nice day good sir.
A quick remark: this could theoretically work for content delivery but it will be unsuitable for video archival where picture fidelity and lossless transfer are paramount.
There’s your compression.
Compress itself into a singularity.
caption : compel
So the article claims. Where is it improving rapidly? I interact with the Google Assistant and Alexa on a regular basis, and they seem to be just as limited and non-discerning as they have always been. I still have to speak slowly to them, while articulating carefully. And it still is the case that it does not take much in the way of background sound to throw them out of kilter. Thus, where is voice recognition improving rapidly?
It should be obvious that neural nets are great at this sort of thing. That's how our brains record and recall events. They're not registering a stream of pixels or waveforms and zipping them up, they're registering chained concepts. Every time we remember, we piece these concepts back together, so yes, there is a lot of "imagination" filling in data in even our most detailed memories. But just like that can work for our brains, it can work for computers.
"What is the difference between a Ponzi Scheme and an Investment Bank?" -- Jon Stewart
"Through out"? Really? Well, at least you can spell "through".
Preserving perceived quality by which metrics? Comcast recently moved to downgrading the quality of the HD cable programming it provides. Some people see a significant problem with the downgraded video, especially during action video such as sport events. Yet Comcast says that ~the perceived quality is the same.~ So I ask again, how is "perceived quality" going to be measured? And by whom? By those who want to push out the new technology for monetary gain, or by those who are subject to the inferior results of the new technology?
Yeah I saw that 10 microseconds after hitting submit, typical me.
I think you are more than a little confused if you think the Supreme Court has anything to do with patents...
This is what partisanship does to your brains kids. Don't be partisan, learn how systems actually work, and offer thoughtful critiques instead of running around in a blind panic saying things that are outright wrong everywhere you go.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
I saw a paper on audio stream compression recently that was doing good compression. The quality was not quite there yet.
Of course, he'll blame it on his Chinese copy of gramma, rly?
I, too, am looking forward to a better age of compression.
Current compression algorithms compress a fairly average way based on a fairly typical use-case for each file type.
This leads to situations where certain structures in some files straight-up don't compress well at all.
You can tell this when you chain multiple compression systems together that you can usually shave off some many Xbytes. And that's using yet another average compressor on top of another.
But as you say, we really need solid accelerators for hardware compression. It's been talked about for years but now we are finally getting to the kinda core-counts needed to make it at least half-way feasible at the business level.
Even ray-tracing is beginning to become practical in some scenes. In a few major generations of processor design, it'll be even more common.
Exciting times in computing hardware are ahead. Good industry to be in right now before it gains some real momentum. Specialized processors are making a comeback!
Compare the compressed and original picture. They have just put another guy that happens to be in DB and looks kinda similar.
Imagine now Comcast reduced plan contract: Netflix is permitted provided that every actor looks like Brad Pit.
It should be obvious that neural nets are great at this sort of thing. That's how our brains record and recall events. They're not registering a stream of pixels or waveforms and zipping them up, they're registering chained concepts.
Video codec A: Han shot first!
Video codec B: Really? That's not how I remember it.
I don't care if it's 90,000 hectares. That lake was not my doing.
1. Caching.
Remember, all that matters is the bandwidth of the most constrained point you traverse. And that's typically going to be near the servers, as point-to-point connections are highly inefficient.
If you place frequently-accessed content downstream, much nearer the recipient, you eliminate the need to use any of the pipes along the constricted regions.
Experiments I did on this in the 1990s, when the problem was at its worst, showed that you could get a 60-fold improvement in quality of experience, well beyond anything compression can achieve.
2. Multicast
Most people are used to a few seconds delay before streaming starts. If N people request the same content over a 5 second period, then delaying the first person by 5 seconds won't be perceived as abnormal. You're now transmitting one copy per path. This is an ideal way to populate the aforementioned caches, it would be useful for server-based content only if no caches exist.
3. Fractal compression
Technically, wavelet. Used in the BBC's Schrodinger codec. Produces far better compression than typical codecs.
4. Better pipes
Most of the rest of the world is already operating at bandwidths between 100-10,000x that common in the U.S., with U.S. cable companies have either prosecuted those offering higher speeds or driven heavy equipment through their cables. It is time to stop accepting this as the cost of doing business.
The U.S. should mandate 50 gbps to the home, the highest speed available elsewhere on a large scale. If you're going to be the best, you have to be the best. The U.S. should not be an also-ran. A fat tree is impossible, at those speeds. Realistically, I doubt you could get a block to handle more than 200 gbps. Which is adequate.
Tier 1 is fine, just have a mesh network. Current SDM transmission rates are 111 tbps, which means you can support 555 blocks of houses off a single seven core cable. GMING can supply the info on how you'd actually get that to work on a metro level.
You're really not going to need to do a whole lot of compression at those speeds.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
fukrdts udntndmchnlrng
"Through out"? Really? Well, at least you can spell "through".
To be fair, "out" is also spelled correctly. ... but I sadly could not find the referenced quote.
Also sadly, as "AC" I will never see any follow-up post. As you were..
Let's differentiate classifier vs compressor.
There are 2 kind of compressors: lossy vs lossless.
Yup tey are the same thing. The difference is what is the seed key. Is it publicly known or not. That's it. So will AI build in back doors to itself to allow for recoverable keys. Who controls that feature?
Screenplay -> Neural Network -> Motion Picture
I mean, let's be honest, if you tell me to imagine "a white room and Brad Pitt is standing in the middle of it wearing a blue jacket", does it really matter that in my mind he looks like he did in Fight Club and in your mind he looks like he did in Meet Joe Black? Look at how much data we saved by sharing text instead of an image.
Meh, compression. AI, machine learning, network technologies. et. al.; the heavy lift for machine learning will tackle friction points that consume hard money, real brain power and lay waste to precious resources - Debug
A reverse revolution against entrenched momentum, human language and bridgeway to complexities
Skynet is Machine Learningâ(TM)s killer app
Once you get above a Weisman score of 5 or 6 the visible effect on the application becomes negligible. Nothing to see here.
Most CPUs can only practically handle tens of billions of arithmetic operations per second,
Got to wonder how a computer scientist from 30 years ago would react to this casual statement. I'm going with something along the lines of, "Great Scott!"
You are in a twisty maze of processor lines, all alike.
There is a lot of hype here.
Strong AI will be ML's killer app. Perhaps both figuratively and literally.
I first read is as "compassion" such disappointment at realising my mistake....
translation, OCR, Facial Recognition, speech-to-text, text-to-speech, personal assistants, chess/go/any-other game, predictive typing, energy management, ...