Check the article and the webpage; CAPTCHAs that work from a word list appear to be vulnerable to attacks that compute a confidence for each word in the wordlist -- this is how the current generation of CMU CAPTCHAs can be machine-solved 90%+ of the time -- and the new CAPTCHAs at captcha.net use random letters instead of a wordlist.
captcha.net also has "demonstration" image recognition CAPTCHAs, where the user must look at a picture of a cat and choose the radio button marked "cat". That would certainly require familiarity with English. On the other hand, it would require familiarity with English on the part of the CAPTCHA's *intended* audience as well:)
.. is that they can be brokered. If you give me a puzzle, *I* don't have to solve it; all I have to do is induce someone, somewhere, to solve it, and give me the answer. That means I can set up a CAPTCHA-solving factory in Taiwan, or field a porn site where users pay for their pictures in CAPTCHA answers. (*My* CAPTCHAs, the ones my script was assigned to answer in order to make Paypal transactions, not new ones I made up on the spot.)
Suppose that a human can solve your CAPTCHA in an average of five seconds. Suppose unskilled labor costs $6/hour. Then it costs a bit under a cent to find the solution to your CAPTCHA, assuming that I want to solve at least a few thousand a day. As a result it is impractical to protect a service worth more than a penny with a CAPTCHA.
Actually unskilled labor costs far less than $6/hour in some parts of the world, so if CAPTCHAs see wide use the value of the services they can protect is even lower. A tenth of a cent?
CAPTCHAs should be seen as a proof-of-work mechanism, like "hash cash", not as an oracle that can determine whether a transaction was initiated by a human or a machine. Unlike proof-of-worth schemes that burn CPU time, the value of a CAPTCHA won't be inevitably halved every 18 months by Moore's law; on the other hand, it could be suddenly reduced to zero by breakthroughs in image processing.
Encrypt the password n times, with n different keys,
each of which can be brute-forced in an expected
time t, say some fraction of a second. By making n
sufficiently large, the chance that the total time
required to brute-force all n keys and recover the
password will differ by more than d from the
expected time t*n can be made arbitrarily small?
> It's actually quite likely that Vorbis infringes on several of Fraunhoffer/Thompson's patents
Actually, no. Monty, Xiphophorus, and iCast did their homework, and also hired some clueful technology IP lawyers to look things over, some of whom I've met -- these guys are sharp, and they grok psychoacoustics.
Have you read the actual patents? If you work out exactly what is being claimed, FhG doesn't actually own the farm, as I've heard it told.
Yeah, I was skeptical too, but I've been convinced.
Note, also, this: FhG has never actually claimed any infringement by Vorbis. FhG's lawyers could certainly read the source if they wanted to know.
1) Go to www.cogentco.com.
2) Note "100 Mbps for $1000/Month banner on the right side.
3) Click on Wholesalers next to "I Want Cogent Access" on the menu
4) Watch the banner change to "100 Mbps for $3000/Month"
You expect people to drop their prices.. but
cut them to third?
Simple -- fingerprint the audio *after* it comes out of the mp3 player. But yeah, you can always make your own private format, and just encrypt the data. That's fine for sharing stuff with your friends, but doesn't help too much if you want to put the songs on Napster.
As for time compression, yeah, that's one of the distortions you have to make it robust to, one way or another, just like volume change and mp3 compression. I have some strategies in mind for this but haven't run a lot of these kinds of tests.
> the info might be stored modulated on a 50 hertz signal that we can't hear
No. It's pattern matching, not stenography. Tuneprint doesn't change the audio in any way. Rather, you essentially send your mp3 to a tuneprint server and ask the server 'what do you think this sounds like?' and the server says 'oh I know, it's kruder & dorfmeister remixing bomb the bass's bug powder dust'. of course you don't send the whole track, just the 'fingerprint' that uniquely identifies it, but you get the idea.
That means you don't have to modify tracks beforehand. That means you can use it on all the stuff on napster right this instant. And that also means that there's no watermark that a sufficiently clever attacker can strip. (Instead, an attacker would want to subtly change the audio so that the fingerprint is fooled but quality isn't degraded. Psychoacoustics gives us lots of tools to try to stop people from doing this.)
okay, I just want to point something out really
quick: if you take the cryptographic hash of a mp3, then you can fingerprint every unique mp3 (not song) in the world only once and keep it in a database. don't have to recalc it each time. you can take the hash of any mp3 you find and know that you have the same mp3 trusted-authority had when they fingerprinted it.
ah, you say, but the clients will just lie about the hash of the mp3 files they're serving! well, I was thinking about that, and I think I can see a really simple way to design a 'challenge hash' algorithm. the server asks for a random 1k block from the file, and the client has to send that block and send proof that that data, combined with the rest of the data in the file, could possibly hash together to give the hash the client sent originally. the client can only do this if it's true. now, all you have to do is stop the client from saying one thing to the server and something else to everybody else. presumably you do this
by making the protocol to randomly check up on
the mp3's you're serving the same as the protocol to download one of the mp3's you're serving.
these are just random schemes; i haven't tried them or really even thought them through. maybe if I have time someday:)
That was the first thing we tried: have a set of 'classifiers', each of which makes a yes-no decision about the spectrum it sees over a short time interval. Hash together the results of all of the classifiers, and poof. The classifiers were built by automatically analyzing a lot of music to find critera that were stable but widely distributed.
The problem is wobble (aka fencepost error). What happens if the original, undistorted version of the track classifies as a one with a given classifier, but is really close to flipping over
to a zero given just a little push? You lose, that's what, and have to fingerprint both possibilities and put them into the database. And usually there are many different opportunities for wobble if you're checking enough different criteria to build a useful fingerprint. So the next thing to try was only using the foo most 'confident' (unwobbly) classifiers in the hash. But then of course selecting those is wobbly. So you 'debounce' it by having different 'enter set of current classifiers' and 'leave set of current classifiers' criteria. Still too wobbly, and now the code is a complete mess.
So everything was redesigned to use a different and far simpler and elegant approach. The downside is that we now assume the server is a little intelligent and knows how to do a bit of fuzzymatching. See rant posted elsewhere.
To put a song into the database, you break the song into itty bitty pieces, fingerprint each piece, and put each little fingerprint -- labeled with the track it came from and the offset into the track -- into the database. So what you're getting back from the server is a series of messages like "I think you're at offset foo into track bar." The server will be wrong sometimes, maybe even a lot of the time, because of samples that we haven't correctly dealt with, silence, the inherent loopiness of music, distortions that tuneprint isn't sufficiently robust to, gnomes, etc, but only for the correct matching track will the "offset foo, offset foo+n, offset foo+2n" series make sense.
Anyway, so you can identify song boundaries either by the length of the track in the database based on your current offset, or by just waiting until the track names you're getting from fingerprinting the little bits change (and you pick up the "offset foo, offset foo+n,..." pattern again.)
Hiya. My name is Geoff and Tuneprint is my baby which some excellent and astonishing friends at MIT are helping me deliver.
I'd already been up all night when the story was posted at 7am. I'm going to try to stumble my way through a few points, get some breakfast, and try to answer people's questions as soon as I can get to it.
First of all, this is not a hoax. Wow, hair triggers:) Yeah, I was sleep deprived whilst writing most of the website. Yeah, the barcode in the logo is '31337 24816'.. get it.. eleet powers of two. eleet two-to-the-n's. eleet two-n's. eleet tunes. yeah. well. you had to be there. and jamie's to blame for the 24816 pun:) Don't hold it against us that we're not suits.
The general idea is pretty simple. We take the input audio. We condition it (adjust it to a known sampling rate and volume.) We pass it through the psychoacoustic model (it's about a notch more complicated than what you'd see in a mp3 encoder, which ain't saying much. This is all stuff that was mostly hashed out decades ago.) This model effectively strips the parts of the sound you can't hear -- the desired result being that even if the audio has been compressed or manipulated subaudibly, the result is still the same. Okay, so the net result of all of this is a vector that covers a very small segment (fraction of a second) of audio. We stack several of these vectors (possibly separated in time by a bit) side-by-side to get a big vector. Then we do completely boring and standard and well-understood statistical and pattern-matching stuff on the vector to make it smaller and more palatable for the server -- think of it as lossy compression. Then it goes off to the server. The server is about equal in complexity to a text search engine. (I say this fully realizing that I have only a vague impression how Google works. It's certainly
a lot more complicated than the obvious
hash-table-of-sorted-lists stuff.) It finds the database vector that's the best match in a fairly boring but efficient way. (No, it does not involve searching through all tracks one by one, no more than Altavista searches through all web pages one by one every time you want to find some porn.) Call the result a submatch. Back at the client, the whole process is repeated a bunch more times, generating a stream of submatches ("Radiohead offset 0.. Radiohead offset 1024 or 16384.. Slashdot's Gr34test Hits 5262324.. Radiohead offset 3072..") from the input audio stream. Then, the client looks at the submatches and tries to figure out what the input audio was and where the song boundaries are (did somebody really stick in a sample from Slashdot's Gr34test Hits, or was that just an unlucky match?)
See? Not magic. It's a challenging problem, but not an impossible problem. The reason that this doesn't exist right now is not that generations of scientists have tried and failed, but rather that people didn't care too much until lately and nobody's gotten off their ass and done anything about it yet. I like big but approachable problems, which is one of the reasons I'm excited about this.
FOR ALL OF YOU WHO FELL ASLEEP THROUGH THAT: YOU CANNOT ADD AN INAUDIBLE TONE TO THE MUSIC AND BREAK TUNEPRINT. THE FINGERPRINT IS BASED ON THE LARGE-SCALE PSYCHOACOUSTIC FEATURES OF THE MUSIC. IF MP3 ENCODERS CAN DO IT, SO CAN WE. Maybe not perfectly, but enough to have a fighting chance.
THAT'S THE WHOLE POINT HERE.
jen is telling me to go to breakfast but I want to say one more thing, which is that y'all should also pay attention to the second of our two goals as listed in the FAQ, which is to get this tech and access to a nice, well-maintained central database out into the hands of everybody, commerical and open source, major label and independent, so that people can go do lots of cool stuff with it. I don't want this to end up controlled by a single organization that permits its use only in ways that further its private agenda.
Hint: I know that there are sekrit batcave startups that are working on the same thing, because we're starting to bump into them.
Oh yeah. Also like I say in the FAQ, it's not done. No promises. I like the current algorithm; it reflects the wisdom of throwing several other stabs away in disgust. I like the very limited performance data we have. I like the mathematical theory. We haven't scaled it very far yet, though, and it may all come toppling down. In which case we'll pick up the pieces and try again. But I'm
confident we'll pull off something cool, because, well, 70% of what we want to do isn't that hard. The other 30% is a bitch and will require cleverness, work, and chutzpah, but even the 70% is going to be a damn useful tool. And this project has started to catch the eyes of some pretty f*cking brilliant techincal people, in my opinion, so I think we're all over that 30%.
breakfast now. more later:)
geoff
PS: if you've emailed me in the past few days, and I haven't gotten back to you, I'm sorry -- things are pretty hectic around here. I really hope to burn through the backlog this afternoon before I get to the slashdot stuff. thanks:)
Ever wonder if you get a nice warning email before you show up on slashdot? the answer would be 'no':p
okay, first of all, arithmetic coding is an entropy coder like huffman coding, not a replacement for a front-end probability model (which is basically what LZ and friends boil down to.)
it's true that AC is closer to optimal than huffman coding. but by optimal we're talking about a particular kind of information theoretical optimal, having to do with generating output bits with maximal entropy, that is completely different from "generates the smallest files" optimal.
why people still use huffman coding instead of arithmetic coding in almost all applications:
(1) AC is slower and more complicated than Huffman
(2) the patent issues suck hard (eg JPEG can use AC instead of Huffman as a backend, but nobody bothers because of licensing)
(3) the gain is small enough that it's usually not worth the effort
the idea that people should start with AC or Huffman coding and then build a statistical model is profoundly silly. while it's true that a statistical model *exists* that will give you theoretically optimal compression, determining what it is, let alone computing it, is far, far more trouble than it's worth in even mildly complicated cases. instead, you put the data through an invertable transform first, to 'decorrelate' it and simplify the statistical model into something you can deal with. it's much easier to put the transform before the entropy coding instead of inside the statistical model.
this is how all modern compression algorithms that I can think of work (transforms in parenthesis): gif, pkzip, gzip, and friends (lempel-ziv), bzip (burrows-wheeler), jpeg (2d discrete cosine), mp3 (filterbanks and scalefactor tricks), aac (filterbanks, modified discrete cosine, prediction, and other tricks),...
of course there's extra magic that happens inside lossy formats.
#i-opener-linux experimentation reveals the interface to be a standard serial port plus four extra lines, two for incoming phone line and two for outgoing phone line.
so, you've got a COM1 port to play with. no bus.
sorry:(
it has a USB port though. lots of stuff can fit in a usb port. like a $4o USB to ethernet adaptor. also available in wireless.
it's not only vegetarian, it's vegan. as in, no animal products. thus non-dairy cheeses and gluten (in its seasoned form sold as "fake meat.")
the funny thing is.. why does scott adams avoid mentioning this? it's nowhere on the website or in any of the press releases. it's like he's trying to sneak veganity past the unwashed masses. I suppose that's what the "nobody knows how to eat healthily" and "make the world a better place" doublespeak is about.
the other funny thing: my vegan friends tell me that caseinate (one of the main ingredients in the "non-dairy cheese") is milk-derived and not vegan-safe. maybe this is some kind of synthetic casein? maybe he's too vegan for real cheese, but not too vegan for artifical cheese with milk protein?
as far as people pointing out that it's not really a "complete day's nutrition," it's worth noting that the only things they don't have 100%usra of are the things you normally get much too much of. it is *just hard* to live in america and consume less than 100% of your recommended fat, protein, sodium etc intakes -- this is called dieting, and it's not something coders are known for. you wouldn't eat just a burrito in a whole day -- you'd grab some chips and jolt or something. one of these dilberitos plus a serving or two of unhealthy junk food will give you a great approximation of the rda's.
TOC (the protocol that TiK uses, the protocol that AOL released specs for) is a pidgin, text version of the binary OSCAR protocol that is used by the "mainstream" AIM client. (AOL has a gateway box somewhere that speaks TiK on one side and OSCAR on the other. All this effort is because OSCAR is a much more powerful protocol and you could do lots of interesting stuff that AOL wasn't quite intending yet if you had the specs to it.)
Also, OSCAR uses a weird, AOL-internal application framework similar to what's used in the AOL client. (In the beta version, you can actually pop up a console window and avail yourself to AIM's command language, complete with aliases and other random "features.") It's based on dynamically-loaded modules that talk to each other according to a special message-passing architexture.
This would lead me to think that any code contributed to TiK would be of limited utility to the main AIM developers. No temptation to violate the GPL.
[The information in this post comes from, umm, staring at the AIM login screen really hard and meditating. Yeah, that's the ticket.]
Re:How to accept credit cards ...
on
R.I.P. Linuxbox
·
· Score: 1
There are lots of places on the web (of widely varying reputations) where you can sign up and get online approval for a merchant account, often with better rates than you'd get at your brick-and-mortar bank. They direct-deposit (ACH) the money into whatever account you specify. Expect to pay from around 2% (if the plastic card and signed receipt passes through your hands) to 3-5% (if you take the order over the phone or net), plus a couple dimes a transaction, subject to a $10-$15 monthly minimum or so.
That'll get you a Merchant ID number that targets your bank account. Then you need software/hardware. You can buy credit card terminals that swipe a card and dial out over a phone line for a few hundred bucksish, or look around for something used. Or you can get Windoze software and a modem to do it. For a server solution (that you can develop against), the name brand is ICVerify, and it's quite expensive.
Or check out www.hks.net -- they sell Unix (incl linux) flavored software with full APIs (and lots of different bindings.)
Re:There are deadbeats in every profession
on
R.I.P. Linuxbox
·
· Score: 1
Wow. Interesting stuff. If you ever put any of your contracts up on the web I'd be very interested to see them..
Under what pretext do you sue somebody when the fine is written into the contract? (Or is the whole point of your story that you don't?)
The load average is the average number of processes *eligable to run* -- the number of processes fighting for the CPU, as opposed to, say, waiting for hard drives to spin or data to arrive from the network. So, for example, starting the d.net client immediately increases your load average by 1.0, because d.net is always available to suck up any free CPU cycles.
Run top and you can see the actual percentage CPU time use.
Re:What an amazingly bad idea
on
Beaming Money
·
· Score: 1
Yeah. Instead, money would electro-magically disappear into the ether. If you never sync your PDA (or reinstall/wipe the memory/whatever periodically), you can transfer all you want and never pay anything..
Check the article and the webpage; CAPTCHAs that work from a word list appear to be vulnerable to attacks that compute a confidence for each word in the wordlist -- this is how the current generation of CMU CAPTCHAs can be machine-solved 90%+ of the time -- and the new CAPTCHAs at captcha.net use random letters instead of a wordlist.
:)
captcha.net also has "demonstration" image recognition CAPTCHAs, where the user must look at a picture of a cat and choose the radio button marked "cat". That would certainly require familiarity with English. On the other hand, it would require familiarity with English on the part of the CAPTCHA's *intended* audience as well
.. is that they can be brokered. If you give me a puzzle, *I* don't have to solve it; all I have to do is induce someone, somewhere, to solve it, and give me the answer. That means I can set up a CAPTCHA-solving factory in Taiwan, or field a porn site where users pay for their pictures in CAPTCHA answers. (*My* CAPTCHAs, the ones my script was assigned to answer in order to make Paypal transactions, not new ones I made up on the spot.)
Suppose that a human can solve your CAPTCHA in an average of five seconds. Suppose unskilled labor costs $6/hour. Then it costs a bit under a cent to find the solution to your CAPTCHA, assuming that I want to solve at least a few thousand a day. As a result it is impractical to protect a service worth more than a penny with a CAPTCHA.
Actually unskilled labor costs far less than $6/hour in some parts of the world, so if CAPTCHAs see wide use the value of the services they can protect is even lower. A tenth of a cent?
CAPTCHAs should be seen as a proof-of-work mechanism, like "hash cash", not as an oracle that can determine whether a transaction was initiated by a human or a machine. Unlike proof-of-worth schemes that burn CPU time, the value of a CAPTCHA won't be inevitably halved every 18 months by Moore's law; on the other hand, it could be suddenly reduced to zero by breakthroughs in image processing.
Encrypt the password n times, with n different keys, each of which can be brute-forced in an expected time t, say some fraction of a second. By making n sufficiently large, the chance that the total time required to brute-force all n keys and recover the password will differ by more than d from the expected time t*n can be made arbitrarily small?
> It's actually quite likely that Vorbis infringes on several of Fraunhoffer/Thompson's patents
Actually, no. Monty, Xiphophorus, and iCast did their homework, and also hired some clueful technology IP lawyers to look things over, some of whom I've met -- these guys are sharp, and they grok psychoacoustics.
Have you read the actual patents? If you work out exactly what is being claimed, FhG doesn't actually own the farm, as I've heard it told.
Yeah, I was skeptical too, but I've been convinced.
Note, also, this: FhG has never actually claimed any infringement by Vorbis. FhG's lawyers could certainly read the source if they wanted to know.
1) Go to www.cogentco.com.
2) Note "100 Mbps for $1000/Month banner on the right side.
3) Click on Wholesalers next to "I Want Cogent Access" on the menu
4) Watch the banner change to "100 Mbps for $3000/Month"
You expect people to drop their prices.. but
cut them to third?
Umm, dude, read the site.
Simple -- fingerprint the audio *after* it comes out of the mp3 player. But yeah, you can always make your own private format, and just encrypt the data. That's fine for sharing stuff with your friends, but doesn't help too much if you want to put the songs on Napster.
As for time compression, yeah, that's one of the distortions you have to make it robust to, one way or another, just like volume change and mp3 compression. I have some strategies in mind for this but haven't run a lot of these kinds of tests.
> the info might be stored modulated on a 50 hertz signal that we can't hear
No. It's pattern matching, not stenography. Tuneprint doesn't change the audio in any way. Rather, you essentially send your mp3 to a tuneprint server and ask the server 'what do you think this sounds like?' and the server says 'oh I know, it's kruder & dorfmeister remixing bomb the bass's bug powder dust'. of course you don't send the whole track, just the 'fingerprint' that uniquely identifies it, but you get the idea.
That means you don't have to modify tracks beforehand. That means you can use it on all the stuff on napster right this instant. And that also means that there's no watermark that a sufficiently clever attacker can strip. (Instead, an attacker would want to subtly change the audio so that the fingerprint is fooled but quality isn't degraded. Psychoacoustics gives us lots of tools to try to stop people from doing this.)
okay, I just want to point something out really quick: if you take the cryptographic hash of a mp3, then you can fingerprint every unique mp3 (not song) in the world only once and keep it in a database. don't have to recalc it each time. you can take the hash of any mp3 you find and know that you have the same mp3 trusted-authority had when they fingerprinted it.
ah, you say, but the clients will just lie about the hash of the mp3 files they're serving! well, I was thinking about that, and I think I can see a really simple way to design a 'challenge hash' algorithm. the server asks for a random 1k block from the file, and the client has to send that block and send proof that that data, combined with the rest of the data in the file, could possibly hash together to give the hash the client sent originally. the client can only do this if it's true. now, all you have to do is stop the client from saying one thing to the server and something else to everybody else. presumably you do this by making the protocol to randomly check up on the mp3's you're serving the same as the protocol to download one of the mp3's you're serving.
these are just random schemes; i haven't tried them or really even thought them through. maybe if I have time someday :)
That was the first thing we tried: have a set of 'classifiers', each of which makes a yes-no decision about the spectrum it sees over a short time interval. Hash together the results of all of the classifiers, and poof. The classifiers were built by automatically analyzing a lot of music to find critera that were stable but widely distributed.
The problem is wobble (aka fencepost error). What happens if the original, undistorted version of the track classifies as a one with a given classifier, but is really close to flipping over to a zero given just a little push? You lose, that's what, and have to fingerprint both possibilities and put them into the database. And usually there are many different opportunities for wobble if you're checking enough different criteria to build a useful fingerprint. So the next thing to try was only using the foo most 'confident' (unwobbly) classifiers in the hash. But then of course selecting those is wobbly. So you 'debounce' it by having different 'enter set of current classifiers' and 'leave set of current classifiers' criteria. Still too wobbly, and now the code is a complete mess.
So everything was redesigned to use a different and far simpler and elegant approach. The downside is that we now assume the server is a little intelligent and knows how to do a bit of fuzzymatching. See rant posted elsewhere.
To put a song into the database, you break the song into itty bitty pieces, fingerprint each piece, and put each little fingerprint -- labeled with the track it came from and the offset into the track -- into the database. So what you're getting back from the server is a series of messages like "I think you're at offset foo into track bar." The server will be wrong sometimes, maybe even a lot of the time, because of samples that we haven't correctly dealt with, silence, the inherent loopiness of music, distortions that tuneprint isn't sufficiently robust to, gnomes, etc, but only for the correct matching track will the "offset foo, offset foo+n, offset foo+2n" series make sense.
Anyway, so you can identify song boundaries either by the length of the track in the database based on your current offset, or by just waiting until the track names you're getting from fingerprinting the little bits change (and you pick up the "offset foo, offset foo+n, ..." pattern again.)
Hiya. My name is Geoff and Tuneprint is my baby which some excellent and astonishing friends at MIT are helping me deliver.
I'd already been up all night when the story was posted at 7am. I'm going to try to stumble my way through a few points, get some breakfast, and try to answer people's questions as soon as I can get to it.
First of all, this is not a hoax. Wow, hair triggers :) Yeah, I was sleep deprived whilst writing most of the website. Yeah, the barcode in the logo is '31337 24816'.. get it.. eleet powers of two. eleet two-to-the-n's. eleet two-n's. eleet tunes. yeah. well. you had to be there. and jamie's to blame for the 24816 pun :) Don't hold it against us that we're not suits.
The general idea is pretty simple. We take the input audio. We condition it (adjust it to a known sampling rate and volume.) We pass it through the psychoacoustic model (it's about a notch more complicated than what you'd see in a mp3 encoder, which ain't saying much. This is all stuff that was mostly hashed out decades ago.) This model effectively strips the parts of the sound you can't hear -- the desired result being that even if the audio has been compressed or manipulated subaudibly, the result is still the same. Okay, so the net result of all of this is a vector that covers a very small segment (fraction of a second) of audio. We stack several of these vectors (possibly separated in time by a bit) side-by-side to get a big vector. Then we do completely boring and standard and well-understood statistical and pattern-matching stuff on the vector to make it smaller and more palatable for the server -- think of it as lossy compression. Then it goes off to the server. The server is about equal in complexity to a text search engine. (I say this fully realizing that I have only a vague impression how Google works. It's certainly a lot more complicated than the obvious hash-table-of-sorted-lists stuff.) It finds the database vector that's the best match in a fairly boring but efficient way. (No, it does not involve searching through all tracks one by one, no more than Altavista searches through all web pages one by one every time you want to find some porn.) Call the result a submatch. Back at the client, the whole process is repeated a bunch more times, generating a stream of submatches ("Radiohead offset 0.. Radiohead offset 1024 or 16384.. Slashdot's Gr34test Hits 5262324.. Radiohead offset 3072..") from the input audio stream. Then, the client looks at the submatches and tries to figure out what the input audio was and where the song boundaries are (did somebody really stick in a sample from Slashdot's Gr34test Hits, or was that just an unlucky match?)
See? Not magic. It's a challenging problem, but not an impossible problem. The reason that this doesn't exist right now is not that generations of scientists have tried and failed, but rather that people didn't care too much until lately and nobody's gotten off their ass and done anything about it yet. I like big but approachable problems, which is one of the reasons I'm excited about this.
FOR ALL OF YOU WHO FELL ASLEEP THROUGH THAT: YOU CANNOT ADD AN INAUDIBLE TONE TO THE MUSIC AND BREAK TUNEPRINT. THE FINGERPRINT IS BASED ON THE LARGE-SCALE PSYCHOACOUSTIC FEATURES OF THE MUSIC. IF MP3 ENCODERS CAN DO IT, SO CAN WE. Maybe not perfectly, but enough to have a fighting chance. THAT'S THE WHOLE POINT HERE.
jen is telling me to go to breakfast but I want to say one more thing, which is that y'all should also pay attention to the second of our two goals as listed in the FAQ, which is to get this tech and access to a nice, well-maintained central database out into the hands of everybody, commerical and open source, major label and independent, so that people can go do lots of cool stuff with it. I don't want this to end up controlled by a single organization that permits its use only in ways that further its private agenda.
Hint: I know that there are sekrit batcave startups that are working on the same thing, because we're starting to bump into them.
Oh yeah. Also like I say in the FAQ, it's not done. No promises. I like the current algorithm; it reflects the wisdom of throwing several other stabs away in disgust. I like the very limited performance data we have. I like the mathematical theory. We haven't scaled it very far yet, though, and it may all come toppling down. In which case we'll pick up the pieces and try again. But I'm confident we'll pull off something cool, because, well, 70% of what we want to do isn't that hard. The other 30% is a bitch and will require cleverness, work, and chutzpah, but even the 70% is going to be a damn useful tool. And this project has started to catch the eyes of some pretty f*cking brilliant techincal people, in my opinion, so I think we're all over that 30%.
breakfast now. more later :)
geoff
PS: if you've emailed me in the past few days, and I haven't gotten back to you, I'm sorry -- things are pretty hectic around here. I really hope to burn through the backlog this afternoon before I get to the slashdot stuff. thanks :)
Ever wonder if you get a nice warning email before you show up on slashdot? the answer would be 'no' :p
okay, first of all, arithmetic coding is an entropy coder like huffman coding, not a replacement for a front-end probability model (which is basically what LZ and friends boil down to.)
...
it's true that AC is closer to optimal than huffman coding. but by optimal we're talking about a particular kind of information theoretical optimal, having to do with generating output bits with maximal entropy, that is completely different from "generates the smallest files" optimal.
why people still use huffman coding instead of arithmetic coding in almost all applications:
(1) AC is slower and more complicated than Huffman
(2) the patent issues suck hard (eg JPEG can use AC instead of Huffman as a backend, but nobody bothers because of licensing)
(3) the gain is small enough that it's usually not worth the effort
the idea that people should start with AC or Huffman coding and then build a statistical model is profoundly silly. while it's true that a statistical model *exists* that will give you theoretically optimal compression, determining what it is, let alone computing it, is far, far more trouble than it's worth in even mildly complicated cases. instead, you put the data through an invertable transform first, to 'decorrelate' it and simplify the statistical model into something you can deal with. it's much easier to put the transform before the entropy coding instead of inside the statistical model.
this is how all modern compression algorithms that I can think of work (transforms in parenthesis): gif, pkzip, gzip, and friends (lempel-ziv), bzip (burrows-wheeler), jpeg (2d discrete cosine), mp3 (filterbanks and scalefactor tricks), aac (filterbanks, modified discrete cosine, prediction, and other tricks),
of course there's extra magic that happens inside lossy formats.
damn, i'm offtopic. must be late.
this is the funniest thing I've seen on /. in a long time. subtle and brilliantly written.
#i-opener-linux experimentation reveals the
:(
interface to be a standard serial port plus
four extra lines, two for incoming phone line
and two for outgoing phone line.
so, you've got a COM1 port to play with. no bus.
sorry
it has a USB port though. lots of stuff can
fit in a usb port. like a $4o USB to ethernet
adaptor. also available in wireless.
the funny thing is.. why does scott adams avoid mentioning this? it's nowhere on the website or in any of the press releases. it's like he's trying to sneak veganity past the unwashed masses. I suppose that's what the "nobody knows how to eat healthily" and "make the world a better place" doublespeak is about.
the other funny thing: my vegan friends tell me that caseinate (one of the main ingredients in the "non-dairy cheese") is milk-derived and not vegan-safe. maybe this is some kind of synthetic casein? maybe he's too vegan for real cheese, but not too vegan for artifical cheese with milk protein?
as far as people pointing out that it's not really a "complete day's nutrition," it's worth noting that the only things they don't have 100%usra of are the things you normally get much too much of. it is *just hard* to live in america and consume less than 100% of your recommended fat, protein, sodium etc intakes -- this is called dieting, and it's not something coders are known for. you wouldn't eat just a burrito in a whole day -- you'd grab some chips and jolt or something. one of these dilberitos plus a serving or two of unhealthy junk food will give you a great approximation of the rda's.
TOC (the protocol that TiK uses, the protocol that AOL released specs for) is a pidgin, text version of the binary OSCAR protocol that is used by the "mainstream" AIM client. (AOL has a gateway box somewhere that speaks TiK on one side and OSCAR on the other. All this effort is because OSCAR is a much more powerful protocol and you could do lots of interesting stuff that AOL wasn't quite intending yet if you had the specs to it.)
Also, OSCAR uses a weird, AOL-internal application framework similar to what's used in the AOL client. (In the beta version, you can actually pop up a console window and avail yourself to AIM's command language, complete with aliases and other random "features.") It's based on dynamically-loaded modules that talk to each other according to a special message-passing architexture.
This would lead me to think that any code contributed to TiK would be of limited utility to the main AIM developers. No temptation to violate the GPL.
[The information in this post comes from, umm, staring at the AIM login screen really hard and meditating. Yeah, that's the ticket.]
There are lots of places on the web (of widely varying reputations) where you can sign up and get online approval for a merchant account, often with better rates than you'd get at your brick-and-mortar bank. They direct-deposit (ACH) the money into whatever account you specify. Expect to pay from around 2% (if the plastic card and signed receipt passes through your hands) to 3-5% (if you take the order over the phone or net), plus a couple dimes a transaction, subject to a $10-$15 monthly minimum or so.
That'll get you a Merchant ID number that targets your bank account. Then you need software/hardware. You can buy credit card terminals that swipe a card and dial out over a phone line for a few hundred bucksish, or look around for something used. Or you can get Windoze software and a modem to do it. For a server solution (that you can develop against), the name brand is ICVerify, and it's quite expensive.
Or check out www.hks.net -- they sell Unix (incl
linux) flavored software with full APIs (and lots
of different bindings.)
Wow. Interesting stuff. If you ever put any of your contracts up on the web I'd be very interested to see them..
Under what pretext do you sue somebody when the fine is written into the contract? (Or is the whole point of your story that you don't?)
The load average is the average number of processes *eligable to run* -- the number of processes fighting for the CPU, as opposed to, say, waiting for hard drives to spin or data to arrive from the network. So, for example, starting the d.net client immediately increases your load average by 1.0, because d.net is always available to suck up any free CPU cycles.
Run top and you can see the actual percentage
CPU time use.
Yeah. Instead, money would electro-magically
disappear into the ether. If you never sync
your PDA (or reinstall/wipe the memory/whatever
periodically), you can transfer all you want and
never pay anything..