Slashdot Mirror


The Ultimate Weapon Against Censorship?

Erik Moeller writes "David Madore, mathematician at ENS, describes a method that might be the ultimate weapon in the battle against Internet censorship. In his paper A method of free speech on the Internet: random pads he introduces a system of so-called pads, chunks of random data that are used to encrypt controversial information.(Read More)

Every byte in the source file is XOR'd with exactly one byte in the random file. The result file, by itself, is totally indistinguishable from white noise, provided that the pad used is truly random. Madore now suggests that users store pads on different servers and use several of them in combination to encrypt data.

A FTP or WWW site that stores one of the pads could argue that they are only storing random noise, and another might do the same. It would be mathematically impossible to prove them guilty of storing illegal information (unless there is a way to prove that one pad was created after the other). Only by the combination of the two (or more) files I am able to retrieve the original controversial information. The critical parts are the links to the pads I need to obtain the information, but those might be traded on a distributed system like Gnutella or FreeNet. Plus links take very little space and can be relocated easily to freespace ISPs.

The concept is a little more complicated than my summary here, so please read the paper (and mirror it, it's GPL'd!). There are already scripts and programs to create pads and restore the original files (including a GUI program for Win32). I might add that the idea of pad encryption is fairly old, already used in WWII -- its advantage is that it is mathematically safe if the pads are truly random and only used once, thus its name "One Time Pad"."

37 of 181 comments (clear)

  1. You can go further with secret sharing. by Paul+Crowley · · Score: 4

    "Secret sharing" allows you to break a piece of data (usually a secret key) into N "shares", such that you only need M %lt; N shares to reconstruct the secret, but such that you don't have sufficient information to reconstruct the secret with M-1 shares (ie it's not just impractical, it's information-theoretically impossible). This means you could extend the scheme to keep working even if one or more of the participating sites go offline.

    However, I don't believe any such scheme will work. If it turns out that existing law is insufficient to prosecute participants, they'll extend the law so that acting in a way that could facilitate such a scheme is illegal, and that will include participating in FreeNet, Gnutella, the Eternity service, or whatever. That's why we need both the technology and the data havens.
    --

    1. Re:You can go further with secret sharing. by roman_mir · · Score: 2

      First of all this is not a new idea and I can not imagine why it would be allowed to GPL it or licenise it otherwise. I guess it is all in the implementation details.

      During the WWII messages were sent back and forward that could only be decoded if the receiving party knew what 'key' was used to encrypt the data. The 'key' could be a well known bestseller, a book, or a letter, or any piece of paper with words on it.
      All the encryption does in this case, it randomly finds a letter (case does not matter) on a page and puts the relative position of the letter instead of the letter itself into the encrypted document. Since the 'key' can have many (literally thousands) of the same letters repeating in various words (say it's a book, how many letters 'a' could you find in it?) the message can not be decrypted without knowing exactly the text that was used to encrypt it.

      for example I could use the text above to encrypt the following message: "FIRST POST" as: "1 16 3 17 24 9 87 7 102 5" - note that 'S' is coded as '17' in "FIRST" but as "102" in "POST" and it could anything else. Imagine using a book as a key, for each letter you could put a page number, line number and position of the letter within line.

      This would be the same idea as the scheme suggested in the article above and this idea is not new at all.

    2. Re:You can go further with secret sharing. by David+A.+Madore · · Score: 2

      Speaking of secret sharing, I just wrote a little portable C program to do just that. You can find it at this place (all explanations on use are given within the source file itself). It's really cute.

  2. This doesn't make any sense by Eric+Sharkey · · Score: 2

    This doesn't make any sense. Sure, the pads are random, you can distribute the pads, but you still need to distribute the information that combining certain pads in a certain way gives you a certain message.

    If you could censor the delivery of the message, you could censor the delivery of the list of pads needed to create the message.

    All you're doing is putting the information into a new form. It's the pad list which becomes the important piece of information here and it's precisely the pad list which is completely unprotected by this scheme.

    It sounds pretty useless to me.

  3. One security weakness by Kiwi · · Score: 2
    One security weakness I see is that an attacker can keep track of the pad database, keeping a note of the dates all pads are added to the database. This way, they can determine the location of at least one 'guilty' pad--the most recently uploaded pad in a set of pads contining undesirable material.

    With this attack in mind, I really don't see what these pads give us that the traditional cypherpunk techniques, such as the anonymous mailers, freenet, etc. don't give us.

    - Sam

    --

    The secret to enjoying Slashdot is to realize that it should not be taken too seriously.

  4. White Noise by B.+Samedi · · Score: 2

    What I don't get is why someone would be storing white noise on their server. I mean come on. The argument that it's not encrypted data and just white noise is kind of a flimsy one to use against inspectors or what not. Why in the world would you be wasting storage space with white noise unless it's something important? Maybe I just don't get it.

    1. Re:White Noise by TetsuoShima · · Score: 2

      What I don't get is why someone would be storing white noise on their server. I mean come on. The argument that it's not encrypted data and just white noise is kind of a flimsy one to use against inspectors or what not. Why in the world would you be wasting storage space with white noise unless it's something important? Maybe I just don't get it.

      And you don't think that's a scary thing: Having to justify the existance of ANY file on your hard drive to ANYONE !? That sounds entirely horrifying to me.

      Me: "I just had the file on my hard drive, sir"

      Judge: "For what reason?"

      Me: "I dunno, I just wanted to see what it would look like"

      Judge: "Well, the state deems that it appears too random, and since you can't offer an acceptable explantion for its use, we have to assume it was for illegal purposes."

      Scoff now, but it's been happening since the beginning of time.

      If I want to sit there and read from /dev/random all day(not the best choice for real 'white noise', granted), NOTHING about that points to any illegal, or even 'suspicious', activity. It's one man, piping data to a file. When any incarnation of that, random or ordered, is considered illegal, I'm moving out.

    2. Re:White Noise by Darchmare · · Score: 2

      Why not? Just relabel the white noise as, say, a Metallica MP3 and nobody could tell the difference.

      Then again, you might just open up another can of worms entirely...

      - Jeff A. Campbell
      - VelociNews (http://www.velocinews.com)

      --

      - Jeff
  5. Re:Another idea gone... :) by Betcour · · Score: 2

    Have a look at http://freenet.sourceforge.net ... this is what you are looking for.

  6. Ultimate Weapon Against Censorship?! by hypergeek · · Score: 4
    While this, in conjunction with Freenet may make censorship more difficult, and possibly more tricky from a PR point of view, it's a simple matter for a large governmental body to find and stamp out all the freenet-type servers in its jurisdiction.

    The best weapon against censorship is getting the general public rallied to your cause. Slinking around in the underground only makes you look more criminal to the average joe, and easier for any censorial body to sway public opinion against you. (Remember the panic about "hackers" from the early 90s to the present?)

    Failing that, though, the second best weapon, IMO, would be true anonymity. Would it be possible to have host addresses spontaneously, randomly generated, encrypted, and routed to the destination in a kind of virtual circuit?

    Then, when the connection is terminated (or even beforehand if constant generation of new addresses is part of the scheme), the address is discarded, never to be used again (except perhaps by coincidence).

    If someone wants to communicate, um, "nonymously" (as opposed to anonymously, of course :), they'd simply use digital signatures, but anonymity would be the default.

    Unfortunately, I'm not sure exactly how the non-addressing scheme would be implemented, and it would be of limited use to servers (which would require static addresses anyway), but with a shared client/server mechanism such as Gnutella, Freenet, (or OpenCOLA, for that matter ;), you could have a "swarm" network. Like a swarm of insects, you can definitely see that the swarm's there, you can tell when one insect bites you, but you can't track down that individual insect, as it gets lost in the swarm again.

    Or something like that. (It's 1:50-ish a.m., so I'm not exactly bright-eyed and bushy-tailed... :)

    --
    Stay up hacking each weekend. Sleep is for the week.
  7. NSA & Venona by Detritus · · Score: 4

    The NSA and its precessors have been attacking problems like this for over fifty years. You take a bunch of intercepted messages, select two messages, overlay one message on top of another, subtract or exclusive-or the messages, look for a non-random result, shift or rotate one of the messages by a character or code group, and repeat. Continue until each message has been compared to every other message. The statistical anomalies indicate that two messages were encrypted with the same pad or additive. The NSA used this method to detect Soviet messages that had been encrypted with the same one-time-pad. The Soviets ran short of one-time-pads during World War II and issued duplicate pads to AMTORG and the KGB. It was also used to break naval codes that used a code book and random additive from a second book. Using multiple files makes the problem larger but the same techniques can be used.

    --
    Mea navis aericumbens anguillis abundat
  8. Re:This is old news by arcade · · Score: 2

    I thought the same, until I actually read the referred article. :) This is an idea about a free-speech-network, a'la FreeNet. Not a OTP system.


    --
    "Rune Kristian Viken" - arcade@kvine-nospam.sdal.com - arcade@efnet

    --
    "Rune Kristian Viken" - http://www.nwo.no - arca
  9. I'm not impressed. by rjh · · Score: 4

    I am an InfoSec professional IRL, but I am not speaking for my employer, yadda yadda, this is not professional advice, insert standard disclaimer.

    First: I've never heard of this fellow. I don't recall seeing his name in any of the crypto journals. I don't recall seeing any particularly clever attacks from him in the past. Protocol design is tough; it is an exceedingly nontrivial task, even harder than designing new algorithms. Anyone who says they have a great new protocol is most likely lying, unless they're a Tuchmann, a Coppersmith or a Schneier.

    Always assume all new protocols are full of it, until enough time and attacks have gone by to give confidence that the protocol is only mostly full of it.

    Second: this system is not secure. Repeat after me: a one-time pad is secure as long as it's only used once. The likelihood of a birthday attack is orders of magnitude more likely than he's making it out to be. The reason for this is because Net traffic is not uniform; certain places tend to be "hot" and others "cold".

    Let's have a thought experiment. Let's say Slashdot begins to implement this system, and has a few thousand "pad blocks" available. This means a few hundred megabytes of purely random data--let's completely ignore the practical difficulties of purely random data for now and just assume we can do it.

    When Alice decides to store something unpopular and encrypts it with Pad(s) alpha, beta and gamma, so that Bob and Charlie can read it later, what's Alice going to do? -- Probably use one of the first twenty pads listed. Why? Because people are lousy at choosing random numbers. If you ask someone to pick a number at random, they're most likely to pick a number between one and ten, not one and fifty billion. Things that are at the head of a list get selected more often than those that aren't.

    Let's say that Slashdot randomizes these pads, though, so they always come up in an unpredictable order. (Never mind the practical difficulties in how to do this in the first place. It's a thought experiment. Just keep alive in the back of your mind the fact that (a) we've had to create hundreds of megabytes of purely random numbers, and (b) we have to present them to people in a purely random way.)

    After some mathematics, Alice's super-secret Neiman-Marcus cookie recipe is now pretty much totally obscured. She posts the recipe to a Website, and then tells Bob and Charlie, "Psst! I posted the information to this site. Find pads with IDs of [she recites their IDs] and use that to recover the information!"

    At that point the secret police storm in, having been eavesdropping on the entire conversation. They throw Alice, Bob and Charlie in jail. They go to the website, pull the information, get the pads and read the Neiman-Marcus Cookie Recipe for themselves. Guess what? This protocol has completely, totally and utterly failed.

    The naieve response is to say "well, they wouldn't say it in the open... they'd use encrypted email to share the pad IDs!" Okay, fine. All that's happened is the encrypted email is the weak link in the security; if that goes, the entire scheme falls apart.

    Now recall those two extremely thorny problems from before. Hundreds of megabytes of purely random data are very hard to come by, and purely random presentation of random data is very hard to do. Add in the implementation weaknesses to the weakness of the communications channel between Alice, Bob and Charlie, and you've got a protocol which has very little merit.

    This protocol solves a problem which doesn't exist, as far as I can tell. Now, admittedly, I'm not the sharpest knife in the drawer and I'm also bone tired and I could be totally misunderstanding what the goal of the protocol is.

    But for a secret-sharing protocol, or as a way to securely store information in a way which is deniable, it's pretty dismal.

    1. Re:I'm not impressed. by Pig+Hogger · · Score: 2

      At that point the secret police storm in, having been eavesdropping on the entire conversation. They throw Alice, Bob and Charlie in jail. They go to the website, pull the information, get the pads and read the Neiman-Marcus Cookie Recipe for themselves. Guess what? This protocol has completely, totally and utterly failed.

      Not at all. The protocol did what it wanted to do: it told whoever wanted the cookie recipe where to find it, and they found it.


      --
      Here's my mirror

    2. Re:I'm not impressed. by Dirtside · · Score: 2
      Generating "purely random data" (or, as someone put it, practically random data) ain't that hard, even several hundred megabytes of it

      1. Set up a webcam pointing at a lava lamp.
      2. Turn on the lava lamp.
      3. Take a screenshot every fraction of a second, take the bitmap sequence and XOR sections of the image together.
      Voila, random numbers. The probability of generating subsequent screenshots with identical bit values is nil. Especially so the higher the color depth/resolution of each image is... You could easily get a hundred K of random data from each image, and you can get (let's say) 10 of those a second. 1 megabyte of random data per second. Now just run the program for ten minutes...
      --
      "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
  10. Re:An Insightful Post by muldrake · · Score: 2

    Ok, just about no one seems to have read and understood Madore's page, so I'll summarize his idea: when two people independently serve statistical "white noise" (which just happens to XOR to controversial material), it is ridiculous for either to be convicted.

    I understand this legal argument, but it's a rather highly technical legal argument. Suppose the DA decides to prosecute anyway and has some imbecile willing to testify to your guilt?

    Ok, at this point you then have to find yourself an expert witness to testify at a price of a couple grand a day. So then the DA hires a lot more "experts" to shout down your expert. So now you are paying massive legal expenses on doctored-up kiddie porn created by a crooked DA.

    The jury will be told that obviously you are some kind of criminal because otherwise why would you be doing something like this in the first place. Anyone who knows anything about the Internet or even has an AOL account will be excluded from the jury. Then any jury you have, presuming you can even afford lawyers, will already be drooling idiots, and will be pummeled into submission by a parade of trained circus ponies and clowns with seltzer water.

    To counter this you will have to spend every penny you ever had, and indenture yourself into slavery for your lawyers. Then the idiot jury will probably find you guilty anyway.

    That's assuming you get a trial. They could just invoke the name of Mitnick and deny you bail, and lock you up in solitary until you agree to waive your right even to have a bail hearing. Then they won't let you examine any of the "evidence" in your case and will generate a few gigabytes of crap. When you finally get the right to examine it, they'll print out tens of thousands of pages of binaries on a dot-matrix printer and let you look at it with a flashlight for five minutes a day in a dark room.

    All this is well and good as a mathematical exercise, but the real trick in creating a security system is to have one which is so ubiquitous that having it won't even seem suspicious.

    Because even looking suspicious is enough to get demonized these days. And what's the legal excuse? Ooooooh, we need to protect the CHILDREN. They'll use it for CHILD PORN!

    (IMO fuck the children, but that's not good politics. Anyone using this system will be portrayed ipso facto as some sort of pervert or molestor, and PGP already does this stuff fine.)

    (Oh and I forgot. While this is all going on a bunch of idiots will be posting on slashdot, ohhhh, but he's a criminal, hell with him.)

  11. The Real Ultimate Weapons Against Censorship... by istartedi · · Score: 4

    ...are a strong social framework, a tradition for the respect of individual rights, and a rational government working in harmony.

    Stop looking for technological fixes to problems that aren't technological.


    The regular .sig season will resume in the fall. Here are some re-runs:
    --
    For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
    1. Re:The Real Ultimate Weapons Against Censorship... by Pig+Hogger · · Score: 2
      It can be stated much simpler: a well-educated population in a democracy that doesn't listen only to SIGs.

      --
      Here's my mirror

  12. Two weaknesses + fixes by XNormal · · Score: 2

    The first weakness is that it is easy to poison the repositories with pads with false names. The pad names should be made self-verifying by using a hash of the entire pad as a name (e.g. md5).

    The second problem is that the keyspace is too small. The obvious solution would be to encrypt the data. This way the "URL" for the information would be the names of pads to XOR plus the encryption passphrase. The encryption format should have no headers and be indistinguishable from random data without the passphrase. A good candidate would be CipherSaber.

    The system's biggest advantage is that it ridiculously simple and uses existing tools. This makes it very transparent.

    ----

    --
    Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
  13. You can go further with almost any current method! by orpheus · · Score: 3

    This method described has almost no merit at all.

    The Article had so many technical, philosophical, mathematical and other misconceptions in the article (just a few listed below), that it could pass for a modestly well crafted troll. It had 'something for everyone' (i.e. anyone should be able to poke *some* hole in it, with a moment's thought), making it both an 'obvious troll' and 'good bait'.

    At first, I thought the author was sincere, but then I noted that he actually reversed and misrepresented its flaws as *strengths* (e.g. 'the birthday effect' in namespace collisions)

    How did this article get on the front page of SlashDot? <sarcasm> Is it supposed to be a sly social analysis, a wry deconstructionist experiment or dry Gallic humor? I wonder why it's under censorship rather than crypto -- could it be footnote #6, below? Must be, else this ubmission would never be the cream of the crop. </sarcasm>

    [1] "Free speech" is only meaningful when it can be widely heard. Perfect encryption without public decryption is like locking yourself in a trunk and throwing away the key. If every Joe Sixpack and Dexter Tapedglasses can read your message without prior arrangement, so can Joe Gannon and Janet Reno. if JS and DT can't read it, it ain't 'free speech', its 'private communications'.

    [2] The only privacy insight here is the obvious fact that "encrypted files may look like garbage" (regardless of encryption method) However *cleverly* encrypted files, e.g. steganography, may look like something utterly harmless. Which approach is safer/more secure for the originator, the storage site, and the recipient? Especially in the light of laws like England's mandatory key surrender on (proper) demand. Someday, keeping massive Porn databases may be your duty as a patriot! ;-> How else can we stop the jackbooted thugs from finding/blocking our 21st century Federalist Papers?]

    [3] While independently assigned padnames of 8 bytes may offer 2^64 names, there is a 50% chance of collision after relatively few pads are generated (i.e. millions). The birthday problem the article mentions doesn't suggest high freedom from collisions (as he implies), it means collisons are much likelier than we expect: if there are 24 people in a room, it's *probable* (>50%) that there'll be a birthday collision (shared birthday) even though there are 366 possible days in the dataspace. He cites this as proof that collisions will *not* be a problem

    [4] The system loses the ability to decode as more random pads are created/shared and collisions begin to occur. Since pad generation is uncontrolled, this method would become an information hole -- if you used the 'wrong' #6930d3ed740d54de for a given file, you'd get gibberish -- yet all pad #6930d3ed740d54de are equally valid. The system he calls "A whole Mess 'O' Pads" would degrade to "A Whole Mess" (of bits) -- an effective information hole.

    [5] At its best, this inept rendition of a one-time pad is a Geek Pig Latin [GPL??], reducing the encription value of a theoretically UNBREAKABLE 128K one-time Pad to a *theoretical* maximum of 2^(64*n) combinations [where n is the number of OTPs XOR'd together. and a minimum that is no more than x^n combinations for brute force cracking [where x=number of published pads, n= number of XORs]

    You can best think of it as a poor key generation method, where the true key is not the 128K pad, but the far shorter 'instructions' -- the keynames to XOR together. The example he gave (6 XORs, 8 byte keynames) amounts to the same security as XORing against a 384 bit key, as far as a brute force attack is concerned. This is the same security as XORing against "Netscape Engineers are Weenies! They really are!!" (48 bytes)

    [6] Perhaps this article can be most charitably read as an experiment in information darwinism, but not in the Dawkinsian 'meme' sense: the speaker who uses this method is 'too dumb to be listened to' (and silenced by disappearing into the 'Whole Mess'O'Bits) -- akin to the sardonic 'too dumb to live'. (This is supported by his assertion that he is not sure free speech is a good thing)

    --

    If you can go to bed, knowing you did a valuable thing today, you're very lucky. If you can't... it's not bedtime

  14. One Time Pad Snake Oil by Effugas · · Score: 5

    *Sigh*

    Everybody loves the One Time Pad.

    Can't imagine why. It's like, couple words out of Shannon saying a system can be provably uncrackable, as long as it's far too annoying to actually use, and people convert that to:

    Lets just make it not annoying to use.

    Problem is, the security comes from that annoyance, and degrades ungracefully: Very, very ungracefully. As in, the moment one pad gets compromised, or even reused, boom. Game over. You're done.

    Compound that by having key material retrieved by the encryptor over a network(as this system depends on), and you're even more done. Lets analyze what's going on here a bit.

    All cryptosystems are essentially engines for extracting the secrecy from a set of data. Secrecy is something even more intangible than the raw data that itself is secret; a very large quantity of information can be stored and transfered, but a secret can only be transfered if that data can be understood. Cryptography essentially works by allowing the comprehensibility of data--though not the data itself--to be extracted and simplified down to some other piece of data.

    Now, often that data can be much, much smaller. Broadbridge Media, for instance, takes direct advantage of this for reasonably secure mass data distribution of music videos on CDs--some large ciphertext gets mass distributed on CDs or DVDs, while a small, personalized transaction over the Internet allows an individual to retrieve the key which decrypts the ciphertext into plaintext. The mass data is moved, but remains incomprehensible until a relatively tiny amount of key material is transmitted to the destination host.

    Madore's system is somewhat similar; he still has a chunk of extracted secrecy composed of a "recipe of pads" which, when XOR'ed together, reveal the plaintext. This recipe can be as small as literally two pads; an innocent "complete works of Shakespeare" page and some extension thereof.

    First problem? Madore gets his pad indexes from the first couple of bytes of whatever pad he's come across. PGP has survived reasonably well with a 2^^32 complexity attack against its public keyspace indexes(it's called the DEADBEEF attack); Madore's system however is likely to find collisions in everyday use.

    It never ceases to amaze cryptographers that, for all the functionality of the fixed-output, one way hash(password storage, small indexes to arbitrarily sized inputs), people don't use them. There really aren't that many flat out solved problems in all of crypto, this is one of them. IF YOU'RE NOT STORING YOUR PASSWORDS AS EITHER MD5 OR SHA-1 HASHES, YOU'RE WAITING TO GET HACKED. *sigh*

    Anyway, beyond that small chunk of data which gives the recipe of which block to use, there's also the censorworthy-but-XOR-obfuscated block which will supposedly diffuse itself throughout the network. Whereas Broadbridge got its incomprehensible data out the door on CDs, Madore's system invokes the distributed nature of many, many XORable keyblocks to hide which block on the network is the actual censor-worthy block.

    But how many blocks do I need to use for a recipe? Suppose I have 200 random blocks to choose from, and I download one block of random key material. Wait. Lets say I'm really paranoid, and I generate my own random block to XOR against, and upload it to a server. OK. So I've gotten my single block to XOR against, I do so, and I upload my data-containing block to the padservers.

    I've already lost.

    Whether I downloaded my keyblock from the network, or uploaded it to the network, anybody sniffing my network traffic will see the exact block I used to encrypt against. They'll either watch it leaving the keyserver or going back in.

    Worse, lets assume there was no sniffer--just 201 random blocks, any two of which can be XORed together to reach plaintext. The complexity isn't one of fifty billion, it's 201*201, or a good 40,401 operations. Use of two pads isn't particularly specified...but then, use of this as a viable encryption system isn't particularly specified either. You can tell, by this line:

    "Your first task is to locate an announcement stating that the data you want are recoverable by XORing such a set of pads."

    Oh, that's all.

    "Go find your key."

    Obviously, with no special complexity applied to locating your key, there's nothing that separates You As Reader from You As Censor. And, since whoever determines a key used *once* for secret information determines it for all time...boom.

    But, lets be fair. Madore's goal mainly seems to be able to give websites the capability to host information they can't recognize. Freenet did this; Madore doesn't actually even come close. Among other things, the system isn't particularly fault tolerant. Good secret sharing systems allow m-of-n functionality, i.e. retrieval of any m number of shares from n total(like 3-of-5) reveals the data. This system? Any block is missing--and there doesn't need to be more than two--and your data is gone. Loss of a single pad archive is likely to cause some data to disappear forever. Ouch.

    Honestly, I'm putting too much energy into this. Madore writes the following:

    The pads, of course, are just named by their 16-hex-digit names (thus, strictly speaking, the announcement makes it possible to recover the first eight characters of the data; but that should not be a problem).

    Any cryptosystem which leakes information about the plaintext in the key material never should have left the drawing boards. I congratulate Madore on noticing this, of many flaws in his design, but this really is Bad Crypto. It's timely, and it's useful, and it'll hopefully prevent people from falling for other Pad scams by sheer nature of the /. reaction, but it's still Bad Crypto.

    *Sigh* At least he wasn't trying to sell us anything.

    Yours Truly,

    Dan Kaminsky
    DoxPara Research
    http://www.doxpara.com

    1. Re:One Time Pad Snake Oil by Effugas · · Score: 2

      mdpopescu--

      If you've got a cogent point to add, please, do so. I don't hold the monopoly on clues; I expect to fuck up pretty harshly in my life. It's part of crypto; you fuck up.

      This was billed as a means of encryption; it fails miserably in that regard. Key material is retrieved over a network, or is compromised when it is submitted to a network. Methodologies of dealing with files greater that 128kb aren't even mentioned. Recipes end up causing a single block to be the non-innocent one. No block that is innocent really is functionally that.

      And so on! Really, I'd love a better response. Crypto's what I do, and I wrote the previous rant on not *too* much sleep. You've gotta admit, Madore's system just isn't very good crypto, but if I missed the reasons why it isn't, I'm all ears.

      Yours Truly,

      Dan Kaminsky
      DoxPara Research
      http://www.doxpara.com

    2. Re:One Time Pad Snake Oil by Effugas · · Score: 2

      Hartwell--

      There are two components here:

      Information Hiding, via Encryption.
      Secret Sharing, via Split Chunks and Recipes.

      As an encryption system, this fails. Madore admits this. But it's still an encryption system in one very classical sense: You have one block which is equal to ciphertext.

      Not two, not three, not m of n.

      One.

      And it's one block, which never changes. One block, which can be easily identified. One block, which is dependant upon network retrieved keying material.

      There are far, far better ways of doing steganography, secret sharing, and cryptography as a whole. That's my point.

      --Dan

  15. Re:You can go further with almost any current meth by Pig+Hogger · · Score: 2

    [1] "Free speech" is only meaningful when it can be widely heard. Perfect encryption without public decryption is like locking yourself in a trunk and throwing away the key. If every Joe Sixpack and Dexter Tapedglasses can read your message without prior arrangement, so can Joe Gannon and Janet Reno. if JS and DT can't read it, it ain't 'free speech', its 'private communications'.

    (For convenience, let's call the act of getting the pads and XORing them together " schkroping ").

    Not at all. If you describe such a text as being available by schkroping together, say, 95FE35321DA3, 95843938475894, 3948382830405, 409530404950 and 28305049394, (presumably each pad being locatable by it's "name"), you'd get a schkroping browser with will get the information you want just as (insert your favourite HTML browser name do), except that the URL would be the name of the various pads constituting the information.

    Hey! Let's invent a new URL type: shkrp://(pad 1),(pad 2),(pad 3),...(pad n )

    [3] While independently assigned padnames of 8 bytes may offer 2^64 names, there is a 50% chance of collision after relatively few pads are generated (i.e. millions). The birthday problem the article mentions doesn't suggest high freedom from collisions (as he implies), it means collisons are much likelier than we expect: if there are 24 people in a room, it's *probable* (>50%) that there'll be a birthday collision (shared birthday) even though there are 366 possible days in the dataspace. He cites this as proof that collisions will *not* be a problem

    However, here, you're right. There WILL be name collisions when you just take the first n bytes of the pad to identify it. But what can we do? If we take the last n bytes of the pad, we'll have the same problem. Even if we XOR them together, or if we XOR the CRC of the pad over that.

    Ultimately, it would seem that the only real unique key would have to be the pad itself!!!! Which hardly solves the problem at hand...

    The method could sure be greatly improved by the million eyeballs now looking at it; how about incorporating it in freenet, as the author suggests????


    --
    Here's my mirror

  16. How does this help censorship? by raygundan · · Score: 2

    It seems to me that this system is just as easily censored as the existing internet. Somebody has to host the information that tells which pads are put together to make the real data, right? It wouldn't be any harder to censor this "list of pads" than it would be to censor the unencoded file itself in the first place.

    It appears that all this method does is move the point of censorship from the document contents to the "list of pads" required to build the document from the random data stored on various servers.

    Unless I'm missing something when I read through the document, I don't think that this really gains us anything and at the same time it makes it really freaking difficult to put a file out there. Maybe if it were automated, it would make a nice extension to an anonymous DFS system for file sharing, but you shouldn't rely on it to prevent censorship.

  17. Enhancement by Nicolas+MONNET · · Score: 2

    Servers should not list what they have (maybe except to mirrors), they should just return what they are requested.

  18. Re:An Insightful Post by Pig+Hogger · · Score: 2

    That's assuming you get a trial. They could just invoke the name of Mitnick and deny you bail, and lock you up in solitary until you agree to waive your right even to have a bail hearing. Then they won't let you examine any of the "evidence" in your case and will generate a few gigabytes of crap. When you finally get the right to examine it, they'll print out tens of thousands of pages of binaries on a dot-matrix printer and let you look at it with a flashlight for five minutes a day in a dark room.

    You should have said "a dot matrix printer with a faded ribbon with holes and creases"...


    --
    Here's my mirror

  19. Re:IANAL, but I'm a math/CS graduate by Signail11 · · Score: 2

    The OP is correct (and you've missed a rather subtle point). The OP said "It suffices to build a (roughly) square matrix containing the prefix of all the pads we wish to include in the analysis, run Gaussian elimination, and then see if there is a dependency with the file." The key word in that sentence is the word "square". Moreover, it is emminently possible to use more than just the 64 bit prefixes of the files; if one uses say 3,000,000 (where 3,000,000=the total number of pads), the total size of the matrix is well within the bounds of conventional techniques, to say nothing of SGE or BL/BC.

  20. I think you've misunderstood, & your sums are bad. by Paul+Crowley · · Score: 2

    The point of the method is to make it easy to collect the information, while making it difficult to blame the publishers. Janet Reno is supposed to be able to read it; this is supposed to make it more difficult, legally speaking, to get the information offline. I don't think it'll work but it's not utterly mad. It's not exactly unobvious either.

    Your sums are wrong for point 3 as well. If you want a chance on the order of 50%, you'll have to generate around 2^32 pads; that's more like billions than millions. I still think that's too small, but hey, move to a 160-bit identifier (perhaps the SHA-1 of the pad?) and you won't get collisions.
    --

  21. Fighting the wrong battle by sansbury · · Score: 2

    In a country such as China, merely maintaining a Freenet server or collection of pads for this scheme would likely be declared a capital offense. And since the authorities are willing to monitor every drip of water that flows through the pipes, they will see when you send that PGP-signed message, and arrest you. Whether they can crack the message or not is in most cases irrelevant.

    What is needed here is a form of encryption in plain sight that doesn't say, "look at me I'm a cypherpunk" when you use it. What about this-

    1. Take a copy of an innocuous 8-10k JPEG file from some large public site. Say some cute little kitty-cat from Pets.com or that sort of thing.

    2. Use a program that takes a small text message, maybe a few dozen words- "The police chief practices Falun Gong and will warn you if trouble is coming."- and embeds them into the JPEG file by, say, flipping a handful of color values around ever so slightly.

    3. Send the munged image to the recipient in an innocuous email- "Isn't this kitty so cute!!! :-)" While indistinguishable to the naked eye, a simple comparison of the differences between the file sent and the publically-available image file would reveal differences.

    4. The crypto here need not be so strong, because the point is to focus on making the sending of the message look as innocuous as possible, and to create plausible deniability for the receiver.

    5. Now the only program is to get the decoding software installed where it needs to be. I don't know what the right answer here would be.

    Anyway, just my two cents. Take it FWIW.

    -cwk.

  22. Some replies to various criticisms by David+A.+Madore · · Score: 5

    Hi. I'm the author of the page in question, and victim unaware of the Slashdot effect (well, not truly unaware: Erik Moeller, who posted the story, was kind to notify me in time). I received many emails about it, which I've all read, as well as a good many posts in the current discussion. I can't possibly reply to them all, but I'll try to answer some of the most frequent or important comments here.

    First note that the page was written in february (2000/02/19 to 2000/02/23 to be precise), so it is not new. However, I do not claim any kind of originality, nor paternity of the idea: it is a small variation on the protocol described in section 6.3 ("Anonymous Message Broadcast") of Bruce Schneier's book on cryptography. In any case, I think it is pretty obvious in the first place. I am merely suggesting a few practical ideas to make it workable. There is nothing great or revolutionary about anything, and I never made that claim.

    One thing should be made clear from the start: the whole idea is not about obscuring what the data is (i.e. it is not strictly speaking cryptography) but about who is sending the data. And, even more specifically, it is about making legal conviction impossible so long as the presumption of innocence is maintained (whether the presumption of innocence still means anything in these dark days is another question:-/&nbsp); thus, it is normal that the story appeared on Slashdot's "Your Rights Online" section.

    Please also note that I am not making a political statement. This is not a libertarian manifesto. I am not stating that you should use this system to send out assassination messages against the President / the Prime Minister / the King / the Pope / <insert your favorite assassination victim here>; I am merely stating that you can, and that this is none of my business.

    Many have pointed out that my suggested way of naming pads is bad. That's true: using the MD5 (or SHA1 or any other kind of hash) signature would be a better idea. But it doesn't really matter all that much what the pads are named unless we want the system to be resistant to malicious tampering, which was not one of my avowed goals. Indeed, we can get this almost for free, so we might as well. Let's say we could have a symlink pointing from pad_md5_whatever.dat to the pad of the given md5 for each pad in each repository, and "combination recipes" could be given with these links so as to make them resistant to tampering.

    Similarly for secret sharing: my idea was not to have a system which is hard to censor (there are other, far better, solutions for this), but to have one which is hard to track.

    Another thing I should make quite clear is that the system in itself is not used to hide data: it is used to hide the origin of data. This is why all comments on the "OTP is secure as long as the pad is truly one-time" line, or all remarks to the effect that it is trivial to find all relevant data among the padset, are quite true but completely irrelevant. If you want to hide the data on top of hiding the origin, then you use a traditional cipher; for example, you encrypt your data using blowfish and you use that data (the ciphertext, which for all intents and purposes is random) as input to the pad system. So long as you don't release the key, nobody can tell that there's a blowfish-encrypted data hidden in the pad system. The two are completely orthogonal. (It is true that my remark about the difficulty of finding "recognizable data" in the pad system is very misleading and irrelevant. I should remove that: never mind that part.) As for my comment about the birthday effect, it is merely about accidental collisions, not at all about malicious action.

    Somebody asks what is wrong with storing all pads in the same place since anyone can download them all. That is true, but that is beside the point. The point is that as long as a site does not have a complete set of pads yielding readable data, it is not, by iself, breaking any law, and all it is distributing is white noise; whereas if it stores one complete set of pads, then it is distributing the forbidden document in some form. Naturally, if someone wants to collect a complete set of pads, it is a good idea; but to distribute it is dangerous.

    Finally, there is the central question of whether the legal argument (which is the crux of the matter) holds water. Presumably it doesn't, but that will at leas prove one thing: the argument shows that any kind of law restricting free speech contradicts the presumption of innocence. Some have pointed out that one could monitor the pad system, and the last pad published in a set of pads would always be the culprit: this is not true, because it might have been delayed, or it might be provably innocent (which implies the former, actually), and you can never quite be sure.

    Imagine the following scenario: someone points out on some Usenet group that eight publically available pads, when XORed together, give something like DeCSS code. Judge summons the 'someone' in question, who claims that he just noticed that by randomly XORing pads together; not unconvincing, so judge lets the guy go. Then judge summons the pad owners. Starts with the most recently published pad: but the owner explains "look, my pad is just an encryption using the key 'foobar' of the first 128kb of (some standard transcription of) Shakespeare's Tempest; the idea had been floating around for some time, I just decided to publish it". Judge checks statement: it's true. So apparently the data was "published" earlier than was thought, it just took some time to come out; that makes things rather difficult to track. Second owner similarly points out that his pad is just a sequence of decimals of pi in binary. Third owner is in a country over which judge has no jurisdiction, so nothing to do there. Fourth and fifth owners seem to have created their pads at the very same time, and both state obstinately that they generated pure white noise (following, say, a story on Slashdot about pads being a great idea). Sixth owner says he generated his pad by XORing another dozen other pads with an innocent message (which he shows to judge). Seventh owner refuses to answer judge's question. Eighth owner posted his pad before DeCSS even appeared, so must be innocent (or really?). Now what does judge do? Convict some owners? All? None? Problem is, judge is impressed with first poster's proof, and can't run the risk of convicting someone who might afterward prove that his pad was innocent. Presumption of innocence. Even if judge merely issues an injunction that the pads be taken off the network, every owner appeals on the ground that the pads were reused in making some other messages (innocuous ones) and that removing them would be a serious breach of first amendment (or whatever you call this thing about free speech).

    Anyhow, this is the summary: there's nothing new or revolutionary about the whole pad system; in fact, it's pretty trivial. But it does make one point: that information is fundamentally delocalized and that any attempt to pinpoint it or to find a culprit will fail. For the better or for the worse.

    1. Re:Some replies to various criticisms by David+A.+Madore · · Score: 3

      Yes, but the most recently created pad is not necessarily the culprit. It can be a good strategy to create a provably innocent patch (I explained how this can be done in various ways), XOR it with the rest and delay it's publication until much after the others. If anyone tries to pull the "latest created patch is the culprit" argument on you, then you show he's a fool by expliciting the way it was created (you can really make someone look like a fool if he tries to condemn you for publishing a sequence of the decimals of pi or an encrypted version of a part of the Bible!).

  23. Randomness is available, and selecting is easy by jsm · · Score: 2
    Problems (a) and (b) are easily solved:

    (a) In a slashdot discussion a few weeks ago, someone pointed out that Intel and possibly other CPU's provide an analog white-noise random data source, providing something like 75K/second of random data.

    (b) If you need a random number between 1 and 50 billion, then use rand(). Humans should never try to pick random numbers on their own; there are too many biases and patterns.

  24. Can you spell conspiracy? by www.sorehands.com · · Score: 2
    In the CPHack, the judge said "in active concert."

    There is conspiracy, where one hand does not need to know what the otherhand is doing. They just need to have a common purpose, publish prohibitted data. And 3 or more of this can be considered RICO.

    Instead of worrying about bypassing the law, why not fight it and change it?

    Recognizing some of these lawsuits as abusive, slapp enough of the companies that bring them.

    If you slapp a company hard enough, the others would stop doing this. That is why I am fighting Mattel. When I win, and I will, I am wanting a large enough sum to make sure that other companies flinch when they think about trying to shut someone up with abusive litigation.

  25. Related idea by Elvii · · Score: 3

    I've come up with/been inspired with an idea to "encrypt" virtually any data, being near totally unbreakable unless you torture the sender/recivier of that data. It's not pad/block based, it can be used with or without a computer, and the numerics/codes it uses are unbreakable by brute force, look random, yet they're not random or patterned.

    Can answer simple questions, but going to hold off on full blown explanition until mid-week when I have full sample code/implememtation. It's not a hard system, just no time this weekend. Watch my site for more info as the week goes on, if you're interested.

    bash: ispell: command not found

    --
    This sig left intentionally blank.
  26. Excellent Point by FreeUser · · Score: 2
    Excellent Point.

    There is more than one kind of censorship:

    • Outright Government (Federal) Censorship (e.g. it is illegal to possess kiddie porn, to publish classified material, etc.)
    • Outright Government (State and Local) Censorship (e.g. Cincinnati's witch hunt of the Maplethorp exhibit, Larry Flynt, etc.)
    • Structural Censorship (e.g. Copyright prevents people from publishing another's work without permission, allowing the Church of $cientology to silence many citations of its works by critics, trademark laws restrict how one may refer to a corporate entity, etc.)
    • Institutional Censorship ("We won't display/print/publish that, it would offend too many, cause a lawsuit, etc.")
    • Corporate Censorship (threats of lawsuits, often based on dubious claims of trademark or copyright infringement with little or no legal basis, i.e. Legal Thuggary)
    • Social Censorship ("We don't like your kind around here!")


    I've probably missed some other forms of censorship, but you get the idea.

    Clearly, there is no technological solution that will solve all of these forms of censorship, and as others have pointed out, no technological solution can substitute for political involvement in preventing these kinds of abuses.

    Nevertheless, this sort of thing, coupled with a FreeNet infrastructure, could at least alleviate both Institutional (ISPs) and Corporate Censorship by making it too expensive to persue. It won't win the war, but it could be decisive in a few important battles.
    --
    The Future of Human Evolution: Autonomy
  27. Aliens by Hard_Code · · Score: 2

    [spaceship lands on the burnings ruin of a once flourishing planet]

    [2 aliens come out of the ship]

    Alien1: Wow...this planet is in ruins, but from the wreckage I can guess that once a properous and flourishing culture lived here.

    Alien2: No...I searched all recorded data and only found meaningless random garbage. Let's go home.

    [aliens enter ship and fly away]

    --

    It's 10 PM. Do you know if you're un-American?