Slashdot Mirror


Usenet Encoding: yEnc

Motor writes "Anyone remotely interested in usenet binary newsgroups must have noticed the spread of yEnc. yEnc is an encoding scheme for usenet binaries which avoids the enormous (30-40%) bloat associated with the schemes currently in use - which all have to produce 7-bit data to stop ancient newsservers from choking. A good thing, surely? Well, not according to some people. The guy has some good points about yEnc and standards, but I can't help thinking that "standards" people have endlessly discussed better encoding schemes, and nothing has come out of it. yEnc may not be perfect, but it works and it's here - hence the rapid adoption. What do you think?"

3 of 417 comments (clear)

  1. Re:Screw luddites by Reality+Master+101 · · Score: 0, Flamebait

    I hope you will be more considerate in the future.

    Not a chance. It's not my fault you have a bad Internet connection, and I don't think it's reasonable to hold back progress for some proportion of people.

    If this needs a solution, then the solution is to implement a "strip" protocol at the Usenet server level. But the solution is NOT to penalize everyone else who doesn't want to live in a world of monospace fonts.

    --
    Sometimes it's best to just let stupid people be stupid.
  2. Counterpoints to all of Jeremy Nixon's main points by Harumuka · · Score: 2, Flamebait

    Uuencoding relies on searching for "magic strings" in the message body of a Usenet post. This is unreliable, error-prone, and has already led to problems with certain client software. It is absolutely the wrong way to go about tagging message content, because what you really want is something reliably machine-readable and precisely specified. However, yEnc also relies upon magic strings in the body.

    There is no reason to despise magic strings. They work, and cannot ever occur in the user data. All yEnc magic strings start with =y, = being the escape character. Ctrl-Y does not need to be encoded, so yEnc is free to use =y for it's own purposes (e.g. =ybegin, =yend). Jeremy Nixon continues his misled rant...
    With a uuencoded multi-part post, client software typically uses the Subject line of the post to attempt to determine the filename, and to tell where the segment falls in the sequence. This is obviously a terrible way to do it.

    No, using the subject line is not obviously a terrible way to determine filenames, segments, and anything else. I find it very convienent to know exactly what my yEnc files will be saved as, how big they are, and how many parts they are in inside the subject line. Nixon says "Sure, it works out most of the time, but it is imprecise and error prone (especially when spaces are used in filenames)" This is blatently false nonsense. Quotes reliabily allow clients to discern the filename. It's not "imprecise and error prone" by any stretch of imagination.

    When non-ascii characters are used in message headers, software currently just has to guess what they mean. Jürgen's filename specification cannot even be used to reliably reproduce his own name.

    I give them that. Non-USASCII data in headers is a pain, and a large powerful organizational bodies needs to agree on a character encoding standard. Oh wait, they already did - Unicode!


    but gives no method to specify a filename which happens to contain quotes, which is not uncommon

    False again. I've never had a filename containing quotes on my Windows box. If we expect newsgroups standards to reach everyone, we must use the lowest common denominator. Similar to how ISO9660 used 8.3 filenames, but on a higher level.

    And the bandwidth savings? That's an illusion. A smaller encoding scheme gives us exactly one benefit: faster downloads and uploads for the users

    Which is exactly what the creators of yEnc intended.

    Meanwhile, the transition creates confusion for the users

    They mean "AOL users" of course. Usenet hasn't had a new encoding format in 6 years, it's about time. Adopting this format should be as easy as switching from Napster to OpenNap to Morpheus to Grokster to Blubster and so on.



    When Jürgen found that going through an actual standardization process within MIME would take time, he chose to ignore MIME in favor of getting something out there right away.

    I don't blame him. Jurgen is a coder, not a politician. I would have done the same thing.


    In short, yes I agree yEnc needs to be more polished. But the point is it works right now, and it's working great. It filled a gap in Usenet, itched a stratch to borrow an ESRism. Once yEnc is standardized as Y.32049 Annex D or whatever those standard organizations call it, we will use it. Until then, yEnc forever!

    --
    What do you think of MusicCity now?
  3. Consider the source by JohnA · · Score: 2, Flamebait

    I'm not sure if it is still true, but I know that Jeremy Nixon (the author of the article) worked at Supernews (now ReMarq) as one of their chief engineers. Not to be jaded, but it stands to reason that he would be against a technology that will decrease the data transferred by customers who pay by the gigabyte.