Usenet Encoding: yEnc
Motor writes "Anyone remotely interested in usenet binary newsgroups must have noticed the spread of yEnc. yEnc is an encoding scheme for usenet binaries which avoids the enormous (30-40%) bloat associated with the schemes currently in use - which all have to produce 7-bit data to stop ancient newsservers from choking. A good thing, surely? Well, not according to some people. The guy has some good points about yEnc and standards, but I can't help thinking that "standards" people have endlessly discussed better encoding schemes, and nothing has come out of it. yEnc may not be perfect, but it works and it's here - hence the rapid adoption. What do you think?"
The article points out some interesting points why yEnc shouldn't be adopted... none of which will probably keep the community from adopting it, however. If it's here, and being used, that is a whole lot more intertia than common sense can usually gain. Er, betamax, anybody?
instead of breaking the standard, code it right, wait till it matures, use it then.
"It's such a fine line between stupid and clever" -- David St. Hubbins, Spinal Tap
I had scripts that automatically got some a.b* newsgroups, but after the invention of this bastardized yEnc piece of crap, all my scripts are broken, and I'm 2 months behind on data for our clients.
:) Only e-biz that's thriving still.
BTW, I work for a pr0n site
There was a market for this thing, it spread like wild fire. It's too bad that no one made a better spec and program (the author aludes that there was planty of time to do this). yenc meets the "GOOD ENOUGH" criteria, thus it will be used, shitty, non-robust standard or not.
I'm all for standardisation... but sometimes it takes _forever_ to get something standardized. If someone writes a better product, they generally don't want to wait for it to be declared a standard, especially with something like uuencoding which has been around as long as usenet, and isn't going to be replaced in a hurry unless someone comes out and waves a product around yelling "hey try this. it works better". Ogg Vorbis isn't a standard by any means. Hell, it is still on RC3. _but_ a lot of people are using it because it has far better sound compression than mp3. You don't hear people complaining that Vorbis has jumped the standardisation process do you?
Personally I can't see why we can't just send the data as 8-bit binary. uuencode and similar encoding formats should have died out with UUCP years ago, since there is no physical reason why 8bits can't be sent over the wire anymore.
The question is if the author of the article knew of the issues to implement a better binary encoding method, why didn't he do so? He even claimed that he had tried something like yEnc years ago, so he should be technical capable to come up with a better way. So I urge him: Stop whinning, create a better competing format instead. Even if he doesn't have enought time to commit on such project, he could just find some able bodies through internet to do actual work and act as an advisor or technical assistant to this project. yEnc is still in its infant stage, so it is still not too late to do something right.
Despite its problems, XMODEM took off because it filled a need, just as yEnc does. Nixon's complaint that shrinking files by 35% won't make Usenet any smaller because people will just post more files is besides the point; it's like saying getting a 35% salary increase won't help your finances because you'll just buy more stuff with the extra money. Most people want that extra 35%, and Jürgen stepped up to the plate and delivered it.
Thankfully, as far as I know, nobody railed against Ward Christiansen the way Nixon does against Helbing. XMODEM's problems became obvious and the solution was to introduce YMODEM and then ZMODEM. XMODEM is still around, but its successors (and of course serial IP) have pretty much supplanted it. Ward's initial efforts are still deeply appreciated.
Yes there's the problem of legacy software, but a protocol that's only been around for a few weeks or months can't have that much of a legacy. The only programs that currently support yEnc are the ones whose maintainers react pretty fast to new developments, and those maintainers are likely to also quickly pick up any revisions/fixes to yEnc.
So the solution Nixon should be calling for is not a years-long bureaucratic standardization process that will get yEnc 1.3 entrenched while the standardization is happening. The solution is to fix yEnc's problems and release a new version as fast as possible, before the old version gets spread around too widely.
In one sentence, standards ARE important because they allow for the most people to get the most benefit.
I work in an industry that relies heavily on standards, and my job deals specifically with standards. Making sure that WE follow standards, and making sure that other vendors follow standards.
Sure, they're slow to develop. But they're the best for interoperability, and that's crucial. In my line of work (for a major Mobile Phone System NSS provider), I have to deal with other providers that have to follow the same standars we do. That allows both of our products to communicate. This gives the end consumer (i.e., Cingular, Sprint, etc.,) the option to buy from different vendors. This forces us to make better products. This forces us to be more efficient. This forces our competitors to do the same thing. In the end, everybody wins.
The other alternative is what I see as the Micro$oft approach: Standards be dammed, I'm going to do it this way, and f*ck everybody else. It's the same approach that gives you security holes in your browser, because, well, who needs the standards?
I can't believe I'm reading comments like "well, it's here and it works so what's the problem?"
The problem is the future.
The problem is the inability to send an SMS from a CDMA service like Sprint to a GSM one like Voicestream. That's what happens when you blow off standards.
The problem is the inability to read an M$ Word doc that was sent to a Linux user.
Ignoring standards and going off on your own (especially, going off BADLY on your own) just divides us.
Good standards help us all. They give us better products. The lower costs.
CD-Rs. FireWire. PCI. countless others.
Besides, as the article begins by asking: Just what problem were they trying to solve?
Watch the Teaser Trailer for "The Lightning Thief" Her
The big savings on binaries is coming from .PAR files.
If you don't know what they are, then you haven't been on usenet for a while.
But essentially, it allows you to stripe you sets with parity so that you can lose up to "n" posts and the PAR programs can rebuild the missing pieces.
I believe this has helped the backbone tremendousl.
Of course, by doing this, yEnc is now a de facto standard, just like MS Word doc format . . . And what a good standard that is.
"Especially for file sharing"
No no no. Usenet is good for everything because a web site can be stopped with a single letter from an aggrieved party.
Try it sometime. Try setting up a website posting some copyrighted scientology stuff. It will be up there about 24 hours.
Now, upload the same stuff to Usenet. Its out there, and NOBODY can stop it.
Do you understand why that makes usenet infinitely more powerful than the web?
Lets see...Napster is dead. Morpheus is dead. Morpheus part 2/gnutella is a zombie (perhaps it's just me - I have yet to have it actually pull down a file for me. It keeps telling me the other end is busy or something.) Even when they were alive, you get half way through and someone cuts you off...or you find out that 80M download that took a whole day was actually mislabled.
On Usenet...sure - you don't get to search and finding a file involves posting a request and hoping someone fulfills, but you get bandwidth - assuming you want to pay for it...you get files that are there (assuming you have decent retention.) and not dependant upon someone being online. And unless you have a crappy server, you don't get halfway through a download and someone decides to kick you off. And 99% of the time, what something is posted as is what it is.
As for the replication - well, there is no one point of failure. As well, you don't have one site getting the shit hammered out of it either. I pay $9/mo for usenet...I get three fast servers to choose from and some high GB limit.
If I had to go to the same server as everyone else, you'd have the same problem that moviefone had when star wars tickets went onsale online - DOA - all with nice corporate control of content.
It should be pointed out that this site, linked from yENC's own website, goes into more technical detail regarding the technical flaws of yENC. The fact that it's linked from yENC's own site is proof that the author is at least familiar with the concerns that people have with his implementation.
I personally still find it difficult to argue against the article author's point that THERE WAS NO RUSH to force yENC out the door in such an unpolished form. After so many years of waiting for something better, why ignore the recommendations of those you are trying to help?
< tofuhead >
It is still the dark of night.
The key theme here is that people on usenet whine. A Universal Truth, as it were.
- A.P.
"Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
If, by any chance, you're transferring things over a modem (v42bis' lzw) or ssh vpn (zlib's deflate) or possibly other types of links, then you're probably not going to notice a difference anyway. The systematic encoding inefficiency that goes with base64 and uuencoding, results in a substantial lack of entropy that will be picked up on and exploited by good compression algorithms. Then end result won't be quite as good as having efficient encoding to begin with, of course, but it will be in the same ballpark. There's no way it'll be anywhere near a 33% difference.
This sounds like something that would have been useful 15 years ago before compression was widely used, and when people were still writing newsreaders. Now it looks like a waste of time and an excuse to get people to "upgrade" their software.
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
The "he's against it because it saves bandwidth" argument makes no sense. If it saves users a little bandwidth, it saves Supernews many many times that much bandwidth, lowering their costs (which means they don't have to charge users as much to provide the same service). It also saves disk space, meaning Supernews doesn't have to buy new disks quite as soon. And a good bit of Supernews' business is in the corporate (outsourced ISP) service, which they don't charge by the gigabyte (they have speed caps, not monthly download quotas).
The problem is that any savings are just an illusion; this is just a momentary blip in the growth of Usenet. Since yEnc doesn't have the 100% market penetration that uuencode and MIME have, people are more likely to post binaries in multiple formats, causing storage and bandwidth needs to increase, not decrease.
Can *anyone* look at the uuencoded, mime encoded, and other similarly mangled into 6bit, 70 character-per-line standards, and honestly tell me that Usenet was designed with binary file transmission in mind?
There are no Usenet binary transmission standards, just a few different hacks to make it work. If this guy's new hack makes it work better, good for him.
I lost sympathy about here:
A smaller encoding scheme gives us exactly one benefit: faster downloads and uploads for the users. It is not going to make Usenet smaller. It is not going to allow servers to increase retention. Do you really think people aren't going to post more, if they can do it faster? Of course they are. They're always going to post more, with or without yEnc [...] big deal.
So effectively, what he's saying is, in effect: "this system changes nothing, and is of no benefit, except that it makes more data available on the Usenet and gives users faster uploads and downloads. So it's worthless."
This guy obviously hasn't had to use a metered dial-up account for a while. A 33% saving on transfer times is an enormous benefit. I feel quite insulted by the way he seems to think it's of no importance, as if my time and money aren't worth anything. "What's the rush" indeed! I'd happily tear up MIME and MD5 tomorrow if it would speed up my transfers by a third.
If yEnc is so widespread, it can only be because there's a demand for it. And if there's a demand for it, why the hell shouldn't programmers support it? Last time I checked, RFC's weren't enforced by law. The Net has seen a million non-standard hacks, and has, for the most part, assimilated the good ones and outlived the bad. yEnc is by no means the worst, and it brings real benefits to tens of thousands of people every day. I say leave it alone - or if you have to oppose it, at least oppose it constructively, for Christ's sake!
More, smaller messages -- what difference do you think that would possibly make?
My answer to yEnc is brewing, and frankly, I'm not arrogant enough to push anything which is "my" answer to yEnc. There are a lot of people out there who know what they are talking about and it would be stupid not to listen to them. So, don't look for "my" answer to yEnc... look for "the" answer to yEnc, developed not by one hacker but by a group of people who know what they're doing.
Jeremy
In particular, there is no yenc RFC and yenc does not use MIME which is the agreed upon standard for encoding binary attachments. Yes, uuencode is a gross grandfathered format, but it is still 7 bit clean.
Releasing problematic improperly specified encodings that break internet protocols is not being a good citizen. "it works" is a poor justification. it does not work, and breaks compliant software.
-Kevin
This just reminds me of the napster data format. Anybody ever read the reverse engineered specs? It's scary. It looks like it was designed by a monkey. And not a smart one.
yEnc sounds like a good idea, and a horribly bad implementation.
Because Usenet works, and an "efficient P2P system" does not exist.
It used to be that someone did something useful, then the community, through use choices, adopted it as standard. Then, if there were flaws, these would be ironed out with an updated standard, usually all or mostly backwards-compatible with the original implementation. It's gotten to where new standards are useless, either because companies (like, say, RealNetworks or MS) refuse to submit their protocols/formats for public use/review, or because the standards committees (say, for Java (before it was pulled) or the W3C) argue for years without actually doing anything.
I, for one, am happy to see a useful format publically available.
-- Two men say they're Jesus. One of them must be wrong. - Dire Straits
Your essay is the best summary I've seen so far of the reasons not to use yEnc. You have done a service to those of us who have been annoyed with yEnc -- now we don't have to explain it to anyone, we can just point them to your essay.
So, be it resolved that yEnc leaves much to be desired.
However, if yEnc is the impetus which actually gets the community moving toward implementing a good, solid standard, then it will have served its purpose. Perhaps if we had had yEnc 5 years ago, we would have a standard already. But we didn't, and now we must pay the piper.
Since people aren't going to give up the advantages of yEnc without a substitute, the priority going forward is clear: to develop a better standard. If it truly is better (and not simply another hack) then ensuring its wide adoption shouldn't be too much of a problem. If, however, people can't be persuaded to switch, so much the worse for Usenet -- but no point in dwelling on doomsday scenarios. As you say, the cat is out of the bag, and all we can do is damage control.
In the very early 1990 I was using subnet, which was assimilated by usenet meanwhile, with uucp and wazoo.
Even way back then there was vital interest in more efficient filetransfer. uudecode simply sucked. Everybody agreed to this.
But what did happen in the last 12 years?
Nothing.
There are now more powerfull uudecode-implementations, but I havent really seen anything practical invention in the basics.
I totally agree that yEnc is a quickshot with none thinking at all and its weird and so on. But the usenet-gods havent done anything wort mentioning in the last 12 years and they will not do in the next five years. They lost it and are now crying for not getting asked. Sad fate, but graveyards are half full of indispencable people and half full of people who dispenced with them..
"Life is short and in most cases it ends with death." Sir Sinclair
7-bit died BEFORE THE PC- That's over 20 years ago. I seriously doubt there are any PDP11s still on the internet, and AFAIK nothing else felt the need to pack three 6-bit chaaracters into a 3 character file extension. Apart from that, it was a 16 bit machine and did 8-bit chars. Almost all other 7-bit hacks were dead before the PDP-11 was even launched. (except POCSAG pagers - now there is a seriously SAD protocol!)
The best answer is dotn compress for transmission - let the modem/imodem/NIC/why do it IF THERE IS A SUITABLE STANDARD, and if not, let the appropriate standards committee fix it, cos they have the tools for negotiation a compatible standard at run time.
Sent from my ASR33 using ASCII