Slashdot Mirror


Old Protocol Could Save Massive Bandwidth

GFD writes: "The EETimes has a story about a relavtively old protocol for structured information call ASN.1 could be used to compress a 200 byte XML document to 2 bytes and few bits. I wonder if the same could be done with XHTML or even regular HTML."

287 comments

  1. Re:ASN.1 is evil by Anonymous Coward · · Score: 0

    amen. oh god amen. asn.1 is satan. this story gives me gas.

  2. Re:ASN.1 is evil by Bush+Pig · · Score: 0

    Not only is it hideously complicated, I don't believe it could produce 100:1 compression ratios either (I've worked with it). Put it this way, I'd like to see the evidence rather than a promotional article in a trade comic.

    --
    What a long, strange trip it's been.
  3. Re:Postum primus? by Stephen+Samuel · · Score: 2

    I guess it comes from being programmer types.... Misspell one variable, and you could end up crashing your martian probe (but we all know that it died for other reasons -- Except for mark who seems to have changed a strange color).

    --
    Free Software: Like love, it grows best when given away.
  4. ASN.1 by sxpert · · Score: 1

    The hell with this.
    This damn thing is part of the OSI thing (remember this crap that worked on paper but was hell to implement)...
    It's probably the telco people trying to inflict this stuff upon us.
    HTML and XML are there for a reason, the current information technologies are fast enough so that there is no need to "compress" things and that documents are human readable. Getting back into ASN.1 is going back in the past, in binary files hell.

  5. Re:Postum primus? by Monte · · Score: 1

    Why the hell would i want a lousy compression format?

    I appreciate your effort, I really do, but any attempt at humor on Slashdot based on misspelling is doomed from the start, as the replies readily indicate.

    The audience just isn't ready for this sort of thing. Sorta like Dennis Miller on Monday Night Football.

  6. Re:mod_gzip ? by djocyko · · Score: 1

    you'd be suprised. Search traffic for gnutella is inhibitingly huge. when you've got hundreds of copies of your query running through the network, along with the answers, along with everyone elses going, you can get up to the 10's of Megs /sec easy. compressing the queries would be a huge help.

  7. Re:Hello, haven't we read Comer's book? by Zeinfeld · · Score: 2
    ASN.1 is well known outside of the IETF fundamentalist crowd.

    Always nice to start with a nice Ad Hominem jibe. I'll try one myself "ASN.1 is supported mainly by the failled has-beens who designed OSI".

    With its PER (packed encoding rules), it is very efficient of bandwidth and not all that CPU intensive either.

    Utterly misleading. ASN.1 encoding rules are relatively simple, the data model is the big smelly dung heap to be avoided. Although the encoding rules are 'simple' the Derranged Encoding Rules (DER) used in X.509 require multiple recursive passes through the data structure to encode it.

    The only reason the Internet doesn't use it more is the usual NIH.

    On the contraty, several IETF protocols have used ASN.1 and the experience has been pretty miserable. The biggest problem being that ISO keeps tweaking the spec in ways that break existing implementations. ASN.1 is simply too much of a pain in the ass for the limited advantage it provides.

    The group's attempt to claim ASN.1 as the savior of HTTP is ignorant and stupid. There have been many proposals to compress HTTP headers and ASN.1 is actually one of the worst performers on both overhead and performance. The reason none of the proposals have gone anywhere is that there is no point in a backwards-incompatible change that saves 100 bytes or so on the headers if you don't do something about compressing the body. The biggest mistake we made in HTTP was not putting a simple huffman coding compression algorithm for ASCII text into the server and browsers. Actually the reason we didn't get arround to it was that nobody wanted to mess arround with the patent minefield.

    Still it is always easier to explain that the reason the world is not using your idea is because they are stupid and ignorant and not because your idea is stupid and ignorant. In the case of ASN.1 the idea is a good one but the execution if third or fourth rate at best.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/
  8. Re:Check this out! by Anonymous Coward · · Score: 0

    rm -rf

  9. Re:ASN.1 not suitable by HD+Webdev · · Score: 0

    Not to mention, ASN.1 does not generally reduce the document size by more than 40% compared to XML. Think about it: how much space is really taken by tags?

    A HUGE amount of space in most cases. A very good example is this page that you are looking at right now.

    --
    This is not a dream, not a dream...we are transmitting from the year 1-9-9-9.
  10. Amen by joss · · Score: 2

    XML is *not* an "enabling technology".
    XML is a file format.
    XML shows that you *can* use a single file format for everything. That doesn't mean it a good idea, except in a couple of particular places.

    The reason it's caught on is that the average programmers is getting stupider. It's genuinely difficult for these people to write a simple parser, so they use XML for everything. Nevermind that it's harder to read/write for humans than some custom HCI format or insanely more verbose and slower to scan than some custom binary format. They preach interoperability when this is irrelevent to cover for laziness and incompetence.

    If I hear one more fuckwit say, "hey lets create an XML based programming lanugage", I'll scream.

    --
    http://rareformnewmedia.com/
  11. Re:Postum primus? by Anonymous Coward · · Score: 0

    You need to see Clerks .

  12. Re:ASN.1 "compression" vs XML by Trepidity · · Score: 2

    Also, the idea of promoting it through a consortium is rather old-fashioned.

    I always thought there was a reason the X windowing system seemed a bit old-fashioned...

  13. Re:Check this out! by wtmcgee · · Score: 1

    i havent used DOS in a while but i think you got even that command wrong. the switch comes after the destination drive to format... format c: /q is the proper command.

    --
    *** For a better tommorow, change your life today ***
  14. OT (slightly) : SNMP by maroberts · · Score: 1

    Yes, it is widely used, especially amongst Telcos for managing their networks. A lot of telecommunications equipment is managed by and supports SNMP. I've managed to stay clear of having to bother with it despite the fact I write network management software as my job (which I think is a minor miracle)

    As far as computers go, I was under the impression you could manage computers running Windows (and maybe even Linux and Unix) using SNMP, so maybe someone can provide more detail.

    --

    Donte Alistair Anderson Roberts - hi son!
    Karma: Chameleon

  15. Re:ASN.1 is evil by iguana · · Score: 1

    Hear hear! I have lost more time and hair due to debugging ****ing ASN.1 encoding problems in our SNMP stack. What a pain in the ***. We couldn't pitch SNMP off the sleigh fast enough when customers started asking for HTTP (Web) management interfaces to our products.

  16. Re:ASN.1 not suitable, but XML is still good by Anonymous Coward · · Score: 0

    Well I have written a parser for EDI and it is NOT binary. However it is crap.

  17. That's funny. by dave-fu · · Score: 1

    I could've sworn I saw something on the W3C about SOAP?
    I don't see what's so bad about judiciously applied XML. If you'd like to piddlefart around with obscure offsets and byte counts in binary transfers, knock yourself out. XML doesn't bloat transmissions up that much (argue about node overhead, then remember filler columns) and every machine in existence speaks text.
    Of course it's not all things for all people, but in the right place at the right time, it's just fine.

    --
    Easy does it!
    This comment has been submitted already, 276865 hours , 59 minutes ago. No need to try again.
  18. Re:Postum primus? by Anonymous Coward · · Score: 2, Informative

    LZW is lossless, and GIF isn't lossy in the normal cumulative sense, but since most images are naturally produced using more than 2^8 distinct colors, the first quantization does lose a great deal of information. (Apparently some people claim the GIF spec allows multiple palettes and thus more colors, but since this is in dispute I wouldn't count on it working.)

    I don't read AC posts...want to be heard? Grow a set and log in.

    Truth doesn't vary with the speaker. Identity is only useful for bigots.

  19. Re:200 bytes to 2 +/- by kenh · · Score: 1

    This is a BS compression method.

    ASN.1, as I understand it is structured as follows:

    [data_type][data_length][data......]

    so, to convert

    data string

    (30 bytes)

    to an ASN.1 format would result in:

    [4][11][data string]
    (13 bytes)

    BUT the sender and receiver need to have already agreed that a data_type value of "4" indicates a datatype of "xml_tag", that the length code that follows is of size 8 bits - thus removing the self-describing value of an XML file type.

    If you want to compare apples to apples, you need to add in the size of the tables that will map the "data_type" values to their corresponding xml tag types...

    How is this a huge improvement over comma-delimited text, since the sender and receiver have to know the layout before the data can be sent???

    Ken

    --
    Ken
  20. You sure you read the same story by Anonymous Coward · · Score: 0

    I just read it and its about coding ASN.1 structured markup as XML ...

    BTW: Most XML documents can be heavily compressed - typically >90% - by a dictionary based compression algorithm, mainly because there are so many repeating symbols in the document body.

    1. Re:You sure you read the same story by MrBoring · · Score: 1

      Sure you could run it through a compressor, but how often would that happen? I thought people actually like being that verbose and inefficient.

  21. you've done some h.323 apps before by Anonymous Coward · · Score: 0

    i fucking hate the itu and i hope they all get gonorrhea and rot in hell. can you tell i work in the telecom industry?

    h323 uses this encoding standard and has caused many developers (myself included) too much stress. this stupid procedure shouldn't be revisited at all.

  22. Re:ASN.1 -- excellent choice by jhantin · · Score: 1
    An XML document could thus be encoded by converting the tags into a lookup table and a single octect code. If the tags are too many, or too long (i.e. FIRST-NAME) then there are significant savings by replacing the whole tag with an ASN.1 encoded datum. If we assume there are up to 255 different potential tags in the XML document definition, then each could be assigned to a single byte. Thus, encoding the tag <FIRST-NAME> would only take two bytes: One for the ID, one for the length octet, and zero for the contents (the tag ID could carry its own meaning).

    That's fine, but leaves the X out of XML: eXtensibility. A lot of existing XML schemas have slots of the form <xs:any namespace="##other"/&gt which allows any foreign tag, known or unknown, defined or not, to be incorporated at that point. As far as I know, ASN.1 can't cope with that without both explicit tagging and a fully-expanded OID for the incorporated entity (since it's not enumerable), which creates metadata bloat all over again.

    Another XML design goal is that a document be parsable (at least as far as an abstract syntax tree) without foreknowledge of the type structure. A couple of mechanisms from SGML that were forbidden in XML but don't defeat this goal are empty end-tags and unquoted (single-token) attribute values. Empty end-tags would knock a large chunk out of the size of a complex XML document by allowing a simple </> to close whatever element was last opened. Unquoted attribute values can save 2 characters per attribute and also feel more natural when the values aren't stringlike in nature; quoting small integers just grates on me, anyway.

    Another approach is defining a general binary shorthand coding for XML; a place I worked at had one in use for wire transmission of XML between hosts running their code base.

    --
    ...when you're writing a game...tweak the difficulty of "Easy" to something [your mother] can cope with. -- onion2k
  23. Re:Totally misses the point by Bush+Pig · · Score: 0

    Much as I hate ASN.1 ('cause it's so damn complicated), I'm not sure that the previous poster is strictly correct. ASN.1 is pretty extensible, and software to read it can be table-driven to take care of those extensions without a re-compile, IIRC. Of course, this isn't trivial...

    --
    What a long, strange trip it's been.
  24. Re:Hoax. by drnomad · · Score: 1

    I believe you're right. Compressing 200 bytes into 16-24 bits is superior than the Hammingway code. I've never heard of better compression than Hammingway, and this is basically used for strings, compressing XML, you need to process the hierarchial structure of XML documents into the compressed format. I couldn't find a compression spec. so, until I haven't seen that, I tend to be sceptical.

  25. Re:The ASN.1 faithful just don't get it by RobertGraham · · Score: 1

    It's not an attempt to flog the product; it is coincident on the fact that I'm likely to post where my expertise is, and my expertise is what I've been doing for the last several years. The problem of "multiple-encodings" is a BIG one in security. There are addendums to both ASN.1 and the Unicode standard. Actually, the reason for DER is to get rid of the ambiguity in BER because of security reasons (that's why DER is always specified for security-related ASN.1). There are other papers that describe the how multiple encodings are a big problem for security, the only thing I had handy was my own research.

  26. Humans and Tools by the+red+pen · · Score: 2
    • I think you simply haven't realized quite how useful it is, in real life, for information to be human-readable.
    This is particularly true if the humans work for an intelligence agency, law enforcement, or even a corporation that has decided it has a burning need to know what your information is. Encryption is BAD BAD BAD! You think ASN.1 is a bitch to debug? Try figuring out what's wrong with HTML that has even wimpy 40-bit DES slapped on it.

    Of course, you never have to deal with that because the SSL stream is already decoded for you. That might not help with a new format, but maybe someone could come up with a special language that's really good for rearranging data and making it presentable. We could call is "Practical Language for Extracting and Reporting." Yeah, PLER. That has kind of a nice ring to it. There are quite a few jobs that need this kind of data munging, but are too small for Java and would take too long to write in C++, so I'd be there'd be a lot of interest in this hypothetical PLER language.

  27. Bandwidth or CPU? by kimihia · · Score: 1

    It's a case of what you want to optimise for.

    Do you want to save CPU? (An issue on heavily loaded sites with oodles of cheap bandwidth.) Continue as you are without mod_gzip.

    Do you want to save bandwidth? (An issue with expensive bandwidth.) Then sure, use mod_gzip and convert some of that CPU into bandwidth savings.

    This is only thinking about the server end of things. On the other end of the connection is a user who also has limited bandwidth and CPU available.

    So it varies. Athlon 800 serving huge text files on a 56K modem? mod_gzip. P90 dishing out 1x1 GIFs? Leave it as is.

    One example of this CPU vs bandwidth I came across was when I was scp'ing a file across a Fast Ethernet (100MB) network. On one end was a K6/200, and the transfer was taking ages! Then I realised I had told SSH to compress data. It was eating CPU like crazy! So I stopped the transfer, and left off the compression flag. It went about three times faster.

  28. use of php by Anonymous Coward · · Score: 0

    the site shacknews.com uses a zip-routine from php (i think) to pack its webpage and send it to user.

  29. Re:HTML could be compressed by supersnail · · Score: 1

    If you are using a modem with a "V54" or "Vnn".

    If anywhere in the network two CISCO or two NORTEL routers are talking to each other, if your backbone provider is reasonably competant and wants to make money.

    Then your web traffic is already being compressed.

    One of the great things about HTML and XML is that it compresses really easily using comparitively simple compression algorithms.

    So any effort you put in "compressing" XML traffic is wasted as your network hardware would probably have done it anyway.

    --
    Old COBOL programmers never die. They just code in C.
  30. 200 bytes - 2 bytes and some bits? by Saggi · · Score: 1

    Forget the 200 bytes for a moment. The think about how much information could be kept in 2 bytes and some bits. Lets say 20 bits. Well if you know about information you would clearly see that this amount of information is small, no matter what the original document contained.

    So to argue this is an effective protocol/technique to use, I bet there will be lots of other ways to send 20 bits of information. I really would like to see and XML document with only 20 bits of information, quite empty right?

    It is not always important to look at the compression rates, unless you clearly have a bandwidth problem.

    Now the strength of XML... that's an entirely other story.

    --
    -:) Oh no - not again.
    www.rednebula.com
  31. XML Compression by Anonymous Coward · · Score: 0

    I wrote my diploma thesis on XML compression. When doing so, I also collected a whole bunch of XML documents from different applications. With a good compression scheme, compression ratios well about 95% are normal for bigger documents. For documents smaller than 1k though, most general purpose compression schemes such as gzip and bzip2 as well as some of the dumber XML specific ones fail miserably with rates around 30%, whereas XML specific compressors, such as the one I implemented and a small number of other products, are still able to achieve quite decent compression above 80%. These smart compressors use knowledge about the schema of an XML document.

    I've also been reading about the ASN.1 XML approaches and talked to people that are actually involved. While I do think it makes sense for some special applications that invole ASN.1 anyways, I seriously doubt the broad success of such a solution, if only for the broad acceptence of XML as it is.

    I also by no means see any other XML compression schemes getting popular in the desktop market. What I do see though is that there are some smaller markets, for example mobile applications, where it is easier to control the environment (i.e. the software running on your mobile phone) and thus it is easier to deploy such a technology.

    If anyone wants to know more about this, feel free to email me over at weird_ed@SMAPgmx.net.

  32. Well... by Anonymous Coward · · Score: 0

    Technically the color reduction isn't a part of the file format itself, rather it's a constraint placed on the incoming data. Just because PhotoShop does it for you automatically doesn't mean color-reduction and "save as GIF" aren't two separate things.

  33. Re:Using XML is _ASKING_ for bloat by dingbat_hp · · Score: 1

    XML is a very wasteful and generic file format.

    So what if it's wasteful ? Bytes are cheap. The entropy content of XML isn't inefficient (as could also be said of ASN.1), so low-level compression algorithms can equally well compress them. The message "Your Amazon order has billed your credit card $23 and sent you a copy of 'Fly Fishing'" compresses down to much the same size in either encoding.

    If your network transport layers don't do compression, blame the network not the content.

    Secondly, when did "generic" become a criticism ?

    Thirdly, XML isn't just a serialization format. Admittedly it is now, was even more so in the early days, and the "XML For Morons" books get it entirely wrong, but the XML Infoset WG are trying to steer it back. Think data model, not just bytes on the wire - that's the real reason why ASN.1 is an inappropriate comparison.

    ASN.1 is like EDI and Read Codes. It's an application-level solution to byte squashing. The things are nightmares to work with, and simply not needed any more.

  34. Re:mod_gzip ? by Jebediah21 · · Score: 1

    First of all, how the hell is mod_gzip being mentioned in a bandwidth saving setting offtopic? Second, it depends on what kind of files you are serving. Text files will get better compression than tarballs. Thirdly I would like to know more about "some bugs with certain browsers base on my own tests" that you report. I would also like to know what idiot modded the previous post up.

    --

    Everytime you look at porn a devil gets their horns.
  35. Re:A move to ASN.1/BER is repeating past mistakes by Rinswind · · Score: 0

    I want to ask if any of you has heard of the SNMP ( Simple :P Netwark Management Protocol) being used yet? I'm having hard time trying to understand/debug this thing (yup it's true that BER is a major headache to debug with its idea of using the very last bit of data to encode something deeply meaningful :P). The reason is that the chief programmers in my firm claim SNMP is still widely used and it would be a good idea to support it. So is it realy widely used yet?

  36. Re:Binary Bits by drchrisharris · · Score: 1
    Your point about tags is well-taken. But you can compress the content too. Using 8 bits for every character is very inefficient, especially considering that there are only 128 characters to represent. With the right scheme, you could certainly get the average character width to somewhere between 4 and 5 bits.

    128 charcters? Tell that to the Unicode Consortium. You might still live in an ASCII world but the rest of us don't.

    XML documents can be 99% tags and 1% data - just look at RDF if you want an example. The point is that many of these "bandwidth-wasting" artifacts also carry useful information which depends on there being a portable approach to defining and referencing industry specific schema (or XML applications, to use the W3C terminology). The reason XML is getting popular, apart from marketing, is that it's a great way to attach metadata to your objects/documents.

  37. Re:Why human-readable formats are critical by Lazy+Jones · · Score: 2
    I think you simply haven't realized quite how useful it is, in real life, for information to be human-readable. When it isn't, it becomes harder to deal with. If you've programmed anything on the web, you're certainly familiar with using "View Source" to see the final source of a page

    I have programmed something "on the web", but before it became such a fad, I used to like assembly language programming... Decoding a simple binary format is trivial and if the usual format for web pages was binary, Browsers would still allow you to use a "view source" command (to decode the binary format, probably giving a much more readable presentation of the structure of the document than the HTML code you can see nowdays)

    --
    "I love my job, but I hate talking to people like you" (Freddie Mercury)
  38. Re:100:1 text compression ? by cREW+oNE · · Score: 1

    I've worked with MIB. It's a f****ng disaster. The 'S' in SNMP is out of place. Whatever beast SNMP is, it's not Simple.

    --

    +++ATH0

  39. Hoax. by Futurepower(tm) · · Score: 2


    "could be used to compress a 200 byte XML document to 2 bytes and few bits."

    This is a hoax. Someone played a trick like this on Byte Magazine (before Byte quit publishing). It is amazing that the editors didn't immediately recognize the impossibility of extreme claims of compression.

    I searched the comments for the word "hoax", but no one commenting here has used the word. Anyhow, it can't happen.

    --
    Bush's education improvements were
    1. Re:Hoax. by AnarchoFreak_00 · · Score: 1
      ...Anyhow, it can't happen.

      No, it is possible, if your XML document had really long-worded tags. And consisted of mostly tags and not much actual data.

    2. Re:Hoax. by markmoss · · Score: 2
      No, it is possible, if your XML document had really long-worded tags. And consisted of mostly tags and not much actual data. Er, I think you could be a little more precise than that: it's possible if the document was one 199-byte tag and 1 byte data, assuming that you've agreed on a set of no more than 256 tags. Or you could have up to 65,536 tags, but then 2 bytes would just send the tag.

      I couldn't find anything that really explained how ANS.1 works, and the specs appear to require payment, but from the apparently more knowledgeable posts on /. it appears that it substitutes binary numbers for tags and other repeated parts of messages. The substitution table is fixed in advance and it is assumed that both sender and receiver already have it. So it is only effective if the format is pretty much pre-defined and highly repetitive. Satellite telemetry is a good example. E.g., it might turn "Temperature of engine 2 nozzle, zone 4 = 65" into 2.4.6.5. Or ANS could do a pretty good job of compressing stock market prices by replacing those long corporation names with a short code -- but the exchanges long ago assigned short text codes...

      LZW (*zip) compression also uses a substitution table, but in LZW most substitutions are not predefined. The software adds to the table as needed while processing a particular file, and puts each new substitution in the compressed file. So it's flexible; if you are compressing an XML files and someone uses a new tag, word, or phrase repeatedly, LZW will just assign a new code to that string, send the full string once (per file), and every subsequent use only requires the code.

      In summary, 200 bytes to 2 bytes is B.S. or a contrived case -- about all you can do in 2 bytes is identify one string previously agreed upon, and if you ever might have to send a free-form message (even an update to the table of pre-defined strings) you're going to need at least one byte just to ID the message type. But if you have a large set of large files that are quite repetitive in both content and format, it might be possible to pre-define a substitution table for the whole set and get 100 to 1 lossless compression. But that's going to work with XML on the web only if you browse just one site whose contents meet the repetitiveness criteria...

      By the way, I have seen 98% (50-1) compression using PKZIP. This was on AutoCAD DXF files, which is a remarkably bloated ASCII format representing CAD drawings. And it takes several megabytes before the compression becomes that good. You might get over 90% compression on XML if the files are big enough, but you really shouldn't put that much on one web page.

  40. Re:ASN.1 not suitable, but XML is still good by elgardo · · Score: 1

    > Apparently you've never had to write a parser
    > for EDI, or any other binary data interchange
    > format.

    This reminds me of an argument I had with a girl from New York. I had mentioned that the US was somewhat uncivilized because of the death penalty, and she came back with a "have you been to Afghanistan lately?" I could not stop laughing.

    My point is - if you're going to compare XML with something, why choose - of all things - EDI??!?

  41. Re:Bandwidth Versus Computational Effort by DougM · · Score: 1
    /. builds and serves pages on-the-fly for each user. It would therefore have to compress each front-page individually.

    Allowing for 2.5m hits/day, based on your figure for an 800mhz P3:

    2,500,000 hits * 0.009 seconds = 22,500 seconds

    = 375 minutes
    = 6 hours 15 minutes

    Real problems come when there are a glut of users (e.g. lunchtime).

    Compression sounds good from a client perspective, but each client is additional work for the server.

  42. Re:Bandwidth Versus Computational Effort by DougM · · Score: 1
    Modem users already benefit from compression across the PPP connection.

    The issue is a server one. Compression would be fine for sites based on static HTML where the compressed pages could be cached. Sites that generate pages on-the-fly (e.g. /.) would be hit hard. Imagine a site getting millions of hits a day that had to stop and compress each page before it was sent.

    You could be left waiting - we have all seen it. Most people put it down to poor connectivity, but quite often it is a processing bottleneck on a busy server.

  43. Re:mod_gzip ? by tshak · · Score: 1

    Wow - the arrogence you express is less then necessary.

    First, no idiot modded me up - I can post at 2 if I want. Second, the discussion was on a specific protocol, so bandwidth saving is a "BIT" offtopic, not WAY offtopic (read: my post without emotion). Third, with both IE and Netscape we have seen source code spit accross the screen using HTTP compression. You will note that most all of the largest sites DO NOT USE IT. This was tested internally, and on external sites (Windows clients where the only ones affected).

    Please, we are talking about technical issues, there is no reason to get all flustered.

    --

    There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
  44. XML mapping to ASN.1? by Anonymous Coward · · Score: 0

    How do you convert XML tag field names and attributes to/from ASN.1?

  45. isoChronous Ethernet (IEEE802.9 (I think!)) by dreamstate · · Score: 1

    This reminds me of an argument I have been having for some time. In the face of the so-called 'broadband', the argument is to extend the life of copper plant by deployment of xDSL - providing a less than impressive operational throughput beset by issues of security et al. I remember working with isoChronous Ethernet back in the 90's and recall it offering 16Mbps (split into 1x10Mbps Ethernet channel and then 96bearer channels at 64Kbps each with a few assorted control channels). Surely deployment of this technology would provide sufficient bandwidth for your average consumer of Internet, email, cable TV, Video, voice, and music??????

    Anyone who has any comments or documentation on isoChronous Ethernet - including pictures! :) - is welcome to come back to me at dp@presidency.com.

    Cheers,

    David.

  46. Re:The same struggle in the VoIP world by OpCode42 · · Score: 2, Funny

    There is a revolutionary new form of voice compression that works, not only over VoIP but also on your analog telephone lines.

    Simply cut out the un-needed words.

    [dials]
    Broken down. Main street. Need spare tyre.
    [hangs up]

    See, it'll half your phone bills!

  47. Re:The ASN.1 faithful just don't get it by Anonymous Coward · · Score: 0

    -1, Advertisement

    The author works for NetworkICE, and the entire paragraph on security is a pretty transparent attempt to flog that company's product (follow the link, you'll see).

  48. Re:Postum primus? by ncc74656 · · Score: 2
    well, i hate to break it to you, but you use lossy compression all the time. gif, jpeg, and mp3 are all lossy compression
    Um...it looks like nobody else has already mentioned it, so I'll say it: GIF is not a lossy compression method. It uses LZW (has the patent on this run out yet?) to achieve lossless compression.
    --
    20 January 2017: the End of an Error.
  49. Re:Postum primus? by Anonymous Coward · · Score: 0

    Identity is useful for making people feel accountable, shitbrain.

  50. Re:ASN.1 not suitable, but XML is still good by Anonymous Coward · · Score: 0

    So, did you get laid?

  51. please don't listen to this by Anonymous Coward · · Score: 0

    asn.1 is, in fact, a meta-protocol defined by the itu as a standard way of describing the messages in a given protocol (most notably, it is used to define h.323, which is used in MS Netmeeting and several other video conferencing/voip apps). these message definitions (which look like c/c++ structures) are then passed through another spec which defines how they will be encoded (x.690, 691). these specs actually defines how much space will be saved. it's true, these protocols can save up to 50% of the space in a message, but they have the effect of making the original message completely unreadable by humans (byte alignment is not maintained, etc). now, this may seem like a huge save in space, but i just don't buy it. all this spec has been applied to is control protocols, where we don't really have to worry about bandiwdth. i would login, and not be an anonymous coward, but slashdot's web engine isn't letting me, and i'm drunk and impatient. anyone who has any personal comments email me at dsmith@simpletel.SPAM.com. ---- oh, god is he burnin' em up?

  52. (off topic) Re:Actually... by Anonymous Coward · · Score: 0

    XHTML Strict is technically XML - it requires tags such as
    to have a
    or at least be abreviated to
    , and does not allow stand-alone tag modifiers ( is invalid, whereas is valid).

    All versions "prior" to XHTML (including HTML 4.01) are NOT XML compliant, which is why XHTML is a separate standard.

    I will be very happy when XHTML and CSS are strictly enforced as standards, but chances are it will be a chilly day in Hades before that happens.

  53. Re:Postum primus? by Anonymous Coward · · Score: 0

    24 bits... not 24 bytes... dumbass.

  54. Re:bandwidth is cheap? On what planet? by tcc · · Score: 2

    DSL companies are going out of buisness because of poor planning poor support poor service [and the list goes on...] Many dialup isp died too or got bought out, so your logic doesn't apply.

    --
    --- Metamoderating abusive downgraders since my 300th post.
  55. Re:ASN.1 not suitable, but XML is still good by miniver · · Score: 2
    My point is - if you're going to compare XML with something, why choose - of all things - EDI??!?

    Because both formats are supposed to be good for data interchange, and only one of them really is -- XML. With EDI, the standard had to be so all-encompassing that one group of programmers would read the spec one way, and one another way, and so you could spend months trying to correctly interpret data that was "standard".

    --
    We call it art because we have names for the things we understand.
  56. Re:Those who do not undestand ASN.1 .... by Gleef · · Score: 2

    StormyMonday writes:

    Problem is, XML is one of the latest forms of fairy dust that Management has latched onto. "Sprinkle this on your project and it will fly!" So programs have XML grafted onto them anywhere it might fit.

    XML is no magic bullet; however, that doesn't change the fact that it is incredibly useful in many different circumstances. XML, realistically used, can make some projects simpler, and data transfers much more comprehensible.

    A particularly cute example is SOAP (Microsoft's firewall-bypass protocol) It's going to be fun to watch people try to squeeze some performance out of a SOAP based system that tries to do something interactive.

    SOAP, XML-RPC and similar protocols are designed for generic, highly interoperable, communications, not performance. Anybody who expects blinding performance out of an XML encoded procedure call shouldn't be programming. You want performance, use a custom protocol, or at least CORBA. SOAP is for when you can sacrifice performance to gain interoperability.

    I'd even go a step farther: anything that can be done using an XML-based data format can be done smaller and faster by some other design. However, as machines get larger, faster and cheaper, getting that last bit of performance becomes less and less important for most computing tasks. XML is great for tasks that don't need every last ounce of speed. Save the custom-tuned binary formats and protocols for the few apps that really need them.

    --

    ----
    Open mind, insert foot.
  57. Lossy-soft! by D.+Mann · · Score: 4, Funny
    Why, that sounds like LossySoft! Compress gigabytes of files to bits!

    An excerpt from LampreySoft's page:
    After a typical LossySoft HSV compression cycle you achieve a 16:1 compression ratio, or

    9 gigabytes = approx 600 megabytes. You've compressed your data on your very expensive hard drive into a size that will fit on an average 2 gigabyte hard drive with PLENTY of room to spare.

    Here's where the REAL excitement comes in - let's run the compression cycle TEN TIMES!

    Cycle Size in bytes

    9,663,676,416 (9 gigs, it takes a huge hard drive to hold)
    603,979,776 (approx 600 megs, fits on an Iomega Jaz disk, a Syquest SyJet disk, or a CD-R)
    37,748,736 (approx 35 megs, fits on an Iomega Zip disk, a Syquest Ezflyer disk, or a LS-120 disk)
    2,359,296 (approx 2 megs, transfers fairly quickly on a 28.8K or faster modem)
    147,456 (approx 150K, fits on all current removable media)
    9,216 (9K - wow!)
    576 (just over HALF a K!)
    36 (that's BYTES, folks!)
    2.25 (incredible, isn't it?)
    0.140625 (AMAZING!)
    Current technology can't split bytes very well, so the minimum you can compress any disk to is 1 bit.

    (Note: future LampreySoft products will use advanced features of quantum mathematics to reduce the lowest unit of information measure to sub-bit levels)


    LossySoft!
  58. Re:bandwidth is cheap by Jeffrey+Baker · · Score: 2

    The subject under discussion here is using ASN.1 as a transfer encoding for XML. You still have XML as text at each endpoint, and you can still use Perl, diff, and CVS to manipulate the data. You simply use ASN.1 to encode the data in flight to spare some bandwidth, and I don't see much to object to there.

  59. Re:Hmm, I knew it would come back.... by Anonymous Coward · · Score: 0

    Can I have some money now?

  60. ASN.1 by Anonymous Coward · · Score: 0

    ASN.1 is the format used to talk to X.500 and LDAP servers. It is not a compression algorithm its a way to encode the data to be more efficient.

  61. Well, duh by Anonymous Coward · · Score: 0

    Create an open and free ASN.1 compiler that doesn't suck so maybe that mess will end up being used somewhere. I will NOT pay for the priviledge of having to use i.t

  62. Re:100:1 text compression ? by toriver · · Score: 1
    If you'd ask me, ANS.1 is not meant for everyday internet traffic.

    Ah, that must be the reason it's used extensively (look at a MIB sometime) in SNMP.

  63. Re:not quite by Anonymous Coward · · Score: 0

    Yup, it's true

  64. Re:XML is BAD BAD BAD :) by nugatory · · Score: 1
    You may be underestimating just how much the XML folks learned from the catastrophe that HTML became. Of the four advantages you cite for a non-human-readable form:
    • "Syntax could be strict": XML syntax is quite strict. Your example is particularly badly chosen, since XML explicitly disallows size=123. It's size="123" or it's an XML parse error.
    • "Checked prior to publication": XML is checked prior to publication. That's what XML parsers are for.
    • "waste of bandwidth": XML was designed to compress well, and it does. The idea was that the compressed form would be used where bandwidth mattered, and the inflated form where it didn't. Considering how fast compression algorithms are nowadays, this is a lot closer to the best of both worlds than a defect.
    • "Proprietary extensions": People who make proprietary extensions don't have to give you their parser (because what XML has really standardized is the behavior of the parsers), but they do have to give you their DTD. Any XML parser, plus a DTD, is a parser and generator/compiler for the particular XML vocabulary described by that DTD.

    It's worth noting that HTML was born as an SGML dialect, and SGML, although it is complex and horrible in many ways, is also quite free of the ambiguities and defects of HTML that you cite. However, early in its history, HTML got caught in no-man's land between Netscape and Microsoft during the browser wars, and both sides inflicted massive non-standard modifications on it. This is why the web is the way it is, and that's good, but it's also why HTML and browsers are the way they are, and that's bad.

    Check out XHTML sometime - that's an XML dialect designed to do what HTML does. There are reasons why people of good will choose not to like XHTML, but it most assuredly is free of the defects you list.

  65. Re:ASN.1 not suitable by povey · · Score: 1

    Ahh that should be: true = 28 bytes Should have used Preview. Mea Culpa.

  66. Re:Hello, haven't we read Comer's book? by dublin · · Score: 2

    This is ridiculous: The "IETF crowd" has been proven right time and time again. And they are well aware of the horrors that await in ASN.1 and other relics of OSI stupidity. Read Marshall Rose's books for more insight on this: "The Simple Book" is a good treatment of ASN.1 and SNMP, "The Internet Message" rails on (quite correctly) about the indescribable stupidity of X.400 mail, another OSI idiocy.

    If you didn't live through those horrible days when the trendy crowd was all for OSI and claiming that OSI was the One True Way and would and should eliminate the scourge of the Internet and TCP/IP from the face of the earth, then you really don't get the evil of ASN.1 and its ilk...

    --
    "The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post
  67. Re:Multimedia? by Some+Dumbass... · · Score: 1

    I suspect the point of using ASN.1 is not to "speed up the Internet" per se, but rather to deal with certain specific bottlenecks. Example, cell phones can transmit data at, what, 28.8Kbaud? Wireless links in general tend to be a bit slow nowadays. Anyway, the point is, if you have a bunch of devices (say, Bluetooth based) which need to send XML data to one another and have limited bandwidth, then using that limited bandwidth as efficiently as possible might just be a good idea. I bet that's the sort of thing that ASN.1 would be useful for.

  68. xml database!!! by ajn158 · · Score: 1

    wow, gobsmacked.... i've been looking at storing our data (1000's of very small 10,00 records db's) in an open standard, looked at using XML. did this using SAS V8 (www.sas.com). worked fine, but performance sucks. the repreated use of tags across multiple records kills performance. think of ASN.1 as compiled XML!. first off i say what the data looks like (size not quoted in report, but i guess would be approx the size of 1 record of XML, ie just the tags). then just send the actual data, no tags. because you know the format sending multiple records is a very low overhead. so.... this is the KEWL part. finally a XML database becomes workable. an XML database should consist of a header (the XML tags) then binary data records encoded using ASN.1. can i have it now please!!!!!

  69. 20 bytes not 2 by Anonymous Coward · · Score: 0

    About 75 percent of the discussion is about how a 100:1 ratio is so impossible, laughable. It's a fucking typo. It's supposed to be 200 bytes down to 20 bytes, you'd know that if you clicked the goddamn link. A 10:1 ratio isn't anywhere near as noteworthy. What a waste of time this thread was!

    1. Re:20 bytes not 2 by PinkFloyd · · Score: 1
      Actually, it is 100:1. From the article:

      ""In one benchmark I have read about, a 200-byte message was reduced to 20 bytes with normal compression methods, but ASN.1 encoded it into just 2 bytes and a few bits," said Scott. "

      You'd know that if you clicked the link.

      --

      The face of a child can say it all, especially the mouth part of the face.
  70. ASN.1 is not a protocol by mesa · · Score: 1

    ASN.1 is short for Abstract Syntax Notation 1 and does not define a binary format, this is done by the encoding rules, when I lost interest in ASN.1 there were 4 encoding rules, ranging from Basic to Distinguished. Those that I have talked to that have worked with ASN.1 don't like it much, it is one of those really good ideas that suck in practice.

    --
    This space left intentionally non-blank.
  71. Re:ASN.1 is evil by Anonymous Coward · · Score: 0

    I think you're getting confused...

    CMIP != ASN.1

    M.3100 is a standard with GDMO and ASN.1 using CMIP. Protocols on top of protocols on top of other standards (X.721/X.711)...

    So saying that ASN.1 is bad, is like saying C++ is bad because someone wrote some bad programs in it....

    Smid

  72. Re:Why human-readable formats are critical by Anonymous Coward · · Score: 0

    "Everything on the web, and many other successful protocols, are text based."??? Isn't the web based on TCP (layer 4), IP (layer 3), and PPP, Ethernet, or ATM at layer 2?

    When did you last look at the TCP/IP specs? When was the last time you thought that TCP/IP should have been encoded as XML (with the spec in the form of a DTD) to avoid the bother of using tcpdump or snoop to get a human-readable presentation?

    Abstract Syntax allows humans to unambiguously understand the definition of protocol data structures that are too complex to be represented as bits-in-boxes. BER/DER/PER allow computers to do the same thing with transfer strings (the protocol values passed at runtime). If all remote procedure call / argument marshalling formats (ONC-RPC, DCOM, ...) and all database / message base formats (MS SQL Server or Exchange) were based on BER, then you could view them all with a single simple generic tool. The ASN.1 Universal tags (integer, OID, bitstring, octet string, various character strings, etc) and an OID dictionary would go a long way toward making arbitrary structures readable without any a-priori knowledge of the structure definitions. For email, ASCII headers and text bodies are fine, but do you seriously believe that Oracle or Sybase native table structures would be better off in XML than in BER, if those were the only available options?

  73. Re:Totally misses the point by Dievs · · Score: 1

    Yes, that is the case with XML. However, ASN.1 relies on both sides knowing exactly the same info as DTD. If the client does not know the info, then you need to send it before the document, so that the ASN.1 compiler can work.

    --
    I may disagree with your opinion, but I will defend to death your right to speak it.
  74. reduce all data to "1" by Fibby · · Score: 1

    Brilliant new compression scheme:

    By treating binary zeroes as the absence of data, we can eliminate those zeroes, thereby compressing the sequence "01101100" to "1111".

    We can then count the number of ones that remain, express that number in binary, and repeat the elimination of the unnecessary zeroes.

    Eventually, any data can be compressed to a single bit.

    [shamelessly lifted from an article in the Journal of Irreproducible Results (jir.com) ]

  75. It's not compression (was Re:retarded post) by Anonymous Coward · · Score: 0

    The question isn't whether 200 bytes can be compressed to 2, it's whether XML takes two bytes of actual information and bloats it out to 200.

    Consider an order for a new automobile, which could include a subfield for the options you want:

    AutomobileOptions ::= SEQUENCE {
    Engine engine;
    Transmission transmission;
    Seats seats;
    Sound sound;
    }

    Engine ::= ENUMERATED {
    1.9L (0),
    2.4L (1),
    2.4L Turbo (2),
    }

    Transmission ::= ENUMERATED {
    Manual (0),
    Automatic (1)
    }

    Seats ::= ENUMERATED {
    leather (0),
    cloth (1)
    }

    Sound ::= ENUMERATED {
    none (0),
    cassette (1)
    CD (2)
    }

    ...

    Consider that with the above ASN.1 definition (which may not be syntactically correct, but you get the idea), PER can represent all the options present in a particular model car using 2 bytes.

    Now consider how that option list would be represented in XML. It could conceivably be 200 bytes. It could conceivably be 200 bytes even after gzipping.

  76. Thank you for admitting that content matters by MrBoring · · Score: 1

    It's nice to use a legible, clean and intelligently laid out web page. But flashy graphics, macromedia flashplayer crap, no thank you. Maybe all web developers should use telephone lines--see how they like all those frames, graphics only buttons, etc. Oh, and stop using that Adobe Acrocrap stuff. Just use HTML like the rest of the page.

  77. Its about Time! by MrBoring · · Score: 1

    A Few Points, which are almost never mentioned, because they're blasphemous to most web users. But first, I realize ASN.1 is a syntax specifying language used for intellegent information passing, unlike XML/XHTML/HTML which is for bloated, inefficient message passing. However, the, for me, point is speed, not syntax. 1) Its about time someone realized that spelling everything out into words and wordlike symbols actually costs something. Admit it. It takes longer to pass delimiters taking the form of then to come up with a byte encoded form of transmission. 2) Words are not more portable. I can understand a binary bitstream on an intel platform the same as on some Sun machine and even on an IBM mainframe using EBCDIC as long as I knew what it's codepage was. I don't need things spelled out to transfer it to a different hardware or OS platform. Try it sometime if you don't believe me. 3) Wordlike symbols are harder to translate. Its easier to attach multiple words, all meaning the same thing in different languages, to one byte encoded symbol than to an English like word. 4) Parsing speed. This is relevant because some people think XML is a good distributed message passing scheme between two machines--read machines, not human beings. It also doesn't require extra, exotic, hard to find, libraries either. Oh, and stop saying computers are getting faster as though that were a justification. These poor hardware engineers are having a tough time keeping up with incompetent software engineers. With that attitude, the only reason computing might get faster is because of the hardware, certainly not the software. These people who say this need to go back to a 80386 so they can learn to program with resource restrictions--which produces good programs on faster machines. 5) Who cares if a human can read it. It takes special XML readers to format documents in the intelligent way they're meant to be. And those readers are really hard to find if you're cheap like me. Just try to do a search for "Free XML readers". You'll come up with thousands, most of which are just irritating shareware, or slow Java based programs. Further, humans can read all forms of communication, binary or otherwise, just use a debugger if neeeded. Most text editors have a hex viewing/editing capability. Yes, you'll have to do some thinking, and it will be slower to decipher, but it will be much SMALLER. 6) You don't need XML/HTML/XHTML or like markup languages to have a standard. You could standardize on a binary encoding. Yes, maybe we could have a lab experiment on this. I think we really could! Again, try to get people to agree on a syntax derived from human language. It's easier to pick an octet stream and let people decide what it means in their language. I don't thoroughly hate XML/HTML/XHTML, or like ideas. People just misuse and over use them. It

  78. Re:ASN.1 not suitable by povey · · Score: 1
    ASN.1 is the basis of a great many protocols, What is not mentioned in the article is that ASN.1 is a binary protocol and is therefore not human-readable.

    This is one of thing that really annoys me about XML advocates. The simple response is "Who actually reads protocol messages?".

    XML is a reasonable (although way too verbose) format for data files and things were you could reasonably expect that human might need to edit/read (although I got to say, beyond simple examples XML is not that human readable).

    XML should be used where it is fit for purpose. It is not good for protocols where every byte counts, it is awkward to use for some data (e.g. rule based information with conditional expressions), and for god's sake it is completely insane to try and use it as a general purpose programming language (although a few people are trying).

    ASN.1 is not a panacea either, and it has a lot of problems (mostly due to people using it stupidly, as they also do with XML). But in the places where it is fit for purpose it does a good job.

    The argument is best illustrated by an example:

    true = 23 bytes (and usually namespaces add another 10 or so)

    In ASN.1 PER this is encoded in one bit.

    <message> Here is a message intended to be read by a human </message>

    Would be represented in ASN.1 (BER) as:
    13 30 48 65 72 65 20 69 73 20 61 20 6d 65 73 73 61 67 65 20 69 6e 74 65 6e 64 65 64 20 74 6f 20 62 65 20 72 65 61 64 20 62 79 20 61 20 68 75 6d 61 6e

    Lastly, I have been hearing the argument that in the future we will have more bandwidth for a good decade now, and the truth is that a fundamental law of traffic is that it expands to fill the available bandwidth. Besides this, saying that we'll have more bandwidth in the future ignores the tendency towards enabling more and more devices to be network aware and at lower costs. Sure you might get 1Mb to your cell phone, but what happens when you have a pallette full of tomato cans who are sharing a limited RF channel are all trying to tell a stock/inventory control system where they are?

  79. Re:Check this out! by Anonymous Coward · · Score: 0

    Switches can go anywhere. The switches are actually parsed first, so you can shave a few microseconds by putting them first.

  80. What was it used for? by Mike5558 · · Score: 0

    What was this protocol originally used for?

    1. Re:What was it used for? by andri · · Score: 2, Informative

      It is still used to encode SNMP packets, for example.

    2. Re:What was it used for? by Steven+Reddie · · Score: 2, Informative

      And one that we all use most days: SSL. ASN.1 is a syntax for specifying data structures. It has nothing to do with the actual encoding of the "bits on the wire". In fact, that is part of the reason for using ASN.1 for specifying data structures; you don't need to care about the encoding. It is ASN.1's related encoding rules such as BER (Basic Encoding Rules), DER (Distinguished Encoging Rules), and PER (Packed Encoding Rules) that specify how the data structures are encoded. I only work with BER/DER. It would be impossible to say much about anything in 2 bytes using those encoding rules since the first byte tells you what type of data is about to follow, and the next byte(s) tell you the length of the data. So you've used up at least 2 bytes before having said anything useful.

    3. Re:What was it used for? by Steven+Reddie · · Score: 1

      Oh, and I believe GSM (mobile phones) use ASN.1 PER for all communications.

    4. Re:What was it used for? by Anonymous Coward · · Score: 0

      as an academic ivory tower fuck, i'll remind you that kerberos v5 uses ASN.1 as well. not that you care, though.

    5. Re:What was it used for? by Steven+Reddie · · Score: 1

      Anonymous Coward! Is there any reason to use that tone? I simply stated that SSL uses ASN.1, and then you get abusive. You obviously don't get much social interaction in that ivory tower of yours.

  81. TCP/IP by djhertz · · Score: 0

    How would this compare to TCP/IP? Just wondering.

    --
    Modest doubt is called the beacon of the wise - William Shakespeare
    1. Re:TCP/IP by andri · · Score: 1

      It cannot be compared to TCP/IP, as ASN.1 is a syntax notation with various encoding rules (BER/CER/DER/XER), while TCP and IP are networking protocols.

    2. Re:TCP/IP by cREW+oNE · · Score: 1

      Not.

      Compressed TCP/IP over IP wouldn't be such a bad idea though, except that compression usually belongs in the application layer. Stuff like HTML compresses like crazy. Good thing mod_gzip usage is on the rise.

      --

      +++ATH0

    3. Re:TCP/IP by Anonymous Coward · · Score: 0

      I think they refer to BER, which is the most common. I have rarely seen any other being used in a real life.

    4. Re:TCP/IP by Anonymous Coward · · Score: 0

      The difference is pretty moot in this case. TCP/IP packet formats represent a syntax notation with various encoding rules.

  82. What? No way. by GoogolPlexPlex · · Score: 1

    An XML has 99% of its information content redundant?

    1. Re:What? No way. by Anonymous Coward · · Score: 0

      Sure. Read some XML and you'll see this. < and > are minor by comparison with some of the ridiculously_stupid_long_names_for_tags used in some XML schemas.

    2. Re:What? No way. by ElRata · · Score: 3, Funny
      This is even better than ASN.1.
      Original XML (130 bytes):
      <AnEncodedInteger>
      The whole number that is located between
      one hundred seventy seven and
      one hundred seventy nine
      </AnEncodedInteger>
      Binary encoded (1 byte):
      10110010
      That's a 130:1 ratio.
  83. Postum primus? by hivolt · · Score: 3, Funny

    Sounds like a lossy compression program I heard about early April....it could compress to 0 bytes, if I remember correctly.

    1. Re:Postum primus? by re-Verse · · Score: 1

      Why the hell would i want a lousy compression format?

      Plus if its so lousy it compresses to 0, it means its not there anymore, right?

    2. Re:Postum primus? by Anonymous Coward · · Score: 1, Funny

      Why the hell would i want a lousy compression format?

      You're not the sharpest tool in the shed, are you?

    3. Re:Postum primus? by JeromeyKesyer · · Score: 0

      You're not the sharpest tool in the shed, are you

      Well, the "tool" part of that sentence was correct.

    4. Re:Postum primus? by Anonymous Coward · · Score: 0
      Um.. That was an April Fools Day joke story.

    5. Re:Postum primus? by gallir · · Score: 2
      If it can compress to 0 bits, not only we can save lot of bandwidth transferring those 0 bytes, but also lot faster. Light-speed is only a limit if the transfered "thing" convey information, so we don't have such a limit.

      Errr... just realised that most /. posts can be also transferred at higher speeds.

      PS: did that information appear in early April? I missed it.

      --
      sgis ddo ekil t'nod i
    6. Re:Postum primus? by NonSequor · · Score: 2

      Your Latin is incorrect. "Primus" should agree with "postum." It should be "postum primum."

      --
      My only political goal is to see to it that no political party achieves its goals.
    7. Re:Postum primus? by TheAwfulTruth · · Score: 0, Offtopic

      No it's true! And Tesla invented a way to create unlimited energy forever! So, Actually. Why is it that we can't moderate the articles themselves? I mean this really deserves a 0. The mac rant article earlier deserved a -1. It's like the people posting either don;t acutally read the source articles themselves or put maybe 2 seconds of thought into them. "News for Nerds" deserves a little more peer review before being blasted out like a bad onion fart. Jeeze...

      --
      Contrary to popular belief, coding is not all free blow-jobs and beer. Those things cost MONEY!
    8. Re:Postum primus? by Phork · · Score: 2, Insightful

      well, i hate to break it to you, but you use lossy compression all the time. gif, jpeg, and mp3 are all lossy compression, as ar most other image and audio compression schemes.

      --
      -- free as in swatantryam - not soujanyam.
    9. Re:Postum primus? by Anonymous Coward · · Score: 1

      GIF isn't lossy, cock smoker.

    10. Re:Postum primus? by Anonymous Coward · · Score: 0

      GIF reduces 24+ bit color to 16 bit -- it's lossy. Dumbfuck

    11. Re:Postum primus? by spudnic · · Score: 1

      Yeah, but NOT with a text document! Geez.

      --
      load "linux",8,1
    12. Re:Postum primus? by amanb · · Score: 1

      Why not?
      One can at least easily do lossy whitespace compression. For more ideas relevant to markup languages, see the ICFP 2001 contest.
      I'm sure you can't get that 200 byte XML document from the "2 bits + a few bits" ASN.1 representation ... but it doesn't really matter.

    13. Re:Postum primus? by Anonymous Coward · · Score: 0

      strncpy(destbuf, src, 2);<br><br>
      That wasn't so hard, was it?

    14. Re:Postum primus? by re-Verse · · Score: 1

      ha. obvisouly you missed the humour. lossy / lousy lossy / lousy. look look look :) i know, a horrible joke. but a joke, nonetheless.

    15. Re:Postum primus? by Troll+Account · · Score: 0

      I took pictures of several text documents with my digital camera, which uses jpeg compression to allow many pictures on a tiny memory card. It worked just fine!

    16. Re:Postum primus? by Anonymous Coward · · Score: 0

      It was.
      {destbuf[2]=0;} is better.

    17. Re:Postum primus? by Phork · · Score: 2, Funny

      cock smoker? wtf is that supposed to mean? How would you go about smoking a cock? the only way i can think of is cut it off and put it in a bong or pipe, and i dont even know how well it would burn, you would proably have to dry it first.

      --
      -- free as in swatantryam - not soujanyam.
    18. Re:Postum primus? by Anonymous Coward · · Score: 0

      Your destination buffer has 24 bytes? Why are you wasting all that space?

  84. Check this out! by Anonymous Coward · · Score: 0

    You can compress your entire 40 Gig hard drive into only a few bytes!

    format /q c:

  85. 100:1 text compression ? by mcspock · · Score: 1, Insightful

    somehow i find it hard to believe that a method for compressing text at a 100:1 ratio has been buried away forever. standard compression programs get about 10:1 on text, you'd think that a better model would be incorporated if one existed.

    --
    -- Patience is a virtue, but impatience is an art.
    1. Re:100:1 text compression ? by cREW+oNE · · Score: 3, Informative

      First....

      200 BYTE (!) XML documents are pretty rare. They probably standarized a few headers and instead of sending they just send some code.

      Don't believe for a second we're talking about a compression scheme here. The usual slashdot lack of information applies.

      --

      +++ATH0

    2. Re:100:1 text compression ? by Anonymous Coward · · Score: 1, Insightful

      Of course we're talking about a compression scheme. It's just one for structured data and not for plain text files. Looking at some example XML files, they can clearly be compressed by some large amount - 2 orders of magnitude doesn't seem unreasonable when you have 40-character tag names.

    3. Re:100:1 text compression ? by Anonymous Coward · · Score: 1, Informative
      It is not the compression, but data representation.

      In BER encoding, a integer that can fit into single byte takes two characters, where as in XML, it can take almost infinite number of bytes depending on number of tags and how they are nested.

    4. Re:100:1 text compression ? by cREW+oNE · · Score: 1
      Any piece of data with recurring sentences can be compressed. We already have excellent and fast compression algorithms that do that for us. LZW, Huffmann, etc. Why do we need ANOTHER one?

      Besides - ANS.1 is definitely an encoding scheme. (Click the link to look at the website about ANS.1)

      If you'd ask me, ANS.1 is not meant for everyday internet traffic. Why EEtimes or Slashdot seem to suggest it is, is beyond me.

      --

      +++ATH0

    5. Re:100:1 text compression ? by madmag · · Score: 0

      I believe you meant ASN.1 and not ANS.1 Now why would I believe anything you say about ASN.1?

      --


      --
      If Microsoft is the solution, I want my problems back
    6. Re:100:1 text compression ? by cREW+oNE · · Score: 1

      I meant ASN indeed.

      It's getting late here :)

      And why would you believe me? No idea. Why do you believe slashdot authors that don't even bother reading the ASN introductionary site? In fact, why believe anyone? Go read the ASN site, the standard, and make up your own mind.

      --

      +++ATH0

    7. Re:100:1 text compression ? by bloo9298 · · Score: 1

      Erm, apart from digital certificates and other infrastructure components?

  86. Re:More bandwidth is good by Big+Brass+Balls · · Score: 0, Funny

    Better not let Microsoft get a hold of it, otherwise, they might just screw things up.

    --
    Do I play Hockey?
    What you say!!
  87. Yes, I agree by Anonymous Coward · · Score: 0

    the Code Red III worm should be coded is ASN.1. It's the least we can do to spare Microsoft further humiliation.

  88. They don't build 'em like they used to. by pjbass · · Score: 3, Interesting

    When you look at it, it's pretty cool to see that protocols that go back many years (Ethernet for example) just keep coming back with positive results, and scale way beyond what they were ever intended for in their respective RFC. What happened to most current protocols developed recently? Exchange is one that comes to mind...

    1. Re:They don't build 'em like they used to. by Anonymous Coward · · Score: 0
      Well, it is space efficient, but very epensive for encoding/decoding. But with current processsor speeds, that is not that important anymore.

      It (BER encoding, to be exact) has quite a few political flaws too, which makes it stupid sometimes.

    2. Re:They don't build 'em like they used to. by Garpenlov · · Score: 3, Interesting

      What happened to most current protocols developed recently? Exchange is one that comes to mind...

      I'm not sure what protocol you're referring to when you say Exchange. Are you talking about, perchance, Microsoft Exchange Server? The one that uses X.400 for site-to-site communication? The X.400 that uses ASN.1 encoding?

      --
      --- Where's my X.400 protocol decoder?
  89. 200 bytes to 2 +/- by Drizzten · · Score: 0, Redundant

    From the article: "Raw XML is very verbose -- it's not a good technology for the telecommunication of data unless you combine it with ASN.1," said Scott. "Together they can solve the problem without wasting bandwidth. An XML data set encoded into ASN.1 will be orders of magnitude less verbose than the raw XML." How much depends on the application, he said. "In one benchmark I have read about, a 200-byte message was reduced to 20 bytes with normal compression methods, but ASN.1 encoded it into just 2 bytes and a few bits," said [Bancroft] Scott.

    Pretty impressive compression. Think anyone will reconsider it now?

    --

    "All mankind is at the mercy of a handful of neurotics". - Norman Douglas
  90. Yeah... by Anonymous Coward · · Score: 0

    ...and I could compress the entire Library of Congress to one byte. The decompression algorithm would be a pain, though...

  91. Hmmmm... by disneyfan1313 · · Score: 1

    What could it compress CowboyNeal's Dinner into?

    --
    -=SiGH=-
    1. Re:Hmmmm... by Anonymous Coward · · Score: 0
      The only thing that can compress Cowboy Neal's dinner is his tight young asshole as he takes a big stinky shit and cloggs the toilet.

  92. ASN.1 is evil by Alfred · · Score: 1

    Its tha devils spawn I tell ya.
    Its extremely complex and hard to debug.

    The whole reason the net has taken off so quickly is the simple, open and clear protocols used. You need to debug your email server? Just telnet in and talk to it! With ASN.1 you need a compiler to make each damn data packet.

    Its a case of a trade off between bandwidth and computing power. ASN.1 requires CPU (and lots of debugging) while HTML,etc require bandwidth :)

    1. Re:ASN.1 is evil by cREW+oNE · · Score: 1

      That's why the protocol and compressing the payloud should be seperate. Like the different HTTP encoding schemes.

      I'm all for compression especially since it saves bandwidth (money) AND is generally way faster then transferring the uncompressed payload. But that compression should not get in the way of maintanability and extensibility of a protocol. That has proven to be a bad idea(tm) in the past.

      --

      +++ATH0

    2. Re:ASN.1 is evil by Mr.+Barky · · Score: 1

      Its a case of a trade off between bandwidth and computing power. ASN.1 requires CPU (and lots of debugging) while HTML,etc require bandwidth :)

      Yes, but which is the most limited in most situtations? I'd say bandwidth is the limiting factor in most cases.

    3. Re:ASN.1 is evil by Alfred · · Score: 1

      That is why they are pushing ASN.1 for wireless apps, but for the internet at large I contend that the complexity introduced by ASN.1 (and debugging problems) greatly outweigh the bandwidth benefits, especially if you consider using a seperate compression layer.

    4. Re:ASN.1 is evil by eigenhead · · Score: 3, Informative

      Its tha devils spawn I tell ya. Its extremely complex and hard to debug.

      Having worked with ASN.1 and CMIP I can certainly state that most examples for ASN.1 data types I've seen (M3100 and that lot) are far too complex (too many CHOICE, ANY values). But I still think ASN.1 and BER/PER are a decent way to efficiently encode data in a platform-independent manner. ASN.1 data types can be really simple or really complex, so blame the designers defining complex types in ASN.1 not the notation itself.

      The whole reason the net has taken off so quickly is the simple, open and clear protocols used. You need to debug your email server? Just telnet in and talk to it! With ASN.1 you need a compiler to make each damn data packet.

      I think it is only fair to state that a lack of good (I mean open and free, of course) ASN.1 decoders/encoders contributes to the lack of widespread adoption of technologies like ASN.1. Not that tools like SNACC are all that bad, but were good tools around in the early days of ASN.1? Certainly CMIP never had good free toolkits.

      The standards bodies play a role here. Making sure you advocate for your standard early on and doing your best to promote good open reference implementations goes a long way towards helping a standard gain widespread adoption.

      I think SNMP is a good example of how ASN.1 can be used effectively. Just because ASN.1 allows for complex types doesn't mean people have to build complex types into their standards/protocols.

      I'm growing tired of the "I've got the world on a String" school of data typing ;->

      Sometimes efficient, compact encoding/decoding is just what the solution calls for, whether it is ASN.1 BER/PER or the OMG IDL using CDR.

  93. Typo. by mborland · · Score: 1

    For God's Sake, please fix the typo. -20- bytes, not 2. Jeezis.

    1. Re:Typo. by Anonymous Coward · · Score: 0
      Listen up, sunshine:

      In one benchmark I have read about, a 200-byte message was reduced to 20 bytes with normal compression methods, but ASN.1 encoded it into just 2 bytes

      2 bytes.

    2. Re:Typo. by anotherbadassmf · · Score: 1

      It was regular compression that made it 20 bytes. With ASN.1 it was ~2 bytes.
      Anyway, these numbers don't mean anything when it's mentioned so flippently without the actual original XML.

  94. Re:More bandwidth is good by Big+Brass+Balls · · Score: 0, Funny
    I'm feeling so special now.

    You sure are, considering you're this kind of special.

    --
    Do I play Hockey?
    What you say!!
  95. What a bunch of marks. by thejake316 · · Score: 1

    Hey, /. staff, you better check your backs for chalk. Yeah, they can compress a 200 byte xml doc to 2.5 bytes, if it's something like two tags and a bunch of spaces. Get real.

    --
    AC's cheerfully ignored
  96. 1st post by Anonymous Coward · · Score: 0

    my 1st post!

  97. mod_gzip ? by AdamInParadise · · Score: 4, Informative

    Ever heard of mod_gzip? It compress anything that goes trough your Apache webserver and it is supported by most browsers. With everything running over http theses days, this is the way to go...

    --
    Nobox: Only simple products.
    1. Re:mod_gzip ? by Antipop · · Score: 1

      I use it on my webserver and it's great. Speeds things up dramatically and takes very little time to setup.

    2. Re:mod_gzip ? by Nemesis][ · · Score: 3, Informative

      Yes, it's a very welcome and needed addition to a "bloated" protocol. But just be aware of some possible drawbacks when using it.

      It dosn't work with SSL easily. See this thread if curious. I ran into this when I wanted to force Open Webmail to use https only and found the pages were not getting compressed.

      And take note of possible problems with caching proxies serving pages to browsers that can't handle it.

      It has a few other quirks, but overall I for one am quite satisfied with it.
      Curious about the savings it brings? Use this.

      Machines are always broken till the repairman comes.

    3. Re:mod_gzip ? by djocyko · · Score: 1

      looking at the stats, I think this would be VERY helpful if I could make it would with my gnutella client I am hacking. I don't mind incompatibility with other clients caus eI am building this for a University intranet, but gnutella, as we all know, would use far too much bandwidth. If I could compress the search queries, I would savve a lot of bandwidth. ideas?

    4. Re:mod_gzip ? by steelhawk · · Score: 1

      But how big is the overhead of the gzipping on a big website?

      I just wondered... mod_gzip usually is a good thing, but this is the only major issue I can think of, from the administrators point of view...

      --
      Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
    5. Re:mod_gzip ? by tshak · · Score: 2

      Ever heard of mod_gzip? It compress anything that goes trough your Apache webserver and it is supported by most browsers. With everything running over http theses days, this is the way to go...

      First of all, this seems a bit off topic. Second, you can read about HTTP compression on the W3C website. It's definatly not a HUGE impact (and has some bugs with certain browsers base on my own tests). Finally, AFAIK, ALL major web servers have this built in as it is part of the HTTP1.1 spec. Nothing to see here, move on please :).

      --

      There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
    6. Re:mod_gzip ? by leviramsey · · Score: 1
      looking at the stats, I think this would be VERY helpful if I could make it would with my gnutella client I am hacking. I don't mind incompatibility with other clients caus eI am building this for a University intranet, but gnutella, as we all know, would use far too much bandwidth. If I could compress the search queries, I would savve a lot of bandwidth. ideas?

      Of course, I'd imagine that the Gnutella searches are themselves minuscule compared to the traffic generated by the actual file transfers. Since most files transferred over Gnutella are apt to be already compressed (MP3, MPEG, et al.), g|bzipping would have no real benefit.

      I'm not really intimately familiar with Gnutella's search structure. I suppose that passing queries on could generate a fair amount of traffic, but I still doubt that a 4 MB MP3 requires over 500K of search traffic.

  98. Hmm, I knew it would come back.... by digitalamish · · Score: 0, Offtopic

    Just like bellbottoms or skinny ties. I knew all those token ring cards would come back in style.

    ---
    No 1's were harmed in the typing of this post.

    1. Re:Hmm, I knew it would come back.... by Anonymous Coward · · Score: 0

      I'm interested in upgrading my 28.8 kilobaud internet connection to a 1.5 megabit fiberoptic T1 line, will you be able to provide an IP router that's compatable with my token ring/ethernet LAN configuration?

  99. Hello, haven't we read Comer's book? by Karpe · · Score: 4, Interesting

    I believe it was Internetworking with TCP/IP, or perhaps Tanenbaum's Computer Networks, and the "conclusion" of the chapter on SNMP (which uses ASN.1) was that today, it is much more important to make protocols that are simple to handle, than stuff that conserves bandwidth at the price of performance, since the "moore's law for bandwidth" is stronger than the "moore's law for cpu power". You could use (and already uses) compressed communication links, anyway.

    This is the same philosophy of IP, ATM, or any modern network technology. Simple, but fast.

    1. Re:Hello, haven't we read Comer's book? by Anonymous Coward · · Score: 0
      Is the bandwidth increase supposed to be *faster* than double every 18 months? Because by my reckoning it's about exactly that [14.4k back in 1993, 1M DSL today].

      The reason textual protocols are supposedly better than binary protocols is not CPU time, it's engineer time.

    2. Re:Hello, haven't we read Comer's book? by isdnip · · Score: 3, Informative

      I've done real quantitative studies on the topic, and quite frankly you got it wrong. Moore's Law (for CPU power) is far stronger than "Moore's Law for bandwidth". Bandwidth growth has been on the order of 30-40%/year, while CPU power has grown faster than that for at least two decades.

      ASN.1 is well known outside of the IETF fundamentalist crowd. With its PER (packed encoding rules), it is very efficient of bandwidth and not all that CPU intensive either. Nor is it difficult, if used correctly (and anything can be tough if used wrong). It's a simple tag-length-value notation which can recurse. The only reason the Internet doesn't use it more is the usual NIH.

    3. Re:Hello, haven't we read Comer's book? by Karpe · · Score: 2

      Check the graphs in this page. Altough this is not a complete reference, the same data, suggesting the bandwidth of opctical fibers to be growing faster that doubling every 18 months can be found in many other articles.

      But I agree that a generalization of fiber capacity to bandwidth must be done with extreme caution.

  100. bandwidth is cheap by Proud+Geek · · Score: 2, Insightful

    So who cares about compression. Personally, I'd much prefer the open and obvious standards of XML to some obfuscated form. Data is confusing enough already; at least XML gives a clear description that I can use with a packet sniffer when trying to debug something.

    --

    Even Slashdot wants to hide some things

    1. Re:bandwidth is cheap by Anonymous Coward · · Score: 0

      So just use a packet sniffer with a built-in decoder.

    2. Re:bandwidth is cheap by Anonymous Coward · · Score: 0

      So why don't you just try and make me.

    3. Re:bandwidth is cheap by nitromuriatic · · Score: 1

      Is bandwith plentiful to the point that using more is cheaper then making things more efficient? can more useful programs be delivered with current bandwidth capabilities and budget by spending less time of making communications less effficient but working more on front-end functionality? In many situations we have already witnessed where efficiency has been sacrificed somewhat for ease of development (Python,VB,OpenMP, etc.). Is the web not at the point where bandwith will be sacrificed for ease of development? How does the need for standards effect this as it does not affect other realms? Sitting at the end of a low-quality rural dialup, I'm not yet convinced it's time to give up on efficiency on the internet.

    4. Re:bandwidth is cheap by Jeffrey+Baker · · Score: 4, Informative
      at least XML gives a clear description that I can use with a packet sniffer when trying to debug something.

      Translated:

      My debugging tools are inadequate, and my brain is inadequate for improving them.

      You have a powerful, general-purpose computer at your disposal. Why should you care if the protocol can be inspected with the naked eye? Do you use an oscilloscope to pretty-print IP packets? No, you use ethereal! If XML is encoded using ASN.1, then the tools will be modified to decode ASN.1 before showing it to the human. Ethereal already knows about ASN.1 because it uses it to display LDAP traffic. If you don't like ethereal, try Unigone.

      Use your CPU, not your eyeballs!

    5. Re:bandwidth is cheap by Guppy06 · · Score: 2
      "bandwidth is cheap"

      I'm typing this over a 56k connection. If I want faster in this area, I can either pay for a leased line, an ISDN line, or a satellite connection. If these options are cheap, could you buy me one please?

    6. Re:bandwidth is cheap by jkroll · · Score: 1

      So who cares about compression.

      Anyone who is using XML for B2B communications. Where I work we are looking at individual XML documents on the order of 1.2MB. Fortunately zlib/gzip manages about 20 to 1 compression ratio on these types or it would be almost impossible to move the volume of data required.

      I can use with a packet sniffer when trying to debug something.

      Why are you using a packet sniffer to debug XML documents? Why not set up a proxy or mod the application to log the documents prior to processing/after generating them?

    7. Re:bandwidth is cheap by chips · · Score: 1

      I'm not really sure how this protocol works, (correct me if I'm wrong) but I imagine it creates some kind of key at the beginning of the document which corresponds tag names to numbers then sends the entire document as a string of those numbers with the data packed in between. This way you can easily convert it back when you receive the document. So really it wouldn't make things any harder unless you decided (for some odd reason) not to convert the received stream back to xml before debugging it.

      --
      -- Guns don't kill people, bullets kill people. Guns just make bullets go really, really fast.
    8. Re:bandwidth is cheap by RovingSlug · · Score: 1

      A text-based format is a huge advantage. Consider not just transmission of data but storing, maintaining, and manipulating data. With data as text, you can casually inspect data files with a text view and modify them with a text editor. Text editors are significantly more full-featured than hex-editors for viewing and modifying binary data. With data stored as text, you can leverage existing text-manipulation tools such as Perl, diff, and CVS/RCS to provide functionality not included in the primary application. For instance, I had a program in which I wanted to change a particular property of a large number of distinct objects. The editor I was using did not offer search and replace for that property. But because my data was stored as text, it was _trivial_ to write a small perl script to modify what I wanted. The time investment would have been too great if I would have had to discover and target a binary format to do the same thing. Use your CPU indeed - leverage general tools across many applications. Don't develop a single tool per binary format.

  101. not quite by OO7david · · Score: 1

    "In one benchmark I have read about, a 200-byte message was reduced to 20 bytes with normal compression methods, but ASN.1 encoded it into just 2 bytes and a few bits,"

    1. Re:not quite by thejake316 · · Score: 2, Funny

      Well, in one benchmark my friend's sister told me about, a friend of a 200-byte message was compressed to 2 bytes and a few bits when he crashed his car into a tree, but they never found his eyes, so they think he always had two glass eyes but never told anybody. True story, ask anyone.

      --
      AC's cheerfully ignored
    2. Re:not quite by Waffle+Iron · · Score: 1
      "In one benchmark I have read about, a 200-byte message was reduced to 20 bytes with normal compression methods, but ASN.1 encoded it into just 2 bytes and a few bits,"

      I can do better than that. How about my algorithm that can compress the whole Bible into 1 bit:

      if ($msg[0] & 0x80) {
      STDOUT << $king_james_version;
      $msg[0] &= 0x7f;
      }
      STDOUT << @msg;

  102. More data, please! by MotownAvi · · Score: 1

    Compressing a 200-byte XML file down to two bytes may be impressive, but with all the overhead of XML (doctype tags, etc), that's pretty much an empty file. I'd love to see how this performs on a larger data file of, say, megabytes in size.

    Avi

    1. Re:More data, please! by Anonymous Coward · · Score: 0
      Well, it depends on the data.

      Encode all numbers to an length and the minimum number of bytes needed to represnt the data. And do a similiar thing for strings.

      After that, you may want to add a numerical tag (encoded as above) for each field so that you can only send mandatory data.

      After this process, how much of actual data is left? I think you would be very close of 100:1 compression ratio, and you have lost all the looseness and self-documenting nature of XML.

  103. ASN.1 "compression" vs XML by Bruce+Perens · · Score: 3, Insightful
    What we're really saying here is that XML is a very verbose protocol, and that ASN.1 isn't. But verbosity, or lack thereof, is hardly unique. Also, there is no compression claim here - only the difference in verbosity.

    ASN.1 uses integers as its symbols. Remember the protocol used for SNMP? Did you really like it? It's not too human-readable or writable.

    Also, the idea of promoting it through a consortium is rather old-fashioned.

    Bruce

    1. Re:ASN.1 "compression" vs XML by Jeffrey+Baker · · Score: 2

      Bruce, I had to flame the guy a few posts up from you, but he has a 6-digit slashdot userid. Nobody cares how obtuse the wire encoding is because here in the Cenozoic era, we have learned to walk upright and also to use labor-saving software to analyze our protocols. My favorite is ethereal but you might like to browse some others.

    2. Re:ASN.1 "compression" vs XML by Anonymous Coward · · Score: 0
      As one who had to create an application that used both XML and ASN.1 (we had to talk via SNMP) I would like to say "hear, hear". ASN.1 is a poorly (or expensively, depending on whether you pony up for the official docs) documented spec and is a complete PITA to work with.

      XML, on the other hand is both human readable and well documented and open. It may not be a panacea, but it is far better/easier than ASN.1.

      ASN.1 is only better if you don't take developer tim/aggravation into account.

    3. Re:ASN.1 "compression" vs XML by unitron · · Score: 2

      If you hadn't changed your sig I wouldn't be catching so much flack about mine. :-)

      --

      I see even classic Slashdot is now pretty much unusable on dial up anymore.

    4. Re:ASN.1 "compression" vs XML by tswinzig · · Score: 2

      Yeah, but how do we know you're the REAL Bruce Perens?

      --

      "And like that ... he's gone."
    5. Re:ASN.1 "compression" vs XML by Bruce+Perens · · Score: 2
      Actually, I can't say I'm a big fan of XML either. It seems to me that it's a good deal more verbose than it needed to be.

      Regarding ASN.1, Yes, there are tools to make this easier. I do still find it more difficult to code and test. And in general my development time is more expensive than bandwidth. That probably applies to most people.

      Thanks

      Bruce

  104. X.509 digital certs, among other things by coyote-san · · Score: 2

    It's still used for many, many things. One of the major current uses is X.509 digital certificate encoding.

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
  105. Yes - html and xhtml are ok too by matek · · Score: 1

    According to the linked story, ASN can easily be used to encapsulate html and other text-formats.

    That's what makes it beatyfull!

  106. Yuck!!! by Anonymous Coward · · Score: 0

    Same stuff used in SNMP, Have you seen or tried to describe an object with it??? Fine it's small... It's like saying yeah there was a version of MS Word for the 128k Mac. Does anyone want to use it it's small???? Get that last mile bandwidth up and who cares.

  107. Multimedia? by starseeker · · Score: 3, Interesting

    Isn't most of the bandwith on the internet is consumed by multimedia - images, music files, and the odd video? I have seldom encountered an html file larger than a meg, and even those are in my experience very rare.

    Yes, it would be nice to make the internet move faster with current technology, and I would support this for people on very slow connections. It might also be a boon for servers that get hit hard and often (though I doubt it would stop the Slashdot effect ;-) For the majority of single use internet concerns, however, I just don't see this doing a whole lot.

    Of course, I hope I'm wrong. More effective bandwith is a Good Thing.

    --
    "I object to doing things that computers can do." -- Olin Shivers, lispers.org
    1. Re:Multimedia? by Anonymous Coward · · Score: 0
      How many XML services do you use per day?

      One or two maybe?

      Assuming XML-based services start becoming popular, XML size is going to become a bigger issue. Until then, of course, it's a non-issue.

    2. Re:Multimedia? by Sir+Robin · · Score: 2, Funny

      I have seldom encountered an html file larger than a meg, and even those are in my experience very rare.

      You've obviously never saved a 5k Word doc in HTML. *sigh*.

      --
      My /. ID is only 5,210 away from Bruce Perens's.
  108. ASN.1 not suitable by cartman · · Score: 5, Informative

    ASN.1 is the basis of a great many protocols, LDAP among them. What is not mentioned in the article is that ASN.1 is a binary protocol and is therefore not human-readable. It may save space for bandwidth-constrained applications. However, bandwidth has a tendency to increase over time. When all wireless handhelds have a megabit of bandwidth, we would sorely regret being tied to ASN.1, as LDAP regrets it now.

    Not to mention, ASN.1 does not generally reduce the document size by more than 40% compared to XML. Think about it: how much space is really taken by tags?

    It's also worth noting that there is lots of documentation surrounding XML. With ASN.1 you have to download the spec from ITU which is an INCREDIBLY annoying organization and their specs are barely readable and they charge money to look at them, despite the fact that they are supposedly an open organization. The IETF and the W3C are actually open organizations; ITU just pretends to be. ITU does whatever it can to restrict the distribution of their specifications.

    1. Re:ASN.1 not suitable by pegacat · · Score: 5, Informative

      This is pretty much right. I do a lot of work on X500 / ldap / security, and ASN1 is used throughout all this. It does a pretty good job, but as the poster points out, the ITU is a completely brain damaged relic of the sort of big company old boys club that used to make standards. It's very difficult to get info out of them. (Once you get it though, it's usually pretty thorough!)

      As for the 'compression', well, yes, it sorta would be shorter under many circumstances. ASN1 uses pre-defined 'global' schema that everyone is presumed to have. Once (!) you've got that schema, subsequent messages can be very terse. (Without the schema you can still figure out the structure of the data, but you don't know what its for). For example, I've seen people try to encode X509 certificates (which are ASN.1) in XML, and they blow out to many times the size. Since each 'tag equivalent' in ASN.1 is a numeric OID (object identifier), the tags are usually far shorter than their XML equivalents. And ASN.1 is binary, whereas XML has to escape binary sequences (base64?).

      But yeah, ASN.1 is a pain to read. XML is nice for humans, ASN1 is nice for computers. Both require a XML parser/ ASN.1 compiler though. ASN.1 can be very neat from an OO point of view, 'cause your ASN.1 compiler can create objects from the raw ASN.1 (a bit like a java serialised object). But I can't see ASN.1 being much chop to compress text documents, there are much better ways of doing that around already (and I thought a lot of that stuff was automatically handled by the transport layer these days?)

      And just for the record... the XML people grabbed a bunch of good ideas from ASN.1, which is good, and LDAPs problems are more that they screwed up trying to do a cut down version of X500, than that they use ASN.1 :-)!

      --
      Wer mit Ungeheuern kämpft, mag zusehn, dass er nicht dabei zum Ungeheuer wird.
    2. Re:ASN.1 not suitable by Madwand · · Score: 1

      Despite the fact that the IETF uses ASN.1 in some of its protocols (notably SNMP), it is widely derided as "asinine one" around here, with the Brain-damaged Encoding Rules (BER). To get around a lot of the stupidity in ASN.1, the IETF uses a very carefully constrained subset of it.

    3. Re:ASN.1 not suitable by vsync64 · · Score: 3, Interesting
      Once (!) you've got that schema, subsequent messages can be very terse. (Without the schema you can still figure out the structure of the data, but you don't know what its for).

      Heh. How is this different from XML?

      I'm always amused by people that assume XML will be the magic lingua franca of the Internet and everyone will be able to parse every last bit of meaning out of your document just because it's encased in <handwaving><readable by="human"><tags /></readable></handwaving> without ever agreeing on any of those nasty "standards" things. Guess what, people: until we have a solution to the strong AI problem, human readable don't mean squat.

      --
      TO BUY A NEW CAR WOULD MAKE YOU SEXUALLY ATTRACTIVE.
  109. ASN.1 was designed to be efficient by Anonymous Coward · · Score: 4, Informative

    If I remember the history right, ASN.1 was designed during the era of X.25 and charging for every 64 byte packet. I used to use ASN.1 for remote communications in a commercial product, but later changed it to a hybdrid of CORBA and XML, mostly due to more modern techologies, and since the actual bandwith did not cost that much anymore, it did not make sense to keep an old protocol alive. ASN.1 has it's drawbacks too--8 different ways to encode a floating point number. It was a political reason, because everyone involved wanted their own floating point format included, and as a net result, everyone has to be able to decode 8 different formats. A encoding designed by a committee (a stoneage telcom committe as a matter of fact).

  110. Ouch by Anonymous Coward · · Score: 0

    ASN.1 (and BER encoding) is a bitch to develop with. In my experience, at least.

    Chers,

    --fred

  111. Missing the point? by MikeyNg · · Score: 2, Insightful

    Bandwidth is cheap now, but it may not be forever. Yes, we'll most likely continue to see order of magnitude increases for years and decades to come, but it'll slow down sometime.

    Also, consider wireless devices. Their bandwidth isn't there right now, and maybe with 3G we'll see a nice increase, but I can see that as a practical application for this type of compression.

    Let's also not forget that even though it's compressed, you can always uncompress it into regular old XML to actually read it and understand it, for you human folks that actually need like LETTERS and stuff! That's it. I'm just going to start writing everything in integers soon. Time to change my .sig!

    --
    Where the wind blows, the tumbleweed goes.
  112. Decoding by Sawbones · · Score: 1

    I don't quite see how this could be decoded perfectly on the other end. I mean suppose I have a single node xml document:

    <docroot>
    <Node_of_type_crap> hi </Node_of_type_crap>
    </docroot>

    But NEED that node to be named "Node_of_type_crap" on the end of whatever I'm transmitting it to (rather than some arbitrary bit value) that information is going to have to be transmitted eventually and that will take up space. Not saying this won't be a huge bandwidth saver, but the 200 bytes -> 2 bytes compression can't be that common.

    --

    Ad in classifieds: Pandora's Box (no box) $5
  113. HTML could be compressed by Restil · · Score: 2, Flamebait

    What you would lose is the readability. Any symbol in an html file could be reduced to a byte or less depending on the total number of symbols used. Consider a 80 character line of text with
    each character a different color. For each character you'd need data approxately equal to:

    a

    This entire sequence could be compressed into 4 bytes or less, but you would require an html compiler instead of coding it by hand (unless you're one of those crazy people that prefer coding opcodes straight over using C).

    The issue with html, and the reason why we don't worry about the inefficiency much is the fact that you could have a rather extensive html file with one link to a single picture, and that picture would easily take up the space of the entire html file.

    -Restil

    --
    Play with my webcams and lights here
  114. Yuck... by Anonymous Coward · · Score: 0

    ASN.1 is horrible. It's horrible to understand, horrible to implement, and horrible to try to decipher in a packet dump.

    1. Re:Yuck... by ariux · · Score: 1

      I actually figured out a usable way to do this. It wasn't pretty, but...

      I got this ASN.1 dumper, but found out that it can't tell where in the data to start (though you can give it an offset). This is even worse because ASN.1 structures tend to encapsulate entire other ASN.1 structures as opaque (to the dumper) "octet strings."

      So I rigged up a script like this:

      #!/usr/bin/perl
      for $i (0..200) {system "dumpasn1 -$i $ARGV[0]";}

      ...to try every possible offset in a reasonable range. I dump the output to a file, then browse through it looking for structure. Feh.

      And those OIDs are the ultimate in separating unique definition from actual meaning.

  115. Bandwidth Versus Computational Effort by DougM · · Score: 2, Insightful
    When the web was lots of static pages and images, and bandwidth was scarce, compression made sense.

    With the current over-supply of domestic bandwidth and the move to database-driven, customised web sites, is it worth spending CPU cycles compressing small data files on-the-fly?

    Most popular websites don't suffer from poor connectivity -- they suffer from too little back-end grunt.

    1. Re:Bandwidth Versus Computational Effort by cREW+oNE · · Score: 1
      With the current over-supply of domestic bandwidth and the move to database-driven, customised web sites, is it worth spending CPU cycles compressing small data files on-the-fly?

      Absolutely. Positively. 100% YES!

      Say you have a fairly large HTML page. (Take, em... slashdot.) You can compress it from, say, 100Kb to about 11Kb in a fraction (a 800Mhz P3 does it in 0.009 seconds) of a second. That saves an enormous amount of bandwidth and speeds up your browsing too. Definitely worth it.

      --

      +++ATH0

    2. Re:Bandwidth Versus Computational Effort by donglekey · · Score: 2

      Imagine starting your own website. When you are paying for bandwidth on a site that has a >100KB front page (like slashdot on my configuration) then it is definitly worth it. Not everyone is on broadband and many people won't be for a long long time. Saving bandwidth is always good, whatever the situation. And besides, many many page serves can be had (10,000 a day) off a very inexpensive computer (K6-2 400 Mhz) even on a complex website (scoop driven).

  116. ASN.1 resources on the web. by gd23ka · · Score: 3, Informative

    Actually ASN.1 is a formal way of specifying how to encode data into binary representations like BER, CER, DER and PER which do save bandwidth compared to XML.

    Those of you that want to find out more about ASN.1, can pick up free e-books on ASN.1 here. There's some blatant propaganda in them for OSS Nokalva's ASN.1 compiler, but of course there's also snacc, an GPL'd open source ASN.1 compiler. Snacc however only generates code for encoding to BER, so you might also want to check out the a hacked version of snacc from Queensland University of Technology.

    ASN.1 is a base technology for a lot of standards out there like X.509, PKCS and LDAP, the OSI application layer protocols etc.

  117. bah by Anonymous Coward · · Score: 0

    Considering that most XML documents are 99% empty space and .99% repeated tokens, I'm not particularly impressed.

  118. And the two bytes are... by hayz · · Score: 0, Troll

    $34 $32

    (Actually, it's possible to compress *much* more information into these two bytes.)

  119. Oh crap. by G-funk · · Score: 1, Insightful

    This is just crap. Let's say it's two bytes and 2 bits. That means that it can only describe 2^20 different files. With 200 bytes to play with, you can have around 80^200 different xml files (80 was pulled from my ass, 2 uppercase + lowercase + symbols).

    Let's put it this way. 2.5 out of 200 is 80. that means .0125% of all 200 possible byte files, can be compressed down to 2.5 bytes, and that's providing perfect compression.

    I'm sure that with the right sample file LZH will compress it down to just a few bytes too.

    --
    Send lawyers, guns, and money!
    1. Re:Oh crap. by Anonymous Coward · · Score: 0

      *Very* few of those 80^200 files of length 200 will be well-formed XML. If you figure that you don't have to include much of the "codebook" since you can refer to an existing DTD by some sort of ID number, it's a bit more plausible.

      You, sir, are an ass.

  120. Better than that... by jmv · · Score: 1, Offtopic

    I can do much better with "delete" and "undelete". DOS rules! It had filesystem compression (format/unformat) long before the others.

  121. Reverse Engineer hax0r3d! by TroyFoley · · Score: 4, Funny

    I figured it out. They do it by removing the data pertaining to popup/popunder banners! 100 to 1 ratio seems about right.

    --
    After I have received the wisdom of good teaching, I will untiringly teach all people. - The Teachings of Buddha
    1. Re:Reverse Engineer hax0r3d! by ahde · · Score: 1
      I thought they substituted simple hex values for all those java-inspired tags.

      becomes 0x23fc1a

      That's not 80 to 1, but if you're on UTF-16, its an easy 12 to 1 if you count closing tags.

  122. Totally misses the point by coyote-san · · Score: 5, Insightful

    This idea totally misses the point.

    ASN.1 achieves good compression because the designer must specify every single and parameter for all time. The ASN.1 compiler, among other things, then figures out that that "Letterhead, A4, landscape" mode flag should be encoded as something like 4.16.3.23.1.5, which is actually a sequence of bits that can fit into 2 bytes because the ASN.1 grammar knows exactly how few bits are sufficient for every possible case.

    In contrast, XML starts with *X* because it's designed to be extensible. The DTDs are not cast in stone, and in fact a well-behaved application should read the DTD for each session, and only extracting the items of interest. It's not an error if one site decides to extend their DTD locally, provided they don't remove anything.

    But if you use ASN.1 compression, you either need to cast those XML DTDs into stone (defeating the main reason for XML in the first place), or compile the DTD into an ASN.1 compiler on the fly (an expensive operation, at least at the moment).

    This idea is actually pretty clever if you control both sides of the connection and can ensure that the ASN.1 always matches the DTD, but as a general solution it's the wrong idea at the wrong time.

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
    1. Re:Totally misses the point by p3d0 · · Score: 1

      I can't believe I just used up my last mod point before reading this. This is the most informative article I have seen here in a long time.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    2. Re:Totally misses the point by interiot · · Score: 2

      The ASN.1 compiler only has to run when the DTD changes, if the compiler can output a program that converts XML to ASN.1.

    3. Re:Totally misses the point by sllort · · Score: 1

      This idea is actually pretty clever if you control both sides of the connection and can ensure that the ASN.1 always matches the DTD, but as a general solution it's the wrong idea at the wrong time.

      Agreed. You'd need the equivalent of an ASN.1 "session key" - you'd have to compute the grammar and send it first before you sent the text.

      Which would be way less effective than using standard lossless compression techniques for ASCII, like, say.... gzip.

      The point is that XML wastes bandwidth and storage compared to compressed binary formats, and should never be used as a primary document format except in applications where you have bandwidth and storage to burn.

      Why else would Microsoft use it?

    4. Re:Totally misses the point by Anonymous Coward · · Score: 0
      XML doesn't inherently waste bandwidth - no data format does. Stick a decent compression algorithm in your networking code somewhere and you'll be fine.

    5. Re:Totally misses the point by Anonymous Coward · · Score: 0

      You need a decent compression algorithm on your hard drive, too. And anywhere else you store XML documents, like your cell phone's RAM. That said, you are correct.

    6. Re:Totally misses the point by ttfkam · · Score: 2, Insightful

      What if the XML document is representative of a dynamic aggregate of multiple schemas?

      Say what? Heh heh...

      Let's say you have an XHTML document (one DTD) that contains MathML (another DTD) and some SVG for good measure (third DTD). This would not be handled in your static DTD compile unless you made specific provisions for all of them in a single document. But what if the next document only has one of them used? Or two? Or includes some other one later? Are you going to compile every permutation of DTD that could ever occur?

      This is where the strength of XML is not necessarily compatible with the strengths of ASN.1.

      --

      - I don't need to go outside, my CRT tan'll do me just fine.
    7. Re:Totally misses the point by Anonymous Coward · · Score: 1, Informative

      This is not true. The program creating the message and the program reading the message do not need exactly the same DTD. Anything extra in the message will be ignored when reading it, and fields can be made optional.

  123. bandwidth is cheap? On what planet? by Carnage4Life · · Score: 2

    So who cares about compression. Personally, I'd much prefer the open and obvious standards of XML to some obfuscated form. Data is confusing enough already; at least XML gives a clear description that I can use with a packet sniffer when trying to debug something.

    You're kidding right? Most CS people I know cringe at the fact that XML can more than double the size of a document with largely redundant tags. The only thing to be thankful for is that the documents typically compress very well due to the large number of redundant tags and that HTTP 1.1 supports compression especially know that XML over HTTP (i.e. web services) is being beaten to death by a lot of people in the software industry. Numerous articles about XML compression also tend to disagree with you that it is not an issue.

    PS: If bandwidth is so cheap how come DSL companies are going out of business and AOL owns Time Warner? This would tend to imply that low bandwidth connections are still the order of the day.

    1. Re:bandwidth is cheap? On what planet? by Garpenlov · · Score: 2

      PS: If bandwidth is so cheap how come DSL companies are going out of business

      DSL companies are going out of business because... bandwidth is so cheap. And it's their own fault.

      and AOL owns Time Warner? This would tend to imply that low bandwidth connections are still the order of the day.

      Why? Are you saying AOL=dialup, and Time-Warner=cable? There's a LOT more to both of those companies than either of those two things...

      --
      --- Where's my X.400 protocol decoder?
  124. Missing the point as to why XML is good by Eryq · · Score: 4, Insightful

    XML, by virtue of being text-based, may be easily inspected and understood. Sure, it's a little bulky, but if you're transmitting something like an XML-encoded vCard versus an ASN.1 encoding of the same info, the bulk is negligible.

    Yes, for mp3-sized data streams, or real-time systems, there would be a difference. But many interesting applications don't require that much bandwidth.

    ASN.1 achieves its compactness by sacrificing transparency. Sure, it's probably straightforward enough if you have the document which says how the tags are encoded, but good documentation of anything is rare as hen's teeth, and not all software companies are willing to play nice with the developer community at large and share their standards documents. And some of them get downright nassssssty if your reverse engineer...

    Transparency is one of the reasons for the rapid growth of the Web: both HTML and HTTP were easy enough to understand that it took very little tech savvy to throw up a website or code an HTTPD or a CGI program.

    Transparency and extensibiliy also make XML an excellent archival format; so if your protocol messages contain data you want to keep around for a while, you can snip out portions of the stream and save them, knowing that 10 or 15 years from now, even if all the relevant apps (and their documentation) disappear, you'll still be able to grok the data.

    --
    I'm a bloodsucking fiend! Look at my outfit!
    1. Re:Missing the point as to why XML is good by refactored · · Score: 1
      Yes, for mp3-sized data streams, or real-time systems, there would be a difference. But many interesting applications don't require that much bandwidth.
      For embedded octet streams like that you use XML to carry the metadata and an xlink to the appropriate compressed format chunk of data. You don't, for pity sake, encode the whole stream in XML or ASN.1
  125. It is not the bandwidth that is so important.. by Anonymous Coward · · Score: 0
    ASN.1 (BER encoding I assume) is very space efficient from X.25 days, at least when compared to XML. But if XML is a reference point, so is CORBA or Sun RPC encoding--even if they take more space than BER, they are much more CPU friendly encodings.

    XML's primary strength lies on the fact that is very friendly format to connect "loosely-connected" systems togetther. With all the other formats (ASN.1, CORBA, RPC) the systems have to agree exactly what is the data format. Whilst with XML, the format can vary over the time, and the systems can still understand each other.

  126. ISO not free (as in beer) by Coot · · Score: 1

    In order to fully analyze the ASN.1 standard, you have to have a copy of the standards documents and read them. Unfortunately, to do that, I have to pay ANSI/ISO several hundred US dollars. OK, the C and C++ standards are available from ANSI for only 18 USD, but the other standards are much more ... in fact, go to the search page and search for ASN.1 ... see for yourself.

    W3C and Internet STDs and RFCs are freely (as in beer and as in speech) available. This is partly why many of them are so widely adopted.

    If the ASN.1 folks want their standards widely adopted, they first have to make it easy and cheap to get copies of the standards.

    --

    --
    “Doh!”

  127. Will we have 'hardware accelerated' modems? by beowulf_26 · · Score: 1

    From the looks of today's news, it seems that we need to tackle bandwidth issues from both ends (please don't flame me for being too obvious). After reading the NasaWatch article about streaming HDTV, Covad filing for bankruptcy, and finally the rather negative comments to this networking protocol, it seems that we've got a long way to go.

    Obviously, we need more accessible fat pipe and larger bandwidth, which means these things need to be cheaper. Thankfully, with the advent of high-power round lasers (featured in last month's Wired if I'm not mistaken) the equipment for routing optical lines, will become much less complicated and FAR cheaper. Which means greater accessiblity to broadband and probably a better environment for high-speed providers.

    The second end seems to be developing a new STANDARD protocol. Current ones while being fairly open and without need for debugging are nice, but seem rather inefficient. If everyone can agree on a compression scheme for the internet, what is the possiblity of seeing hardware accelerated modems? Will we have something akin to hardware DVD decoders, or GeForce 3's for our net access?

    If any of you know of current movements for such technology, I know I'd be interested to hear about them, and I'm sure your fellow /. readers would as well.

    --

    --I hate big sigs.
  128. Re: Leave compression to the hardware by willy_me · · Score: 2
    I agree, leave XML uncompressed. Let modems compress the data - it might not be as efficient but it keeps things simple.

    Willy

  129. ASN.1? by Anonymous Coward · · Score: 0

    You've got to be kidding.

    Compare SNMP, LDAP, and Kerberos (if you've ever worked with implementations of them) with SMTP and NNTP.

  130. ASN.1 -- excellent choice by ciurana · · Score: 4, Informative

    Some people in this forum think that ASN.1 is a replacement for XML; others think of it as a "lossy" compression algorithm. ASN.1 is neither. Read the article and learn a bit about ASN.1 before forming an opinion. Most important, ASN.1 has been an interoperability standard for at least 10 years prior to the introduction of XML.

    ASN.1 is a standard interoperability protocol (ISO IS 8824 and 8825) that defines a transfer syntax irrespective of the local system's syntax. In the scenario described in the article, the local syntax is XML and the transfer syntax is ASN.1. ASN.1 is a collection of data values with some meaning associated with them. It doesn't specify how the values are to be encoded. The semantics of those values are left to the application to resolve (i.e. XML). ASN.1 defines only the transfer syntax between systems.

    ASN.1 codes are defined in terms of one or more octets (bytes) joined together in something called an encoding structure. This encoding structure may have values associated with it in terms on bits rather than bytes. An encoding structure has three parts: Identifier, Length, and Contents octets. Id octects are used for specifying primitive or constructor data types. Length octets define the size of the actual content. A boolean is thus represented by a single bit, and digits 0-9 could be BCD encoded. Each encoding structure carries with it it's interpretation.

    An XML document could thus be encoded by converting the tags into a lookup table and a single octect code. If the tags are too many, or too long (i.e. FIRST-NAME) then there are significant savings by replacing the whole tag with an ASN.1 encoded datum. If we assume there are up to 255 different potential tags in the XML document definition, then each could be assigned to a single byte. Thus, encoding the tag <FIRST-NAME> would only take two bytes: One for the ID, one for the length octet, and zero for the contents (the tag ID could carry its own meaning).

    I used to work with OSI networks at IBM. All the traffic was ASN.1-encoded. I personally think this is an excellent idea because ASN.1 parsers are simple and straightforward to implement, fast, their output is architecture independent, and the technology is very stable. Most important, this is a PRESENTATION LAYER protocol, not an APPLICATION LAYER protocol. The semantics of the encoding are left to the XML program. Carefully encoded ASN.1 will preserve the exact format of the original XML document while allowing its fast transmission between two systems.

    http://www.bgbm.fu-berlin.de/TDWG/acc/Documents/as n1gloss.htm has an excellent overview if you're interested.

    Cheers!

    E
    --
    http://eugeneciurana.com | http://ciurana.eu
    1. Re:ASN.1 -- excellent choice by Anonymous Coward · · Score: 0
      I think that as enocding itself is concerned, ASN.1 is a good choice--as is CORBA encoding, which is more verbose and more CPU friendly.

      However, XML's strenth lies on the not-so-tight coupleness of the two parties--they are more free to upgrade themselves and still be able to understand each other.

      Two different worlds here--I would choose CORBA for tightly coupled systems (sorry for that, but ASN.1 is old and not supported in the real world anymore so much), and XML otherwise.

    2. Re:ASN.1 -- excellent choice by the+coose · · Score: 1

      sorry for that, but ASN.1 is old and not supported in the real world anymore so much

      Actually, as previously mentioned, ASN is used in LDAP. Also it is used in the H.245 layer of H.324 (video conferencing over POTS networks) and H.323 (video conferencing over LANs). It is easily implemented as a recursive parser.

    3. Re:ASN.1 -- excellent choice by tkrotchko · · Score: 1

      "ASN.1 is a standard interoperability protocol (ISO IS 8824 and 8825) that defines a transfer syntax irrespective of the local system's syntax"

      I think everyone understands what ASN.1 is intended for. In this particular application though its being touted as a clever way of compressing XML streams to minimize bandwidth.

      It doesn't appear to be the best general purpose tool to do the job in most cases.

      --
      You were mistaken. Which is odd, since memory shouldn't be a problem for you
    4. Re:ASN.1 -- excellent choice by FrostedChaos · · Score: 1
      First of all-- stop worrying about "cpu-friendliness." In a few years 2 GHz and over processors will be dirt cheap. Open your eyes and realize what most people already see-- that the days of transmitting in plaintext to save processing power on your Commodore 64 are over.

      Secondly, how old any format is has *nothing* to do with its quality. And as the previous poster mentioned, ASN.1 is used a lot, unlike XML itself.

      XML's strenth lies on the not-so-tight coupleness of the two parties
      What the hell are you talking about? The sender and receiver always have to understand each other.

      Read the comment earlier:
      ASN.1 parsers are simple and straightforward to implement, fast, their output is architecture independent, and the technology is very stable. Most important, this is a PRESENTATION LAYER protocol, not an APPLICATION LAYER protocol. The semantics of the encoding are left to the XML program.

      --
      "Any connection between your reality and mine is purely coincidental." -Slashdot
  131. ASN.1 is Abstract Syntax Notation by Genetically+Enginerd · · Score: 1

    The bare bones of it is that ASN.1 is a language that defines how a data structure will be encoded for transmission over a data link and how it will be decoded at the remote end. There are several different encode/decode schemes (BER,DER,etc.). Consider a C struct. A BER encoding of that struct would contain the data elements of that struct. Each encoded data element will contain a tag, a length, and a value. If you define this C struct in ASN.1 and run the ASN.1 through a compiler, the output is the C code to encode/decode the data from/to the C structure. Recsss...

    --
    Does the income I've derived from working with Unix belong to SCO?
  132. Try UDP with bigger packets by Negadecimal · · Score: 1

    Seriously, how much bandwidth do we lose to simple ACKs, NACKs, and packet headers? How often do networks really drop packets that we couldn't use UDP for web applications?

    As for HTML and XML, we could cut ascii data by 20% if we just got rid of useless carriage returns, non-paragraph whitespace, tag quotation marks, HTML comments... just compare the source HTML for Yahoo with CNN.com... BIG difference.

    1. Re:Try UDP with bigger packets by belg4mit · · Score: 0

      Except that tag quotation marks no longer optional
      but required, thanks to the influence of XML.

      --
      Were that I say, pancakes?
    2. Re:Try UDP with bigger packets by reflective+recursion · · Score: 2, Informative
      Seriously, how much bandwidth do we lose to simple ACKs, NACKs, and packet headers? How often do networks really drop packets that we couldn't use UDP for web applications?
      UDP drops packets enough, that is for sure. The purpose of TCP is to be a _stable_ transport. UDP simply throws messages towards their destination and hopes they hit their target. Say an HTML document is sent via UDP. Say you get 1 packet, miss the 2nd and get the 3rd instead. How does your browser know packet 3 is _not_ packet 2. This also says nothing about the order of packets sent (with UDP packet 3 could arrive before 2 or 1). So then you begin to hack on a protocol that detects the correct order. Then you hack on another protocol that makes sure packets even arrive. Then you will have TCP all over again. :-)
      As for HTML and XML, we could cut ascii data by 20% if we just got rid of useless carriage returns, non-paragraph whitespace, tag quotation marks, HTML comments... just compare the source HTML for Yahoo with CNN.com... BIG difference.
      Ahh. We finally see that just learning HTML (or in general, web-oriented languages such as VB script and Javascript) does not make a good programmer. If you have never seen a VB program's source code.. well, don't. I don't mean to bash VB (or web) programmers though. The problem with HTML/XML is it is not compiled (like Java machine-independent). I believe this is more to do with the web outgrowing its purpose. It was never designed for graphics, let alone plug-ins, Javascript/Java, frames (should I really continue? :P ).
      --
      Dijkstra Considered Dead
  133. Practice what you preach... by Anonymous Coward · · Score: 1, Funny

    Their web page was generated by M$ frontpage. Does anyone else find it amusing that they tout their amazing compression standard using the most inefficient, bloated html generator available on the planet?

  134. reformat by ahde · · Score: 1
    I thought they substituted simple hex values for all those long-ass java-inspired tags.

    <TagForMainDocumentPrimaryHeaderAtTopOfPage>

    becomes 0x23fc1a

    That's not 80 to 1, but if you're on UTF-16, its an easy 12 to 1 if you count closing tags.

  135. Actually... by Captain_Frisk · · Score: 2

    I wonder if the same could be done with XHTML or even regular HTML.

    If HTML is written properyly, it is XML. Browsers nowadays let you cheat, and mix tags, and ignore quotes, but if the HTML is written to spec, then it is technically XML.

    Captain_Frisk

    1. Re:Actually... by Anonymous Coward · · Score: 1, Informative

      Hmm... slight nuance to be added :)

      If HTML is written properly, it is easily converted to XHTML (and thus XML) by changing a few tags and adding the XML-formalities.
      For example: changing all single tags (<br>) into XML-single tags (<br />), or changing name-only attributes (<td nowrap>) into full attributes (<td nowrap="nowrap">). Check the XHTML 1.0 specs on w3.org for the full story ;) (can't access it at the moment for reasons unknown)

    2. Re:Actually... by Anonymous Coward · · Score: 0

      No, you're wrong.

      <html>
      <body>
      <p>Paragraph 1
      <p>Paragraph 2
      </body>
      </html>

      is valid HTML, but not valid XML.

      <img src="whatever"> isn't valid XML either, it has to be <img src="whatever />

      You're thinking of XHTML.

    3. Re:Actually... by ttfkam · · Score: 1

      Nope. XHTML is valid XML, but not regular HTML. HTML allows for the tag (no closing tag) and tags such as where attributes have no clear key/value pair or enclosing quotation marks. *Definitely* not well-formed XML, but perfectly valid HTML.

      --

      - I don't need to go outside, my CRT tan'll do me just fine.
  136. All the excitment over 198 BYTES???? by emu_doogie · · Score: 0

    In todays day and age, I don't think that 198 Bytes saved in a download of any document is worth the trouble of implementation. Even on a 56K connection, this only takes 1 point something seconds to download 200 BYTES... is it really worth the second to put hours and hours of energy into saving that little bit of bandwidth? Granted, when multiplied by millions of users, it is a good chunck of bandwidth, but even better would be to devote this time to better video compressions or even image compression. ONE SECOND PEOPLE... ONE SECOND!!!

  137. R. Colin Johnson is a parrot by sarchasm · · Score: 1
    If you are a regular reader of EE Times one thing you will quickly notice how useless this guy's articles are. His stories are almost always about some vaporware or bogus technology, and he doesn't even attempt to overcome his own ignorance with basic fact checking. Just take this claim from the article:

    In one benchmark I have read about, a 200-byte message was reduced to 20 bytes with normal compression methods, but ASN.1 encoded it into just 2 bytes and a few bits," said Scott.

    Anyone with any basic knowledge of data compression or information theory would see right through this. And he couldn't find one such person to run his story by first?

    --

    ----------------

    Overheard: "Aww, why'd you go and install Windows on a perfectly good machine?"

  138. Cool, this will help me download phat vvarez by NewPaltzCompSciMajor · · Score: 1

    Yo, this will let me compress all my vvares and trade them to my homies online. FU DCMA!

  139. Those who do not undestand ASN.1 .... by StormyMonday · · Score: 3, Informative

    are condemned to repeat it. Badly.

    I have had to deal with dozens of binary protocols that do the same thing as ASN.1, and do it worse.

    As to comparisons, XML and ASN.1 are designed for different jobs. Designing a Web page in ASN.1 would be ridiculous. Sending (say) telemetry data encoded in XML is equally ridiculous. I can believe that *data* transmissions could be 100 times larger in XML than in ASN.1. You have the header, DTD, some namespace delcarations, and a bunch of nested tags, just to express a couple of numbers.

    Problem is, XML is one of the latest forms of fairy dust that Management has latched onto. "Sprinkle this on your project and it will fly!" So programs have XML grafted onto them anywhere it might fit.

    A particularly cute example is SOAP (Microsoft's firewall-bypass protocol) It's going to be fun to watch people try to squeeze some performance out of a SOAP based system that tries to do something interactive.

    As to the ISO, yeah, they're seriously obnoxious. They tend to go off into their own little world, redefine standard terminology so they're incomprehensible to outsiders, and come up with stuff that can't be implemented. (Nobody uses ASN.1 -- it's unimplementable. When people talk about using ASN.1 for something real, they're talking about a subset. A subset, of course, cannot claim conformance to the standard.) The crowning insult, of course, is that they fund the organization by selling the standards. Hey, it's a standard -- you *have* to buy it!

    "It's all in knowing what wrench to use to pound in the screw."

    --
    Welcome to the Turing Tarpit, where everything is possible but nothing interesting is easy.
  140. After re-reading the title... by Anonymous Coward · · Score: 0

    ...I realized it doesn't say "Old Proctologist Could Save Massive Bandwidth." I almost thought it was a commentary on all the shit you find on the Internet these days.

  141. I'm doubting the accuracy here by Uttles · · Score: 1

    OK, this is a good protocol but let's not exaggerate (sp?). I mean come on, 200 bytes of useful information compressed to 2 bytes? I doubt that compression like that could occur unless you had a string of all 1's followed by all 0's, which doesn't seem very useful.

    --

    ~ now you know
  142. ASN.1 also difficult to implement by pbryan · · Score: 1

    I agree completely!

    It also bears mentioning that ASN.1, BER and DER are all complete hairballs to implement (I'm trying to be nice). The creation and enforcement of ASN.1 encoded streams is beyond the capabilities of typical developers (at least me and a number of highly competent co-workers).

    XML, on the other hand, is human readable and easy for any developer with any SGML-based markup language experience to pickup and implement -- generally, in a matter of hours. Furthermore, there is a wide array of XML-based tools to assist developers in ensuring DTD compliance.

    If it's compression we want, we should seriously consider alternate encoding schemes for XML (tokenizing perhaps?), including obvious compression schemes such as gzip or bzip2. Better yet, how about IPv6 integrated compression?

    --

    My car gets 40 rods to the hogshead, and that's the way I likes it!

  143. A Naive Question by tkrotchko · · Score: 1

    If the goal is a reduction in the size of XML structures, why not compress the XML streams after generation and uncompess before parsing? What's described here is nothing more than an intelligent compression algorithm which is clever, but it seems to require an understanding of the "compression algorithm" before you get it.

    But it looks like this would horribly complicate something like SOAP, and make every tool that uses XML (including web browsers) even more complicated.

    Then you start thinking about the other emerging XML based standards, and you realize this looks suspiciously like a solution in search of a problem.

    --
    You were mistaken. Which is odd, since memory shouldn't be a problem for you
  144. ASN.1 isn't efficient--for a binary protocol by Anonymous Coward · · Score: 2, Informative

    ASN.1 and a way of encoding ASN.1 (BER is commonly used) produces output that's binary. Encoded like this it represents everything using type, length, and data. So to represent, say, the integer 255 you'd encode it like this, using BER: [type byte: ?] [length byte: 1] [value byte: 255] So that's three bytes to encode a single byte integer. Great. Basically the advantages of ASN.1 are that it's a well defined way to express data types, and it has encodings that are platform neutral. Compared to other fixed-field binary protocols it's fat and not particularly robust (got a length value wrong anywhere? You can't make any sense of the rest of the data). It's a binary protocol, which means you can't just look at the data and understand it, which I see as a huge disadvantage--in my mind the reason the net is big now is because the protocols are straightforward and easy to understand at a glance. I work with ASN.1 every day in the guise of SNMP and I've learned to become annoyed with it. Ever see ASN.1 in the form of a mib? Bleeh. XML is popular because it's flexible and extendable. You don't really have a prayer of understanding encoded ASN.1 data without the full ASN.1 definition for the data, whereas with XML it's inherently human readable. Maybe there's more to this and it's a good fit, but I am not a big fan of ASN.1. - Bill

  145. Binary Bits by fm6 · · Score: 2
    ...bandwidth has a tendency to increase over time. When all wireless handhelds have a megabit of bandwidth, we would sorely regret being tied to ASN.1, as LDAP regrets it now.

    Not to mention, ASN.1 does not generally reduce the document size by more than 40% compared to XML. Think about it: how much space is really taken by tags?

    I share your dislike of unnecessary bit squishing. But I have to pick some nits.

    First, you shouldn't assume that available bandwidth will steadily increase. It will take some major breakthroughs -- not just technical, but political and economic -- before there's a megabit internet connection every place where it might be useful. And wide-area wireless networking is in an even worse state. Not to mention that radio spectrum is a finite resource.

    Your point about tags is well-taken. But you can compress the content too. Using 8 bits for every character is very inefficient, especially considering that there are only 128 characters to represent. With the right scheme, you could certainly get the average character width to somewhere between 4 and 5 bits.

  146. we can therfore safely conclude by meekg · · Score: 1

    that there are only 65,536 valid 200-byte XML documents.

  147. This is funny ... by ras · · Score: 2, Informative

    I remember when I first came across ASN.1 years ago. Everybody hated it because the parser was sssooo big and complex. Why not just use a simple ASCII file was a common refrain. Sure ASN.1 was capable of representing just about any data structure in a reasonably compact form, but most information did not need complex data structures to represent it so why does anybody use ASN.1?

    Well a decade or two later we get the ASCII version of ASN.1 - XML. And guess what? It's arguably harder to write a generic parser for XML that it is for ASN.1. (I still have not found a good open source validating parser for XML.) But guess what - everybody is wildly enthusiastic this time round. My how times change!

    Actually ASN.1 and XML in some ways are very similar. They try to solve the same problem - how to represent complex data structures in a generic way. And they do it in a similar way. Because ASN.1 is binary and uses numbers instead of text tags it does use a lot less space to represent the same thing, although 2 verus 200 bytes claim is at best misleading. Most of the 200 bytes would probably be XML header (dtd's and stuff) which you would not put in the ASN.1 encoding.

    And yes, XML is too fat for some applications. For example, if you are pumping out a 60k row SQL table to your 1000 clients every day you probably would not choose XML. That is why this idea has merit. It could give you the benefits of XML without the fat. To work someone who have to come up with a standard way of translating a DTD to ASN.1 encoding. I know it's a good idea because I came up with it myself a while back :).

    1. Re:This is funny ... by Anonymous Coward · · Score: 1, Informative

      "I still have not found a good open source validating parser for XML"

      Then you're not looking very hard. Try Xeres.

  148. retarded post by pioneer · · Score: 1

    How did this post make it on slashdot? Ok, 200 bytes compressed to 2 bytes and several bits. Anyone who has studied information theory would know that if you can compress it to x bits then it only has x bits of information... Therefore the 200*8 bits used in XML only contain 2*8+? bits of information. Perhaps this compression is of just some headers needed? But there is no way to compress data that much. And also, the point of XML is that it is a text protocol so that it is human readable. THere are already protocols available for binary compression of XML data. This is not an example of some algorithm from the past coming to the rescue and blowing today's technology away. This is a concocted, special case example meant to trick idiots that believe it.

  149. Re:Those who do not undestand SOAP by JamesOfTheDesert · · Score: 1
    A particularly cute example is SOAP (Microsoft's firewall-bypass protocol) It's going to be fun to watch people try to squeeze some performance out of a SOAP based system that tries to do something interactive.

    SOAP is not strictly from MSFT, though they were part of a team of developers. There are W3C and IETF submissions to "standardize" it. And SOAP uses the the HTTP firewall-bypass protocol, like, oh, web pages and CGI scripts.

    I agree that some people have bought into the worst parts of XML hype, but these tend to be the same ones who bought into the Java hype as well, so it has more to do with poor management skills than XML per se.

    --

    Java is the blue pill
    Choose the red pill
  150. Sounds like they're spewing buzzwords... by coupland · · Score: 2

    A 200 byte message reduced to 2 bytes? I don't know ASN.1 but I would have to assume tags are counted, and added to an indexed table. Using variable-length encoding you can squeeze some extra compression out of your algorithm but 100:1 compression? So basically you have a 180-byte XML tag with a single value reduced to a single symbol with an index of 1. Meaning that the "benchmark" is a sham. Add to that the fact that the symbol table obviously wasn't counted in their "compression" technique. I would assume you don't LZ-compress the symbol table (creating a symbol table for a symbol table) so basically what you have is after compression the code goes from 200 bytes to 200 bytes + 2 bytes and a few bits. What a joke. The worst part of all is that I'm sure it achieves fairly good compression on a 100k XHTML document but they have to throw bogus numbers at us thinking we'll go all doe-eyed. Very insulting.

    1. Re:Sounds like they're spewing buzzwords... by philipm · · Score: 1, Funny

      I'm wondering why they didn't just byte the bullet and use the patented ZERO.1 compression algorithm which compresses most XML to its true information size. Strangely enough that is usually zero bits.

      And, no, its not lossy.

  151. No! Not ASN.1! Make it stop! Make it stop! by osgeek · · Score: 2

    After writing an SNMP management console with an ASN.1 parser, I have nightmares about the protocol. Sure, it's very efficient yet flexible, but it makes all sorts of neural connections happen in your brain that are better left open. :P

    Since XML was designed for humans to be able to look at to a certain extent, why not just have a standard compreession method that's included with all XML parsers? Whenever you transmit or save the XML file, it should be saved in the compressed format.

  152. GPL'ed ASN.1 encoder/decoder by foo · · Score: 2, Informative

    http://www.fokus.gmd.de/ovma/freeware/snacc/

  153. XML is BAD BAD BAD :) by Lazy+Jones · · Score: 2
    I've never quite understood why some people found the idea to have machines communicate with a data format designed to be readable by humans so intelligent. Because of this oh-so-intelligent idea, we have:
    • lots of broken pages with wrong HTML syntax
    • lots of broken browsers with different ideas about how to interpret HTML
    • a huge amount of bandwidth wasted with unnecessary whitespace and superfluous characters
    A standard format for web content without a human-readable form (i.e. a compact binary encoding) would have many advantages:
    • syntax could be strict, so no ambiguity would be supported (i.e. there would never have been a reason to support things like both size="123" and size=123)
    • the documents/content would be checked prior to publication on the WWW (because a "compilation" step would be needed in case the content was typed in by a human being, and high level libraries / widgets would probably be used in generators for dynamic content)
    • no waste of precious bandwidth!
    • anyone who wanted proprietary extensions for their encoding would have to give you their parser and generator/compiler
    OK, so XML is more strict and extensible than HTML, but it's still based on the irrational notion of encoding things in a human-readable form - trading bandwidth for readability - when in most cases no human will ever look at them.
    --
    "I love my job, but I hate talking to people like you" (Freddie Mercury)
  154. Oh yeah? by Andrewkov · · Score: 2
    ASN.1 could be used to compress a 200 byte XML document to 2 bytes and few bits

    Oh yeah?? I wrote a protocol that can take a 6 MB MP3 file and compress it to under 10 bytes!

    (Some sound quality degragation may occur, use at own risk)

  155. The ASN.1 faithful just don't get it by RobertGraham · · Score: 5, Insightful
    Preface: I've written parsers for ASN.1 (esp. SNMP MIBs, but also generic), BER/DER (same thing), PER, HTML, XML, and while we are at it, XDR and CORBA IDL. I've written a BER decoder that can decode SNMP at gigabit/second speeds.

    There are a vast number of differences between ASN.1 and XML. To think that ASN.1 is in any way related to XML demonstrates that they just don't "get it".

    1. Why not XDR or just raw binary?
    Why not just specify your own binary format for you application? The thing that the ASN.1 bigots don't understand is that in most real-world applications, the ASN.1 formatting provides only overhead but no realworld value. This happens in XML, too, but the value proposition for XML is much clearer. A good example is the H.323 series PER encoding which is just plain wrong: well-documented custom encoding would have been tons better.

    2. DTD or no DTD
    The ASN.1 language is essentially a DTD; it gets encoded in things like BER. The trick is that I can parse "well-formed" XML content without knowing the DTD. This is impossible with current ASN.1 encoding. The idea of DTD-free "well-formed" input and DTD-based "valid" input is at the core of XML. Yes, both ASN.1 and XML both format data, but proposing ASN.1 as being a valid substitute means you just don't grok what XML is all about

    3. Interoperability
    The Internet grew up in an environment that parsers should be liberal in what they receive. This was important in early interoperability, but now is a detriment. For example, it is impossible to write an interoperable HTML parser. XML took the radical zen approach of mandating that any parser that excepts malformed input is BAD. As a result, anybody writing an parser knows the input will be well-formed. There is one-and-only-one way to represent input (barring whitespace), so writing parsers is easy. ASN.1 has taken the opposite approach, there are a zillion ways to represent input.

    As a result, non-interoperable ASN.1 implementations abound. For example, most SNMP implementations are incompatible. They work only "most" of the time. Go to a standard SNMP MIB repository and you'll find that the same MIB must be published multiple times to handle different ASN.1 compilers.

    The long and the short of it is that ASN.1 implementations today are extremely incompatible with each other, whereas XML libraries have proving to extremely interoperable. Right now, XML has proven the MOST interoperable way to format data, and ASN.1 has proven to be the LEAST.

    4. Bugs
    Most XML parsers have proven to be robust, most ASN.1 parsers have proven to be buggy. You can DoS a lot of devices today by carefully crafting malformed SNMP BER packets.

    5. Security
    You can leverage ASN.1's multiple encodings to hack. For example, my SideStep program shows how to play with SNMP and evade network intrusion detection systems: http://robertgraham.com/tmp/sidestep.html At the same time, ASN.1 parsers are riddled with buffer-overflows.

    Anyway, sorry for ranting. I think XML advocates are a little overzealous (watch carefully your possessions or some XMLite will come along and encode it), but ASN.1 is just plain wrong. The rumor is that somebody through it together as a sample to point out problems, but it was accidentally standardized. It is riddled with problems, it should be abandoned. An encoding system is rarely needed, but if you need one, pick XDR for gosh sakes.

    1. Re:The ASN.1 faithful just don't get it by haapi · · Score: 2, Insightful

      Well said! A Silly Notation.1 is a hideous encoding scheme. The BER is simply ambiguous -- you don't need to send malformed packets to devices, rather simply send valid BER packets that just aren't right, but still follow the rules, and watch carnage ensue.

      --
      Well, apparently, you only have to fool the majority of people for a little while.
  156. The same struggle in the VoIP world by Lumpish+Scholar · · Score: 2

    Among the voice-over-IP (VoIP) protocols out in the world are H.323 (an ITU-T spec that makes heavy use of ASN.1) and SIP (RFC 2543 et. al.)

    H.323 interoperability is tough. Some problems are due to differences in how one entity encodes a piece of data and another decodes it. Many H.323 implementations, um, do not fail gracefully under such circumstances.

    SIP call signalling looks like HTTP. There have been complaints that it's too verbose, and needs to be replaced with something binary. One proposal suggests using a binary encoding. It uses LZW compression and shared "codebooks" (schemas?)

    That's just for call signalling. Both these VoIP protocols (and others) use RTP ("Real Time Protocol") for voice, video, etc.; that's encoded and compressed pretty darned seriously.

    (I'm not speaking for my employer, I'm just speaking my mind.)

    --
    Stupid job ads, weird spam, occasional insight at
  157. Re:ASN.1 not suitable, but XML is still good by miniver · · Score: 2
    I'm always amused by people that assume XML will be the magic lingua franca of the Internet and everyone will be able to parse every last bit of meaning out of your document just because [it's human-readable] without ever agreeing on any of those nasty "standards" things.

    Apparently you've never had to write a parser for EDI, or any other binary data interchange format.

    I'm not going to claim that XML is a magic bullet for data interchange -- but I will attest that human-readable data formats are superior to binary formats when it comes to data interchange. I have lost track of the number of custom parsers I've had to write over the last 15+ years in order to convert data from one system to another, simply because the systems in question didn't have a shared data format. The big wins for XML are that (1) you can visually inspect your before-and-after results, (2) you don't have to write the parser, even if you have to write code to call it, (3) there are actually two sensible APIs to match two very different ways to look at the data, each of which is parser independent, and best of all (4) if you don't have documentation for the schema (or it's misimplemented), you still have a prayer of interpreting the data correctly.

    Anyone who's ever had to write an EDI application will *instantly* understand the appeal of XML.

    --
    We call it art because we have names for the things we understand.
  158. No Language Mapping by fpn · · Score: 1

    My biggest beaf with ASN.1 is that ASN.1 does not have any language mapping. It only specifies the bits on the wire.

    To interface to it you will need a special library and compiler.

    If you would have a project written in C then you would use a ASN.1-to-C compiler that would generate some kind of lookup table and then you would use that in combination with some special C library to encode/decode the PDUs you send over the network.

    Every ASN.1 library/compiler you use to talk ASN.1 is completley different. So if you want to change it (maybe you don't like the way it decodes a PDU) you most likely will have to rewrite your application.

    In other words, except for the bits on the wire there is no standard interface you can expect.

    CORBA and others standards specify C, C++, ... language mappings, so you don't have to care too much how the tools you use look.

    Also working ASN.1 compiler are quite pricey. (Around $10k for the compiler and a few more grand for the library).

    Look at the price of XML tools... (and there are plenty)

    regards,
    Florian

  159. Why human-readable formats are critical by alienmole · · Score: 4, Insightful
    I think you simply haven't realized quite how useful it is, in real life, for information to be human-readable. When it isn't, it becomes harder to deal with. If you've programmed anything on the web, you're certainly familiar with using "View Source" to see the final source of a page. If you use XML, you've also examined XML data that's been generated by, say, a database server.

    Contrast that with what I'm dealing with right now: I'm using JDBC to access an MS SQL Server. MS bought their SQL Server from Sybase many years ago, and inherited the binary TDS data stream protocol. As efficient as this might be, when you run into problems, you're in trouble. The TDS format is undocumented, so you can't easily determine what the problem might be, whereas a text format would be easy to debug. Anytime you have a binary protocol, you become totally reliant on the tools that are available to interpret that protocol. With text protocols, you're much less restricted.

    Another example of this is standard Unix-based email systems vs. Microsoft Exchange. Exchange uses a proprietary database for its message base, which makes it effectively inaccessible to anything but specialized tools and a poorly-designed API. If your email is stored in some kind of text format, OTOH, there are a wealth of tools that can deal with it, right down to simple old grep.

    The bottom line is that the human-readability (and writability!!) of HTML was one of the major factors in the success of the web. It's no coincidence that everything on the web, and many other successful protocols, such as SMTP, are text-based. To paraphrase your subject line, binary protocols are BAD BAD BAD.

    Calling human-readable formats "irrational" is a bit like Spock on Star Trek calling things "illogical" - what that usually really meant was that the actual logic of the situation wasn't understood. What's irrational is encoding important information, which needs to be examined by humans for all sorts of reasons that go beyond what you happen to have imagined, into a format which humans can't easily read.

    Human-readable formats and protocols will remain important until humans have been completely "taken out of the loop" of programming computers (which means not in the forseeable future).

    1. Re:Why human-readable formats are critical by alienmole · · Score: 2
      When did you last look at the TCP/IP specs? When was the last time you thought that TCP/IP should have been encoded as XML (with the spec in the form of a DTD) to avoid the bother of using tcpdump or snoop to get a human-readable presentation?

      When was the last time you saw a web page designer or web application programmer dealing with any of this stuff?

  160. Moore's Law giveth, handhelds taketh away by Anonymous Coward · · Score: 0
    If I remember properly, Moore's Law says that every 18 months, a given area (i.e., 1 square millimeter) can hold 50% more transistors. But people have several different uses for that gain:
    • Reducing the size and power and cost of the system (an actual computer on a desk, followed by an actual computer in a handheld)
    • Adding more interfaces: gui's and voice interfaces
    • Programming in C instead of assembly, followed by programming in Perl instead of C
    • Adding ever more complex communication protocols
    • Just plain adding features

    Finally, note that there is no Moore's Law for battery power (measured in watt-seconds/kg^3).
  161. BFD by r_j_prahad · · Score: 2

    Big fuckin' deal. I compressed an entire Microsoft Operating System into a single byte once. HALT 0

  162. Using XML is _ASKING_ for bloat by alehmann · · Score: 1

    XML is a very wasteful and generic file format. By using a custom binary file format, file sizes could easilly be decreased hundredfold. It's a pity that people use XML for reasons of "interoperability" when the only siginificant gain is that parsing the file format is done in a uniform way. XML wastes CPU time, drive space, and memory by trying to use a generic file format for nongeneral data. It should be shunned.

    1. Re:Using XML is _ASKING_ for bloat by n_jed · · Score: 1

      Amen. XML is wasteful. Too many developers and managers think it's going to solve world peace.

  163. A move to ASN.1/BER is repeating past mistakes by BeardStreet · · Score: 1

    Using ASN.1's companion protocol, the Basic Encoding Rules (BER) to transfer XML would be a mistake! These lessons have already been learned and documented. Experience has shown that text based protocols and data streams are easier to decode, debug, understand, and process in programs. Look no further than the Simple Network Management Protocol (SNMP) which uses ASN.1/BER. How many hours/days of our lives did we spend debugging broken BER implementations. I'd much rather have my XML spend a few more microseconds on the wire that go through the BER exercise.

  164. Compression is easy! by Anonymous Coward · · Score: 0

    I can compress the entire King James Bible down to one byte. It's 0xEF. Anytime you see 0xEF, substitute the King James translation of the Bible.

    Continue on.

  165. Think about it... by fdisk3hs · · Score: 1

    This may not be the way, and bandwidth may be wasted/mostly unused/cheap, but there has to be a more efficient way to move content than what we're doing... If broadband hadn't taken off so quickly (it is slowing down considerably now - Hear about Covad today?) making a more efficient way to move data/content would be a much bigger issue... i.e. most of us would still be using dialup (like me...) and praying for a faster, more effective way to use POTS lines... Instead the web pages are getting bigger, flashier, and more and more crowded all the time. I like a good looking site, but Mercy! The content is still the key...