Slashdot Mirror


Old Protocol Could Save Massive Bandwidth

GFD writes: "The EETimes has a story about a relavtively old protocol for structured information call ASN.1 could be used to compress a 200 byte XML document to 2 bytes and few bits. I wonder if the same could be done with XHTML or even regular HTML."

6 of 287 comments (clear)

  1. mod_gzip ? by AdamInParadise · · Score: 4, Informative

    Ever heard of mod_gzip? It compress anything that goes trough your Apache webserver and it is supported by most browsers. With everything running over http theses days, this is the way to go...

    --
    Nobox: Only simple products.
  2. ASN.1 not suitable by cartman · · Score: 5, Informative

    ASN.1 is the basis of a great many protocols, LDAP among them. What is not mentioned in the article is that ASN.1 is a binary protocol and is therefore not human-readable. It may save space for bandwidth-constrained applications. However, bandwidth has a tendency to increase over time. When all wireless handhelds have a megabit of bandwidth, we would sorely regret being tied to ASN.1, as LDAP regrets it now.

    Not to mention, ASN.1 does not generally reduce the document size by more than 40% compared to XML. Think about it: how much space is really taken by tags?

    It's also worth noting that there is lots of documentation surrounding XML. With ASN.1 you have to download the spec from ITU which is an INCREDIBLY annoying organization and their specs are barely readable and they charge money to look at them, despite the fact that they are supposedly an open organization. The IETF and the W3C are actually open organizations; ITU just pretends to be. ITU does whatever it can to restrict the distribution of their specifications.

    1. Re:ASN.1 not suitable by pegacat · · Score: 5, Informative

      This is pretty much right. I do a lot of work on X500 / ldap / security, and ASN1 is used throughout all this. It does a pretty good job, but as the poster points out, the ITU is a completely brain damaged relic of the sort of big company old boys club that used to make standards. It's very difficult to get info out of them. (Once you get it though, it's usually pretty thorough!)

      As for the 'compression', well, yes, it sorta would be shorter under many circumstances. ASN1 uses pre-defined 'global' schema that everyone is presumed to have. Once (!) you've got that schema, subsequent messages can be very terse. (Without the schema you can still figure out the structure of the data, but you don't know what its for). For example, I've seen people try to encode X509 certificates (which are ASN.1) in XML, and they blow out to many times the size. Since each 'tag equivalent' in ASN.1 is a numeric OID (object identifier), the tags are usually far shorter than their XML equivalents. And ASN.1 is binary, whereas XML has to escape binary sequences (base64?).

      But yeah, ASN.1 is a pain to read. XML is nice for humans, ASN1 is nice for computers. Both require a XML parser/ ASN.1 compiler though. ASN.1 can be very neat from an OO point of view, 'cause your ASN.1 compiler can create objects from the raw ASN.1 (a bit like a java serialised object). But I can't see ASN.1 being much chop to compress text documents, there are much better ways of doing that around already (and I thought a lot of that stuff was automatically handled by the transport layer these days?)

      And just for the record... the XML people grabbed a bunch of good ideas from ASN.1, which is good, and LDAPs problems are more that they screwed up trying to do a cut down version of X500, than that they use ASN.1 :-)!

      --
      Wer mit Ungeheuern kämpft, mag zusehn, dass er nicht dabei zum Ungeheuer wird.
  3. ASN.1 was designed to be efficient by Anonymous Coward · · Score: 4, Informative

    If I remember the history right, ASN.1 was designed during the era of X.25 and charging for every 64 byte packet. I used to use ASN.1 for remote communications in a commercial product, but later changed it to a hybdrid of CORBA and XML, mostly due to more modern techologies, and since the actual bandwith did not cost that much anymore, it did not make sense to keep an old protocol alive. ASN.1 has it's drawbacks too--8 different ways to encode a floating point number. It was a political reason, because everyone involved wanted their own floating point format included, and as a net result, everyone has to be able to decode 8 different formats. A encoding designed by a committee (a stoneage telcom committe as a matter of fact).

  4. Re:bandwidth is cheap by Jeffrey+Baker · · Score: 4, Informative
    at least XML gives a clear description that I can use with a packet sniffer when trying to debug something.

    Translated:

    My debugging tools are inadequate, and my brain is inadequate for improving them.

    You have a powerful, general-purpose computer at your disposal. Why should you care if the protocol can be inspected with the naked eye? Do you use an oscilloscope to pretty-print IP packets? No, you use ethereal! If XML is encoded using ASN.1, then the tools will be modified to decode ASN.1 before showing it to the human. Ethereal already knows about ASN.1 because it uses it to display LDAP traffic. If you don't like ethereal, try Unigone.

    Use your CPU, not your eyeballs!

  5. ASN.1 -- excellent choice by ciurana · · Score: 4, Informative

    Some people in this forum think that ASN.1 is a replacement for XML; others think of it as a "lossy" compression algorithm. ASN.1 is neither. Read the article and learn a bit about ASN.1 before forming an opinion. Most important, ASN.1 has been an interoperability standard for at least 10 years prior to the introduction of XML.

    ASN.1 is a standard interoperability protocol (ISO IS 8824 and 8825) that defines a transfer syntax irrespective of the local system's syntax. In the scenario described in the article, the local syntax is XML and the transfer syntax is ASN.1. ASN.1 is a collection of data values with some meaning associated with them. It doesn't specify how the values are to be encoded. The semantics of those values are left to the application to resolve (i.e. XML). ASN.1 defines only the transfer syntax between systems.

    ASN.1 codes are defined in terms of one or more octets (bytes) joined together in something called an encoding structure. This encoding structure may have values associated with it in terms on bits rather than bytes. An encoding structure has three parts: Identifier, Length, and Contents octets. Id octects are used for specifying primitive or constructor data types. Length octets define the size of the actual content. A boolean is thus represented by a single bit, and digits 0-9 could be BCD encoded. Each encoding structure carries with it it's interpretation.

    An XML document could thus be encoded by converting the tags into a lookup table and a single octect code. If the tags are too many, or too long (i.e. FIRST-NAME) then there are significant savings by replacing the whole tag with an ASN.1 encoded datum. If we assume there are up to 255 different potential tags in the XML document definition, then each could be assigned to a single byte. Thus, encoding the tag <FIRST-NAME> would only take two bytes: One for the ID, one for the length octet, and zero for the contents (the tag ID could carry its own meaning).

    I used to work with OSI networks at IBM. All the traffic was ASN.1-encoded. I personally think this is an excellent idea because ASN.1 parsers are simple and straightforward to implement, fast, their output is architecture independent, and the technology is very stable. Most important, this is a PRESENTATION LAYER protocol, not an APPLICATION LAYER protocol. The semantics of the encoding are left to the XML program. Carefully encoded ASN.1 will preserve the exact format of the original XML document while allowing its fast transmission between two systems.

    http://www.bgbm.fu-berlin.de/TDWG/acc/Documents/as n1gloss.htm has an excellent overview if you're interested.

    Cheers!

    E
    --
    http://eugeneciurana.com | http://ciurana.eu