Slashdot Mirror


IETF Publishes Jabber/XMPP RFCs

stpeter writes "The Internet Engineering Task Force has published the XMPP specifications as RFCs. These documents formalize the core protocols developed within the Jabber open-source community, and publication as RFCs represents a major milestone in acceptance of Jabber technologies. Read on for details."

64 of 248 comments (clear)

  1. Market Penetration by The+Snowman · · Score: 4, Insightful

    Good, now hopefully someone with some market clout will pick this up and market an IM program using these protocols to the masses. Jabber may be cool, but it is no MSN or AIM. Both of those have immense market penetration. I have high hopes for this protocol, hopefully someone like IBM will make this happen.

    --
    24 beers in a case, 24 hours in a day. Coincidence? I think not!
    1. Re:Market Penetration by LnxAddct · · Score: 2, Interesting

      or Google? :) I really hope they come out with a GIM.
      -Steve

    2. Re:Market Penetration by mo · · Score: 5, Interesting

      Actually, Jabber has found a very good niche doing behind the scenes work in lots of commercial software. For example, we were using it in my last job writing console video games. We're also looking to use it in a current voip product at my new company. The thing is, most of this work uses LGPL/BSD licensed libraries like iksemel so you'd never know that the underlying protocol driving that video game chat lobby is jabber unless you ran tcpdump on it.

    3. Re:Market Penetration by The+Snowman · · Score: 4, Interesting

      Google makes a great search engine, I do appreciate their Usenet archives, but they do not have the Midas touch. If they made a GIM I am sure they would have a hardcore following, probably the same people using orkut and GMail, but I doubt they could market it to enough people.

      Jabber is much more than a simple IM protocol, but that is where it needs to take root. I doubt even Google can see beyond IM here, but someone needs to (hence the IBM reference). A wise man (the Linux nerd who converted me) once told me (while drunk) that "Perl is the glue that holds Unix together." Jabber could be the glue that holds networks together at the application level (I am not drunk but I wish I was). Cross-platform, standards based (XML), extensible, versatile...

      --
      24 beers in a case, 24 hours in a day. Coincidence? I think not!
    4. Re:Market Penetration by Sheepdot · · Score: 5, Insightful

      Actually, the ease of use and portability is what will eventually make Jabber the new thing. Don't get me wrong, Yahoo, AIM, and Microsoft's alternative have a lot more functionality, but that same functionality can be easily integrated into Jabber. In fact, many have done so already.

    5. Re:Market Penetration by InfiniteWisdom · · Score: 4, Funny

      He did say market clout not a small, but vocal group of followers :)

    6. Re:Market Penetration by vr · · Score: 4, Informative

      If they made a GIM I am sure they would have a hardcore following, probably the same people using orkut and GMail, but I doubt they could market it to enough people.
      It's Google. Sure they could.

      And the cool thing is that with IE, Mozilla, _and_ Opera (in the upcoming 7.60 release) supporting XMLHTTPRequest, they could make a web-based IM, without using such nasty stuff as Java or Flash.

    7. Re:Market Penetration by jeif1k · · Score: 2, Interesting

      The thing is, most of this work uses LGPL/BSD licensed libraries like iksemel so you'd never know that the underlying protocol driving that video game chat lobby is jabber unless you ran tcpdump on it.

      In both cases, you are required to acknowledge inclusion of the software. Furthermore, in the case of the LGPL, you are required to offer redistribution of the library to your users (which means that they must know that you are using it).

      In fact, if you want to behave decently, you should (1) acknowledge use of the library in the "About" box, in the game's credits, and in the printed documentation, (2) include the library source on the CD (no reason not to), and (3) let the developers know about your use of their code. This isn't strictly speaking legally required, but it shows that you are a good citizen, and it helps ensure that there will be improvements for you to include in the next game.

    8. Re:Market Penetration by Schreckgestalt · · Score: 2, Interesting
      hopefully someone like IBM will make this happen

      HP do that for us. Not only are they sponsoring various Jabber-related projects, but they also use Jabber for internal IM. I can reach any HP employee using the internal Jabber system (Yes, I work at HP).

    9. Re:Market Penetration by Trejkaz · · Score: 3, Insightful

      The other thing people are always forgetting about XMPP is that it's not just an IM protocol. If someone devises a killer app for it which isn't IM, then we might find everyone using XMPP whether they know it or not. :-)

      --
      Karma: It's all a bunch of tree-huggin' hippy crap!
    10. Re:Market Penetration by helmutjd · · Score: 4, Informative

      Already been done without Java or Flash. (demo).

      Disclaimer: I'm one of the developers for this product.

    11. Re:Market Penetration by pixelcort · · Score: 2, Informative

      I'm trying to recreate the entire web through XMPP:

      nuWeb.org

      Yeah, using a IM standard and some P2P apps. Unlikely, but solves polling and distribution problems.

      I may be out of my mind, but HTTP sucks.

      --
      http://pixelcort.com/
    12. Re:Market Penetration by mattyrobinson69 · · Score: 2, Interesting

      How about this

  2. Re:wooo by kgbspy · · Score: 5, Insightful

    What chance one of the big four (aim/icq/msn/yahoo) adopting these standards? Sorry, I did say standards, so you can discount msn. But if any of the other three did, and there was a greater level of interchangability between those, and jabber because of it, the takeup would be much higher.

    But that's the thing about standards - unfortunately it's always the big players that seem to set the ones that have any major sway.

    --
    ~
    ~
    ~
    -- INSERT --
  3. So how does jabber work then? by B747SP · · Score: 5, Interesting
    OK, I know, I should RTFRFC, but in a nutshell maybe? I tinkered with jabber for a bit, couldn't get my head around how it worked in a big-picture sense (servers, networks thereof, how to find a server with lots of like-minded folks on it, etc, etc). I'm ashamed to say that I've always ended up traipsing back to IRC/ICQ/Yahoo despite that my client and my other client both speak jabber fluently...

    Does someone wanna give a quick HOWTO and/or a pointer to a suitably high-level explanation? Thanks.

    --
    I find your ideas intriguing and I wish to subscribe to your newsletter.
    1. Re:So how does jabber work then? by Hawke666 · · Score: 5, Informative

      In a nutshell, it's pretty similar to e-mail, only without indirect routing between servers, and (partly therefore) less store-and-forward, and definitely less latency. It also includes presence information. See also http://www.jabber.org/oscon/2004/jabber-bootcamp.p pt
      and http://en.wikipedia.org/wiki/Jabber

    2. Re:So how does jabber work then? by Hawke666 · · Score: 2, Informative
    3. Re:So how does jabber work then? by mindstrm · · Score: 5, Informative

      The glossed over version:

      jabber is multi-site, like mail. You don't need a server with like-minded people... all jabber users globally can chat with each other (unless, you know, you set up a private jabber server, etc.. same with email)

      The protocol is open and extensible, and supports the idea of extended transports.. so the jabber server can act as a gateway for msn/icq/aol/foo/bar/baz. My jabber server deals with all of this... my yahoo/aim/icq/msn contacts are all stored on my jabber server.. I just sign in with whatever jabber client I want, and it all just works.

  4. took long enough by js7a · · Score: 3, Insightful

    I remember when IETF drafts took less than six years to make it through to RFC status.

  5. Commoditisation by Colin+Smith · · Score: 2, Insightful

    "What chance one of the big four (aim/icq/msn/yahoo) adopting these standards?"

    Immediately? Very slim.

    However, like almost all of the other standardised protocols they will eventually have to be able to interoperate to survive. In the long term they will adopt a standard protocol or they will vanish.

    --
    Deleted
  6. Propogation by Guidlib · · Score: 4, Interesting

    I don't think Jabber/XMPP will truly propogate until every ISP hands you out an IM address on their XMPP compliant server along with the email they hand out. Hopefully this standardisation process will go a long way to see this happening.

  7. From an ex-Jabber Inc. guy by Erbo · · Score: 5, Insightful
    I know that this represents the culmination of many years of effort on the part of many people, both at Jabber Inc. and in the open-source community, especially Peter Saint-Andre, Jabber architect and evangelist extraordinaire. And, of course, without Jeremie Miller, none of this would even exist.

    To all my former colleagues: this is an historic day for Jabber, for instant messaging, and for the Internet. Congratulations!

    Erbo - Former employee, Jabber Inc., Denver, CO

    --
    Be who you are...and be it in style!
  8. Re:Yay! by DrEasy · · Score: 5, Interesting

    Well, nobody in this thread seems to care so far, but the question is indeed valid: does this mean that Jabber just beat SIMPLE? How will the IETF accommodate these two competing standards?

    --
    "In our tactical decisions, we are operating contrary to our strategic interest."
  9. Needed... bad by SnprBoB86 · · Score: 2, Informative

    Does anyone here actually use the official AIM?

    It gets significantly more bloated and less usable with each version. I have stopped upgrading it and only continue to use it because coupled with Dead AIM it is bareable and neccessary as everyone I know uses AIM for IMs. I understand GAIM or Trillian will also connect with AIM, but my point is that AOL is butchering what was once a simple and elequent program.

    --
    http://brandonbloom.name
  10. Industry Support by Guidlib · · Score: 2, Interesting

    It's interesting to note that Apple will be supporting this protocol. Perhaps that will be the start of some big industry backing.

    1. Re:Industry Support by Erbo · · Score: 2, Insightful

      Actually, Apple has been supporting this protocol, in their iChat software. It uses Rendezvous to get two nodes connected, but the message traffic between them is XMPP.

      --
      Be who you are...and be it in style!
  11. Open Source Reference Implementation by augustz · · Score: 4, Insightful

    What Jabber struggles with is a high quality open source reference server implementation that can serve as the center of gravity for server side jabber development.

    Whether it is hgiher level C# / Java or lowerlevel C++ / C there isn't (yet) a body of software with a lot of developer momentum behind it.

    Jive just released some of their stuff, will be interesting to see how that unwinds.

    If Jabber could get to that gravity producing mass on an open source implementation, I think you'd start to see Jabber expand into reliable messaging, higher volume messaging, presense, communication, BPM and lots of others apps.

    1. Re:Open Source Reference Implementation by ari_j · · Score: 2, Informative

      One problem I had when using Jabber for my honors thesis project was that it doesn't make fully-standard use of XML. The way the protocol assumes namespaces work is incorrect, and a fully-compliant XML parser will not work with Jabber, in my experience.

    2. Re:Open Source Reference Implementation by ari_j · · Score: 2, Informative

      That's a false statement. You have to shut off any kind of even minimal validation for it to parse. The assumptions XMPP makes about namespaces are false.

      The Jabber protocol isn't bad. It's got its quirks, but overall it's okay. The problem is that it relies on a broken version of XML to be useful. Anyone who's written a validating XML parser with the intent of using it to speak Jabber (as I did 3 years ago) can tell you just how bad it is. Things like treating "xmlns" as a regular attribute instead of a namespace declaration, not including subnodes in the correct namespace, barfing if you send a subnode with the correct namespace specified, etc.

      It's for that reason that my relationship with Jabber is love-hate. I love the "presence-aware XML router" nature of it, but I hate that it operates on bad assumptions about XML.

  12. Re:How about G-Chat by BicycloHexane · · Score: 2, Funny

    Real Time Ads! Next time I cyber I will get distracted by hardcore pr0n text ads!

  13. Re:GJabber by timealterer · · Score: 4, Informative

    Right from the Google corporate philosophy: "Google does search. Google does not do horoscopes, financial advice or chat."

    While it'd be wonderful for Google to come along in its shining armour and rescue us from the oppression of closed IM protocols, I think the fact that not doing chat is right in their official philosophy is worth noting. Of course Apple's iChat will have support for it, in OS X 10.4, and others may well follow... just maybe not Google.

    --
    - Allen Pike
    Altering time, one time at a time.
  14. Not just instant messaging by jgarzik · · Score: 5, Informative

    It's worth pointing out that XMPP is not just for instant messaging.

    XMPP standardizes a method for exchanging structured information streams between autonomous entities -- by they human or automated agent.

    Thus, when you (as an engineer) need to set up a network of programs that all communicate with each other, you don't have to roll your own protocol, XMPP can do it for you.

    Although IRC "botnets" have existed for quite some time, they are typically very primitive and exist mostly in the realm of script kiddies. Further, IRC is unformatted, unstructured, un-standardized text, making it very difficult to parse reliably.

    XMPP allows networks programs to communicate with each other in a "native" language -- data structures -- rather than attempting to glean information from a line of IRC ASCII.

    I'm currently using XMPP for several local applications: backup agents communicating with each other, sending and receiving mon monitors and alerts, an improved (RSS-like) syndication system, and more.

    This ain't your grandfather's IM protocol.

    1. Re:Not just instant messaging by jgarzik · · Score: 3, Insightful

      To elaborate a bit further...

      The reason why XML is so darned handy is that is captures the essence of what makes a computer useful: data structures. XML standardizes parsing (never write a text parser again), leaving only the task of grok'ing data structures to the programmer.

      XMPP takes that a step further: a standardized way two programs may exchange data structures.

      To me, this has all sorts of useful implications, particularly in enterprise installations. Now engineers can stop rolling their own TCP protocols just to get two custom applications to communicate with each other; they can now use XMPP, and exchange data structures.

    2. Re:Not just instant messaging by noselasd · · Score: 2, Insightful

      Ok. What have they done to prevent spam here. email/smtp had some design
      flaws when it comes to identifying senders. Does Jabber "solve" this.

    3. Re:Not just instant messaging by sw155kn1f3 · · Score: 2, Interesting

      Sheesh..
      It's what XMLRPC and SOAP are for. Look ma, RPC over XML via any medium (http, e-mail, ftp etc etc). And it's _widely_ supported.
      Would you please enlighten me why XMPP is better than SOAP?

      --
      - Arwen, I'm your father, Agent Smith.
      - Well, you're just Smith, but my father is Aerosmith!
    4. Re:Not just instant messaging by HyperCash · · Score: 2, Funny

      You know you're getting old when your grandfather has an IM protocol.

      --HC

      --
      So I'm jump'n up and down screaming show me the money.
  15. Great start! by pyrrhonist · · Score: 4, Informative
    This is a great news! After years of being an Internet Draft, Jabber finally entered the Internet Standards Track. This is good news for end-users, as a standard IM protocol with a standard presence protocol is exactly what we need to integrate disparate messenging devices like cell phones, VOIP phones, and IM clients. I am totally thrilled about this.

    Since XMPP has been in development for a while, hopefully it shouldn't take too much time for it to climb the Standards Track to full Internet Standard. Right now, XMPP is in the Proposed Standard category, which is the first step (look at the bottom of the list).

    The next level up is Draft Standard. To become a Draft Standard, the RFC has to be a Proposed Standard for at least six months, have two independently developed interoperable implementations, and have had "sufficient" successful use. I think that Jabber is pretty much a shoe-in for this category. Several servers been in operation for years from which a large amount of experience with the protocol has been gained, so there shouldn't be any contention about XMPP not being mature. There are many independent implementations, so that shouldn't be an issue either. I don't think there will be any problems getting to Draft Standard in six months.

    The final step in the Standards Process is Internet Standard, where the RFC retains its RFC number, and gets the all important STD series number. A standard needs to be in the Draft Standard category at least four months (or until at least one IETF meeting has occurred, whichever comes later). On the technical side, there needs to be a significant implementation of the protocol and much more experience using it needs to be gained. The level of maturity for Standards is such that the protocol is believed to be beneficial to the community. Again, since XMPP has been in the works for over two years now and there are now commercial implementations, I don't think there is a problem with the usage requirements. Furthermore, as the only open messaging protocol, it has a large value to the Internet. Thus, I think getting Jabber to full standard easily is not out of the question.

    In about a year, we'll have an Internet Standard for IM and prescence (and an open one, at that)!

    --
    Show me on the doll where his noodly appendage touched you.
  16. Re:That's cool and stuff, but by Anonymous Coward · · Score: 5, Informative

    It's not a IM client, it's a protocol that can be easily extended to do just about anything. I'm pretty sure there's something out there that can use video/audio conferencing, if not, then one will soon appear if there's enough demand.

  17. Don't you just love 90 page RFC ? by Anonymous Coward · · Score: 2, Interesting

    90 pages.
    As excessively verbose as XML streams it describes.
    Yuck.

  18. Re:Office Use by mindstrm · · Score: 2, Interesting

    Not quite.. jabber is multi-site too.. moreso than any of teh other IM clients which require a central server farm to work.

    user@foo.com can message user@bar.com. you can set up your own jabber server and join the global jabber community.

    it works like email.. DNS looks up the domain, finds the appropriate record for the server to use, and then delivers.
    Ultimately, all the other IM systems (msn, aim, etc) are centralized... we rely on one provider. Jabber is completely internet-scale.. infinitely more scalable than the others.. that's one large long-term advantage.

  19. Simple Explanation by ari_j · · Score: 2, Insightful

    Jabber is a presence-aware XML router. Beyond that, it's imagination-bound.

    1. Re:Simple Explanation by ari_j · · Score: 2, Informative

      I should note that I used Jabber as the network layer of my honors thesis project, a general-purpose distributed computing architecture. If my web server weren't dead this week, I'd share a link with you, but rest assured that Jabber makes things simple that would otherwise not be.

  20. Good for Jabber by marktaw.com · · Score: 4, Insightful

    AIM/ICQ, Yahoo and MSN have no need to adopt open standards, and never will. Yahoo does so much stuff that Jabber doesn't do - Imvironments, Audibles, etc., and more importantly, they want to be proprietary so they can decide whether or not to allow third party clients to connect to their service. Twice in the past year I've been locked out of Trillian because of Yahoo, and once they even caused Trillian to crash completely. I had to wait for an update to Trillian, which was available within 24 hours. Supporting open standards wouldn't let them do that. Remember, running a massive IM server and developing a client doesn't make you money, but showing ads does, and Yahoo brilliantly works these in as Imvironments.

    Imvironments and Audibles, proprietary smilies, etc. are also strong arguments for using Yahoo's client rather than Gaim or Trillian. I don't get any of those things, and someone with Yahoo will inevitibly complain that I'm not in Yahoo, so I have to launch it. Very clever and "viral" of them.

    Jabber will probably never reach the same market penetration as the other IM clients, but that's ok, it's not really competition for them. You use AOL if you want to talk to your friends no matter where they are. You ues Jabber because you want complete control over your chat network - who can connect, whether or not you log chats centrally on the server, and who can eavesdrop.

    Jabber can work entirely behind a firewall, so your employees can talk to each other and not worry about revealing trade secrets to someone else sniffing their conversation, or talking to their friends and wasting company time. Or you use Jabber because you're conducting business you don't want someone else to find out about. For example, Google might want to use Jabber to communicate because MSN, Yahoo and AOL are their direct competitors and could listen in to their conversations.

    You also use Jabber because you deal with clients and need an audit trail. By logging conversations centrally on a server, you can produce an audit trail superior to even email. Being centrally located, if you trust that nobody's tampered with it, you get chat logs that prove what was said when to who, and what the response was. This is similar to centralied web-based trouble ticket systems.

    So, while Jabber may have many mechanical similarities to the other IM clients, the actual uses and needs it fulfils are somewhat different.

    1. Re:Good for Jabber by kyhwana · · Score: 2, Insightful

      Maybe you like all that flashy crap, but I use IM (gaim in windows in this case) for just IM and (sometimes) file transfer. I don't want imvironments "smilies", "audibles" and all that other crap.
      (It also lets you ignore all the fonts/colours/etc that other people have set, which a godsend, so I don't have try to read red text on a purple background or something equally hideous)

      For me, all the crap is a strong argument for NOT using yahoo's client.

      --
      My email addy? should be easy enough.
  21. Re:That's cool and stuff, but by spikerini · · Score: 3, Informative

    Don't worry. Audio/Video chat is currently being implemented in the Psi jabber client using Jabber/Helix. It shouldn't take too long before it's finished.

  22. Re:That's cool and stuff, but by JThundley · · Score: 3, Funny

    And you could be saving hundreds of dollars on car insurance by switching to voip! I mean! FUCK!

  23. XMPP Still Broken by Anonymous Coward · · Score: 5, Interesting
    XMPP has a serious design flaw in that it does not implement a framing protocol that helps the software that parses the protocol to separate messages. This might not be an obvious problem to the casual programmer, but if you are going to make scalable implementations that can multiplex thousands of connections, this is a very serious problem indeed.

    I'll give an example:

    Imagine the HTTP protocol for persistent connections. Let's imagine for a moment that all HTML instances are well formed and that the only other file type to be transferred is JPEG images. Now imagine that responses came without HTTP headers describing the nature of the response as well as the size. Content-length is really important. It dictates the amount of processing the software needs to do to determine when it has read a whole element of the protocol. This is an _IO_ operation and you snould NOT have to parse during pure IO.

    You might say "well, if it is HTML, then just parse it and see where it ends, and if it is a JPEG, heck you just parse that and see where it ends".

    No proper framing.

    Now imagine you are writing an HTTP Cache server which needs to do this for tens of thousands of connections simultaneously. Hard? Of course it is. Hard to do right at least. (We leave the kindergarten solutions to freshman students).

    The problem hinges on the fact that in most scalable implementations, you do not follow the one-thread-per-connection paradigm, hence you need to be able to process input in chunks. Given that you are processing many connections at the same time, you want to minimize context for each connection; ie. the amount of state you have to keep around to make sense of the data.

    The only way to securely know that the data you've read so far contains a valid element is to try and parse it. If you were able to consume an element, fine, if not, you have to read more data and try to parse the entire thing all over again. (Also, now you need to figure out how much you consumed, and thus, how much of the input buffer you can throw away).

    Of course, you could make your own primitive XML parser which can infer stanza boundaries, but everyone who has written an XML parser that is reasonably standards compliant knows that this is not easy. In fact, it is a significant project unto itself.

    It is not like this is a new problem. Just look at BEEP (or whatever it is called now). The designers of BEEP quickly realized just how incredibly clumsy a protocol that does not do proper framing is, so they added framing to an XML protocol, and hey presto, you have a protocol that is a lot easier to implement correctly AND efficiently. Or HTTP for that sake.

    I feel that the Jabber team didn't do their homework, and I am incredibly disappointed that IETF didn't have someone flag these problems. The fact that it has been so many years since they started working on this, and that they have not stumbled across this themselves does not bode well for the Jabber team.

    Let's hope they do the right thing now and add proper framing to their protocol. This way it becomes much easier to implement correct and really scalable servers, and we might be able to get a usable standard that can be used for large-scale IM.

    1. Re:XMPP Still Broken by vidarh · · Score: 4, Insightful
      You're assuming that parser state is going to be larger than keeping an input buffer large enough to keep a complete message.

      In the real world, XML is a very verbose protocol, and in most cases it is trivial to store the incoming data in a less space consuming format. Using a SAX parser that is reasonably efficient, the only state you will need to keep track of is namespace declarations and open tags - that is highly unlikely to be much data, and certainly unlikely to get anywhere close to closing the gap between the maximum size of a well parsed data set and the maximum allowed size of a message. As a consequence, a well written server should REDUCE state by parsing as you go, not increase it, and only a complete moron would keep trying to parse the message over and over again from the beginning each time.

      Even if this wasn't the case, a well written system would run into bandwidth limitations and IO limitations of the server long before memory limitations in any sane configuration - memory and CPU is cheap, good IO subsystems aren't.

      As a benefit of this approach, by the time all the data has arrived from the client, you have the message in a much more efficient representation. In fact, in many cases you might have enough information even before you have received the complete message that you may not even need to store the rest of the message as it comes in.

      The idea that you need the complete message before you start doing work on it is flawed - it implies that during sudden bursts of activity, your system will sit mostly idle until complete messages have been received, and then suddenly be swamped with processing, instead of spreading the processing cost over the whole time it takes to receive a message, which could potentially be a "long time" for many clients on slow connections.

    2. Re:XMPP Still Broken by vidarh · · Score: 2, Informative
      I don't use Java unless I'm forced to :-) And I don't have any code that I could share without violating past employment contracts, no, but I can describe the approach - there's nothing novel in it, it is simply a combination of two techniques: 1) Doing buffered multiplexing IO by maintaining a buffer per connection and doing non blocking reads of as much data as possible in a single syscall and 2) throwing away redundant information during a parse to "compress" the read information to only what your application actually need. I don't even remember where I picked up these techniques - I consider them blindingly obvious, and it's how I've been writing networking code for the last ten years at least.

      First of all, a long stream of Jabber messages is irrelevant - you won't be handling more than the greater of your IO buffer (and don't try to tell me 100MB would be a sensible size) or one XML stanza and parts of another at most - so unless you're leaking memory the length of the stream have absolutely no bearing. The IO buffer size is a tradeoff between performance and memory usage, but can be fairly large since in a multiplexed environment it will be shared between all the connections when you are parsing the message as you go.

      If you want to parse the message at the end, however, you'd need to retain a copy of the message.

      Second, a SAX parser needs to maintain only very limited state between calls - it needs the XML namespace declarations and a list of open tags, and enough space to hold a single incomplete token. I'm sure you can find implementations that keep all kinds of extra around, but I've both used and written parsers that have made do with just the above.

      The general approach then is to maintain a fixed size IO buffer, fill it as much as possible with a single non blocking read (which btw. is another reason to consider interspersing processing with the reads, as you for some workloads otherwise end up context switching to fill up only small parts of the buffer at a time), feed it to the SAX parser. If any events are fired, your code looks at the data and throw away any parts it doesn't need, or stores it in whatever more compact form is convenient for you. If the message is complete, process whatever stored data you have gathered from it, and throw away afterwards. Repeat as long as the connection remains active.

      Since XML usually is extremely redundant, it is normally trivial to come up with an internal representation that is more space efficient than raw XML, but many applications (including XMPP) don't even need all the data from many messages - often you only need to store parts of the data you are receiving. If space is really important for you, and you're dealing with limited XML vocabularies, this is particularly true, as "compressing" the XML can be very efficient.

      Applying this to multiplexed IO is exactly the same as for the single connection case. You maintain the state required per socket. Each time the socket becomes readable, you fill the buffer as much as possible, and feed the SAX parser.

      No, it's not guaranteed to always save space - if you NEED to do processing on the raw XML for whatever reason, or NEED all the data before you can do processing that allow you to throw some of it away (after passing it on to another client or server for instance), you could apply compression techniques to your IO buffer without parsing the XML and might end up doing better, but I've yet to come across a single real life application where that would have been a useful approach.

      That said, I'd not usually spend time doing any of this except as a side effect of the convenience of storing the data in a more useful form than raw XML, as memory is cheap and developer time isn't. Given relative component costs it's almost always IO performance that is the most cost effective to optimize in the kind of systems I've dealt with.

  24. The Future of IM by Scott+Robinson · · Score: 2, Insightful

    I'm suprised everyone thinks Jabber is DOA. It's no MSN, AIM, or Yahoo. However, it's not supposed to be.

    Currently, Jabber is an open IM standard with tools available now. It has been receiving large rollouts for corporate use, and plenty of people use it exclusively for IM. (Myself, recently, included.)

    It the future, instant messaging will become more important. Be it text, audio, video, or something new Jabber (with its XML base) can theoretically support it nicely.

    And the worry about numbers isn't something I have. It's fairly simple scalability math. For example, if every cellphone/mobile device comes online and even a quarter of them use instant messaging, the AIM/MSN/Yahoo userspace will be completely swamped.

    1. Re:The Future of IM by borud · · Score: 2, Insightful
      It has to have more going for it than being free.

      It has to be good as well.

      Free is pointless if it is not good as well, and I am not convinced that from a technical point of view, Jabber is quite what everybody was/is hoping for.

  25. XMPP Framing problem fixed or not by borud · · Score: 2, Informative
    I just threw a cursory glance at the XMPP specification and I still can't see any fixes for the framing problem.

    I had a look at Jabber years ago, but what put me off what is now known as XMPP was that it didn't solve the problem of framing stanzas. The only way to determine the borders of a stanza, and thus when you have read enough to successfully parse it, was by parsing the content.

    When you write a high-performance multiplexing server (for any protocol) you wish to minimize the state associated with each session or connection. I am not sure this is necessarily easy for Jabber. Its lack of proper framing dictates that you need to do some serious thinking about how to end up not wasting a lot of memory and CPU. Not really important if your server has ~100 clients, but when you want to accomodate millions of clients (as must be the goal for any large ISP when choosing an IM architecture), these things translate into dollars.

    As someone else pointed out: BEEP solves the framing problem, as does HTTP.

    How do you solve the framing problem in XMPP? How would you go about designing a multiplexing implementation that can handle, say, 1000 connections on a 800Mhz P3 without burning a lot of CPU?

    (The figure was chosen because I've observed a hub IRC server handle 7-800 client connections and 4 servers on IRCNet while only consuming about 10% CPU in steady state)

  26. Crypto by daserver · · Score: 2, Informative

    Was really excited seing RFC 3923: End-to-End Signing and Object Encryption for the Extensible Messaging and Presence Protocol. I only thought the jabber people were making rfc's for the basic protocol.

    Sadly I don't think there is any clients supporting it yet?

  27. All kinds of office by JavaPriest · · Score: 2, Interesting

    IM is even used in warfare.

    A good example of this is the CTF-50 Case Study done by OFT. The types of capabilities they used to increase Mission Effectiveness (i.e. Instant Messenger, Web-logs, basic Portal) would be available directly from Core Information Services.

    The study doesn't say which IM protocol/client was used. The value of IM over phone/radio was having a history of what was communicated.

  28. Re:Yay! by lordpi · · Score: 3, Interesting

    I was thinking the exact same thing. My only guess is that SIP/SIMPLE doesn't have the same amount of 'corporate' backing to push it through the standards process? Although, from other, recent articles, I was lead to believe that SIP had made some inroads in VoIP and P2P... So it is a suprising development.

  29. Re:Yay! by Trejkaz · · Score: 2, Interesting

    As far as I understand it, both standards are attempting to map themselves to CPIM (RFC-3862). And I'm pretty sure there is already at least one working gateway from Jabber to SIMPLE, so the two can co-exist in practice anyway.

    In the end I hope it's the developers who get the say over which one stays and which one goes. If they get intimidated by the ironic nature of SIMPLE (it's not simple!), and every developer decides to use Jabber/XMPP instead, then all the best apps should inevitably be based on Jabber. That would pull in the most users, and they would win.

    About the worst thing that could happen would be for Microsoft to back SIMPLE, write some shitty apps for it, and force them down the throats of the users of their OS. Which... is probably what's going to happen, since Microsoft have been supporting SIP for some time now.

    --
    Karma: It's all a bunch of tree-huggin' hippy crap!
  30. Re:IRC by pyrrhonist · · Score: 3, Informative
    I still wish they would have just improved on IRC. IRC has been around since the late 1980s, and was a significant improvement over talk.

    Yeah, IRC has some nice features, and it was the way to do IM before there was such a thing as IM (talk and write be damned). All the cool kids were using it.

    Unfortunately, its adoption as a standard ran into some issues:

    • RFC 1459, the Internet Relay Chat Protocol RFC was placed into the "Experimental" category.
    • Many programs implemented special improvements that were eventually collectively released as RFCs 2810 through 2813. These RFCs, though, were marked as "Informational".
    • The IRC Client-To-Client Protocol (CTCP) for sending structured data between clients was released as an Internet Draft, but was never made an RFC.
    I think the real killer of IRC as a standard, was the release of RFC 2779, "Instant Messaging / Presence Protocol Requirements". IRC just wouldn't fit this model without a major overhaul, and at that point, you have to question whether it would be worth trying to do that without sacrificing compatibility. It was probably easier to just write a new standard.

    How does it compare to Jabber? Well, IRC is much simpler (try to write IRC with netcat, then try XMPP).

    At it's base level, yes, it's definitely easier. You can do most of what you need for IRC with just a Telnet client. This is kind of fun actually.

    --
    Show me on the doll where his noodly appendage touched you.
  31. Penetration? Uncle Sam? Ewwwww by Lord+Prox · · Score: 2, Interesting

    Also, I seem to remember something about NASA and FEMA choosing Jabber for their needs, and IBM's CapWIN law enforcement / first response flash network using Jabber.

  32. A business nervous system by Colin+Smith · · Score: 2, Interesting

    You're absolutely right. Jabber could be the nervous system of future businesses. I've been putting inter application communication systems together using NNTP servers given the costs of traditional middlware systems, quite a lot of work and the data formats are simple, but it works fairly well but Jabber would be faster and could be more standardised, more ubiquitous.

    e.g.
    http://www.archeus.plus.com/colin/middlewa re/

    --
    Deleted
  33. Re:Indeed. Using an XML based protocol is a farce. by dangermouse · · Score: 4, Informative
    XML is when something has to be human readable and unless its for the benefit of some line tapping hacker who the hell is going to read IM packets?

    No, it's not. If you'd ever developed with XML, you'd know human-readability is not a major reason to use it.

    Not only is XML bloated and so sucks up bandwidth (important if you're still on dial up) but its slow to parse and generally ugly.

    XML compresses amazingly well. I have an OpenOffice spreadsheet that's 25MB in uncompressed XML. Zipped up, as OpenOffice files are, it's about 150k. That's an extreme example, but grab any xhtml web page and gzip it.

    "But its for developers!" someone shouts. I'm sorry? Just how dumb a developer do you have to be to not be able to grok some efficient binary protocol? "But its a standard" someone else shouts. No it isn't. XML is a shell , you can fill it with any old shit and just because something else is "XML based" doesn't mean it will understand it.

    Yes, but XML is a standard shell. Data encoded in XML can be parsed, looked up, accessed, transformed, and represented in code using off-the-shelf toolkits which are extremely good at doing all of those things. You don't have to fuck about writing a parser and a lexer, you can just grab some stuff off Jakarta and go to work on your application instead of its IO format. Furthermore, XML is extensible (that's what the X is for)... if your format requires additional information in the future, or needs to act as a carrier for another format's info, that's already taken care of. Probably a good thing for a message-passing protocol, don't you think?

    Using XML for IM is a clear case of jumping on the bandwagon for no reason other than the sheep mentality coming to the fore.

    Funny, my first thought when I saw your post was "oh look, another cynical-but-wise wrong-tool-for-the-job anti-XML post".

  34. Re:Indeed. Using an XML based protocol is a farce. by Just+Some+Guy · · Score: 2, Informative
    XML is a shell

    Full stop, end of story. XML is nothing more or less than a structured way to store data. What would they get by not using XML, other than having to write their own container format, their own parser, their own editors, their own portable libraries to deal with it, and their their own inevitable screwups that happen every single time someone decides to reinvent the wheel?

    Since it's pretty clear that writing ad-hoc parses for structured data is an obsolete practice, what else could they have used? EDI?

    No, they chose to use the established standard that can take advantage of the optimized and field-tested libraries that are already in widespread use. Frankly, inventing their own representational language would've been the naive alternative that would have resulted in Yet Another Unused Instant Messaging Protocol. They were fortunately more far-sighted than yourself and we now have something useful to show for it.

    --
    Dewey, what part of this looks like authorities should be involved?
  35. Re:Indeed. Using an XML based protocol is a farce. by dangermouse · · Score: 2, Insightful
    Then what is? Because I can't think of any other good reasons.

    Fortunately, you don't have to. I provided you with a brief list later in my reply.

    Thats a bit like saying you can make a go-kart go really fast if you try. Yeah great , but why not just buy a car in the first place then?

    You've got your comparison backward. Your whole argument was that a car (XML), which is larger but is more versatile, wasn't as small as a go-kart (compact, binary format). My point was that if you want, you can negate the size difference while retaining the versatility of the car, so your argument is moot.

    So what , its still just a shell! So you can download some parser to parse it. Oh well great, that saves a weeks development time. And slows down the whole product whenever its run. Hmm , great tradeoff. Not.

    It's again clear that you've never actually developed with XML. If you really care about speed (or need to reduce memory use), you work with SAX streams instead of DOM or other object models. You might take a speed hit compared to working with byte-delimited chunks of binary data, but it will be of a scale you're certainly not going to care about in message-passing, which tends to be a sparsely executed operation anyway.

    I'm also beginning to wonder whether you've actually got a job, as saving a week's development time is often the difference between whether the project gets done or not. In the context of XMPP, this could be a major factor in adoption of the protocol-- bear in mind that's a week's development time saved for every implementation of the protocol.

    No , theres nothing special about "messages" that means they all have to use a standard format. Why shoehorn everything into the same dumb standard? Horses for courses...

    Bear in mind that I'm talking about the messaging protocol (carrier format), not the payload. If your protocol requires changes, isn't it good to be able to add information without necessarily breaking older implementations of the protocol? Wouldn't it be good if they could simply ignore information relating to features they don't support? You can't do that in a byte-delimited binary format without careful and specific pre-planning, the effort of which may be wasted if you're not sufficiently prescient.

    Absolute fastest speed and optimum compactness are not everything, and are usually pretty far down on the list of requirements even for an application-level network protocol. They are almost always trumped by minimizing development effort, maximizing extensibility and maintainability, and standards compliance (yes, even of "shells"). If this weren't the case, we'd all be writing everything in C and doing pointer math on arrays of gobbledygook all the time.