Slashdot Mirror


Exploring Apache's SOAP Serialization APIs

Irish writes "This IBM developerWorks article discusses the theoretical underpinnings of SOAP's type system. Its a good article for anyone who wants to learn more about SOAP's programmatic support or to simply better understand Apache's SOAP."

17 of 147 comments (clear)

  1. A nice SOAP client/server can be found: by Macaw2000 · · Score: 4, Informative
  2. Bruce Schneier has said: by Gis_Sat_Hack · · Score: 5, Interesting

    Implementation of Microsoft SOAP, a protocol running over HTTP precisely so it could bypass firewalls, should be withdrawn. According to the Microsoft documentation: "Since SOAP relies on HTTP as the transport mechanism, and most firewalls allow HTTP to pass through, you'll have no problem invoking SOAP endpoints from either side of a firewall." It is exactly this feature-above-security mindset that needs to go. It may be that SOAP offers sufficient security mechanisms, proper separation of code and data. However, Microsoft promotes it for its security avoidance.

    source:
    http://www.counterpane.com/crypto-gram- 0202.html

  3. SOAP ain't so 'S'imple no mo by Anonymous Coward · · Score: 4, Informative

    XML RPC is simple - it has a 4 page specification. SOAP is, well, not so simple. SOAP started out simple, but then committees got a hold of it. Try reading the specification - it's well over 100 pages long - and all legaleeze. All this crazy namespace and XSLT stuff only adds to its bloat. Surely we all can find a compromise between the simplicity of XML RPC and the robustness of SOAP. I have read that Don Box himself is questioning the SOAP protocol, or at the very least the HTTP transport it is coupled with.

    1. Re:SOAP ain't so 'S'imple no mo by gnovos · · Score: 5, Funny

      XML RPC is simple - it has a 4 page specification. SOAP is, well, not so simple. SOAP started out simple, but then committees got a hold of it. Try reading the specification - it's well over 100 pages long - and all legaleeze.

      The "Simple" in SOAP is like the Green in Greenland... it's there to keep the non-Vikings out... Look for Complex Hyperbolic Interface Protocol, that'll be the one that M$ ACTUALLY uses...

      --
      "Your superior intellect is no match for our puny weapons!"
  4. Re:Bruce Schneier has said: by steve_l · · Score: 5, Informative

    Bruce hasnt looked at the protocol enough, he is being paranoid.

    well, doing SOAP callbacks into the firewall is hard because you have to have an accessible endpoint...for this reason you cant do SOAP callbacks over HTTP. But some of the other transports: SMTP, Jabber, do work and go through firewalls like nobodies business.

    Another issue is that you can't tell whether the message is good or bad from the header; it will always be a POST and the same endpoint/URL could be used from everything from a side effect free get to a malicious bufferstomping write.

    You need to look inside the XML payload, and, being XML, that means understanding XML...string matching is not enough, not when you can disguise stuff with escaping, UTF or Unicode formats, etc.

  5. Blobs by igrek · · Score: 5, Informative

    There's one potential problem with SOAP - sending binary objects. You can't insert binary in XML, so the options are:
    - encode binary to 7-bit (hex, etc.) or
    - send it "outside" of the XML, as MIME attachment
    The acticle mentions these.

    However, there's one more way to do it - the new DIME protocol. It's explained in this article:
    DIME: Sending Binary Data with Your SOAP Messages

    1. Re:Blobs by Citizen+of+Earth · · Score: 4, Insightful

      <![CDATA[ BINARY DATA HERE BASE 64 ENCODED ]>

      They should include an additional escape mechanism to solve this problem directly, like (to borrow the horrid CDATA syntax):

      <![BDATA[10 xxxxxxxxxx]]>

      where the "BDATA" means "Binary Data", and the "10" is the number of binary bytes in ascii decimal followed by a single space followed by the indicated number of bytes of raw binary.

      Honestly, what kind of rabid theoreticians designed XML anyway and didn't include a mechanism for raw binary? Were they thinking that people would encode images like:

      <img:image>
      <img:row>
      <img:pixel>
      <img:red>0xFF</img:red>
      <img:green>0xFF</img:green>
      <img:blue>0xFF</img:blue>
      <img:opacity>0xFF</img:opacity>
      </img:pixel>
      ...
      </img:row>
      </img:image>


      Of course, with XML you also face an enormous lexical-scanning cost. One can easily derive a fully-interoperable totally-equivalent binary encoding for XML; perhaps one day people will realize that it's not efficient to pass everything around in text. Imagine spending all day parsing a big array of real numbers encoded in text rather than slurp/swapping raw 8-byte IEEE doubles.

  6. Redundant Post xml-rpc is by far better by codepunk · · Score: 4, Interesting

    SOAP is nothing more than a poorly designed and implemented version of xml-rpc. Try getting two soap services talking together one time. Interop does not exist in the SOAP world. Take a look at xml-rpc for some lib's that work (without the hype).

    Let's see 2 page spec vs 200, come on people wake up!

    --


    Got Code?
  7. SOAP's popularity will be its problem by conan_albrecht · · Score: 5, Insightful

    I just wrote an article on this. SOAP gets past firewalls because it *looks* like web traffic (at least, HTTP traffic). That's great because most firewalls let HTTP traffic on port 80 through.

    However, once admins realize that we programmers are sending our services (which are inherently a security issue) through port 80, they'll likely start filtering SOAP.

    One of the reasons that RMI and CORBA are firewalled is because they provide remote access to *objects* that might be powerful and that can certainly execute behavior within the trusted environment. SOAP does exactly the same thing, only it looks like HTTP traffic.

    Yes, SOAP can be detected very easily by firewalls. Therefore, I'm predicting that as it becomes popular, many admins won't let it through as easily as it gets through today.

    My $0.02.

  8. Oh common by apankrat · · Score: 4, Insightful

    You certainly can preach about 'feature-above-security mindset that needs to go' for as long as you want, but when it will come to the product not working at your biggest customer site due to the firewall setup and them not willing to mess it up just for trying out yet-another-beta proggy, you will consider SOAP, stunnel, httptunnel and anything else that will get you closer to the goal.

    I agree that positioning SOAP as firewall-transparent protocol is .. err .. may get interpreted incorrectly by less experienced members of comp.sci society, but .. hey! .. you can misuse almost everything.

    .. and (not re: your post, but a thread head) XML-based marshalling ? Give me a break ... Once you start tuning the performance, you realize that bottleneck is often exactly in the freaking SOAP layer with its bloated XML data encoding. You certianly can compress it, but what's the need in XML there for then ?

    --
    3.243F6A8885A308D313
  9. OMG by zephc · · Score: 5, Funny

    "XMI (XML Metadata Interchange) is an OMG standard for sharing UML models among applications"

    What about the OMFG ROFL standard?

    --
    "I would say that 99 per cent of what my father has written about his own life is false." - L. Ron Hubbard Jr.
  10. Re:Bruce Schneier has said: by microTodd · · Score: 4, Insightful

    If you use SSL with either Basic Authentication or some PKI mechanism then you could somewhat trust your client anyways.

    Also, some SOAP/Servlet containers don't run on port 80, they run on port 8080 or something like that. Just because it uses HTTP doesn't mean its using port 80.

    Besides, shouldn't your public web server be in the DMZ anyways, and your SOAP application server inside the firewall? So why are you allowing all port 80 traffic inbound?

    --
    "You cannot find out which view is the right one by science in the ordinary sense." - C.S. Lewis on Intelligent Design
  11. Re:Bruce Schneier has said: by mmusn · · Score: 5, Informative

    By that argument, let's get rid of HTTP. I mean, HTTP invokes remote procedures on the web server, in the form of servlets and CGI scripts. In different words, SOAP is no less secure than HTTP. If your firewall passes HTTP to the wrong internal servers, you have a security problem, no matter whether you are running SOAP or not.

  12. Re:yes you "can" but its silly by Matts · · Score: 4, Informative

    Actually no, CDATA sections don't win you anything beyond not having to escape the < and & characters. You still can't send non-character data within a CDATA section, such as \0, FF, or BEL bytes.

    --

    Matt. Want XML + Apache + Stylesheets? Get AxKit.
  13. Corba over HTTP(S)? by Baki · · Score: 4, Insightful

    One of the advantages of Soap is proclaimed to be that it runs over HTTP (available everywhere) and also it is buzz-work compliant (XML).

    One could also run Corba over HTTP. Corba can use any transport medium. IIOP was only intended to be one of many possible, and if firewalls etc are really the problem, then why not run Corba over HTTP?

    I don't see any other 'advantages' from Soap over Corba. WSDL is an XML format describing the service. Why should it be better than IDL? Both can be parsed by machines and read by humans. With DII (dynamic invocation interface) one can build in generic Corba-over-HTTP client functionality in any program (such as a webbrowser).

    Really, what's new? What's wrong with Corba? Implementing a Corba service in a language such as Java (which takes care of memory management issues and integrates very well with IDL) is trivial. Writing clients even more so.

  14. Re:Bruce Schneier has said: by Zeinfeld · · Score: 4, Insightful
    Bruce says many things he really should not and often with far less thought than he should. You would think that someone who spends so much time talking to journalists would understand the way his pronouncements are taken.

    The reason that Bruce is quoted so often on security is that he returns journalists calls within an hour or two and gives a quotable quote by the deadline.

    I discussed the SOAP paper with Bruce and Adam. The comment about SOAP was not intended to be taken as gospel, it was simply a throw-away comment added to the end of a section.

    Bruce's security expertise is largely in the area of cryptography. He has not been a player in the network security protocols area. His last foray into that area was his criticism of IPSEC which was wrong on almost every count according to Steve Bellovin (who knows rather a lot about internet firewalls having helped invent them)

    The criticisms Bruce makes would be valid if they had not been anticipated. Microsoft has actually developed a very comprehensive security architecture for SOAP and .NET, one of the lead designers was Brian LaMachia who some folk will know as the one time author of the MIT PGP key server.

    A big problem with firewalls is that they are in most cases managed by people whose job is to stop bad things happening, it is not their job to help make usefull things happen.

    Another big problem is that they are often used in the manner of a +5 amulet of protection against hackers, the company does not know how they work but they hope they will ward of attacks. My company installs and configures firewalls. It is not uncommon for our PSO to go onsite to re-configure a longstanding installation and find that it is configured for passthrough on all ports.

    If you deploy SOAP you need an application layer firewall. Which coincidentally Microsoft just happened to demonstrate at RSA 2002. Now running a firewall on top of Win2K would be a pretty bad idea, you don't want a full feature O/S for that type of application. But running a firewall over the NT for embedded systems that is comming soon would be a pretty good idea, particularly with the .NET security framework.

    --
    Looking for an Information Security student project suggestion?
    Try http://dotcrimeManifesto.com/
  15. General Reply by awitod · · Score: 5, Informative

    Hardly ever post here, but I know a thing or two about SOAP and thought you nice folks might find it informative.

    I've done three implementations for three different clients in the last two years. The first integrated an existing UNIX front-end with an existing NT back-end (I know... the real world sure is strange), the second enabled a COM+ app server to talk asynchronously to Apache on a Linux, and the third was a port of a windows forms app that used DCOM to SOAP for use in a VPN.

    I have to say that I am mostly pleased with SOAP, but that it does have areas that need improving.

    Reading this board, I've noted a couple of misconceptions that seem pretty prevalent.

    1. SOAP is not Simple. Several posters noted that the spec is over 100 pages long. Most of the specification is about the correct formatting of the description language on the server side. Fortunately, both Microsoft and IBM toolkits provide tools for generating this stuff and the tools cover 99.9% of what you will ever want to do. As a developer you can use SOAP without ever authoring a wsdl file. Reading the file is not very hard, I was able to write my first working SOAP client implementation within a week of starting. All you need is a good understanding of the HTTP protocol, XML, and your client platform.

    2. SOAP is bloated. Many people (including me) think this when they first see an example of a web service description language (wsdl) file. The key thing to note is that a decently designed client only needs to read it once (using http GET) to understand the service. The actual requests (using http PUT) and responses don't have too much adornment and are pretty darn simple. The server will use the wsdl to validate incoming requests and if it has a decent design, it too only read it once on the service startup.

    3. XML-RPC is better because it is simpler. XML-RPC is actually very, very similar to the rpc aspect of SOAP. But going back to 1. above the spec is so short because XML-RPC lacks an equivalent to wsdl (a runtime readable description of the service). In other words, XML-RPC requires you to understand the interface at design time. Because of this an XML-RPC solution is more tightly coupled and less flexible than an equivalent SOAP implementation. (This might be acceptable depending on the requirements).

    4. Running over port 80 is bad. In fact, it can be. However SOAP requests are generally speaking, HTTP POST, so this has less to do with the standard than the reliability and security of the listener. A good listener will act as an application proxy and reject any shenanigans. A good network design that includes a DMZ with another firewall between the private network and the server is also required for it to be secure. Message level security can use SSL or an alternative.

    5. It isn't standard between vendors. Some validity to this, but I found the differences between M$ and IBM to be very minor and easy to accommodate.

    There are some problems with running over HTTP though.
    1. It is never as fast as native rpc solutions in my experience. You can cut down on the size of the response by using gzip or deflate with http 1.1, but there is no facility for compression on the inbound side. The need to minimize round trips is directly at odds with this lack of functionality because:

    2. It is stateless so things like transactions are very difficult to do and cause the requests to contain enough info for the server to do something ACID... These accentuates problem number 1.

    3. You can't do call backs or event from the server to the client. This is strictly a 'you request and I respond' protocol. You can't push from the server to the client with SOAP.

    Hope you found this informative.