Posted by
Hemos
on from the programming-with-dial dept.
Irish writes "This IBM developerWorks article discusses the theoretical underpinnings of SOAP's type system. Its a good article for anyone who wants to learn more about SOAP's programmatic support or to simply better understand Apache's SOAP."
SOAP ain't so 'S'imple no mo
by
Anonymous Coward
·
· Score: 4, Informative
XML RPC is simple - it has a 4 page specification. SOAP is, well, not so simple. SOAP started out simple, but then committees got a hold of it. Try reading the specification - it's well over 100 pages long - and all legaleeze. All this crazy namespace and XSLT stuff only adds to its bloat. Surely we all can find a compromise between the simplicity of XML RPC and the robustness of SOAP. I have read that Don Box himself is questioning the SOAP protocol, or at the very least the HTTP transport it is coupled with.
Re:SOAP ain't so 'S'imple no mo
by
Anonymous Coward
·
· Score: 2, Informative
XMLRPC bindings exist for Perl, Python, Java, Frontier, C/C++, Lisp, PHP, Microsoft.NET, Rebol, Real Basic, Tcl, Delphi, WebObjects and Zope.
Re:Substantial
by
Anonymous Coward
·
· Score: 3, Informative
Ummm, I've been in the office for like 16 hours and reading is hard and stuff. Could someone please provide me with like 100 words or less or something
Reader's Digest version: SOAP lets you call remote procedures over HTTP (for better or for worse). It has binding for many languages. It is an ASCII protocol, and is fairly verbose.
Re:Bruce Schneier has said:
by
steve_l
·
· Score: 5, Informative
Bruce hasnt looked at the protocol enough, he is being paranoid.
well, doing SOAP callbacks into the firewall is hard because you have to have an accessible endpoint...for this reason you cant do SOAP callbacks over HTTP. But some of the other transports: SMTP, Jabber, do work and go through firewalls like nobodies business.
Another issue is that you can't tell whether the message is good or bad from the header; it will always be a POST and the same endpoint/URL could be used from everything from a side effect free get to a malicious bufferstomping write.
You need to look inside the XML payload, and, being XML, that means understanding XML...string matching is not enough, not when you can disguise stuff with escaping, UTF or Unicode formats, etc.
There's one potential problem with SOAP - sending binary objects. You can't insert binary in XML, so the options are: - encode binary to 7-bit (hex, etc.) or - send it "outside" of the XML, as MIME attachment The acticle mentions these.
Re:Bruce Schneier has said:
by
mmusn
·
· Score: 5, Informative
By that argument, let's get rid of HTTP. I mean, HTTP invokes remote procedures on the web server, in the form of servlets and CGI scripts. In different words, SOAP is no less secure than HTTP. If your firewall passes HTTP to the wrong internal servers, you have a security problem, no matter whether you are running SOAP or not.
Re:yes you "can" but its silly
by
Matts
·
· Score: 4, Informative
Actually no, CDATA sections don't win you anything beyond not having to escape the < and & characters. You still can't send non-character data within a CDATA section, such as \0, FF, or BEL bytes.
Re:Bruce Schneier has said:
by
lseltzer
·
· Score: 2, Informative
I wouldn't assume that the documentation is wrong, but that Bruce Schneier is. The quote in the parent accurately reflects what is in Schneier's essay: He makes an unsourced reference to "Microsoft's Documentation". I wonder what else he makes up in the article.
In this article, MS "...demonstrates the creation of SOAP servers and clients that communicate using different transports: sockets, Microsoft Message Queue, the file system, and a custom HTTP listener."
In this one, we find the quote: "The fourth part of the specification defines a binding between SOAP and HTTP. However, this part is also optional. You can use SOAP in combination with any transport protocol or mechanism that is able to transport the SOAP envelope, including SMTP, FTP or even a floppy disk."
At the top of the main developer resources for SOAP page, we find another quote: "SOAP is a lightweight and simple XML-based protocol that is designed to exchange structured and typed information on the Web. The purpose of SOAP is to enable rich and automated Web services based on a shared and open Web infrastructure. SOAP can be used in combination with a variety of existing Internet protocols and formats including HTTP, SMTP, and MIME and can support a wide range of applications from messaging systems to RPC"
Re:WTF is SOAP?
by
Chris+Croome
·
· Score: 3, Informative
I think some of the most interesting things that have been written about SOAP have come out of the REST thesis, probably the best two introductory articles on REST and the ones on XML.com by Paul Prescod; Second Generation Web Services and REST and the Real World.
SOAP or BLOAT?
by
selectspec
·
· Score: 2, Informative
Text encoding distributed object messaging and remote procedure calls, just so you can tunnel over HTTP is the stupidest thing I've ever heard of. If you sent the objects via carrier pigeon you could avoid the firewalls too!
By the time I get a fat multi-megabit pipe out to my cabin in the woods, the internet will be saturated with this BLOAT, and I still won't be able to do much more than send email.
Hardly ever post here, but I know a thing or two about SOAP and thought you nice folks might find it informative.
I've done three implementations for three different clients in the last two years. The first integrated an existing UNIX front-end with an existing NT back-end (I know... the real world sure is strange), the second enabled a COM+ app server to talk asynchronously to Apache on a Linux, and the third was a port of a windows forms app that used DCOM to SOAP for use in a VPN.
I have to say that I am mostly pleased with SOAP, but that it does have areas that need improving.
Reading this board, I've noted a couple of misconceptions that seem pretty prevalent.
1. SOAP is not Simple. Several posters noted that the spec is over 100 pages long. Most of the specification is about the correct formatting of the description language on the server side. Fortunately, both Microsoft and IBM toolkits provide tools for generating this stuff and the tools cover 99.9% of what you will ever want to do. As a developer you can use SOAP without ever authoring a wsdl file. Reading the file is not very hard, I was able to write my first working SOAP client implementation within a week of starting. All you need is a good understanding of the HTTP protocol, XML, and your client platform.
2. SOAP is bloated. Many people (including me) think this when they first see an example of a web service description language (wsdl) file. The key thing to note is that a decently designed client only needs to read it once (using http GET) to understand the service. The actual requests (using http PUT) and responses don't have too much adornment and are pretty darn simple. The server will use the wsdl to validate incoming requests and if it has a decent design, it too only read it once on the service startup.
3. XML-RPC is better because it is simpler. XML-RPC is actually very, very similar to the rpc aspect of SOAP. But going back to 1. above the spec is so short because XML-RPC lacks an equivalent to wsdl (a runtime readable description of the service). In other words, XML-RPC requires you to understand the interface at design time. Because of this an XML-RPC solution is more tightly coupled and less flexible than an equivalent SOAP implementation. (This might be acceptable depending on the requirements).
4. Running over port 80 is bad. In fact, it can be. However SOAP requests are generally speaking, HTTP POST, so this has less to do with the standard than the reliability and security of the listener. A good listener will act as an application proxy and reject any shenanigans. A good network design that includes a DMZ with another firewall between the private network and the server is also required for it to be secure. Message level security can use SSL or an alternative.
5. It isn't standard between vendors. Some validity to this, but I found the differences between M$ and IBM to be very minor and easy to accommodate.
There are some problems with running over HTTP though. 1. It is never as fast as native rpc solutions in my experience. You can cut down on the size of the response by using gzip or deflate with http 1.1, but there is no facility for compression on the inbound side. The need to minimize round trips is directly at odds with this lack of functionality because:
2. It is stateless so things like transactions are very difficult to do and cause the requests to contain enough info for the server to do something ACID... These accentuates problem number 1.
3. You can't do call backs or event from the server to the client. This is strictly a 'you request and I respond' protocol. You can't push from the server to the client with SOAP.
here.
XML RPC is simple - it has a 4 page specification. SOAP is, well, not so simple. SOAP started out simple, but then committees got a hold of it. Try reading the specification - it's well over 100 pages long - and all legaleeze. All this crazy namespace and XSLT stuff only adds to its bloat. Surely we all can find a compromise between the simplicity of XML RPC and the robustness of SOAP. I have read that Don Box himself is questioning the SOAP protocol, or at the very least the HTTP transport it is coupled with.
Ummm, I've been in the office for like 16 hours and reading is hard and stuff. Could someone please provide me with like 100 words or less or something
Reader's Digest version: SOAP lets you call remote procedures over HTTP (for better or for worse). It has binding for many languages. It is an ASCII protocol, and is fairly verbose.
Bruce hasnt looked at the protocol enough, he is being paranoid.
well, doing SOAP callbacks into the firewall is hard because you have to have an accessible endpoint...for this reason you cant do SOAP callbacks over HTTP. But some of the other transports: SMTP, Jabber, do work and go through firewalls like nobodies business.
Another issue is that you can't tell whether the message is good or bad from the header; it will always be a POST and the same endpoint/URL could be used from everything from a side effect free get to a malicious bufferstomping write.
You need to look inside the XML payload, and, being XML, that means understanding XML...string matching is not enough, not when you can disguise stuff with escaping, UTF or Unicode formats, etc.
There's one potential problem with SOAP - sending binary objects. You can't insert binary in XML, so the options are:
- encode binary to 7-bit (hex, etc.) or
- send it "outside" of the XML, as MIME attachment
The acticle mentions these.
However, there's one more way to do it - the new DIME protocol. It's explained in this article:
DIME: Sending Binary Data with Your SOAP Messages
A Busy Developer's Guide to SOAP 1.1
HTTP = web
HTTP = webdav
HTTP = SOAP too.
By that argument, let's get rid of HTTP. I mean, HTTP invokes remote procedures on the web server, in the form of servlets and CGI scripts. In different words, SOAP is no less secure than HTTP. If your firewall passes HTTP to the wrong internal servers, you have a security problem, no matter whether you are running SOAP or not.
Actually no, CDATA sections don't win you anything beyond not having to escape the < and & characters. You still can't send non-character data within a CDATA section, such as \0, FF, or BEL bytes.
Matt. Want XML + Apache + Stylesheets? Get AxKit.
In this article, MS "...demonstrates the creation of SOAP servers and clients that communicate using different transports: sockets, Microsoft Message Queue, the file system, and a custom HTTP listener."
In this one, we find the quote: "The fourth part of the specification defines a binding between SOAP and HTTP. However, this part is also optional. You can use SOAP in combination with any transport protocol or mechanism that is able to transport the SOAP envelope, including SMTP, FTP or even a floppy disk."
At the top of the main developer resources for SOAP page, we find another quote: "SOAP is a lightweight and simple XML-based protocol that is designed to exchange structured and typed information on the Web. The purpose of SOAP is to enable rich and automated Web services based on a shared and open Web infrastructure. SOAP can be used in combination with a variety of existing Internet protocols and formats including HTTP, SMTP, and MIME and can support a wide range of applications from messaging systems to RPC"
AFIK it is a protocol devised by Dave Winner from Userland and Microsoft, it has been rubber stamped by the W3C, and it's specifications can be found on their site: Simple Object Access Protocol (SOAP) 1.1.
I think some of the most interesting things that have been written about SOAP have come out of the REST thesis, probably the best two introductory articles on REST and the ones on XML.com by Paul Prescod; Second Generation Web Services and REST and the Real World.
There has been quite a bit of interesting discussion on SOAP on the W3Cs Technicial Architecture list, see this thread: SOAP breaks HTTP?.
Check out MKDoc a mod_perl CMS
Text encoding distributed object messaging and remote procedure calls, just so you can tunnel over HTTP is the stupidest thing I've ever heard of. If you sent the objects via carrier pigeon you could avoid the firewalls too!
By the time I get a fat multi-megabit pipe out to my cabin in the woods, the internet will be saturated with this BLOAT, and I still won't be able to do much more than send email.
Someone you trust is one of us.
Hardly ever post here, but I know a thing or two about SOAP and thought you nice folks might find it informative.
I've done three implementations for three different clients in the last two years. The first integrated an existing UNIX front-end with an existing NT back-end (I know... the real world sure is strange), the second enabled a COM+ app server to talk asynchronously to Apache on a Linux, and the third was a port of a windows forms app that used DCOM to SOAP for use in a VPN.
I have to say that I am mostly pleased with SOAP, but that it does have areas that need improving.
Reading this board, I've noted a couple of misconceptions that seem pretty prevalent.
1. SOAP is not Simple. Several posters noted that the spec is over 100 pages long. Most of the specification is about the correct formatting of the description language on the server side. Fortunately, both Microsoft and IBM toolkits provide tools for generating this stuff and the tools cover 99.9% of what you will ever want to do. As a developer you can use SOAP without ever authoring a wsdl file. Reading the file is not very hard, I was able to write my first working SOAP client implementation within a week of starting. All you need is a good understanding of the HTTP protocol, XML, and your client platform.
2. SOAP is bloated. Many people (including me) think this when they first see an example of a web service description language (wsdl) file. The key thing to note is that a decently designed client only needs to read it once (using http GET) to understand the service. The actual requests (using http PUT) and responses don't have too much adornment and are pretty darn simple. The server will use the wsdl to validate incoming requests and if it has a decent design, it too only read it once on the service startup.
3. XML-RPC is better because it is simpler. XML-RPC is actually very, very similar to the rpc aspect of SOAP. But going back to 1. above the spec is so short because XML-RPC lacks an equivalent to wsdl (a runtime readable description of the service). In other words, XML-RPC requires you to understand the interface at design time. Because of this an XML-RPC solution is more tightly coupled and less flexible than an equivalent SOAP implementation. (This might be acceptable depending on the requirements).
4. Running over port 80 is bad. In fact, it can be. However SOAP requests are generally speaking, HTTP POST, so this has less to do with the standard than the reliability and security of the listener. A good listener will act as an application proxy and reject any shenanigans. A good network design that includes a DMZ with another firewall between the private network and the server is also required for it to be secure. Message level security can use SSL or an alternative.
5. It isn't standard between vendors. Some validity to this, but I found the differences between M$ and IBM to be very minor and easy to accommodate.
There are some problems with running over HTTP though.
1. It is never as fast as native rpc solutions in my experience. You can cut down on the size of the response by using gzip or deflate with http 1.1, but there is no facility for compression on the inbound side. The need to minimize round trips is directly at odds with this lack of functionality because:
2. It is stateless so things like transactions are very difficult to do and cause the requests to contain enough info for the server to do something ACID... These accentuates problem number 1.
3. You can't do call backs or event from the server to the client. This is strictly a 'you request and I respond' protocol. You can't push from the server to the client with SOAP.
Hope you found this informative.