Sites Rejecting Apache 2?
An anonymous reader writes "Vnunet
reports on the low adoption of Apache 2 has caused its producers to advocate freezing development of the open-source Web server until makers of add-in software catch up. Almost six months after the launch of Apache 2, less than one percent of sites use it, due to a lack of suitable third-party modules." I'm not sure where they are getting the freezing Apache development part, more talk about forking for 2.1 right now on the httpd mailing list. The article does have it right though that until there is a reason to upgrade and the modules are in place that adoption is not going to happen. While the cores of both Perl and PHP are thread-safe, the third-party modules are not. This renders one the larger reasons to use Apache 2.0, the threaded http support, useless for applications using either of these application layers. It comes down to the question of whether the third-party module writers are better off supporting what is used or what is new.
What is the percentage of sites that actually use third party modules?
I think the fact that it's not being adopted is more because there is no need for the new version from most sites. What they have works and is stable, so there is no reason to upgrade.
cl
Reply . . . let's get it over with.
As soon as they release a stable version for Apache 2 (aka 4.3.0), then I'll look seriously at switching. It's great that Apache 2 has stablized now, though, as it lets everyone else work around a stable project.
We'll all get to Apache 2, it just takes time to migrate.
My main question is, what would it matter if sites weren't using apache 2.0, isn't it enough that open source software is being used??
Ignore the "p2p is theft" trolls, they're just uninformed
I'll use aolserver (aolserver.sf.net).
It's a stable and tested technology.
For my project I stuck with apache 1.3.x
Here's what would convince me to change.
... its always nice to know that the big boys have taken the plunge.
.. given that I bothered reading up on some of the early discussions.
-- References. Have any high profile apache sites migrated? While my sites are small
-- PHP Support. As of 4.2.0, Apache2 support was experimental. The change log does not show anything which says its supported.
-- Mod_gzip support. This is a big one. Mod_Gzip makes my sites download a extremely fast when users over dialup lines log in. This is true specially for low bandwidth countries in Asia. Mod_gzip support has left me fairly confused
Even with all of this.. I'm not likely to change unless there is a perceptible difference in the load / performance stats on my system during the switch.
Until we have a stable, PHP, Mod_perl, Mod_gzip(or whaterver they call it these days), and mod_layout I can't go down the apache road as my site needs all these things.....
I see the writer's point, I does appear that the apache group is pretty much only patching apache 13.x at this point to solve issues, verses imporoving and or adding things so thts probablt a good start to get people start moving. However till te other things catch up(which honestly how long was 2.0.x in beta, they should have been able to work against the dev tree, and come out with compatable products, although I am not an apache developer so I don;t truely know whats involved)
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
As others have pointed out, the 1.3x server is fine already. Why put yourself through the pain of building 2.0, rebuilding PHP et al and worrying about it all working until it's been proven?
By the way, I'd like to know who the hell came up with this god-awful colour scheme?!!
Code, Hardware, stuff like that.
I know Apache does not have any "customers" to support, but why were they so eager to break compatibility for Apache 1.3 modules in Apache 2.0? I know backwards compatibility code isn't sexy, but couldn't they keep the old module API and thunk it to the new API? Then Apache 2.0 could ship with rock-solid mod_php and mod_perl. Let modules developers migrate slowly on their own schedule.
Here's an interesting perspective from Ole Eichorn, the CTO of Aperio Technologies:
One of the more significant recent discontinuities occurred with the release of Apache 2.0. Although it has been under-reported, Apache 2.0 is significantly discontinuous (non-backward-compatible) with Apache 1.3. Many webmasters have decided not to upgrade for now, rather than have to recode their custom modules. And many of the custom modules out there are 3rd party, so the resources to make the changes are not readily available.
It is not clear to me why the discontinuity was required. There was no technical reason not to maintain backward compatibility. I think your essay gets it right, the people who made these decisions were not involved in the original development, and were not sufficiently aware of the impact their decisions would have on their developer community. Multi-threading processes, which inspired most of the discontinuity, primarily benefits Windows sites - a small proportion of Apache installations - and most Windows sites use IIS and aren't going to change.
I bet in a few years we'll be able to track Apache's decline as the leading web server back to this point.
cpeterso
I won't switch over to Apache 2 until there's an amiga port of it!
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
Apache 1.3 works just fine for lots of basic website needs. Why upgrade just for the sake of upgrading? That's proprietary software's game.
Yeah, one of these days I'll upgrade my webservers, (probably when I decide to do a full install of the latest version of a distro that includes it) but there's no particular rush at the moment.
-- Alastair
I'd say the number one reason why people are moving over to Apache2 is due to PHP's slowness in supporting it.
Yeah, yeah, I hear everyone saying "PHP 4.2 works fine with Apache2" Well, we're not touching it as long as it labels apxs2 support as "experimental"
As a software author, you really need to worry about your own users outpacing you. For instance, if someone likes a feature in Apache 2, and every module they use, except yours, works with Apache 2, people quickly discover that they don't need your module all that much anyhow.
Wasn't that everyone's experience when switching from Windows? You can't get program XYZ for Unix, so you discover that you never really needed it that much anyhow...
As a programmer, it always pays to be everywhere you possibly can. But, when it's open source, programmers don't care what's best for the user, so don't expect it to happen.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
when distros will start shipping 2.0 as standard,
everyone will "just use" it. Of course there would
be some rejection rate, of stubborn people. 1.3
development would stop and everyone would slowly roll over to 2.0.
pro 2.0:
- threaded stuff is blindingly fast. most systems threads are faster then processes
- other new technologies, like layered content filtering are great for developers of hight traffic sites.
pro 1.3:
very very many people using apache use linux. Linux threads are almost same performance as processes. Due to kernel limitation, you can stack only so many threads per process.Plus threaded model does not account for stability. One NULL pointer dereference and you're gone. Apache2.0 of course uses bundles of threads. so you still have multiprocess model kicking around.
Expect 2.0 gain popularity on systems like Sun, BSD and Win32 where processes handling is relatively expesive. Threads are dirt cheap.
As everything, things take time. Just like well brewed beer.
cheers.
Make what is new become what is used and the software makers will have no choice but to support it. Simple.
Oh shit! I forgot to click "Post Anonymously"...
That or where it started to fork.If people are unwilling to go 2.x, they'll put the effort into adding new stuff into 1.x. Are we seeing Open Source at work?
Xix.
"Everything is adjustable, provided you have the right tools"
... Why haven't they created a compatability layer or function or something like that to import the older API modules? Seems kinda fundamental to me.
The RH beta includes Apache 2.0 by default. Expect market share to rise when the new RH ships.
The number one problem with Apache 2 is its reliance on threads, and its assumptions about threading models.
This will certainly not win me friends in the "everything should use threads because it's easier to do linear programming than to build a session reentrant state machine" camp, but...
Threads are useful for SMP scalability, but they aren't very useful for much else (I/O interleaving is adequately handled by most network stacks, the I/O interfaces themselves, and the fact that almost all the bytes being mode are being moved from the server to the client: the protocol is very asymmetric, even if you aren't pushing multimedia files). In most cases, threads are a liability.
Under Windows, they introduce data marshalling issues that have to be accounted for in user code -- not just in the modules which implement interpreters for that user code.
Under UNIX, threads are generally a loss, unless there is specific scheduler support for thread group affinity, when threads are running on the same processor. and CPU negaffinity, when there are multiple processors, to ensure that there is maximal usage of computer resources.
If you do the first, then you have the possibility of starvation deadlock for other applications: basically, it's not possible to do it correctly in the scheduler, you have to do it by means of quantum allocation, outside the scheduler. This means a threading approach such as scheduler activations, async call gates, or a similar technique. If you do the second, then you pay a serious penalty in bus bandwidth any time locality spans multiple CPUs -- in other words, it's useless to use SMP, if you have, for example, a shopping cart session that needs to follow a client cookie around.
Overall, this means that you were much better off using session state objects to maintain session state, rather than using threads stacks to do the same job. This is actually pretty obvious for HTTP, in any case, where requests are handled independently as a single request/response pair, and connection persistance isn't generally overloaded to imply session information (you can't do that because of NAT on the clinet side, multiple client connections by a browser on the client side, and server load balancing on the server side, etc.).
Overall, this factors out into threads bringing additional pain for module writers, without any significant performance or other benefit, unless you go SMP, and have a really decent threads and scheduler implementation -- which means you are running a recent IRIX or Solaris, which is a really limited fraction of the total web server market.
Frankly, they would have been a lot better off putting the effort into the management of connection state and MTCP or a similar failover mechanism, and worried about NUMA-based scaling, rather than shared memory multiprocessor with particular threads implementation scaling. The cost for what you get out of the switch is just too high.
-- Terry
The "fuss" is they are still labeled as "experimental."
The most powerful features of Apache based sites aren't features of Apache but of 3rd party modules. PHP, mod_perl, mod_dav, mod_throttle and even Microsoft Frontpage modules contribute significantly to the appeal of apache. There is an excellant Report on Apache Module Popularity by SecuritySpace.com. In considering this report, you should notice the month over month growth in the usage of modulees which have not yet been ported to Apache 2. The developers of these modules will most likely respond to customer demands for support of apache 2, which is dependant of the Apache Software Foundation's ability to convince customers of the benefits of upgrading to Apache 2. In this respect the marketing of Open Source Software mimics the marketing of treditional commercial software. Let's hope they don't adome the strategy of some commercial software vendors by simply refusing to provide security fixes or updates to Apache 1.3.x when needed.This would certainly outrage Apache users, but in the case of Open Source would have the secondary effect of promoting forking of the codebase. On the bright side customers do have a recourse in the case of Open Source, where, they're left twisting in the wind in the case of commercial products.
--CTH
--Got Lists? | Top 95 Star Wars Line
Apache 2 is to the best of my knowledge not distributed with any Linux distrobutions. The Linux distros won't ship A2 until the third party modules have played catch-up.
Until then, we'll just wait and watch adoption be gradual.
Gradual adoption is great, though. That means that the late adopters can be more sure that the platform is stable and efficient.
Stop the brainwash
We run a simple web site hosting shop. Our main web server is running Apache 1.3.26, mod_ssl, and mod_perl. We host several thousand low-traffic sites. If not for the recent security problems in apache and mod_ssl, we still would probably be running 1.3.6 which worked just fine for our needs. We do realize that eventually we will have to upgrade but that's not our priority. It'll probably happen in about a year.
My 2 eurocents: I run a webhosting company. 1.3 works, and I've waited for 2.0 to stabilize a bit - just like with Linux kernels, I like to skip the first 10 or 15 dot-releases if possible ;-).
.php requests to a cgi-bin PHP interpreter sitting behind sbox).
Now, we've setup a test platform, and when our customers are happy we'll move it into production in a month or so, but secondary to our 1.3 setup. In about a year, we'll shut down the old setup and 'force-migrate' anyone that's still using it.
Targeting the SME market, we need to provide that sort of stability because my customers typically are not I-want-to-run-the-latest-and-greatest geeks and, having paid a lot of cash for their website, they're happy it runs and they don't care on what version it runs.
I think that most of my colleagues are in the same position, so 1.3 will probably be the major version for at least a year to come.
(Modules aren't the issue for me - in fact, I've not built the PHP module for 2.x because with all the script kiddies hacking around, I have decided to forward
You incriment the left-most number in the release number. So 1.3 is not expected to be compatible with 2.0, and Linux kernel 2.4 is not expected to maintain backward compatibility with 1.0 ;) This makes things much easier to maintain and see at a glance.
Now as to why they did it, Apache 1.3 is great. I love it, but it is not as cross-platform as it pretends to be (it does not perform well on Windows) and it really is not built for speed. If you need these things, you need multithreading, a better abstraction model so you are not assumign POSIX compatibility (and hence emulating it on Windows) etc. This means you break the compatibility. Pure and simple, but in the end, you get a better product.
Think of Apache 2.0 as Apache-- Next Gen. Not yet supported but when it does, it will be more competitive than 1.3.x because it has a better architecture.
LedgerSMB: Open source Accounting/ERP
And we have trouble: A php script creating a (temporary) file will not be able to use it, because it will be owned by the Apache server, not the owner of the PHP script.
This is not fixed in Apache 2, AFAIK.
-- From Denmark
I wouldn't be surprised if many UNIX users don't ever go for this and Apache 1.x just branches off into a separate project. Apache 2 can turn into some kind of specialized Apache derivative for platforms that just can't handle forking; we shouldn't keep burdening UNIX software with accomodating those other kludgy operating systems.
Did any of you actually understand a word of what he just said?
very very many people using apache use linux. Linux threads are almost same performance as processes. Due to kernel limitation, you can stack only so many threads per process.Plus threaded model does not account for stability. One NULL pointer dereference and you're gone.
So, because limitations in Linux' kernel design, Apache 2.0 is held back? Interesting. What I wondered when reading your remark quoted above, was: apache can't be the only program which will benefit from multi-threading? I mean: a server with a database system on it will benefit greatly using threads for query processing. Processes are nice, and I know Unix' schedulers mainly first schedule processes and then threads, but if Apache or another program puts the spotlight on a flaw in Linux, why isn't it fixed?
Multi-threading is more efficient than multi-process, so why are Linux kernel designers still on the route to multi-process and not multi-thread? To me, this sounds like a flaw which Linus and friends don't want to solve for some reason.
Never underestimate the relief of true separation of Religion and State.
It is an interesting bit of spin to label the hesitancy of sites to upgrade to Apache 2.0 "rejection".
Apache 2.0 has only recently been released and has not even made it into a large number of server OS distributions (certainly not in the way Apache 1.x has).
After its inclusion in a few OS distributions and after support for mod_p{erl,php} becomes stable, then we will be in a position to judge whether or not it is being rejected, but certainly not now.
There is now a mod_layout port to Apache 2.0, the thing is I have yet to get any feedback from people using it. So, I continue to put effort into the 1.3 version not the 2.0 version.
Untill mod_perl is ported I won't be actually using it myself.
You can't grep a dead tree.
Not to put salt in open wounds, but in IIS, which uses threads, they use a concept build in Windows: apartments. You have single threaded apartments (STA) and multi-threaded apartments (MTA). The webserver itself uses threads for handling requests and when a certain library is called/opened by the code, that library takes care of in which apartmentstyle the code is ran: in an STA or in an MTA. VB6 com objects f.e. can't run in an MTA, so they are run in an STA. This is controlled by windows (as a configparam of the com object). So here you see a combination of both worlds: multi-threaded and safe where it has to be, without the hassle of forcing the developer to write threadsafe code when the code itself isn't multi-threaded, but the environment is.
Of course, there are some issues: when you let the code executed by the request of user A create an object in an STA and move that into a container which can hold both STA's and MTA's, and let code executed by the request of user B access that user A's STA object, you get thread unsafety and possible crap.
However: the OS's functionality offers the option to do it threadsafe and still have multi-threading in full effect. Perhaps a thing to look at for the thread/process guys in the Linux kernel team.
(It has been a long time, but afaik, a simple fork() is not forking off a complete new process, but a childprocess which runs as a thread inside the mother process, or am I mistaken? (if not: why then the threadsafetly crap NOW, because a fork() will result in the same issues)
Never underestimate the relief of true separation of Religion and State.
This is why its always (usually) a good thing. At the least, the option for it.
Would it be possible to create a patch/module for Apache 2 that allows old modules ot be used?
Question
http://www.ironfroggy.com/
As far as I can judge, there are two reasons why people wouldn't adopt Apache 2.0 . First of all, Apache 1.3 works Just Fine (WOW) for most sites, and it can therefore be considered wise not to upgrade to a later version which is based on a less-tested code base than what one is currently running.
The other thing is suggested by the author of the original post, and has to do with the fact that Apache 2.0 breaks compatibility with old modules. Downward compatibility is one of the Commandments in software development, and it's quite possible that this is a major reason for admins to be reluctant to switch to Apache 2.0.
Interestingly, both expecting people to upgrade to a product that almost certainly contains yet-to-be-discovered bugs, and breaking compatibility with previous releases are frequently observed in the practices of the Great Stan of Redmond. It may therefore not be surprising that those admins running Apache (rather than It Isn't Secure) would not go with it.
Please correct me if I got my facts wrong.
Well, my server has been running nicely for quite some time now.
I haven't encountered a single problem, Well, except that the default config is more secure and I had to manually change it to run legacy apps.
HTTP/1.1 200 OK
Date: Tue, 10 Sep 2002 08:18:09 GMT
Server: Apache/2.0.39 (Unix) PHP/4.2.2 DAV/2
Last-Modified: Sun, 24 Feb 2002 15:50:43 GMT
ETag: "2d405e-d7-4ac5ac0"
Accept-Ranges: bytes
Content-Length: 215
Content-Type: text/html; charset=ISO-8859-1
echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
On static and dynamic content it's slow about 10-15% ! Apache2 consume more CPU time than 1.3 on static content about 10-20%. Prefork, worker show almost the same performace on single CPU machines. So, when apache2 show at least the same perfomance I'l setup it on my server.
The "fuss" is they are still labeled as "experimental."
There are two very different angles to look at this problem from. Those who hack on linux in their spare time, and those who run mission-critical systems for their living.
Here in my basement, I run Debian Sid. I play with 2.5 series kernels. CVS doesn't phase me. And all is well, for if something gets screwed up, the only loss is my time, and there is more to be gained from the experience than there was lost to it. However, when the sun rises, and I make my way to work, the story is much different.
In the server room at work, I am responsible for the servers that host our client's websites, email, and DNS records. If something hits a bug, that something malfunctions. Maybe it hiccups, maybe it takes the entire box to it's knees. Given my druthers, I'd take the former, after all, if it just hiccups, it doesn't interfere with everything else. Now, I may think that I have a very firm grasp on what is happening on those boxes. I even pretend to think that I have a firm grasp on what is happening on my system here at home. My boss even trusts and respects my judgement. If I decided, for example, to replace our very stable and definately efficient-enough Apache 1.3.26, PHP, mod_gzip, etc. with the Apache 2.x and the corresponding modules, he wouldn't blink an eye. Why? Because he trusts me to make good decisions. I can't think of any better reason to stick with what is known to be stable, verses something that is "cooler," newer, or jus' phatter.
After all, as much as we like to think it, they don't pay us systems administrators to sit there and hack. They pay us to deliver systems that work.
kmem russian roulette: Aquillar> dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek=$RANDOM
So. Why did they add threading at all? What were the advantages, apart from making the code more complicated and prone to breakage, breaking module interfaces and making modules more difficult to write and making it less portable?
Threading in general is a really really bad idea unless you absolutely need it. Stick with a process model, with IPC if needed, unless you're one of those poor sods who absolutely has to have threading.
In fact, the only engineering idea that could be worse for Apache would be to include C++ code... can you say 'unresolved symbol in xxxxx'? You'd never find two binary-only modules that could be loaded into the same server. I do so love trying to figure out exactly which version of which compiler I have to compile Apache with to link it to the proprietary modules we, unfortunately, have.
Well so what. They're not changing the APIs now, are they? That's right, because now it's the final version, and not experimental code with labels "functionality might change!" all over it. I'd much rather they develop a high-quality, future proof API now than redo everything one year later. (And I'm not saying it's high-quality or anything, I'm not in the position to judge that.)
Switch back to Slashdot's D1 system.
We have two linux web servers (one a backup of the other) running Apache 1.3.26. We've not upgraded to Apache 2.0 as basically what we have works and is pretty much bulletproof. As we don't currently have a spare server we could use for testing purposes, we're leaving it alone. Sometime, perhaps, but not right now.
We do have one Windows machine also running Apache 1.3.26 - basically we needed a Windows web server for some web-based data drivers, and I really didn't want to use IIS for obvious reasons. (Basically, I think Microsoft would be doing themselves a favour by scrapping IIS and taking out a licence on Apache.)
Does anyone know how well Apache 2 is in the Windows variant, as I heard it had significant improvements over 1.3.x so that might be worth upgrading.
"Information wants to be paid"
And just because I use php, you assume I don't know java? I do know java, we don't use it because of it's *insane* memory requirements. Sorry, but just having Tomcat idling costs hundreds of megs of ram.
I thought java was all about write once, run anywhere.. what's the point of java if you're using it server side? We're not exactly changing webservers every other day. Why not just use C++ or C# instead?
Java is a language without a purpose these days. No one wants to use it client side, and server side just doesn't make sense when compared to the initial purpose of the language (being able to run the same binary anywhere makes no sense when it's only going to be running on your webserver).
cLive
-- Trinity in high heels carrying a whip: The donimatrix - there is no spoonerism
I don't understand why it matters. Apache 1.x is being used by people who are happy with it. Good. They're happy, ASF must be happy they're happy. Apache 2.x is being used by people who are happy to upgrade to it. They're happy, ASF must be happy they're happy. So, where's the problem? Does it matter to ASF that people aren't flocking to use Apache 2? People will migrate as and when they see a need to. This is a good thing, not a bad one. This is why free software is free. No-one is forcing anyone to do anything, but there is more choice. So, who isn't happy? Third party modules will be patched/re-written when there is sufficient need, not just for the sake of it. This is a good thing.
I know, I'll get modded into the basement for asking, but I wonder if Apache 2.x will do any better on intel's new hyperthreading processors.
There's an article here that mentions intel's future offerings and how they will all feature hyperthreading, and while the 25% performance increases must be mostly a marketing scam, I wonder how this new bullet item on the P4 feature list will work out.
Okay, I'm buying some of the hype for the time being, so sue me.
It took a while to get mod_webapp working on FreeBSD (with enough research done that I wasn't opening any new ports to the outside world). But once I was comfortable with the new setup, I was back.
I must admit, it does seem slower sometimes, but that might be because I upgraded to Tomcat 4 at the same time. Since I don't get nearly so much traffic that it makes a difference (it's a hobby site), Apache 2 works fine for me.
I think that a number of the posts here are missing an important point about the introduction of threading in Apache 2 (note: I claim no expert knowledge in the field of threads). Whilst it may be true that Linux' process model is so efficient that threads offer only marginal performance improvements (at the potential cost of less stability, etc) the same is not true of Windows. IIS has always appeared to run much faster on Windows than Apache ever has - a major factor that might be well be the only reason that IIS is still used (after all, IIS's complete lack of security should, if all things were even, mean that no sane sysadmin would even consider running IIS as a webserver). If version 2 allows Apache to run under Windows at the same, or better, performance as IIS (which I believe it does), then this should lead to an increased take up of Apache on this platform. At the same time, given that this doesn't really impact significantly on its performance on Linux (and arguably improves its scalability in large implementations), then what's the big deal? I think for this reason alone Apache 2 should be supported and encouraged to get the "critical mass" take-up it needs to flourish.
It's too late for me to die young
"What you you mean[...]?"
Windows threading implicitly uses thread local storage, for all objects allocated in the context of a thread. In order to instance those objects in another thread, you have to marshall the data via CoCreateFreeThreadedMarshaller().
This isn't obvious in a lot of cases, because programmers have a tendedncy to do initialization work before they break work off to a worker thread, and the Windows VM system *happens* to leave a mapping for local instance data in the created threads for any instance data which was created prior to the thread being created.
Basically, this means that If I create some connection state objects in a pool, spawn a bunch of worker threads to wait for client requests, and then assign a state object to a client request, I'm fine... I never see the need for marshalling.
BUT... if I need to create session state objects after the threads are created, if that new object causes a new allocation, rather than using the remainder of an existing allocation, then the data space mapping ends up being private to the thread. A second request on the session object will result in a reference to a mapping that doesn't necessarily exist in all threads, on the one in which it was created, unless the object is explicitly marshalled (data copied) between the threads.
Now this is probably not a problem for the Apache programmers, who may have been aware of this, but, more likely, lucked out because of order of creation of state objecats and threads (apache works very hard to front-load state creation, and cache the results). But it IS a problem for applications modules that have to make calls into unprotected services.
Probably the most glaring one of these which is going to impact Apache on Windows modules is that the LDAP library itself is not thread-reentrant, because the connection state for the module will not be created before the threads instances are created (this is a common problem with the use of the standard LDAP libraries for programmers on Windows platforms). Because of that, you have to explicitly wrap the LDAP library with a free-threading safe interface which converts concurrent requests into serial requests -- effectively rental or apartment model threading the library, from the application's point of view (apartment model with one occupant, the worker thread allowed to talk to the library).
The standard LDAP libraries are just one example of a service commonly used in web servers used as services platform integration frameworks.
There are actually a lot of others, because the Windows threading models are not very well understood by most people: they tell you that you need to marshall data, but they don't tell you the reasoning behind it by telling you how the threading actually works; even their own programmers sometimes get it wrong, and they DO have access to the documentation. That makes it very hard to deal with secondary dependent effects automatically -- and I guarantee you that Apache 2.0 does NOT deal with these effects for third party libraries that modules will need to be able to use.
-- Terry
Unless i'm horrbily mistaken there exists no build w/SSL and compiling it myself resulted in a very unstable mess that crashed 1 out of 10 times after requesting an https url.
FreeBSD's current threading is implemented in user space, although work is under way to move it into the kernel, that works is being done *ONLY* for SMP scaling and quantum utilization efficiencies.
As it stands, it is fully compliant with the POSIX threads standard.
If it is not working for Apache, it is because Apache is not a POSIX compliant threads client implementation.
From looking at the code, we can see this is the case, with the Apache code having an assumption of kernel threads, which you are not permitted by the POSIX standard to assume.
Although I have not yet verified it, an examination of the code *seems* to indicate that it has "the Netscape problem", which is an assumption about scheduling coming back to a given thread in a group of threads after involuntary preemption by the kernel when the process quantum has expired.
In older versions of Netscape, this displayed as a bug in the Java GIF rendering code, which was not thread reentrant, in that if you used a Java application as a web UI, and moved the mouse before all the pictures were loaded, the browser would crash. After I explained this, Netscape corrected their assumption, and the problem went away.
Ignorance of the requirements for writing threaded applications which will work on all POSIX compliant threads implementations is no excuse, nor is it a valid reason for blaming the host OS, unless you make it known what your requirements are, above and beyond the standard contract offered by POSIX, and that you are stricter than an application written to the POSIX interface, without such additional assumptions.
You will find that you have these same problems on MacOS 9 (NOT FreeBSD-derived), MaxOS X (uses Mach threads), Mach, Plan 9, VxWorks, OpenVMS, etc..
You will find you do NOT have these problems on systems with implied contracts above and beyond those provided by the POSIX standard: Solaris, UnixWare, Windows, and Linux. You may have *other* problems in Windows, related to implied contracts over virtual address space issues (see other posting).
-- Terry
Who wants to go to 2 and discover you've got your pants down? With the only point of 2.0 being to have an Apache that runs faster on Windows, the only upgrade that makes sense is for Windows admins to go to *nix. Why would *nix admins go to a less-tested, less-secure, less-understood server? Hope the major module writers never port to this bastard child.
"with their freedom lost all virtue lose" - Milton
Yes, I think that was really the first top quality mod for 2.0 I have seen...works fine on my test machine btw
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
Yeah but does your boss trust you enough to give you a raise? Sun Tsu (Beliskner's amendment) - Everybody wants something, if an employee craves trust then give him trust slowly, that way he won't ask for a raise (same as Intel artificially keeps chip development slow so they can milk money from the market).
A caveman dreams of being us, the incalculable power and riches. We dream of being Q, then what?
That's a good question.
I won't argue the legacy code portion of the question; all engineers prefer writing new code to maintaining old code. 8-).
The answer, at least according to me, is that SMP itself is barely a good idea for most OSs, because of scheduler-based CPU affinity implementation. Very few OSs have this right; Sequent (now IBM) and SGI got it right. Everyone else thinks that SMP scalability for Intel breaks down at 4 processors being the point of diminishing returns, but what they are really saying is that they aren't able to address the bus contention scaling issue in software without help.
Hyperthreading puts another tier of preference in there: you don't cache-bust if you move a thread between ALU's in the same CPU, but by the same token, you can't run non-ALU functions simultaneously without stalls.
Basically, this means that you need OS affinity changes, where commercial OSs, and even the free ones, like Linux and FreeBSD already have issues with affinity implementation (Linux implements affinity in the scheduler, leading to the ability of threaded processes to starvation deadlock other processes, but waves a dead chicken to try and work around it, and FreeBSD's implementation is a really rough set of patches that have yet to be integrated).
On top of the OS affinity changes, you have to have compiler help, as well, or the benefit of a Hyperthreaded processor sits at about 20-25% additional, relative to another physical CPU. With compiler help, this "jumps" to an average of 70-75%, relative to another physical CPU.
So basically, it's a relatively cheap way of incrementally increasing performance, if your OS supports the idea. If all CPUs came this way from the factory, then there's a modest win to be had, assuming a direct tie between user space threads and scheduling units. If they don't, then you will pay some penalty on non-Hyperthreaded CPUs, which probably isn't worth the trade-off, if you want to deploy on commodity hardware.
In the Linux case, the person to ask would be Ingo Molnar, who has done much of the work on the scheduler based affinity, and also on the Tux in-kernel HTTP server, and get it and the TCP stack and the minimal driver and data set to live in the confines of a single CPU cache. I don't agree with the approach he has taken in the scheduler, and have discussed it before, but there;s no arguing that it doesn't work, even if I personally think it makes things more complicated than they ought to be.
I think you will find that he will say that the benefit from Hyperthreading in a single CPU machine in that case is no better than perhaps 15-20%, relative to a non-Hyperthreaded processor, which reinforces the idea that the value of Hyperthreading is mostly in the compiler.
Actually, I've heard the Intel documentation on how to write Hyperthreaded code generating compilers described as "Don't do what GCC does", which seems pretty apt to me, all things considered. 8-).
-- Terry
The build process has been slowed down and, IMO, gone entirely broken. Previously I ran the configure script, which took a minute or so, compiled and installed. It worked.
Now a run a monstruous ./configure, which calls itself recursively and takes about ten minutes to complete, at which time any and all warnings have scrolled well past the top of the window. It does not report easy mistakes such as trying to make "so" a shared module until it is almost finished. And the libraries are not linked against the modules properly, so attempting to use a static libssl or libm is not possible.
An upgrade from 1.3.x to 1.3.x+1 took about half an hour. An upgrade from 1.3.x to 2.0.x has taken me the better part of two days, including reinstalling openssl shared so that mod_ssl works at all, for no immediate gain.
I can understand that people do not make the switch.
Is it easily possible to run PHP on Apache2 if one sticks to the non-threaded process model?
One thing that is very nice about Apache 2 is that it is so much easier to get Apache+SSL built and working. Building Apache 1.3 plus SSL plus PHP is still a bit of a chore, though I'm sure the PHP developers wouldn't consider it so.
- jon
Ganymede, a GPL'ed metadirectory for UNIX
with the possible exception of mod_php stability, i
think the single greatest thing that would encourage
the uptake of A2 is open documentation. Doc has
always been a terrible deficiency of Apache, but
in the 1.x series, the community gradually developed
an adequate expertise by force of the demands of
circumstance. But A2 configuration differs in
crucially important ways from A1, and the readily
visible documentation, beginning with the conceptual
model, and continuing on to the detailed syntax and
semantics of configuration directives, is far too
inaccessible to the busy admin at this point.
-I like my women like I like my tea: green-
1) PHP 4.2.x is stable, with libpspell, imap, pg and mysql support.
2) Apache supports the "per child" user definition. The inability to run as any user other than "nobody" is a real limitation. But, if I could define the user/group of a file and have the script run as that user/group, security for web applications would simply skyrocket! (I'd except "root" from that list, tho)
I have no problem with your religion until you decide it's reason to deprive others of the truth.
I went to install apache 2.0 on my home server running FreeBSD 4.4, which isn't all that old. /usr/ports told me that my OS wasn't new enough to install apache 2, so I installed 1.3 instead.
Maybe that's why apache 2 isn't all that popular yet.
At least mafia-owned pizzarias make excellent pizza. Compare to Bill Gates.
I'm aware of the sendfile problem and the "fix".
Sendfile is not an operation which it's really valid to have the function return until it's complete, so that completition status will be accurate. The assumption here was that kernel threads were being used, so that one thread calling sendfile would not block execution of other threads until the sendfile was complete.
The assumption of kernel threads here, or that sendfile could call-converted into a non-blocking call plus a context switch to another threads, is what was bogus.
Though it boggles the mind that one would expect sendfile(2) to be POSIX threads safe at all, considering that it's not a system call POSIX standardizes...
-- Terry
The basic problem is thread group affinity.
Basically, the promise of threads is that you will not be paying the equivalent of a full process context switch overhead, because your VM and other process-specific things will not have to change when context switching from one thread in a process and another thread in a process.
On a machine that has 1001 processes, and you are the 1 process, and you have five threads in your thread group (process), You basically have a 4 out of 1004 chance of one of your threads being picked as the next thing to get a quantum, when one of your threads makes a blocking call, so that it's no longer runnable.
What that means is that you have just reneged on the promise of lower context switch overhead, if you run thread #1, then run "cron", and then run thread #2.
So you have to play favorites, and say "I know "cron" has been waiting a long time, but I just blocked processing on thread #1, and thread #2 is runnable, so I'm going to preferrentially run thread #2, because it lets me avoid the VM switch, and the TLB shootdown, and the other overhead of a full process context switch, and therefore lets me keep my promise about threads being lower overhead than processes".
Any time you play favorites, you starve your non-favorites; just like a Robin or Sparrow with a Cuckoo's Egg in its nest.
So then you have to add all sorts of arcane accounting and other crap to avoid the starvation of other processes, and your scheduler becomes very, very complicated.
Compare this with Scheduler Activations, or an async call gate, where you give a quantum to a process -- and the quantum belongs to that process. In this case, your process runs until either there are no more threads to be run, or until its quantum is used up.
Things are actually more complicated than even this; for example, you want a threaded program to compete as multiple processes for quantum, or you are encouraging people to write programs that fork multiple children, instead of threads, in order to allocate themselves more quantum. On the other hand, you want to set some upper bound on the amount of unfair competition a single unpriviledged program can engage in, relative to other processes on the system.
If you attack thread group affinity as a scheduler problem, the amount of complexity you introduce is substantial, and there will always be corner cases.
There's actually been a huge amount of research on this; check the NEC CS search engine for "scheduling" and "load balancing" and "parallel".
-- Terry
With respect, the benchmarks people have posted have chown a 10-15% performance degradation in the switch from 1.x to 2.x of Apache.
I agree that your argument is valid for SMP systems... but it assumes that the application that's being used is also threaded, or rewritten to be threaded, and that the libraries it uses are also threaded. Whever they aren't, you are going to eat the performance at whatever serialization boundary you put there. It might as well be a standard one that allows legacy code to continue to function as it did before.
-- Terry
I understand the argument; however, it's pretty clear that the practice here has diverged significantly from the theory.
The context in which we are making these postings is an observation about the non-adoption of Apache 2.0, whse design was intended to prevent non-adoption for the reasons you state.
Still, here we are.
I'm fully willing to admit that I'm speaking in hindsight, and trying to analyze why people have failed to adopt Apache 2.0. I understand that non-adoption was not the plan; but reiterating the plan won't make the non-adoption unhappen.
-- Terry
Pre-forking, threading, foo, bar, mish, mash... blah..
In the final analysis, all the major apache 1.3 modules will never work corrects, to the point where code for one works well in the other, and vice-versa. The sad truth is that, like the Apache 1.x, the modules will slowly creep to replace the CGI's, and that took a few years to happen, and mainly with mod_perl replacing perl CGI's.
yeah, that might suck donkies, but its the sad way of human nature. WE simply want to make it like we used to have it in 1.3, and whatever. This it will never be again. Totally new modules should be writen, and used by the upcoming generation of coders, those whom are not corrupted by what we older folks have become used to. I'm 26 btw.
For example, the syntax of php is very good, and so are many of its ways of structuring things. But php itself needs to be thrown away as it stands now. Perl cannot speak of good syntax, it is simply one of the ugliest, yet most usefull languages there ever was. Yet mod_perl has a good chance of remaining viable on Apache2. This is what confuses most folks, because they don't understand how something to them, the elegant code they write, could not work well in another environment. And when your apache module becomes a place that itself is a launch pad for other modules, then what? For example, in php... most folks like to have mysql as a module, or GD, or whatever. However, now you have to wonder that in Apache2, that mysql could be a direct module to Apache2 itself , and php, or perl, just share the common thread. Do you suppose that php, or perl could be writen in a way to share their connections to MySQL, no... probably not going to play nice like that.
People just have to get past the notion that their development environment is just plain bad. The people at the Apache foundation knew it, and probably expected this sort of crap, why they want to mess things up in the next relase to confound the module writer is beyond me.
It isn't a lie if you belive it.
It depends what you count as Unix. I do not believe QNX has virtual memory, at least not in the sense of paging to disk, and I think QNX is POSIX compliant. In the beginning Unix swapped complete processes, not pages, and that is also not what would normally be called virtual memory.
Finally! A year of moderation! Ready for 2019?
Userland threading and blocking calls do not mix, ever.
The way userland threading works is "call conversion", which trades a blocking system call for a non-blocking system call plus a context switch.
FWIW, I worked indirectly on the DEC MTS product on VAX/VMS back in 1992 (indirectly, in that I made patches to the Bliss code as a Novell employee on a cooperative project with DEC), and I used the undocumented "liblwp" in SunOS 4.1.2 in the late 1980's, and before that, I used the "sigsched" package in the mid 1980's. All this adds up to me having experience with implementing call conversion schedulers going on 20 years.
The problem with sendfile(2) is that it sends excessivle large blocks of data all in one go, and that those sends have to be atomic because they are not restartable.
The *obvious* workaround for the problem is to break the call up based on the size of the object being sent, so that the blocking operations don't block "too long".
Another workaround is to have "worker processes", which are used as contexts for the blocking call, to "accomodate" sendfile.
The only canonically *correct* fix is to provide asynchronous interfaces for all synchronous calls, and perform call conversion on them. For sendfile, this either means an aio_ context version, or changing the return value to be the number of bytes out of the range actually sents, and having it work as write(2) does, in terms of non-blocking file descriptors.
By the same token, I could argue that System V message queue receives "should work" -- another patent absurdity, given that such operations are, by definition, synchronous.
-- Terry
Virtual Memory makes sense when real memory is significantly less than addressable memory. With a few caveats, virtual memory gives you the speed of real memory and the size and expense of disk. Sixteen-bit minicomputers have(had?) an 16-bit address space of 64k. Some have had their lives extended by using a VM in reverse to utilize more real memory than virtual memory.
Multics is probably the only OS for which VM was intrinsic. Multics used virtual memory for permanent file storage! Otherwise, Operating Systems tend to act very much the same with or without VM. Obviously, the OS has to *do* something about it, but the only real difference is that the machine looks bigger than it is.
"not from what i see. IMHO, threads makes share things between two processing easier. how can u make a connection pool with many process? and you can do all cpu based work in a group of threads and let one thread do all other i/o based work. These things are impossible with multi process and damn hard with single process FSA."
/proc for the process and setting a flag, so that fork(2) would behave differently in SVR4.2, I can guarantee you that there are other methods of achieving what you want, without threads. Incidently, this greatly pissed off the threads people at USL, because we didn't use their shiny happy application model: it was inappropriate for the problem we were solving.
The classical answer to this is "rfork" or "sfork". But there are others.
As one of the two engineers responsible for adding the ability to share the file table via opening the
Like descriptor table sharing, address space sharing, and SMP scalability, threads are a hammer that is applied to a lot of different problems, on the theory that if all you have is a hammer...
-- Terry
Businesses and people who run their sites aren't going to mess around with upgrading to new versions unless there is a driving reason for it.
If there were only 100 people using this product, they would be enthusiasts, and about 85 of them would probably be running the latest version (heck, probably betas), but there are other factors involved.
The more people who use the product, the less quickly that people will adopt the latest and "greatest" version. Heck... a lot of people are still using Office 97 (those that use M$ products...).
If there is not a business need, then it is a _bad_ idea to change your platform. So, as far as I'm concerned, this is not news-worthy.
T
---- It puts the lotion on its skin or else it gets the hose again. It does this whenever it's told.