Sites Rejecting Apache 2?
An anonymous reader writes "Vnunet
reports on the low adoption of Apache 2 has caused its producers to advocate freezing development of the open-source Web server until makers of add-in software catch up. Almost six months after the launch of Apache 2, less than one percent of sites use it, due to a lack of suitable third-party modules." I'm not sure where they are getting the freezing Apache development part, more talk about forking for 2.1 right now on the httpd mailing list. The article does have it right though that until there is a reason to upgrade and the modules are in place that adoption is not going to happen. While the cores of both Perl and PHP are thread-safe, the third-party modules are not. This renders one the larger reasons to use Apache 2.0, the threaded http support, useless for applications using either of these application layers. It comes down to the question of whether the third-party module writers are better off supporting what is used or what is new.
As soon as they release a stable version for Apache 2 (aka 4.3.0), then I'll look seriously at switching. It's great that Apache 2 has stablized now, though, as it lets everyone else work around a stable project.
We'll all get to Apache 2, it just takes time to migrate.
My main question is, what would it matter if sites weren't using apache 2.0, isn't it enough that open source software is being used??
Ignore the "p2p is theft" trolls, they're just uninformed
PHP, mod_perl, any of the Java servlet modules. These are are third-party (basically the server doesn't ship with them even if they are other ASF projects). Anyone running anything other then a flat HTML site needs at least one of these or something similar.
You can't grep a dead tree.
Here's what would convince me to change.
... its always nice to know that the big boys have taken the plunge.
.. given that I bothered reading up on some of the early discussions.
-- References. Have any high profile apache sites migrated? While my sites are small
-- PHP Support. As of 4.2.0, Apache2 support was experimental. The change log does not show anything which says its supported.
-- Mod_gzip support. This is a big one. Mod_Gzip makes my sites download a extremely fast when users over dialup lines log in. This is true specially for low bandwidth countries in Asia. Mod_gzip support has left me fairly confused
Even with all of this.. I'm not likely to change unless there is a perceptible difference in the load / performance stats on my system during the switch.
I know Apache does not have any "customers" to support, but why were they so eager to break compatibility for Apache 1.3 modules in Apache 2.0? I know backwards compatibility code isn't sexy, but couldn't they keep the old module API and thunk it to the new API? Then Apache 2.0 could ship with rock-solid mod_php and mod_perl. Let modules developers migrate slowly on their own schedule.
Here's an interesting perspective from Ole Eichorn, the CTO of Aperio Technologies:
One of the more significant recent discontinuities occurred with the release of Apache 2.0. Although it has been under-reported, Apache 2.0 is significantly discontinuous (non-backward-compatible) with Apache 1.3. Many webmasters have decided not to upgrade for now, rather than have to recode their custom modules. And many of the custom modules out there are 3rd party, so the resources to make the changes are not readily available.
It is not clear to me why the discontinuity was required. There was no technical reason not to maintain backward compatibility. I think your essay gets it right, the people who made these decisions were not involved in the original development, and were not sufficiently aware of the impact their decisions would have on their developer community. Multi-threading processes, which inspired most of the discontinuity, primarily benefits Windows sites - a small proportion of Apache installations - and most Windows sites use IIS and aren't going to change.
I bet in a few years we'll be able to track Apache's decline as the leading web server back to this point.
cpeterso
The build system in Apache 2, while being vastly improved over the Apache 1 build system, is rather complicated, and has lead to a number of packagers simply not bothering, or having a hell of a time packaging it.
There is no RedHat or Debian packages of Apache 2.0 (offical as in from RedHat or Debian, and part of their stable distribution). There are a few Debian people who are packaging Apache 2.0 (namely Thom May, who is the current package bunny...err...maintainer *grin*), but last I heard they were having a horrible time getting it working, and it's still only in unstable (sid), and hasn't made it to testing (sarge).
If it gets into RedHat and Debian's stable distributions, chances are it'll make a higher percentage mark on site usage. Till then, I don't think things are going to change much.
I won't switch over to Apache 2 until there's an amiga port of it!
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
As a software author, you really need to worry about your own users outpacing you. For instance, if someone likes a feature in Apache 2, and every module they use, except yours, works with Apache 2, people quickly discover that they don't need your module all that much anyhow.
Wasn't that everyone's experience when switching from Windows? You can't get program XYZ for Unix, so you discover that you never really needed it that much anyhow...
As a programmer, it always pays to be everywhere you possibly can. But, when it's open source, programmers don't care what's best for the user, so don't expect it to happen.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
when distros will start shipping 2.0 as standard,
everyone will "just use" it. Of course there would
be some rejection rate, of stubborn people. 1.3
development would stop and everyone would slowly roll over to 2.0.
pro 2.0:
- threaded stuff is blindingly fast. most systems threads are faster then processes
- other new technologies, like layered content filtering are great for developers of hight traffic sites.
pro 1.3:
very very many people using apache use linux. Linux threads are almost same performance as processes. Due to kernel limitation, you can stack only so many threads per process.Plus threaded model does not account for stability. One NULL pointer dereference and you're gone. Apache2.0 of course uses bundles of threads. so you still have multiprocess model kicking around.
Expect 2.0 gain popularity on systems like Sun, BSD and Win32 where processes handling is relatively expesive. Threads are dirt cheap.
As everything, things take time. Just like well brewed beer.
cheers.
That or where it started to fork.If people are unwilling to go 2.x, they'll put the effort into adding new stuff into 1.x. Are we seeing Open Source at work?
Xix.
"Everything is adjustable, provided you have the right tools"
The number one problem with Apache 2 is its reliance on threads, and its assumptions about threading models.
This will certainly not win me friends in the "everything should use threads because it's easier to do linear programming than to build a session reentrant state machine" camp, but...
Threads are useful for SMP scalability, but they aren't very useful for much else (I/O interleaving is adequately handled by most network stacks, the I/O interfaces themselves, and the fact that almost all the bytes being mode are being moved from the server to the client: the protocol is very asymmetric, even if you aren't pushing multimedia files). In most cases, threads are a liability.
Under Windows, they introduce data marshalling issues that have to be accounted for in user code -- not just in the modules which implement interpreters for that user code.
Under UNIX, threads are generally a loss, unless there is specific scheduler support for thread group affinity, when threads are running on the same processor. and CPU negaffinity, when there are multiple processors, to ensure that there is maximal usage of computer resources.
If you do the first, then you have the possibility of starvation deadlock for other applications: basically, it's not possible to do it correctly in the scheduler, you have to do it by means of quantum allocation, outside the scheduler. This means a threading approach such as scheduler activations, async call gates, or a similar technique. If you do the second, then you pay a serious penalty in bus bandwidth any time locality spans multiple CPUs -- in other words, it's useless to use SMP, if you have, for example, a shopping cart session that needs to follow a client cookie around.
Overall, this means that you were much better off using session state objects to maintain session state, rather than using threads stacks to do the same job. This is actually pretty obvious for HTTP, in any case, where requests are handled independently as a single request/response pair, and connection persistance isn't generally overloaded to imply session information (you can't do that because of NAT on the clinet side, multiple client connections by a browser on the client side, and server load balancing on the server side, etc.).
Overall, this factors out into threads bringing additional pain for module writers, without any significant performance or other benefit, unless you go SMP, and have a really decent threads and scheduler implementation -- which means you are running a recent IRIX or Solaris, which is a really limited fraction of the total web server market.
Frankly, they would have been a lot better off putting the effort into the management of connection state and MTCP or a similar failover mechanism, and worried about NUMA-based scaling, rather than shared memory multiprocessor with particular threads implementation scaling. The cost for what you get out of the switch is just too high.
-- Terry
Let's clear up a few things. Yes, PHP support has been somewhat slow in coming, but the main reason is that there is very little motivation for us to rush to support it. This is because most of us really don't see the advantage of 2.0 yet. The threaded mpms don't work at all on FreeBSD due to bugs in the FreeBSD kernel threading code. These are fixed in FreeBSD's CVS, but are not in any released version as far as I know. Also, as was mentioned, PHP itself is threadsafe, for the parts that count anyway, but what about the 100-150 different libraries that PHP can link against? We know some of these are not threadsafe. We also think we know that a number of them are threadsafe. The rest, who knows. Do you want to be the first to discover that a certain library is not threadsafe? Thread safety issues don't tend to show up until you start banging at the server with production-level load. And the errors can be quite subtle and random in nature. These are not PHP libraries we are talking about. These are things like libgd, freetype, libc, libm, libcrypt, libnsf.
Of course, if you run the non-threaded pre-fork mpm, it should be ok. But really, what is the point then? That's why PHP support has been slow going. We develop stuff because we need it ourselves for something. Right now spending a lot of energy on supporting Apache 2 seems somewhat futile. What we need here is a concentrated effort on the part of many different projects to pool their knowledge and generally improve the thread safetyness of all common libraries. I have written a summary and started this work here:
Thread Safety Issues
I would very much appreciate comments and additions to this. I don't think Apache 2.0 is dead in the water, it just needs better overall infrastructure in terms of non-buggy kernels and a push to make all libraries threadsafe before it can really become a viable solution for sites needing dynamic content.
Or, alternatively, we might start pushing the FastCGI architecture more to separate the Apache process-model from the PHP one.
Red Hat's (null) 8.0 beta 3 has 2.0.40. You can probably take the SRPM for it and rebuild it on RH 7.x. I haven't tried it but it should work.
I agree it will get a LOT more use once the Linux and BSD distros start shipping it by default, and once PHP and mod_perl are solidified for it. The Red Hat beta includes both, so they should be about ready.
The most powerful features of Apache based sites aren't features of Apache but of 3rd party modules. PHP, mod_perl, mod_dav, mod_throttle and even Microsoft Frontpage modules contribute significantly to the appeal of apache. There is an excellant Report on Apache Module Popularity by SecuritySpace.com. In considering this report, you should notice the month over month growth in the usage of modulees which have not yet been ported to Apache 2. The developers of these modules will most likely respond to customer demands for support of apache 2, which is dependant of the Apache Software Foundation's ability to convince customers of the benefits of upgrading to Apache 2. In this respect the marketing of Open Source Software mimics the marketing of treditional commercial software. Let's hope they don't adome the strategy of some commercial software vendors by simply refusing to provide security fixes or updates to Apache 1.3.x when needed.This would certainly outrage Apache users, but in the case of Open Source would have the secondary effect of promoting forking of the codebase. On the bright side customers do have a recourse in the case of Open Source, where, they're left twisting in the wind in the case of commercial products.
--CTH
--Got Lists? | Top 95 Star Wars Line
You incriment the left-most number in the release number. So 1.3 is not expected to be compatible with 2.0, and Linux kernel 2.4 is not expected to maintain backward compatibility with 1.0 ;) This makes things much easier to maintain and see at a glance.
Now as to why they did it, Apache 1.3 is great. I love it, but it is not as cross-platform as it pretends to be (it does not perform well on Windows) and it really is not built for speed. If you need these things, you need multithreading, a better abstraction model so you are not assumign POSIX compatibility (and hence emulating it on Windows) etc. This means you break the compatibility. Pure and simple, but in the end, you get a better product.
Think of Apache 2.0 as Apache-- Next Gen. Not yet supported but when it does, it will be more competitive than 1.3.x because it has a better architecture.
LedgerSMB: Open source Accounting/ERP
You are better off using the open_basedir restriction instead of safe_mode for this. Set the open_basedir for each virtual host to that virtual hosts DocumentRoot and then PHP scripts will only be able to open files under that dir.
Of course, both open_basedir and safe_mode are crappy solutions to a problem that needs to be solved higher up. Like with the Apache2 perchild MPM, but that is a long way from being production quality on a couple of different levels.
Did any of you actually understand a word of what he just said?
Not to put salt in open wounds, but in IIS, which uses threads, they use a concept build in Windows: apartments. You have single threaded apartments (STA) and multi-threaded apartments (MTA). The webserver itself uses threads for handling requests and when a certain library is called/opened by the code, that library takes care of in which apartmentstyle the code is ran: in an STA or in an MTA. VB6 com objects f.e. can't run in an MTA, so they are run in an STA. This is controlled by windows (as a configparam of the com object). So here you see a combination of both worlds: multi-threaded and safe where it has to be, without the hassle of forcing the developer to write threadsafe code when the code itself isn't multi-threaded, but the environment is.
Of course, there are some issues: when you let the code executed by the request of user A create an object in an STA and move that into a container which can hold both STA's and MTA's, and let code executed by the request of user B access that user A's STA object, you get thread unsafety and possible crap.
However: the OS's functionality offers the option to do it threadsafe and still have multi-threading in full effect. Perhaps a thing to look at for the thread/process guys in the Linux kernel team.
(It has been a long time, but afaik, a simple fork() is not forking off a complete new process, but a childprocess which runs as a thread inside the mother process, or am I mistaken? (if not: why then the threadsafetly crap NOW, because a fork() will result in the same issues)
Never underestimate the relief of true separation of Religion and State.
As far as I can judge, there are two reasons why people wouldn't adopt Apache 2.0 . First of all, Apache 1.3 works Just Fine (WOW) for most sites, and it can therefore be considered wise not to upgrade to a later version which is based on a less-tested code base than what one is currently running.
The other thing is suggested by the author of the original post, and has to do with the fact that Apache 2.0 breaks compatibility with old modules. Downward compatibility is one of the Commandments in software development, and it's quite possible that this is a major reason for admins to be reluctant to switch to Apache 2.0.
Interestingly, both expecting people to upgrade to a product that almost certainly contains yet-to-be-discovered bugs, and breaking compatibility with previous releases are frequently observed in the practices of the Great Stan of Redmond. It may therefore not be surprising that those admins running Apache (rather than It Isn't Secure) would not go with it.
Please correct me if I got my facts wrong.
Well, my server has been running nicely for quite some time now.
I haven't encountered a single problem, Well, except that the default config is more secure and I had to manually change it to run legacy apps.
HTTP/1.1 200 OK
Date: Tue, 10 Sep 2002 08:18:09 GMT
Server: Apache/2.0.39 (Unix) PHP/4.2.2 DAV/2
Last-Modified: Sun, 24 Feb 2002 15:50:43 GMT
ETag: "2d405e-d7-4ac5ac0"
Accept-Ranges: bytes
Content-Length: 215
Content-Type: text/html; charset=ISO-8859-1
echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
You're not forking a 15MB process for every concurrent connection. You're creating a PID for every concurrent connection; process memory with regards to fork under most UNIX systems is _copy_on_write_, which means it isnt getting copied until such a time as it is actually written to. There's no real gain in memory.
Multi-threading is NOT more efficient that multi-process. That is a blanket statement and my reply is as well. Linus did something smart in that he considered a process and thread one and the same. However, the upside is that whena Linux "thread" dies it does not kill off the "process". MS does this too when it uses Apartments, etc. In fact COM+ is an exact mirror of the Linux process, thread strategy.
"You can't make a race horse of a pig"
"No," said Samuel, "but you can make very fast pig"
Actually it is the other way around. Linux has the smallest process creation and process switching overhead of any Unix with virtual memory. It is simply not possible for threads to be all that much faster than that. Apache 2 is optimizing something that simply was not all that expensive on Linux in the first place.
Finally! A year of moderation! Ready for 2019?
cLive
-- Trinity in high heels carrying a whip: The donimatrix - there is no spoonerism
FreeBSD's current threading is implemented in user space, although work is under way to move it into the kernel, that works is being done *ONLY* for SMP scaling and quantum utilization efficiencies.
As it stands, it is fully compliant with the POSIX threads standard.
If it is not working for Apache, it is because Apache is not a POSIX compliant threads client implementation.
From looking at the code, we can see this is the case, with the Apache code having an assumption of kernel threads, which you are not permitted by the POSIX standard to assume.
Although I have not yet verified it, an examination of the code *seems* to indicate that it has "the Netscape problem", which is an assumption about scheduling coming back to a given thread in a group of threads after involuntary preemption by the kernel when the process quantum has expired.
In older versions of Netscape, this displayed as a bug in the Java GIF rendering code, which was not thread reentrant, in that if you used a Java application as a web UI, and moved the mouse before all the pictures were loaded, the browser would crash. After I explained this, Netscape corrected their assumption, and the problem went away.
Ignorance of the requirements for writing threaded applications which will work on all POSIX compliant threads implementations is no excuse, nor is it a valid reason for blaming the host OS, unless you make it known what your requirements are, above and beyond the standard contract offered by POSIX, and that you are stricter than an application written to the POSIX interface, without such additional assumptions.
You will find that you have these same problems on MacOS 9 (NOT FreeBSD-derived), MaxOS X (uses Mach threads), Mach, Plan 9, VxWorks, OpenVMS, etc..
You will find you do NOT have these problems on systems with implied contracts above and beyond those provided by the POSIX standard: Solaris, UnixWare, Windows, and Linux. You may have *other* problems in Windows, related to implied contracts over virtual address space issues (see other posting).
-- Terry
The build process has been slowed down and, IMO, gone entirely broken. Previously I ran the configure script, which took a minute or so, compiled and installed. It worked.
Now a run a monstruous ./configure, which calls itself recursively and takes about ten minutes to complete, at which time any and all warnings have scrolled well past the top of the window. It does not report easy mistakes such as trying to make "so" a shared module until it is almost finished. And the libraries are not linked against the modules properly, so attempting to use a static libssl or libm is not possible.
An upgrade from 1.3.x to 1.3.x+1 took about half an hour. An upgrade from 1.3.x to 2.0.x has taken me the better part of two days, including reinstalling openssl shared so that mod_ssl works at all, for no immediate gain.
I can understand that people do not make the switch.
The basic problem is thread group affinity.
Basically, the promise of threads is that you will not be paying the equivalent of a full process context switch overhead, because your VM and other process-specific things will not have to change when context switching from one thread in a process and another thread in a process.
On a machine that has 1001 processes, and you are the 1 process, and you have five threads in your thread group (process), You basically have a 4 out of 1004 chance of one of your threads being picked as the next thing to get a quantum, when one of your threads makes a blocking call, so that it's no longer runnable.
What that means is that you have just reneged on the promise of lower context switch overhead, if you run thread #1, then run "cron", and then run thread #2.
So you have to play favorites, and say "I know "cron" has been waiting a long time, but I just blocked processing on thread #1, and thread #2 is runnable, so I'm going to preferrentially run thread #2, because it lets me avoid the VM switch, and the TLB shootdown, and the other overhead of a full process context switch, and therefore lets me keep my promise about threads being lower overhead than processes".
Any time you play favorites, you starve your non-favorites; just like a Robin or Sparrow with a Cuckoo's Egg in its nest.
So then you have to add all sorts of arcane accounting and other crap to avoid the starvation of other processes, and your scheduler becomes very, very complicated.
Compare this with Scheduler Activations, or an async call gate, where you give a quantum to a process -- and the quantum belongs to that process. In this case, your process runs until either there are no more threads to be run, or until its quantum is used up.
Things are actually more complicated than even this; for example, you want a threaded program to compete as multiple processes for quantum, or you are encouraging people to write programs that fork multiple children, instead of threads, in order to allocate themselves more quantum. On the other hand, you want to set some upper bound on the amount of unfair competition a single unpriviledged program can engage in, relative to other processes on the system.
If you attack thread group affinity as a scheduler problem, the amount of complexity you introduce is substantial, and there will always be corner cases.
There's actually been a huge amount of research on this; check the NEC CS search engine for "scheduling" and "load balancing" and "parallel".
-- Terry
Pre-forking, threading, foo, bar, mish, mash... blah..
In the final analysis, all the major apache 1.3 modules will never work corrects, to the point where code for one works well in the other, and vice-versa. The sad truth is that, like the Apache 1.x, the modules will slowly creep to replace the CGI's, and that took a few years to happen, and mainly with mod_perl replacing perl CGI's.
yeah, that might suck donkies, but its the sad way of human nature. WE simply want to make it like we used to have it in 1.3, and whatever. This it will never be again. Totally new modules should be writen, and used by the upcoming generation of coders, those whom are not corrupted by what we older folks have become used to. I'm 26 btw.
For example, the syntax of php is very good, and so are many of its ways of structuring things. But php itself needs to be thrown away as it stands now. Perl cannot speak of good syntax, it is simply one of the ugliest, yet most usefull languages there ever was. Yet mod_perl has a good chance of remaining viable on Apache2. This is what confuses most folks, because they don't understand how something to them, the elegant code they write, could not work well in another environment. And when your apache module becomes a place that itself is a launch pad for other modules, then what? For example, in php... most folks like to have mysql as a module, or GD, or whatever. However, now you have to wonder that in Apache2, that mysql could be a direct module to Apache2 itself , and php, or perl, just share the common thread. Do you suppose that php, or perl could be writen in a way to share their connections to MySQL, no... probably not going to play nice like that.
People just have to get past the notion that their development environment is just plain bad. The people at the Apache foundation knew it, and probably expected this sort of crap, why they want to mess things up in the next relase to confound the module writer is beyond me.
It isn't a lie if you belive it.