Slashdot Mirror


Are Linux Transactions Slower Than Win2k's?

FullClip asks: "In the July issue of PC Magazine, Red Hat Professional is compared to Windows NT/2000 on basis of ServerBench, which tests the maximum Transactions Per Second (TPS) for a given number of clients. Red Hat 6.1 (when tweaked) matched the performance of Windows, but showed a terrible decrease in performance at about 24 clients to a weeping 20 % of the level that Windows was able to maintain. Somehow this disturbs me. Doesn't Linux perform better than that bad in client-server environments? If someone can point me to an non-FUD benchmark site, it would be appreciated..." Is this yet another case where benchmarks have been skewed severely to show a deficiency that doesn't exist? Or is this another area where Linux needs improvement? [Updated 6 July 2000 2:15 GMT by timothy] You may want to compare this with the far different results reported by SpecWeb.

23 of 218 comments (clear)

  1. Re:6.1? by tzanger · · Score: 3

    That is the biggest problem with the fast pace of Linux upgrades, vendors don't have the luxury of 20 billion bug breakers for their code, they have to spend lots of time verifying that their code works against any upgrades.

    And this is why they used Win2000 instead of WinNT4?

    If you're gonna pit the top of one and the middle of the next, I ain't even gonna look at your benchmark. They used Win2k, so (in my mind) should have used Linux kernel 2.4.1-prewhateveritistoday.

  2. Shocking new benchmarks by cxreg · · Score: 3

    If you find this hard to believe, check out these mind blowing statistics. Who would have ever thought?!

  3. Re:F to the U to the D by be-fan · · Score: 3

    Blah blah blah, this is mindcraft all over again. People are used to benchmarking. They're used to not having to hyper-tweek the systems they get in. In fact, its an industry practice to benchmark the systems "out of box" meaning that even stuff like the Diamond ViperII, which would be a great card with updated drivers, recieves a poor review due to the fact that what shipped wasn't up to snuff. If RedHat is attempting to compete in the commercial marketplace, they have to play the game. Don't ship products untweeked. Take a look at the default workstation install of RedHat 6.1. Why is Sendmail running on my workstation? I don't even use sendmail. Samba? What? at daemon, I've never used that. Chron, nope not that either. INET, I'm not serving anything. NFS, I've never even SEEN an NFS drive much less used one. Now I'm sure there are tweeks that the ZD guys could have done to maximize the performance. Would it increase the TPS that much? Maybe, maybe not. Remember, Win2K is much more multi-threaded than Linux, and tends to stall less on these kinds of things. The point is, if those tweeks exist, they should be part of the default install. It was increadible that until Mandrake, there were no mainstream distros that shipped with hard-drive optimizations on. That's akin to not clicking the DMA button in control panel, something that manufacturers get lambasted for in reviews. If RedHat is going to make it in the "real world" they have to play the game. Believe it or not, polish counts for a lot. In the high end business market, and especially in the desktop (home, business) market, lack of polish and be a deal-breaker for an otherwise great product.

    --
    A deep unwavering belief is a sure sign you're missing something...
  4. NT was designed to do this. by be-fan · · Score: 3

    First, let me say, that benchmarks like this are useless. I'm not against benchmarking, mind you, but what I AM against is artificialy benchmarks. For example, in testing 3D cards, ZDN still uses an artificial benchmark called WinBench3D. Of course, manufacturers with no morals (ATI, Megabyte, Intel) can optimize for these types of benchmarks, and thus seem faster than they are. However, if you do a real world benchmark, like say test the FPS in Quake, you're results are actually valid. If a manufacturer cheats so their card runs actual games faster, then thats actually a good thing. What I'd like to see is a real world server benchmark. Maybe set up a COM+ simulation where actual COM+ applications (for example a database) does actual requests on the server, then measure how many clients the server can handle. Or do something like have the server serve up a database and have scripted clients access the database in a real manner. Those are the kinds of benchmarks that really work, but unfortuenatly, they take actual work. As for NT beating Linux, remember that NT is designed to run stuff like this. Though NT is a microkernel, it rusn all its servers in kernel space (though Linux does as well, I think). Also, the kernel has a lot of design concessions that faccilitate a really high I/O rate. There not so good for doing real world tasks, because the OS tends to step on its own toes, but if you testing raw performance, NT usually wins. But these performance enhancements take their toll on stability and ability to handle high loads. For example, in Windows2K, DirectX has some interface calls implemented into the hardware abstraction layer, which really speeds up performance, but at the cost of stability. That said, WindowsNT really IS a decent OS, and some parts are simply better designed than their conterparts in Linux. For the desktop, (if you have the RAM), W2K is perfectly stable (because most desktop users reboot at least once every few days) and nicely supports media. Also, Win2K has a good multi-threaded TCP/IP stack that was rewritten from NT4. Despite its faults (ahem bloat) it does actually have some features that the Linux guys would be wise to look into. (ahem, COM) W2K is nowhere near being the end-all be all of OSs, but neither is Linux. They both have their flaws, and NT has actually improved enough that W2K actually has some uses in the server role! Anything doing small transactions that doesn't need to be particularly stable (for example DNS) would be served well by NT which responds well to little transactions like this.

    --
    A deep unwavering belief is a sure sign you're missing something...
    1. Re:NT was designed to do this. by be-fan · · Score: 3

      NT "runs all its servers in kernel space"? Do you mean drivers? Services? Services are not run in
      kernel space, althougth they can be set with a high priority. All drivers are run in kernel space,
      though.
      >>>>>>>>
      NT is a microkernel operating system. In microkernel OSs, servers are processes that provide system services such as networking, I/O, graphics, RPC, etc. In some cases, servers even provide memory management. In most microkernel OSs, these servers are in user space. However, in NT, they run in kernel mode. It's true that drivers run in kernel space, but so do the subsystems that load the drivers. This is a significant difference to most microkernels which have servers and large parts of drivers in userspace. BeOS for example has all servers in userspace, and most drivers are loaded by the kernel. IBM's experimental WorkPlace OS, on the other hand, put drivers mostly in userspace and even put services such as paging in user space. This tended to have a performance hit, and NT avoids it by running servers in kernel mode, even though that is riskier.

      "The kernel has a lot of design concessions that faccilitate a really high I/O rate." Really? Have you
      looked at the code for network and storage?
      >>>>>>>>
      No, but I have looked at design documents that detail the NT architecture. NT was designed for VERY high performance I/O.

      I have written network and storage drivers for NT4/
      Win2k and is not designed to be fast. Check out the DDK. Both storage and network use a miniport
      model (SCSI Miniport and NDIS Miniport) with a port driver doing much of the work. To make
      matters worse, Win2k use WDM for its drivers. WDM tends to add an additional driver object to
      the layered model. Both miniport and WDM are designed to be very general and take control away
      from the driver developer. A call to read a few bytes from the disk goes through so many layers.
      First, the file system drivers, then class.sys, then disk.sys, then scsiport.sys, then
      vendorscsiminiport.sys, then hardware. There can also be any number of filter drivers in the mix.
      WDM allows upper and lower filters for each FDO (Functional Device Object). We got a nice
      performance boost by not using the SCSIMiniport/Class driver interface. Win2k is not designed to
      be fast as much as extendable and general.
      >>>>>>>>>>
      Win2K is definately not designed to be extendible and general. While WDM may add a lot of overhead to the driver interface, that is not NT's native driver model. Microsoft added WDM to allow drivers for Win98 to work on NT. Also, you cannot deny that the architecture is tuned more to high performance than generality. A lot of critics of NT complained that the architecture was "academically dirty." Meaning that a lot of design desicions resulted in a faster but less clean system. For example, Windows 2K has DirectX class integrated into the HAL. Very unclean. NT also runs all services in kernel mode. Again, unclean. The NT microkernel globs up a lot of services that should be in the servers, which furthers performance, but makes the microkernel less general and less extendible. It runs the windowing system in kernel space! How general and extendible is THAT? NT does have a lot of management overhead, true. But it is also designed for raw performance. If you're not changing anything (ie. simply streaming data of a disk while not doing anything else) it is really fast.

      --
      A deep unwavering belief is a sure sign you're missing something...
    2. Re:NT was designed to do this. by be-fan · · Score: 3

      Okay, if you optimize your card (aside from just outright reporting false numbers) so that it runs Quake and Unreal faster, then you just sped up 90% of the games out there that similar algorithms. Even if you just speed up QuakeII, you've sped up all the games based on that engine. Manufacturers can't do that because it requires optimizing the whole driver, which is actually legitimate! Also, real world benchmarks are harder to fudge. If you do something to a driver that makes Quake run faster, than chances are that driver also makes all the other 3D games run faster. That's even more relevant for serving. I proposed a benchmark where the test emulates real-world conditions. IE, the script reflects how a person would actually use the system. If vendors do something to optimize for it, then they'd be optimizing real world usage. That's a good thing.

      --
      A deep unwavering belief is a sure sign you're missing something...
  5. Re:I think I get it by be-fan · · Score: 3

    That could also be a problem. Processes take more time to start than a thread does. Also, it still doesn't mitigate the fact that the TCP/IP stack is single threaded so THAT can stall.

    --
    A deep unwavering belief is a sure sign you're missing something...
  6. Re:Aren't all benchmarks subjective? by Spasemunki · · Score: 3

    I don't think that benchmarking is about deciding what is the 'better' operating system (you know, the one with the big 'S' on its chest). Benchmarking for something like this is about testing several products at the same task, and finding which one is better for that task. In that way, benchmarking is a good step in accepting that different OS's are good at different things. Saying that Red Hat 6.1 does worse on an application serving test doesn't mean that it's time for Bill the Gates to dance in his underware shouting "Victory!"; if RH61 had won, it wouldn't mean that it was time to throw the first (or last) shovel full of dirt on top of Windows. All it means is: Windows might be better at the particular task of serving applications. It's about finding those individual strengths, not culling the herd.
    Or at least, it ought to be.

    "Sweet creeping zombie Jesus!"

  7. Re:Uh, what sort of client is it? by Entrope · · Score: 3

    Indeed. The ServerBench description page claims it's an "application server" benchmark, but it looks like it's something they implemented themselves. This means that it's almost worthless as a test bench -- it's not very representative of *anything* since server performance varies more with the server program than almost anything else.

    If their protocol is simple enough to make it easy to optimize for different platforms (for example, Win32 vs Unix), it's almost certainly too simple to make an interesting test. If it's a complex protocol, I suspect they optimize then Win32 code a lot more than the Linux code.

  8. Application and non-Application Benchmarks by dingbat_hp · · Score: 3

    This whole bench test is pretty useless. Ziff wanted an "application" benchmark that was cross-platform and didn't rely on applications. What they actually built was so content-free that it simply tests network and OS performance as far as the TCP/IP stack.

    Not surprisingly, they found that the Win TCP stack is quicker than the (known to be single-threaded) Linux stack. QEFD.

    I'd like to see better benchmarks, but I'd much rather see something for simple Corba vs. Corba, or Corba vs DCOM. SOAP (the Apache approach of deployable handlers), vs. SOAP (Servlets) vs. SOAP (Microsft's SOAP-on-a-ROPE) would be even more interesting. We're doing something along those lines ourselves - maybe it will be publically publishable.

    To get the alternative "Useless benchmark shows Linux to be faster than Windows" story go here.

  9. Re:Why is it alwasy Linux v Windows... by Hrunting · · Score: 4

    Why is it always Linux vs. Windows?

    Because that's what Linux advocates trump up. Ever since Linux became 'popular', advocates have been pitting it against the big bad evil Microsoft. Nevermind that until recently, Solaris was just as closed-source and dealt in the same underhanded tricks as Microsoft. Nevermind that they're two completely different types of operating systems aimed at two entirely different classes of people.

    Basically, Linux people want Linux to be able to do everything that Windows can. They want it to be a robust server operating system. They want it to be an easy-to-use client operating system. They want it to run everything. They want to be the monopoly (but a monopoly of choice, not of force). Nevermind that Windows 2000 isn't trumped as the OS for everyone and Windows 98 isn't used in high-end server systems (and yet, advocates want Linux to do all of these tasks, and rule the hand-held market as well). And so, we get tests like this, Win2K vs. Linux, when really, what we should be getting is Win2K vs. Solaris (which I'm quite confident would blow Win2K out of the water).

    Does Linux really want to compete at the levels of AIX and Solaris?

    No, they want to compete with Windows. Windows is the enemy. Sound the alarms, and when Windows does something better than Linux, something is seriously wrong with the world (or so they would have you believe). Perhaps what would be a better suite of tests for Linux is one which isn't a comparison test at all, but rather one which looks for deficiencies so that people can start fixing them and quit debating about whether or not a comparison is valid.

  10. Re:Why is it alwasy Linux v Windows... by ibbey · · Score: 4

    Because that's what Linux advocates trump up. Ever since Linux became 'popular', advocates have been pitting it against the big bad evil Microsoft. Nevermind that until recently, Solaris was just as closed-source and dealt in the same underhanded tricks as Microsoft. Nevermind that they're two completely different types of operating systems aimed at two entirely different classes of people.

    Perhaps a less biased way of saying this is "Because Windows is, arguably, the main competition for Linux. While AIX & Solaris are also viewed as competitors, due to Linux' current weakness in scalability, they are not considered direct competitors."

    Now, that said, I'll respond by saying you're an idiot. Linux & Windows 2k ARE NOT designed for two different types of users. Both are designed for general use, high-end workstations low-to-mid end servers. In particular, in the context of the question, they are designed for EXACTLY the same market.

    As far as AIX & Solaris, they are also the competition. But, most people who have the budget to run a high -end unix server have a reason to spend the money (Support, a boss that's an idiot, or a need for specialized capabilities or scalability that Linux & Windows don't allow). Linux is rapidly advancing, & is beginning to address the last two issues (scalability & features), but at present it's hard to directly compare Linux to some of the commercial Unixes. And of course, you again need to consider the context. Since the question was specifically in response to a benchmark comparing Linux to Win2k, why would you even expect AIX or Solaris to be brought up?

  11. Re:Mindcraft issues still? by Dan+Kegel · · Score: 4
    I have a little writeup on the history of the wake-one fix (and others) at http://www.kegel.com/mindcraft_redux.html . Looking at Andrea's patch, one important change was

    diff -u linux/net/ipv4/tcp.c:1.1.1.6
    @@ -1575,7 +1575,7 @@
    add_wait_queue(sk->sleep, &wait);
    for (;;) {
    - current->state = TASK_INTERRUPTIBLE;
    + current->state = TASK_INTERRUPTIBLE | TASK_WAKE_ONE;

    Offhand, it looks like that particular change isn't in Red Hat 6.1 or 6.2. I don't know whether this would affect ServerBench performance, though. It's hard to tell without looking at the source.

  12. Re:what were they doing when changing specs? by 1984 · · Score: 4

    Without going to far into it, I remember discussing a lot of this stuff with the guys doing those tests at the time. Those (fairly) low-down tweaks were attempts to see if the Linux setup was tripping up on something obvious (e.g. trying to auto-negotiate on the NIC) and whether it could be speeded up. That was because everyone was really quite shocked at the figures coming out, and went to some trouble -- including talking to Red Hat -- to attempt to eliminate configuration issues and the like, because everyone thought the numbers looked odd. But after a lot of effort, they still looked odd.

    And you don't (or shouldn't) 'root' for any of the platforms you're testing when you benchmark. You go to a reasonable amount of trouble to make sure that you are testing what you think you are (and not some config hiccup that's hamstringing the results). But having done that, you sometimes still get a surprise. That's what happened here.

  13. I think I get it by levendis · · Score: 4

    The server was a dual-proc machine. Win2k has a multi-threaded TCP/IP stack, linux 2.2.x doesn't. That probably accounts for most of the issue right there - at around 24 users, the single processor limitation of the Linux TCP/IP stack was reached, and the Win2k mahcine just split the load up.

    Of course, IANALOLT (I am not Alan Cox or Linus Torvalds), but it seems the most likely explanation to me...

    --
    ---- I made the Kessel Run in under 11 parsecs.
  14. DB2 on other platforms by tjwhaynes · · Score: 4

    This just went up on the TPC website Monday, there is a monster leader in transaction processing price/performance and that is:

    • IBM Netfinity with Intel Xeon processors
    • IBM DB2
    • and Windows 2000.

    You will not believe this unless you see it!

    Yes - but check out the hardware. 32 four-way pentium Xeon's, and over a terabyte of disc space, and an obscene amount of RAM. That is not a standard setup, although it was built with standard parts (trust me - I know the team which built it). That is not to say that the DB2 team isn't extremely pleased with this result :-)

    Just because it's running on Windows 2000 does not automatically mean that there might not be better choices for an OS to support this benchmark. It's not even entirely clear to me that Windows NT might not have been faster here, given the benchmarks which MS put out on their own website showing that Windows 2000 does better in limited memory, but is worse than NT above 128MB (and these machines had a lot more than that). Remember that DB2 UDB has a shared-nothing architecture which that it scales extremely well and is additionally capable of using raw devices so the OS in question may not have a big impact on performance. And DB2 runs on most platforms out there, from OS/2, AIX, HP-UX, Solaris, Linux, Windows 9x/NT/2000, SGI, SCO, Dynix and various 64 bit platforms as well.

    Of course, it would be nice to have some side-by-side benchmarks of DB2 UDB on Windows NT/2000 and DB2 UDB on Linux. There will almost certainly be some benchmarks on Linux sooner or later - since IBM has made Linux available for all its machines, it makes sense to publicise the performance of its flagship DB product on Linux as well.

    Cheers,

    Toby Haynes

    P.S. I work on DB2 UDB development.

    --
    Anything I post is strictly my own thoughts and doesn't necessarily have anything to do with the opinions of IBM.
  15. Testing by Dungeon+Dweller · · Score: 4

    "Each was tested on it's own network, with 2 subnets of 24 servers, the windows network consisted of 48 PII's, whereas the linux network had the added advantage of having a Cray Supercomputer making requests at full charge on the 24th node of the first subnet..."

    --
    Eh...
  16. My credibility is fine by tilly · · Score: 5

    I hate people talking how 2.4 will fix everything, 2.2 surely didn't.

    Where did I say that 2.4 will fix everything?

    I said that there is a specific problem, known in 2.2 that has turned up before, that is a potential explanation for this bad result.

    There are other known (and fixed) scheduler problems.

    Encountering any combination of these in 2.2 benchmarks is to be expected. Don't make these out to be more or less than indications that 2.2 had some obvious room for improvement.

    I am sure that 2.4 will have more problems. However many problems that turned up in benchmarking 2.2 have been fixed (because they turned up in benchmarking 2.2), and preliminary benchmarks of 2.4 (eg the recent SpecWeb result where it nearly tripled Windows 2000 on a similar 4 CPU box) indicate this.

    Now will 2.4 be ready for the enterprise, as they like to say? Not really. First of all until it has been through a few point releases, I would expect some significant bugs. (To be expected in any software.) Aside from that issue, it lacks many managability tools, a volume manager, more work needs to be done on failover, journaling filesystems are needed, etc. I have been convinced by Larry McVoy's argument that further work on SMP is not needed, NUMA (done through clustering and virtual operating systems) is.

    These are known problems. Work is being done on them. However there will be room for complaint about Linux vs more mature systems for some time to come. However problems are getting solved, and Linux is moving up the food chain, fast.

    Regards,
    Ben

    --
    My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
  17. Mindcraft issues still? by tilly · · Score: 5

    An immediate thought.

    The "thundering herd" problem that was identified in Mindcraft and fixed in 2.4, isn't that still present in RedHat 6.1? (BTW calling it "Linux 6.1" really irritated me.) That could explain a sudden drop-off. It is not a problem, not a problem, then suddenly becomes a problem and as soon as you get a slow-down, you get a real traffic jam.

    Just guessing...

    Ben

    --
    My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
    1. Re:Mindcraft issues still? by blakestah · · Score: 5

      The "thundering herd" problem that was identified in Mindcraft and fixed in 2.4, isn't that still present in RedHat 6.1? (BTW calling it "Linux 6.1" really irritated me.) That could explain a sudden drop-off. It is not a problem, not a problem, then suddenly becomes a problem and as soon as you get a slow-down, you get a real traffic jam.

      Yeah, the box was dual cpu and dual ethernet card, designed to show the weaknesses of linux networking as of the 2.2 kernels.

      However, as more recent benchmarks show, the soon to be released TUX package (from Redhat, GPLd) does extremely well in multi-cpu multi-ethernet card environments. These changes are likely to become embedded in Apache.

      I'd be really surprised if anyone has an x86 OS that could beat the one Ingo Molnar set up for the SpecBench tests. It more than tripled the Windows machine under unrealistically high loads with flat file service - 4 CPUs, 4 Gigabit ethernet cards.

      There are also issues about scheduling for high loads such as the one in the ZDNet article that have been addressed by a patch from IBM.

  18. Why is it alwasy Linux v Windows... by MosesJones · · Score: 5


    I know this sounds strange but when I'm looking at designing a high transaction application or site I don't even LOOK at Windows or Linux. Does it suprise me that Linux doesn't scale to the enterprise market ? No, its written by individuals for lowish demand systems that they require, rather than by Company A who is implementing for Company B something that will cost several million pounds of development.

    These sort of tests are IMO unfair to Linux. Should you use NT/W2K or Linux for your high transaction application/site ? The choice is more normally "Should I use True64, AIX or Solaris ?".

    Linux works great for me as a webserver, as a client who takes a limited number of hits at a cheap price. If you want to scale you buy more boxes.

    On the back end use a large end server with lots of RAM that has a massive IO throughput.

    Does Linux really want to compete at the levels of AIX and Solaris ? Why not go for the niche, of cheap, reliable, and easy to scale horizontally.

    --
    An Eye for an Eye will make the whole world blind - Gandhi
  19. Linux Tweaks by rothwell · · Score: 5

    Okay, they set the ethercard to full duplex and increased the queue depth on the scsi card. Fine -- makes sense. Stopped the "atime" updates. Makes sense.

    But they also did this:
    echo 100 5000 640 2560 150 30000 5000 1884 2 >/proc/sys/vm/bdflush

    ... interesting. We're developing a new filesystem, and ended up ignoring bdflush completely to get good performance. Here's what those values mean:

    From /usr/src/linux/Documentation/filesystems/proc.txt:

    Table 2-2: Parameters in /proc/sys/vm/bdflush
    Value (default/tweaked)

    nfract (40/100)
    Percentage of buffer cache dirty to activate bdflush

    ndirty (500/5000)
    Maximum number of dirty blocks to write out per wake-cycle

    nrefill (64/640)
    Number of clean buffers to try to obtain each time we call refill

    nref_dirt (256/2560)
    buffer threshold for activating bdflush when trying to refill buffers.

    dummy (500/150)
    Unused

    age_buffer (3000/30000)
    Time for normal buffer to age before we flush it

    age_super (500/5000)
    Time for superblock to age before we flush it

    dummy (1884/1884)
    Unused

    dummy (2/2)
    Unused

    ... they seem to have changed one of the "dummy" values... wonder why? Other than that, they appear to have increased the interval at which bdflush runs, meaning more stuff is hanging around in memory. It may be that at 24 clients, bdflush is banging on the filesystem too much. I would loveto see a graph of disk activity inluded with the results. Sometimes Linux will go through a silent-storm-silent-storm cycle as bdflush runs on a busy system. It would be interesting to see how a journaled filesystem would perform. I think Reiser does his own buffer-flushing rather than relying on bdflush runs to do it, meaning he has finer control over it. It would also be interesting to see this test run on FreeBSD, which does a better job keeping the disks busy.

    Tweakers may also be interested in reading /usr/src/linux/Documentation/IRQ-affinity.txt ... describes how to have specific CPUs handle specific IRQs -- like the mindcraft tests did with NT.

  20. 2.2 kernels used by Ingo+Molnar · · Score: 5

    ServerBench is not available in source code, and the testing was done by ZDNet. From what i know about ServerBench it uses a threaded IO model on NT, but a fork/process model on Linux. The Linux 'solution' is coded by ZDNet, with no possibility from us to influence/comment the design and approach used at all. Even under these circumstances we expect the 2.4 Linux kernel to perform significantly better in ServerBench than 2.2 kernels. The 2.2.1x (and late 2.3.x) kernels had some VM problems, and with increasing VM utilization (more clients) this problem could have been triggered.

    SPECweb99 OTOH is a standardized benchmark with full source-code access (ServerBench are closed binaries), so all SPECweb99 implementational details are visible.

    Nevertheless it's technically possible that ServerBench triggers performance bugs in Linux - we'd love to see the source to fix those bugs ASAP, if they are still present in 2.4.