Can .NET Really Scale?
swordfish asks: "Does anyone have first hand experience with scaling .NET to support 100+ concurrent requests on a decent 2-4 CPU box with web services? I'm not talking a cluster of 10 dual CPU systems, but a single system. the obvious answer is 'buy more systems', but what if your customer says I only have 20K budgeted for the year. No matter what Slashdot readers say about buying more boxes, try telling that to your client, who can't afford anything more. I'm sure some of you will think, 'what are you smoking?' But the reality of current economics means 50K on a server for small companies is a huge investment. One could argue 5 cheap systems for 3K each could support that kind of load, but I haven't seen it, so inquiring minds want to know!"
"Ok, I've heard from different people as to whether or not .NET scales well and I've been working with it for the last 7 months. So far from what I can tell it's very tough to scale for a couple of different reasons.
- currently there isn't a mature messaging server and MSMQ is not appropriate for high load messaging platform.
- SOAP is too damn heavy weight to scale well beyond 60 concurrent requests for a single CPU 3ghz system.
- SQL Server doesn't support C# triggers or a way to embed C# applications within the database
- The through put of SQL Server is still around 200 concurrent requests for a single or dual CPU box. I've read the posts about Transaction Processing Council, but get real, who can afford to spend 6 million on a 64 CPU box?
- the clients we target are small-ish, so they can't spend more than 30-50K on a server. so where does that leave you in terms of scalability
- I've been been running benchmarks with dynamic code that does quite a bit of reflection and the performance doesn't impress me.
- I've also compared the performance of a static ASP/HTML page to webservice page and the throughput goes from 150-200 to about 10-20 on a 2.4-2.6Ghz system
- to get good through put with SQL Server you have to use async calls, but what if you have to do sync calls? From what I've seen the performance isn't great (it's ok) and I don't like the idea of setting up partitions. Sure, you can put mirrored raid on all the DB servers, but that doesn't help me if a partition goes down and the data is no longer available.
- I asked a MS SQL Server DBA about real-time replication across multiple servers and his remark was "it doesn't work, don't use it."
I hate to say it, I've been too long out of the MS development world. That kind of overhead managed to amaze me.
I'm deploying systems right now (some buzzword compliant, some (more efficient ones) on lowly little open source, that scale to an order of magnitude higher transaction volume at a fraction of the cost. No, none of them are windows.
No wonder my company has been doing well in a downturn. (Oh, sorry, we're "recovering" now.)
I forget what 8 was for.
Seems to me swordfish is going to be coding it anyway. I'm sure he can figure out how costly it is to retrain him to program Unix/Java.
Having said that, colleges/universities are churning out Java programmers at an alarming rate. And seeing how unemployment is only rising (lots of experienced people on the market) newcomers are really, really cheap! (They're used to living like.. well.. like students!)
Also, is programming for this new-fangled
Now, I agree that finding reasonably adept administrators for windows is much easier, and cheaper, than finding ace Unix admins. But that doesn't say anything about coders.
If swordfish is doing a feasibility study on this, for Pete's sake, suggest an alternative with less Microsoft in it! Any reason why that server should be
SCO employee? Check out the bounty
I got to tell you its probably not java more the actual application server and/or the application. We use ATG Dynamo, and for that we get what we pay for: More than 4000 concurrent users ( active sessions ) per Dynamo instance with out any problems). Although with Version 6, they've decided to go all J2EE Buzzword compliant and complicated the entire setup.
"If one of the servers has a problem - I can remotly fix it over my cell phone connection, and I don't have to charge them travel time. If it was Windows - I'd have to drive there."
What?! You've never heard of any of the following:
-- Terminal Services
-- VNC for Windows
-- Remote Desktop commercial programs
I am sorry, but that is just on crack (and so is whoever modded you "Insightful".) In fact, with Terminal Services and the rdesktop client program, you can even administer a Windows desktop or server from a Linux or Mac box. Yes, you can do remote reboots, remote software patches, remote software upgrades, and pretty much everything else.
There are lots of valid reasons for using Linux/BSD/UNIX, but being ignorant about Windows certainly doesn't help your case.
Simpli - Your source for San Jose dedicated servers and colocation!
The kind of loads big firms need to support are in the order tens of millions of users with millions of transactions a day. What I mean by transactions is buy process which can contain a dozen to a couple hundred individual orders. In other words, the number of complex insert/updates is tens of millions to hundreds of millions a day.
For example, big firms like fedility, city group, thompson, vanguard, and schwab have millions of customers with hundred thousand plus portfolio managers. throughout a given work day, a portfolio manager may generate a couple hundred orders and submit them in one or two batches. This is done because it's cheaper for them. Can .NET scale well? Like what others say, it can if you design it right. For example, if you use MSMQ for it's designed job it works well. If you write your queues for MSMQ with plain hashtables and you don't index the messages, your chances of supporting 10K+ messages a second aren't likely. On the otherhand, if you write custom queue's, profile the messages, index them efficiently and make sure no other heavy weight stuff sits on the same box it can scale. Is that easy? No. You have to understand the problem you're trying to solve. Let's say hypothetically you have insane performance requirements like 100K+ messages a second for a messaging tier, you're better off using IBM MQSeries. Can you do the same thing with MSMQ? Sure if you build a bunch of custom stuff, write the messages to a database, index, partition and load balance. It will probably take you 8-12 months to do it, but you can with the right people and good hardware. Would you want to use XML for that messaging system? The answer is obviously no, if you want to keep the cpu and memory loads manageable.
Many people have claimed they support thousands of transactions. Sure if all you're doing is insert into one table. Simple stuff right. Financial transactions like trading systems do a heck of alot more than a simple insert into one table. More often than not, a trade transaction with 100 orders goes into the database, affecting several tables. The middle tier then has to get events, and check the order to make sure it is valid and does not violate regulations or other compliance requirements. Sometimes it requires analytics like Tibco or what the industry calls Business Intelligence. Regardless of the server, stuff like analytics take time (seconds). Obviously if you're running complex analytics that scane 10 million rows of data with several joins in the query, you're better off using an analytics server like OLAP. Can .NET handle 1K analytics requests per second? If it's cached sure. If the nature of the data is very dynamic, like realtime trading systems, no way. doing that is very hard and most people avoid it.
The key here is setting the expectations accurately, so your customer knows what is realistic. If you have a hard time communicating that to your customer or management, than find another job.
In other words, if you define a target performance metric and find that a single user can access the system at better or equal that speed, then a system can be said to "scale" if it still performs within that metric, on average, when 20, or 50, or 100, or whatever users are accessing the system.
Of course, due to the nature of computing systems, any system will hit a point where it will no longer scale to the requirements with the same hardware configuration. At that point we start talking about scaling upwards through hardware upgrades. But you can only upgrade hardware so far, at which point, again, the system will reach a hard limit on its ability to perform as required.
The next logical step is to scale through redundancy: If a system can, in whole or (more commonly, as in "multi-tier" web clusters) in part, run concurrently in multiple instances, then you can scale by adding more instances of the system or system components: Multiple web servers, multiple "back ends", multiple database servers, and similar. This kind of organic scaling can be, in practical terms, near-infinite if a system is designed well; for example, multiple mirrored web servers serving static pages will scale indefinitely, whereas a trading system requiring inter-process synchronization through a message-queue system will most likely not.
In scalability terms, efficient redundant clusters is the holy grail. It's the way Google scales, and it's how the core parts of the Internet scales.
In the context of this Slashdot story, since the poster faces limited possibilities for investing in expensive hardware, he might consider going for the "many cheap boxes" route, if his Microsoft-dominated infrastructure permits it.