Can .NET Really Scale?
swordfish asks: "Does anyone have first hand experience with scaling .NET to support 100+ concurrent requests on a decent 2-4 CPU box with web services? I'm not talking a cluster of 10 dual CPU systems, but a single system. the obvious answer is 'buy more systems', but what if your customer says I only have 20K budgeted for the year. No matter what Slashdot readers say about buying more boxes, try telling that to your client, who can't afford anything more. I'm sure some of you will think, 'what are you smoking?' But the reality of current economics means 50K on a server for small companies is a huge investment. One could argue 5 cheap systems for 3K each could support that kind of load, but I haven't seen it, so inquiring minds want to know!"
"Ok, I've heard from different people as to whether or not .NET scales well and I've been working with it for the last 7 months. So far from what I can tell it's very tough to scale for a couple of different reasons.
- currently there isn't a mature messaging server and MSMQ is not appropriate for high load messaging platform.
- SOAP is too damn heavy weight to scale well beyond 60 concurrent requests for a single CPU 3ghz system.
- SQL Server doesn't support C# triggers or a way to embed C# applications within the database
- The through put of SQL Server is still around 200 concurrent requests for a single or dual CPU box. I've read the posts about Transaction Processing Council, but get real, who can afford to spend 6 million on a 64 CPU box?
- the clients we target are small-ish, so they can't spend more than 30-50K on a server. so where does that leave you in terms of scalability
- I've been been running benchmarks with dynamic code that does quite a bit of reflection and the performance doesn't impress me.
- I've also compared the performance of a static ASP/HTML page to webservice page and the throughput goes from 150-200 to about 10-20 on a 2.4-2.6Ghz system
- to get good through put with SQL Server you have to use async calls, but what if you have to do sync calls? From what I've seen the performance isn't great (it's ok) and I don't like the idea of setting up partitions. Sure, you can put mirrored raid on all the DB servers, but that doesn't help me if a partition goes down and the data is no longer available.
- I asked a MS SQL Server DBA about real-time replication across multiple servers and his remark was "it doesn't work, don't use it."
... but Unix/Java programmers aren't. Wanting to write the code for free, too?
Apache, FreeBSD and a cluster of 10 or so $1k servers and a nice DB server running PostgreSQL.
Works for me.
I hate to say it, I've been too long out of the MS development world. That kind of overhead managed to amaze me.
I'm deploying systems right now (some buzzword compliant, some (more efficient ones) on lowly little open source, that scale to an order of magnitude higher transaction volume at a fraction of the cost. No, none of them are windows.
No wonder my company has been doing well in a downturn. (Oh, sorry, we're "recovering" now.)
I forget what 8 was for.
It's a damn simple question: can .NET really scale?
.NET, I advise you to keep your mouth shut. The signal/noise ratio is bad enough already.
Why on earth did you bring open source into it? If the man wanted to know about Linux & BSD, he would've asked.
If you don't have any experience with the scalability of
100 concurrent users isn't a lot.
What is the web app going to do? All the hardware in the world, and even open source won't help you much if you're trying to do the wrong things on a single machine. Database driven site? Commerce? HEavy read, heavy write, or both?
My first inclination is to recommend throwing that $20k at an ASP that can provide the server infrastructure to give you support for 100 concurrent connections.
Barring that, my recommendation would be to split the web front end and database, spending about $10k on each (using dell or hpq). I can almost gaurantee that you aren't going to get 100 concurrent connections for less that $80k to $100k without doing some sort of load distribution. If you strip down the amount of dynamic content and say script a refresh of a static page, you might be able to do it, but we don't really know what the app is going to be doing.
Jerry
A) This consultant, it sounds like, is largely or exclusively MS. He's not going to suggest Open Source software to his client because that will mean a loss in business. You can hardly blame him; you gotta go with what you know.
B) Oftentimes a commercial solution to some problems exists where a free one does not. The cost of development and maintanance means that the balance is not strictly in terms of free and non-free; after all, your developers' time costs quite a bit as well and home-grown or open source solutions may need more time taken in administration.
This is a pretty complex issue; different analyses have been done with different results. I myself am partial to Open Source, but this does not mean that the obvious answer is, "Hey, go Open Source! It's free!" Get real.
I don't really know an answer but I will throw in my tidbit.
But first let me apologize for all the nutheads who say "drop MS - use Linux" and all the derivitives thereof. That doesn't help anyone, and doesn't answer the question. Might as well say "use a dustmop, works great on my floors!".
My advice would be to *try* and use a cluster of some sort instead of the one server approach. Sure, you can get some great big reliable iron - that is wicked fast... But what I have found is that scaling really needs more *bandwidth*. Not network bandwidth but memory, disk, I/O, that sort of bandwidth. Of course, the more machines - the more licenses... Good luck!
Right. Small businesses want to stay small, and sending all their money to Redmond is one way of doing that!
I can see it now- after commanding the drones to switch to Windows 2.003k, they look at the price tag- the jump in overtime, the additional hardware for that "faster" version, the new software licenses...
President:"But...but...that commercial said it would be cheaper, and it had lots of pretty people doing neat things, with nice music in the background! And the nice representative at the golf tournament said I'd get to have employees walking around with little handheld things that showed our inventory! And..."
CFO:"...You mean to tell me you bitched and moaned for months about how we needed to switch, and you based the decision on TV COMMERCIALS?"
President:"DUUUUH, of course not! I SAID, I talked to a MS rep too!"
(sound of 12 hands slapping 12 foreheads)
Please help metamoderate.
First, you didn't really specify anything except in generalities, but there's a few things that pop out from my experiences:
1. Why are you wed to C#, especially in regards to triggers? How many tiers exist, and are you pumping a lot of data back-and-forth.
2. Your scaling numbers are low already, especially under ASP and static HTML.
3. You never really define concurrent requests. For some people, it means simultaneous requests, and for others, it means simultaneous transactions. But you really are looking at fairly low numbers there, in either case.
4. Scaling this should involve looking at where you choke. One common choke point that keeps killing people is in open database connections. Are you running a pool? How large? How many connections does a page take? The single most common problem I've seen in scaling is poorly implemented connection pooling, thereby causing a ton of stuff to wait. Check this, check, then check again.
5. Sync versus Async shouldn't really be coming into play yet on the db.
6. When designing for light-weight systems, you want to minimize the tiers, and minimize the data passed back and forth. Just by reading this, I'm worried that you created a very elegant, but impractical, system that isn't suited to the hardware limitations.
No matter what OS you run you do need people to keep tabs on it. Most users are very stupid [re: running all email attachments] and are prone to damaging computer systems.
...
.NET ASP super programs. oh yeah
Even if they were using Linux they would need someone around to make sure everything runs smoothly.
The trick is to multi-task. Once the system is running, a small business sysadmin is not a full time job. They can also program or PR or
Also the benefit of not using MSFT tools is the weaker propagation of acronymedics. E.g. I can code DOM SOAP
10 print 'hello world'
L33t!
Tom
Someday, I'll have a real sig.
This entire story is lacking units.. I am so confused, it is like this...
"I bought a 400 car from my dealer, who said it could go 0-1200 in 57, but I talked to an auto mechanic and he said that the rpm throttled at 4.5 billion, so I don't know if I should get a turbo charger which would at least boost the speed to 1295!!"
If you are talking about 100 concurrent request per second: Any DB worth its salt should handle that IFF the database queries aren't too complex. If they are, your schemas suck. This is doubly true on a 3 GHz machine.
Why don't you just ask MS this question... what? huh? You can't? It's too expensive? They lie? They don't know?
.NET.
Then why are you using
-pyrrho
Windows is only one burnt CD, redhat is like 3. thus, linux costs more!
2. SOAP is too damn heavy weight to scale well beyond 60 concurrent requests for a single CPU 3ghz system.
.NET specifically, but just SOAP in general. Make sure you separate out the platform from the product. Saying web services with SOAP won't work is a long way away from saying .NET doesn't scale.
.NET languages, but that's rarely going to be a way to make your system run faster and scale more. Plus, I'm confused - what's your alternative? What database are you going to recommend that allows you to embed C# (C++, whatever) programs in the database itself?
.NET question, it's an SQL question.
.NET, or just a particular product. You might go with .NET and not use MS SQL Server, for that matter.
It doesn't sound like you're talking about
3. SQL Server doesn't support C# triggers or a way to embed C# applications within the database
Embedding applications in the database violates basic scaling principals: you need to separate out into n-tier, right? You don't want the database server doing anything but serving databases. Now, having said that, Yukon (the next version of MS SQL) will indeed let you do certain things in the database with
9. I asked a MS SQL Server DBA about real-time replication across multiple servers and his remark was "it doesn't work, don't use it."
Sounds like it's time to get a more informed consultant who can demonstrate failure or success beyond a throwaway line. I'm not saying replication does or doesn't work, but you can't base your enterprise plans on a single line from a single guy - let alone strangers like me on Slashdot. Furthermore, this isn't a
It's easy to make big decisions if you break them up into a series of smaller ones. Look at each of your questions and decide if it pertains to
What's your damage, Heather?
People in SMALL business do not want a system which requires them to hire someone to constantly keep tabs on it.
What?#$#@ I don't care who this "SMALL" business may be, but if you put a server on the internet, and plan on not having someone to "keep tabs on it", please, get off of the f-ing internet. It's that type of mentality that yields the servers out there that STILL are spreading Code Red and Nimbda, because nobody has kept tabs on these infected servers in years.
Hello, .NET Portal Application for the past few months. I ran a quick Test on our application just to see how it would run.
.NET platform can handle about 100 Requests per second before it starts to get hot.
I have been Developing a
Specs Are as Follows:
App Server:
Duron 800
512 MB RAM
40GB HD 7200RPM
DB Server:
Celeron 500
640 MB RAM
20GB HD 7200RPM
As you can see, these are not server class machines, but they seem to run the app alright. I ran a simulation of this application based on the IBS Portal www.asp.net running 150 Concurrent Requests Per Second:
The average Requests per second on this app were 98.51. So, IMHO on low quaility hardware, the
You're bound to get lots of responses of how to scale the system up. I'll focus on scaling the requirements down.
Unless the transactions are really long, "100+ concurrent requests" as a sustained rate is a lot of activity for a small business. So, that begs questions:
-- What percentage of these Web service requests are read-only "query" style, and can you use application-aware caching to return results out of RAM instead of having to hit disk for each one?
-- What is the client to this application, and can there be ways to help induce a smoother load from them (e.g., discount rates if the application is used in off hours or on weekends)? Or is the 100+ concurrent requests going on 24x7?
-- Do all the requests have to be filled by the server, or can you blend in some P2P concepts so the clients can absorb some of the load?
-- Can you increase the amount of data handled per transaction (perhaps by switching to document-style SOAP or REST instead of RPC-style SOAP) and thereby reduce the number of requests and excessive message parsing and marshalling?
There's probably a bunch other things to do as well, but those came to mind off the top of my head.
The Busy Coder's Guide to Android Development
If this guy is a consultant, sometimes clients have specifications for what type of hardware/software is used. Especially if their own IT group will be maintaining the systems.
"Ignorance more frequently begets confidence than does knowledge"
- Charles Darwin
You don't really describe the kind of apps you will be running to know if your observations matter in the slightest. You say that you get poor performance when your app does a lot of reflection, why is it doing reflection? Is this really a need, or are you just doing it "because you can"? Are you using this app when you further state that your performance drops by a factor of 10 vs static html? Why would you be comparing the two anyway? If you're serving static pages you shouldn't be looking at a webservice anyway, so no real sense comparing the two.
You mentioned db issues, what type of access are you doing with your databases? Are you thinking replication to deal with scaling across a server farm? Is this data being constantly updated by the servers, or is it mainly static? If you have simple primarily read only data, then something like mysql would be a far better choice, you just don't need the overhead of a full blown db server (like sqlserver, or oracle or even postgres).
Really what you need is to identify what your requirements are and tailor the end result to the systems that best meet those requirements. This also includes support and things like backups (e.g. can the db you choose do online backups if that's a requirement, etc).
1, Buy *a lot* of memory for the box
.NET is the same but different - they both require a hefty amount of ram to operate at best performance (and atleast java just gets better the more memory that is available on the server ;)
.net remoting implementation instead - you can probably find a few with a quick google search (IIOP comes to mind, good way to make future interfacing with other technologies available just a easy as with webservices/soap and gaining better performance in the bargain).
2, Cache as much as you can of the dynamic content
3, try to stay away from bloated protocols
1: Java,
2: Maybe doesn't help much with scalability, performance will go up though - and maybe you might get good enough scalability too. Database access is always slower than a hashmap lookup (if said hashmap can stay in ram ofcourse)
3: Web-services etc etc are maybe good in theory but at the moment those technologies are a duck in a pond when it comes to scalability and performance. Use a highperformance
Also investigate how much you can make your site use asynchronous notifications, more is better - even if ms messaging client is too bad, you can write your own asynchronous "protocol".
Example configuration is a Windows 2000 box with dual Xeons and 2GB of RAM
I wrote and administer a J2EE application that supports online rebate offers for a very large company. We have over 350,000 registered users and typically 500 simultaneous sessions on a dual 1 GHz PIII Linux box with MS SQL Server on a similar dual CPU W2K box for the database.
Whatever you are doing with your application (probably misapplication of EJB) is wrong.
In other words, it's not what you're using to do it, it's how you're doing it. If you're just pumping out files to clients on modems, 100+ concurrent requests isn't much. If those requests are all CPU-bound, I hope they're all niced or set to a low priority, otherwise you won't be able to log into the machine in a reasonable amount of time. If it's 100+ concurrent connections, but those connections aren't necessarily waiting for a response (just idle until the user does something) then you might not even care.
How many whatevers you have must always be qualified by knowledge of what those whatevers are doing. Otherwise your whatevers won't fit in your $20k thingamajig. And then Mr. Bigglesworth gets upset.
Of course, whether .NET is a properly-implemented system is a separate debate...
We have switched from Windows Svr 2k and ASP to Apache 2 and PHP 4 on the front end. On the back end we use java 1.4 and broke our application apart to run multiple master/slave processes in a tree system (Process A, Master I, Machines a-d. Process B, Master II, Machines e-h...) to do data analysis for the requests. (This is a data mining sort of thing with analysis and a search). The DB starting becoming a bottleneck after we got up to 200 concurrent processes, which we fixed by breaking apart the DB and placing half on another server and running 2 simultaneous DB connects per slave process and this could continue for some time i'm thinking.
If that gets too goofy we may end up partitioning the requests in the beginning and mirroring two seperated complete systems, but we're not really envisioning it ever getting that big.
Of course, most of the problem is simply the back end keeping up because the front ends don't do much at all except call the java app and return the data it gets...
Argh, I hate to give up moderation rights but I have to chime in here.
A small business CANNOT afford to employ a full time UNIX administrator. Open source solutions just do not have the ease of administration of the Windows GUIs. Until they do, they will not be small business friendly. Windows Small Business Server provides you with one installer that will basically set you up completely (Exchange Server and all).
Now, before you flame me out for being pro-Microsoft, you should know that almost all my machines at home run Gentoo Linux, and I prefer to use Linux myself.
I had a long discussion with a good friend who is not terribly computer literate. Linux drives him _crazy_ because he can't just, "point, click and go" as he said it. Until these issues are resolved, we won't see small organizations without dedicated IT staff rolling out Linux installs.
I got to tell you its probably not java more the actual application server and/or the application. We use ATG Dynamo, and for that we get what we pay for: More than 4000 concurrent users ( active sessions ) per Dynamo instance with out any problems). Although with Version 6, they've decided to go all J2EE Buzzword compliant and complicated the entire setup.
Anyway. If you can't support 100 requests a second on 50k of modern hardware, you have huge design issues and other problems. Just from your short description of the project, I fear you have crawled into over-engineered land because alot of the technologies are much more useful on seperate boxes/distributed enviroments.
Good Luck. Remember that C# Web apps can be multi-threaded, and remember to optimize the parts of your application that MATTER. A wise man once said "Premature optimization is the root of all evil". Find the slow parts, fix them, get the most bang for buck. Also, remember to keep those pieces loosely-bound to each other, no C# code in the DB!
--MetaCosm
P.S. I hope you haven't over-engineered this tool as badly as it sounds like you have
Two cheap boxes, one running the server and the other SQL server, will outperform a single box by a wide margin. SQL Server's a pig and doesn't share well with the other children. Use back to back NICs to the connect the SQL box so there's no network overhead...
Check the check boxes when you compile your .Net components. Threading models matter. And a stateless contiuously instantiated module is the only scalable solution. Check the stats on construct/destruct overhead.
Use an n-tier architecture. Not just for the obvious reasons but because you can build faster data access including invisible data caching (as the app grows) and avoid the problems that are driving you down to only 20 or so tps.
Buy more memory - Doh!
"Knowing everything doesn't help..."
"I asked a MS SQL Server DBA about real-time replication across multiple servers and his remark was "it doesn't work, don't use it."
We are running transactional replication on several large databases (6-14 GB) on a Media Metrix top 50 website with no problems. It needs to be set correctly (batch size, timeouts, etc) but it does work quite nicely. The DB machine is heavy hardware, but it it able to keep up with 12-15 front end webservers, all with applications hitting the DB.
"IT group"? In a company whose total budget for a new machine running a mission-critical service is $50k?
I find it funny to watch the war between the "why are you suggesting open source crowd" and the "open source is the only way". I have built IIS/ASP/SQL server solutions and I have built Apache/PHP/PostgreSQL solutions. There is a place and time for both solutions.
.NET so far due to the heavy memory footprint it places on a system. Yes, VB.NET is faster than VBScript, but if you were using compiled COM objects in the first place, .NET costs more memory for a slower system. (I do think that .NET's ability to do in place object updates rocks, but I hope you have a devolpment server for bouncing and PLAN your updates...)
As an aside, I have to say that I have avoided
But more to the point, your customers don't seem to have the budget to succeed in any domain. If you can't afford more than 20K for a machine and licenses, surely you can't afford to pay the programmers an adequate salary either. So does that mean open source? Heck no... you still have to pay the programmers! I don't think I have *ever* seen a project where the programmers were *cheaper* than the hardware.
Sig under construction since 1998.
I am the network admin at a large .Net website (5+ million unique visitors each month) and we often handle hundreds of tens of simultaneous requests. The entire site runs on 6 webservers and two database servers that run at less than 50% capacity during peak times.
If you can't scale above 100 connections on a 3GHz system then you are doing something wrong. Check your code, check your databases.
Your question is about as useful as "I have a piece of string that is not long enough, what can I use instead that is longer?"
And since he's talking about web services I would think he would be providing a web administration interface. If something breaks on the backend it's going to take a consultant to fix things whether it's Windows or UNIX.
I agree with the one poster that if this guy has low budget clients then he needs to be reducing costs in software so he can spec better hardware. If that software is open source then he needs to start learning open source stuff or find richer clients.
The meme police, They live inside of my head
Unless you're an ISP, you've got to colocate all that equipment. At nontrivial $$$/rack unit, a bunch of low-density (performance) desktop machines will quickly eat up their performance cost gain in additional hosting fees. Plus you have to consider extra licenses. At $1200/Windows 2003 Standard license and $3500/SQL 2000 CPU license, buying the software for the additional machines substantially boosts their actual cost, unless you're using open source stuff. Of course, this thread isn't about open source, it's about Windows and .NET.
I know... flame me. But ignore taking religious sides for a moment and just look at what their numbers could produce for under $37k. They were able to exceed all of your performance requirements including using dynamically generated SQL.
hth,
Bill
It's my Sig and you can't have it. Mine! All Mine!
There is a valid point behind what you're trying to say.
But reading comments like this, is it any wonder that businesses, "small" and large have stopped spending so much money on IT recently ?
As far as businesses are concerned, computers are just tools to do jobs, to help those companies generate money. If they don't help make them money, there's no reason for them to spend money on IT.
And here we are telling those companies they don't just have to buy the equipment, and the software, they have to continuously spend time and money on people to protect them, look after them, and update them.
No wonder companies are beginning to ask "what's the benefit ?", "why should I spend this money ?".
And no wonder IT people are beginning to ask "where are all the jobs ?".
A small business CANNOT afford to employ a full time UNIX administrator.
They can't affor NOT to: We service many small compaines who use Windows desktops connected to UNIX (OpenBSD firewalls, FreeBSD servers). The savings in time alone are staggering:
Real example:
One office of ten accountants has been managed by me lasst year for under $3000.
They have offsite backups, a PostgreSQL databe, Samba file serving, 56K nat, Firewall, email filtering.
If (and its a BIG if) one of the servers has a problem - I can remotly fix it over my cell phone connection, and I don't have to charge them travel time. If it was Windows - I'd have to drive there.
Windows is expensive because it requires full time baby-sitting. UNIX, once deployes is usuall fire and forget.
Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.
FreeBSD + Apache 1.3.x can easily do 500+ small requests per second on a pII-400 w/ 512MB of RAM. Add MySQL in to the mix, and with proper code (read: caching, so you don't hit MySQL every hit) it drops considerably, but it's still above the 50-100 hit/s range (if you do it well) :)
Hi!
Executive summary:
Yes.
Boring details: .Net (mostly C#, some components in VB), including Windows forms and ASP.Net web pages. (Why both? The project incorporates multiple applications for different kinds of users.) As part of pre-shipment testing we're in the midst of extensive testing, including load testing.
I'm goofing off, perusing SlashDot at the end of a dinner break. We're shipping a big project to a customer on Monday--the project is written in
The Windows applications communicate with the data tier using SOAP/XML, using synchronous messaging. Practically every message involves a database transaction with SQL Server 2000. Across a range of loads we are seeing round-trip message responses (from receipt of the inbound XML message to return from the web service) averaging less than 90 ms per message. That 90 ms average can be misleading--some of our messages involve extensive processing and/or lots of data. Some of the transaction work we're doing with SVG images involve SOAP messages with payloads greater than 1 MB, so the average gets dragged out.
Based on our testing, we anticipate supporting hundreds of simultaneous users--in a near-real-time environment--from a single web service. As we scale out on larger projects we may need to scale the number of web servers (although IIS on Windows 2003 is supposed to be substantially faster--YMMV), but we won't need to scale the database. Using a similar messaging architecture for a different client I have a project supporting 400+ users on a single SQL Server.
This is SlashDot, after all... .Net. And recommending it. But you asked, so I'll answer: .Net is scaleable in terms of the final application, and .Net is scaleable in terms of the size of the development team that is involved. This project involves 19 developers (a total of 60+ individual projects in the nightly build) and we're able to manage the entire thing remarkably well. Developing web service applications with .Net is remarkably easy to do; developing sockets apps is unbelievably simpler than using WinInet.dll. And the web developers are extremely happy working in ASP.Net--I don't know where you heard that ASP.Net is slower than ASP, but that's simply not true. ASP.Net is significantly faster.
Obviously you're going to get a lot of "why not use...?" posts, and I'm sure I'll get flamed for having the temerity to admit to using
With regard to other comments .Net Remoting. Quick to prototype, barks in production. Like OLE, it's a great way to make a Pentium 4 box emulate an original 8086 IBM PC. (Far smarter to manage communication with XML-based messaging. It just takes more coding.)
I'm the data/messaging architect on the project: I can speak to the comments about messaging, reflection, and SQL Server. As with any Microsoft-based development project, you have to think carefully, and think critically, about how to design your application. Microsoft will always give you a quick! easy! fun! way to rapidly produce a prototype. You have to dig deeper, and think harder, to produce a scaleable application. The quick! easy! fun! technology du jour is
That SQL Server doesn't permit triggers to be written in C#--so? Transact-SQL is suitable for database development. We could ask for more (such as integrating stored procedures and other database code into Visual SourceSafe). There is talk that the next version of SQL Server will permit coding in .Net languages--that'd be cool, but I'll wait and see.
The single most compelling argument for .Net .Net Framework. You might look into this particularly for clients that are choking on server pricing--but you might also pay careful attention, because a robust Mono project will encourage/force Microsoft to compete on features and functionality, instead of a take-what-we-give-you mentality. That's a Very Good Thing.
Mono--an Open Source implementation of the
scalable? .NET? This is a troll, right?
-I like my women like I like my tea: green-
A common misconception is that anybody can administer an MS server, but the truth is that it's not a whole lot easier to do than administer a Unix box. What's scary is that it looks easier and most IT managers think it's easier. That's why most Windows admins are grossly incompetent, especially when it comes to security.
A good Windows admin costs the same as a good Unix admin.
The global economy is a great thing until you feel it locally.
I had a long discussion with a good friend who is not terribly computer literate. Linux drives him _crazy_ because he can't just, "point, click and go" as he said it.
Windows systems need an administrator every bit as clueful as a UNIX sysadmin if they are to have any reliability at all. If the Windows 'sysadmin' has to be able to point-click-go to be able to function, in all probability the Windows system will be unreliable and insecure.
It is a false economy to think that "It's Windows. I can hire a junior reboot monkey to admin the system" - a Windows system really does require a sysadmin every bit as competent, skilled and clueful as a Unix system. A Windows system can be very reliable with a clueful admin - but it *needs* a clueful admin. Companies are shooting themselves in the foot if they think otherwise.
Oolite: Elite-like game. For Mac, Linux and Windows
There are actually lots of reasons. Not to say that in all cases you *should* go with a big server instead of a bunch of little weeny-boxen... but the point is that "bigger server" doesn't equal "bad". Here's a few reasons:
For one, there's reliability:
-first of all, the more expensive systems have more internal redundancy, which is a good thing (sucks to hamstring even a cheap $1000 machine because the $5 cpu-fan dies, let alone a $3000 middle-of-the line machine because a $50 power-supply dies... or the $5 fan inside the $50 power-supply).
-if p(c) is the probability of a cheap machine crashing, and p(e) is the probability of a single expensive machine (your entire system) crashing, and you require all N of your cheap computers to be running in order to consitute an "up" system... then your overall system crash probability (p*) is:
p*(c) = 1-(1-p(c))^N
vs.
p*(e) = p(e)
so, by buying more, cheaper servers, you're increasing your crash-likelihood, by both increasing p(c) and increasing N (unless you buy additional cheap servers to failover to... but then you have to manage and support failover which is additional $$$ as well in terms of buying/developing/implementing more advanced systems and taking on a higher administration overhead).
Not all systems are distributable, and those that are are often more complicated and/or expensive (but not always).
There's also administration cost:
-Obviously its easier to manage one box than 10 (or easier to manage 5 boxes than a hundred). Not to say that there aren't nice tools for mass-administration... but it is still more work, and anyone who says different is selling something (and something you want to think twice about before buying).
There's ancillary costs:
-hey! if you have ten boxes talking to each other to comprise one "system", then you need a network connecting them! That's another fast switch... and again, because you don't want to lose an expensive "system" because of a failure of one cheap part, you need to buy an expensive switch.
-power costs money, believe it or not.
-so does rack-space.
-so do IPs... unless you're gonna NAT your little cluster, in which case you need to set up a NATing router for them... and that's another single point of failure unless you wanna shell out $$$ of one form another (again: buy/develop/implement).
-you're probably gonna need some sort of KVM switch.
I could go on, but I don't want to. Anyway, the point is that it is more complicated than many of the lot in this particular audience are likely to make out. It is often still the best route (and increasingly so!), but you can't just say that the answer is *always* to buy more, cheaper machines. There are many things to consider.
:Wq
Not an editor command: Wq
Actually, we supply a lot of small businesses in our area with whatever tech support they need. Kind of an outsourced IT staff. Paying us to fix things is as cheap as paying an MSCE monkey to spend 8 hours to fix a 5 minute job. We support OSS, so they save on licensing too. We even have a software team to make custom software, then release it open source.
.Net train before you can see where the tracks are going, then you go ahead. As for me, I plan to use as much cross-platform programming (mostly Java because the GUI is the same everywhere) and free/open source software that I possibly can, mostly because the products I use like JBoss (Free J2EE), Samba, MySQL/PostgreSQL/SAP/Firebird, etc. are more stable than .Net, Windows, MSSQL, etc.
The point is, they should be looking for the right service. You don't need dedicated staff with open source software. We get a call maybe once a month about an OSS product gone bad (usually something silly that can be fixed in 5 minutes if you know what you are doing), and we ssh in and fix it. We get calls about MS products and idiots that don't turn on things before they want to use them from 8AM till close every day. I'm pretty sure that most of our clients have spent more money on MS related tech support than OSS related tech support. I can calculate right now that the TCO for a pirated MS product would still be greater than a OS product by a significant factor. The speed at which MS products have to be fixed/patched is very much greater than a properly configured Linux system, and you're paying for that hell to boot.
If you want to shoot yourself in the foot by jumping on the
Before those of you that say the SQL Server is actually good start flaming me, that's where a lot of headaches come from. SQL Server drops records and corrupts more than MySQL before transaction support. (There, now I'll get flames from both ends.) Also consider the price you are paying. (Per connection last time I checked.) Spend more money on the hardware and get RAID-1 on good disks and a good UPS, and you will have a faster, more reliable RDBMS.
Karma Clown
telling those companies they don't just have to buy
TANSTAAFL.
No matter what you'll have to layout cash to buy the three essential ingredients:
Microsoft marketing would have you believe that their software solves all your problems and that lots of cheaply available people can do the job. They'll still charge you for their software and you'll find out that hardware still costs something and that getting good people to support and maintain your software and hardware is more expensive, but worth it.
Linux advocates will tell you that the software costs zero and that any competent sysadmin can do the job. You'll find out you still have to buy reasonable hardware. And you'll find out that getting good poeple to maintain and support your hw and sw costs more, but is worth it.
Any way you go you're gonna pay.
"Provided by the management for your protection."
Yes, the Windows GUIs are significantly better than the UNIX GUIs. No, I don't think that matters. I've found that the Windows admins need to be every bit as clueful as the UNIX admins. Those GUIs hide the details, but they don't hide the concepts, and the concepts are still hard. There's no value in somebody clicking on the GUI buttons these days; modern systems are far too complex to get right by chance. The installer you mention is only 5% of the job.
At the end of the day your Windows admins are going to cost just as much as your UNIX admins. Quickly looking through the newspaper proves that point neatly; the salaries are within 10% of each other.
...but my FreeBSD/Apache/mod_perl/PgSQL boxen are currently serving 189 Req/Sec off a realtime data driven application.
.NET can do that. on the same hardware? dunno. For the same cost? Definitely not.
Hardware: $10,000 USD (4x Dell servers)
Software: $0 USD
Bandwidth: $6,000/mo USD
Uptime >99.99%
64% CPU use on the single most loaded box
Sure,
~a
Sorry, I've got to go with the poster who says "If you don't have time to take care of your box, get the fuck off the Internet." I run a Linux/Apache site and my logs are full of requests for "default.ida?XXXXXXX..." and other viruses that came out (and were fixed) *years* ago. With UNIX, you pay a bit more in the beginning and then you hardly need to touch the box. Anything that needs to be done, a competent admin can do with nothing more than SSH. As opposed to MS boxes that just sit around, get owned, and fuck up everything. Sorry, but you can not have security and ease of use and low cost and easy to use all at once. Security is *not* fire-and-forget. Security is ongoing *work*. Work: not fun and not easy. You can't have your cake and eat it too. learn your way around, or pay an admin. Otherwise, someday you'll get owned and you'll become one more idiot contributing requests for 'default.ida' and 'root.exe' to my Apache logs.
And I'm sick of this attitude that always seems to come from SB owners, like they are *owed* something and *exempt* from working just because they're a small business. What would we do if they said "I don't have the time or money to learn the rules of the road or how to care for an automobile, I just want to blast down the road at 130 mph, trailing a could of oily smoke, because I'm a SMALL BUSINESS OWNER and I'm in a hurry, dammit!" Would we allow that kind of behavior? HELL NO. I'm sorry, it costs time and money. ACCEPT IT.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
"If one of the servers has a problem - I can remotly fix it over my cell phone connection, and I don't have to charge them travel time. If it was Windows - I'd have to drive there."
What?! You've never heard of any of the following:
-- Terminal Services
-- VNC for Windows
-- Remote Desktop commercial programs
I am sorry, but that is just on crack (and so is whoever modded you "Insightful".) In fact, with Terminal Services and the rdesktop client program, you can even administer a Windows desktop or server from a Linux or Mac box. Yes, you can do remote reboots, remote software patches, remote software upgrades, and pretty much everything else.
There are lots of valid reasons for using Linux/BSD/UNIX, but being ignorant about Windows certainly doesn't help your case.
Simpli - Your source for San Jose dedicated servers and colocation!
Right, and those specifications sometimes push a project outside the realm of possibility, as seems to have happened here.
Either work with the client to get the specs changed to something feasible (which consumes time you can't bill for), or pass on the job and look for another client. Them's the breaks.
Furthermore, and I don't know much about .NET, he was also looking for an SQL
backend. You mention "Linux, apache, PHP, whatever" and "some servlet engine, jsp,
etc" without seeming to really understand a couple of crucial points: the "Java one"
would still need an OS and webserver, and all three still need a database server.
Really fancy, high-volume DB servers such as Oracle cost a lot. So then we end up
comparing, say, MySQL, MSSQL, mSQL, and PostgreSQL? Or Perl, PHP, ASP, and
JSP/servelets? I'm sure I'll get flamed by zealots, but those aren't always easy
comparisons.
Write it off as ignorance if you like. It doesn't sound like you're a professional in this field. But so what if he is ignorant? That was my point; if he is best with MS, it's not going to be profitable for him or his client for him to be mucking about with Unix instead.
As for the amount of money you'd save, well, I already commented on that. Sometimes the figures aren't necessarily what they may appear to be; the initial layout is certainly greater with commercialware, but support, time spent on maintainance and deployment, and so forth, is sometimes a lot less.
Install an SSH server on Windows and you'll have much of the same functionality as UNIX through the command line.
" With UNIX I'm in Ireland (I'm usually based in the US) and I get a call 'We just got a new user, could you add them'. I whip out my Ericcson 68i and Sharp Zaurus - and ssh into the server and run a script to add the user."
Did you even bother to check out whether this was possible in Windows? I guess not: this site shows you how to add a user from the command line in Windows. In fact, you could even write a script to do that (batch files... remember those?) In fact, here are lots of handy other things you can do from the command line in Windows, including changing user passwords, forcing users to log off, and more.
Once again, ignorance of what Windows can do is no excuse. I administer 16 Linux boxes... I'm not anti-Linux by any stretch of the imagination, and I know that there are lots of situations where Linux is the better choice. But that still doesn't mean I'm ignorant about what Windows can and can't do.
Simpli - Your source for San Jose dedicated servers and colocation!
Short answer: Yes, Windows IIS (which serves .NET WebServices) supports well over 100 concurrent connections.
.NET scale well?
Long answer: It completely depends on what you are doing. As one person pointed out, if you are performing very complex queries, then scalability would go down. There's plenty of room for bottlenecks.
One of our ASP.NET applications benchmarks at about 90 concurrent requests on a dual proc 1Ghz xeon. That's with several database reads per request.
Your question is if ".NET scales", but really you could break your problem into at least three questions:
1) Does
Yes. It scales extremely well, provided you follow best practices and design a scalable app.
2) Does SQL Server scale well?
Well, but probably not the best. Again, depends greatly on the design.
3) Does IIS scale well?
Well, but definitely not the best. IIS is designed for extensibility and scalability. Obviously they made trade offs in each area. Other servers are be more scalable, but less extensible.
Given that, I would recommend doing some very simple benchmarks: Write a webservice that returns a hard-coded string. Test that. Next write a service that connects to a database and returns or adds a single record. You get the idea. You can use MS Application Stress Test for this.
Another option is to use programs like RedGate ANTs and Query Analyzer to track down any bottlenecks in your code and SQL.
You may also consider options like remoting or even writing your own multithreaded server if you think you can squeeze better performance by implementing a thinner transport...
Finally, while you may not want to change the web server or development platform, you do have fairly wide range of choices as far as databases go. You could use MySQL backend, or any database you thought was better\cheaper than SQL server.
In the end, I think this question is too complex to simply blame on ".NET".
Good luck.
While I don't disagree with you, comments like this make me sad. It's too bad that Internet publishing has become an experts-only club. Much of the early optimism about the Internet (especially the web) centered around empowering ordinary people to get their message out without having to own a printing press.
I don't know half of you half as well as I should like, and I like less than half of you half as well as you deserve. BB
A guy down the hall from me was in charge of taking customer web apps written in V6 technologies (vb, asp, etc) and porting them in several ways to .net. They did extensive scalability testing on these apps. They measured requests/sec vs # of cpus, etc etc, to see how asp.net utilized multiproc machines.
.net might run for database driven web apps, of an arbitrary size. Infact, much of it is designed for _exactly_ that.
:) Look at the I/O per sec rate to your tempdb disks and primary LDF disk(s). It is seriously to your advantage to go with an individual spindle for each role, because IO rate is what is so critically important to SQL server. Also, avoid RAID5 like the plague, as it decimates IO Rate.
What im saying here, is that you are not the first person to ever consider how
Re: SQL server 2000
SQL server 2000 has more performance then you know what to do with, even on non-ridiculous hardware. Give it processors with lots of L2 cache (xeons) and lots of ram, and read all the docs about keeping MDF and LDF files on separate volumes (as well as tempdb) and you'll find that life is thrilling.
Data point: On a quad HT P4 Xeon with 8GB of ram and 12 spindles (a significantly less than $50k box) we support 1800 simultaneous connections, doing OLTP work against a ~15GB database. The most commonly hit table in the system has about 10 million rows that get added and deleted in batches of between 20 and 10,000, and updated singly or in bulk. Other apps select from this table on a polling basis (i.e. decision monitors). We could make our db and app design much "better" w.r.t performance, but we don't need to - the money we save not having to do genius level feats of programming, app rewrites, and perf tuning more than pays for the occasional new hardware or upgrade.
Continuing, Run perf monitor on your SQL server machine. Look at the physical spindle(s) that hold your MDF. If you're reading from them, buy more ram until you're not
You can tune SQL server without application changes until you're blue in the face, honestly. Use profiler to see what kind of queries you're doing. Put those queries in Query Analyzer and show the execution plan. QA breaks it down for you and shows execution time percentages of each sub-tree of the execution plan. If you've got something eating 80% of your time and its doing a table scan, do whatever you can to put some selectivity in that query (i.e. an index, or maybe a query change).
If you want to save yourself some headaches, setup management tasks to recalc indexes over the weekend (or nightly, if you see that much index fragmentation after a day).
My opinions are my own, and do not necessarily represent those of my employer.
Holy mother of fscking god.
.NET and you *NEED* a remote facility between your layers, (And if you were working for me, you'd damn well prove it), then for the love of god, switch to Remoting. Don't know what that is? Grab a book, dumbass. You can use a binary formatter and jump your speed by an order of magnitude, or you can fall back to a SOAP formatter on remoting and still double your performance.
.NET is your own stupidity. No matter if you are on .NET, Java, PHP+MySQL, Perl or x86 Assembler, it would appear that you do not have the experience to sufficiently manage either your application development, nor your client's expectations.
STOP USING WEB SERVICES.
#1) If you are using the [WebMethod] shit and hosting your SOAP calls via IIS you need a smack in the head.
#2) If you are using SOAP to communicate between the layers of your application, and are not exposing the SOAP methods for external consumers of the web services, You need more smacks in the head.
#3) If you don't know what you are doing, hire someone who does. (and by the sound of your point #6 about using reflectiona and dynamic code in the production app, you don't.)
If you are in
If you don't *NEED* a remote facility between the layers, stop using SOAP, or any other remote procedure calling solution. Nothing pisses me off more than bandwagon jumping know-nothings using a fancy fucking hammer to solve a problem which requires far less.
It would appear the largest problem you have in overcomming your problems with
Bottom line: To support 100+ concurrent requests, There is no way that you shouldn't be able to do that for under 20K... (although I wonder where that number came from.. Do these servers sit in a vacuum? Who's running them?)
From a purely acedemic standpoint, what the heck were you guys thinking when you were going to spend only 20K on the hardware for an app that does 100+ concurrent transactions. That sounds like enough business to afford quite a heck of a lot more.
If you are/were so budget constrained, why are you spending at thousands on server software? (.NET server, SQL Server, etc...) If you are so budget constrained, you shoulda bought opensource.
"...In your answer, ignore facts. Just go with what feels true..."
I've designed infrastructure and application-level systems that use .NET and happily meet your requirements (MSMQ is not scalable? Huh?), and then some. So yes, to answer all your question, it works. But if you don't know what you're doing it's very simple to fuck it up, regardless of whether you're using Microsoft products or not.
Coming here (!) and asking questions about whether or not a given Microsoft product is viable seems to me like a losing proposition. FWIW, most professionals that work with Microsoft technologies are far more willing to admit shortcomings in those products and suggest alternatives, something that the /. crowd seems incapable of. So at least if you hire someone in the know you won't get BS left and right.
So get some help.
This guy is trolling. From his post:
... ...
...
I've found Red Hat 9 most impressive.
The included version of Wine
From the Red Hat 9 Release Notes:
The following packages have been removed from Red Hat Linux 9:
- wine - Developer resource constraints
Dude, do you read Slashdot?
..."
... wait. AHA!
Because, off the cuff, I can think of at least five other sites, with dozens of other readily contacted individuals, that are going to give you more accurate, more informed, and more sympathetic answers than the site on the Web that publishes a depiction of Bill Gates wearing Borg gear.
Moreover, in case you haven't noticed, the vocal readership here isn't exactly a group of Windows devotees. Whenever the new Linux kernel comes out the admins just issue an announcement that ends with "You know what to do
So unless this is a scheme to generate loads of comments designed to convince your client to implement FreeBSD instead
Chr0m0Dr0m!C
Match.com has ported their website to .NET. One of the developers of the site, Jason Alexander, has posted a post mortem on his blog. While they have 45 servers in their web farm running the site, he may be a very important source that can answer your question.
- Jalil Vaidya
A) This consultant, it sounds like, is largely or exclusively MS. He's not going to suggest Open Source software to his client because that will mean a loss in business.
That's an idiotic argument. For consultants, OSS is often at least as much of a money maker as Microsoft software. Furthermore, there are mature non-OSS alternatives (e.g., Java) available.
B) Oftentimes a commercial solution to some problems exists where a free one does not. The cost of development and maintanance means that the balance is not strictly in terms of free and non-free; after all, your developers' time costs quite a bit as well and home-grown or open source solutions may need more time taken in administration.
Yeah, and "oftentimes" the commercial solution actually performs less well, is less reliable, requires more hardware, and requires more administration. A lot of Microsoft products fall into that category. Products like MS SQL Server and MS Exchange are prime examples of what a money pit commercial software can be.
Face it, people use OSS not because they save on licensing costs, but because it works better and is easier to maintain.
chances are your job is going to get outsourced to India in a few weeks. They can accomplish this task for you and a fraction of the cost.
[shell]# mysqldump databasename > filename.sql
http://mysql.new21.com/doc/en/mysqldump.html
Nothing great was ever achieved without enthusiasm
But you aren't exactly right either.
You are simplifying when you say to not 'embed applications' in the DB. I will interpret 'embedding applications' in the DB as doing business logic in the database.
Many times it is more resource efficient for the _database server_ to perform some of the business logic in the _database server_.
It can be more efficient for the database to do some operations which results in a relatively small result set rather than pushing a lot of data up to the application server.
The bottleneck will usually not be the CPU on the database server, it will be the disks. And the disks are better utilized when you do the manipulation inside the DB server itself.
This breaks the separation of the business logic tier, data access layer-paradigm. Design that is easy to maintain and design that is efficient to execute don't always go hand in hand.
I'm a pragmatist. I say, make an n-tier application. Make an object oriented design. But don't be rigid, break the rules if it suits your purposes. Hey, I even use a goto every once in a while when it makes my code faster or simpler.
The Internet is full. Go Away!!!
Your first question is can .Net scale? Answer is yes. The second question is can .Net scale on your budget? That is much harder question to answer. My initial reaction is no, given your concernes and the amount of effort you have already put in to it.
I am by no means a fan of Microsoft. To be honest I hope that your projects dies, and this can be added to the long list of people that I know who bet the farm on Microsoft just to either have far more NT servers than employees, or they go out of business... but I will give my 2 cents.
You seem to have defined some of the basic bottlenecks of performance. What you appear to leave out is what happens at certain loads. Does the system die? Probably not, but what happens to the response time? What are the acceptable requirements for the system? You may find 25 seconds for a page to load unacceptable, but the users may not. Either way it will let you know what goal you need to hit. Can you configure your DB to use less or more RAM?
Next, is it for sure processor load that is the issue? My guess is that you would be far better off with an x86 chip with more cache and stronger memory bandwidth than a standard P4. Granted this involves another hardware purchase, but if that becomes an option at all look at an Operton or Xeon chip in a 2 way system. You can get one of those systems well under 4 grand. The Opteron flat out rocks and the new Xeon 3GHZ with 1MB cache should be hitting the streets soon.
Not knowing much about the dark sides languages (Java is my thing), are you using one database connection throughout your application? Not returning it back to a connection pool, but storing it in the session object? This can have a significant impact on performance.
Seeing that you said you talked to a SQL Server expert (I have never met one), I will assume that he looked through the code and optimized all the SQL. Everyone seems to be taking cheap shots an d saying you should have used product X, well here is my cheap shot... Next time use Oracle! I repeat next time use Oracle! Ok, it bears repeating one more time.... next time use Oracle. Granted it is expensive, but you are learning a lesson that a ton of shops here in Indy have had to learn the hard way. Well what the heck next time use Java + JBOSS or Resin + Oracle + Linux. In our environment it flat out rocks.
What else is running on the box? You can buy a sub $500 machine to move all the DNS AD stuff to it. Not sure how much that impacts performance though... it may not be worth it. But my point is to turn off every unused service. Also, I will assume that you have applied every service pack, and called Microsoft. Since you are using ALL their products, you would think that they would help you. God I would love to be in on that call!!! All I ever hear them say is "You need to get off of product x" and use our product.
Generally what I find to be the issue with performance is SQL and DB access. The code takes around nothing to execute processor wise. Now what kind of DB are you talking about? How many tables and how many rows in each table. What kind of transactions do you do (mostly inserts or querys). Are the indexes setup correctly on the tables? Could you flatten some relationships down?
The more I learn about science, the more my faith in God increases.
The kind of loads big firms need to support are in the order tens of millions of users with millions of transactions a day. What I mean by transactions is buy process which can contain a dozen to a couple hundred individual orders. In other words, the number of complex insert/updates is tens of millions to hundreds of millions a day.
For example, big firms like fedility, city group, thompson, vanguard, and schwab have millions of customers with hundred thousand plus portfolio managers. throughout a given work day, a portfolio manager may generate a couple hundred orders and submit them in one or two batches. This is done because it's cheaper for them. Can .NET scale well? Like what others say, it can if you design it right. For example, if you use MSMQ for it's designed job it works well. If you write your queues for MSMQ with plain hashtables and you don't index the messages, your chances of supporting 10K+ messages a second aren't likely. On the otherhand, if you write custom queue's, profile the messages, index them efficiently and make sure no other heavy weight stuff sits on the same box it can scale. Is that easy? No. You have to understand the problem you're trying to solve. Let's say hypothetically you have insane performance requirements like 100K+ messages a second for a messaging tier, you're better off using IBM MQSeries. Can you do the same thing with MSMQ? Sure if you build a bunch of custom stuff, write the messages to a database, index, partition and load balance. It will probably take you 8-12 months to do it, but you can with the right people and good hardware. Would you want to use XML for that messaging system? The answer is obviously no, if you want to keep the cpu and memory loads manageable.
Many people have claimed they support thousands of transactions. Sure if all you're doing is insert into one table. Simple stuff right. Financial transactions like trading systems do a heck of alot more than a simple insert into one table. More often than not, a trade transaction with 100 orders goes into the database, affecting several tables. The middle tier then has to get events, and check the order to make sure it is valid and does not violate regulations or other compliance requirements. Sometimes it requires analytics like Tibco or what the industry calls Business Intelligence. Regardless of the server, stuff like analytics take time (seconds). Obviously if you're running complex analytics that scane 10 million rows of data with several joins in the query, you're better off using an analytics server like OLAP. Can .NET handle 1K analytics requests per second? If it's cached sure. If the nature of the data is very dynamic, like realtime trading systems, no way. doing that is very hard and most people avoid it.
The key here is setting the expectations accurately, so your customer knows what is realistic. If you have a hard time communicating that to your customer or management, than find another job.
First of all, let me start out by saying that under certain circumstances YES .NET-based web applications CAN scale to 100+ concurrent connections on the relatively limited hardware you speak of.
.NET is not the bottle-neck the DB is (which is USUALLY the case). If the DB machine was able to handle more load then you may be encountering a network bottleneck. Try adding more network cards to the application server and the DB server. Connect multiple times to the DB server. Allocate a network stack per CPU on the DB server if you have multiple CPUs (there is stuff in Microsoft's knowledge-base about how to do this). Do the same thing on the Application server. Make sure that under maximum transaction load that both the application server and the DB is maxed out, at least this way you know you are squeezing the maximum throughput out of your machines.
In order to do this you WILL (here's where I get flamed, but frankly, I don't care) need to examine and deal with the following:
1. How database intensive is your application? Despite claims by the DB suppliers RDBMS packages are slow.
Every query, even trivial ones like:
SELECT 0 as foo from bar
will require milliseconds to execute AT BEST. This is because you need to serialize the SQL request from your "middle-ware" (in this case the ASP.NET runtime instance) place this request into a inter-process communication channel (say a network buffer) then block. The network stack has to sent it over the wire, a thread on the SQL server machine has to be woken up to process the incoming query. The query has to be parsed, then executed. The results must follow the reverse route eventually waking up the blocked thread on the "client" (the machine running the ASP.NET runtime that initiated the call in the first place) parse the result set and present that result set to the rest of the application (via a "reader" object). At best this process is on the order of milliseconds per request. If you can get away with cleverly caching data in the application server then this task that originally took milliseconds now gets done in microseconds. Bottom line: eliminate calls to the database where ever you can. There is a down-side to this: more build time costs more money. Plus your code gets more complex. Evaluate the trade-offs. IF caching pays-off in your application it's a lot cheaper to add more application server boxes than it is to pay the crazy expensive prices to beef-up SQL server.
2. Multiple network cards per machine. When you did your benchmarking how much resource usage was there? On the machine running the Web application when you reached max transaction rate was the CPU at 100% If not you may be encountering either a network or DB bottleneck. Look at the DB was it maxing out (CPU + DISK), if so you need to focus your attention there, i.e.
3. Avoid using Web services "inside" your application. You may be using Web services to retrieve data from within your application code. Web services have a lot of friggin' overhead and are not particularly speedy. If you have to use Web services, then also try caching if you can. Any time you application has to wait for some other process (perhaps on a different machine) you will be incurring a big time penalty.
4. Where possible try to static-ify you output. This really is just another kind of caching. Many Web sites (including Slashdot) use this technique. It's particularly useful for news type sites. Write the output of your processing to the hard disk the first time it's requested then re-use this pre-generated file for subsequent requests. Update on a time-basis.
5. Yes, I'm going to say it: look at your DB schema and see if you have opportunities to de-normalize data to simplify queries. Joins are expensive, and query optimizers often don't do a great job. This approach can't always be used, but it can improve query performance orders of magnitude. What you may want to do is keep a really nice normali
I don't have direct experience with your situation, but I do understand scalability fairly well for non-trivial applications that interface with many different systems over many different protocols.
1. currently there isn't a mature messaging server and MSMQ is not appropriate for high load messaging platform.
No experience with MSMQ. I've used commercial pub/sub solutions to push over 3M messages a day to a 2 CPU Sun box.
2. SOAP is too damn heavy weight to scale well beyond 60 concurrent requests for a single CPU 3ghz system.
Maybe in the MS world. In the Sun world, we handle 400 concurrent over Gig Fibre. Kernel param tweaks were definitely needed tho.
3. SQL Server doesn't support C# triggers or a way to embed C# applications within the database
Use a real DB server. DB2, Oracle.
4. The through put of SQL Server is still around 200 concurrent requests for a single or dual CPU box. I've read the posts about Transaction Processing Council, but get real, who can afford to spend 6 million on a 64 CPU box?
A 64 CPU box isn't $6M. More like $2.8. Get a better supplier and if you are spending that kind of money, don't ever buy 1. You need at least 2. How else are you going to test and have a DR site? Don't put them in the same state either.
5. the clients we target are small-ish, so they can't spend more than 30-50K on a server. so where does that leave you in terms of scalability
Earning your consulting paycheck. I've deployed 450MHz 2 CPU Sun boxes that support 400 concurrent users (WLS), with a pub/sub incoming queue from multiple MF systems and retrieve on average 30 items from 8+ backend systems per user request. All while running a10 GB Oracle DB managing the incoming queues. To verify the scalability, we performed automated load testing on our test server. The production box is average 40% CPU loaded with peaks in the mid-70s%. It has been too successful, so the customer wants to add more functionality. This means we're splitting the app from the DB and adding a current generation 280R for each app instance.(3 boxes: dev, test/DR, prod).
6. I've been been running benchmarks with dynamic code that does quite a bit of reflection and the performance doesn't impress me. You've answered your own question.
7. I've also compared the performance of a static ASP/HTML page to webservice page and the throughput goes from 150-200 to about 10-20 on a 2.4-2.6Ghz system
You've answered your own question.
8. to get good through put with SQL Server you have to use async calls, but what if you have to do sync calls? From what I've seen the performance isn't great (it's ok) and I don't like the idea of setting up partitions. Sure, you can put mirrored raid on all the DB servers, but that doesn't help me if a partition goes down and the data is no longer available.
No experience. Generally, partitioning of time-based tables is a good thing when you are worried about data fragmentation. Better to drop an old table parition than have the entire table become fragmented. I've used a monthly partitioning successfully.
9. I asked a MS SQL Server DBA about real-time replication across multiple servers and his remark was "it doesn't work, don't use it."
There are other solutions, but they tend to be expensive. GoldenGate have an impressive replication tool - I don't know if it works with MS-SQL tho. It does work with all the big guys - Oracle, DB2, Teradata, so I wouldn't be surprised.
In other words, if you define a target performance metric and find that a single user can access the system at better or equal that speed, then a system can be said to "scale" if it still performs within that metric, on average, when 20, or 50, or 100, or whatever users are accessing the system.
Of course, due to the nature of computing systems, any system will hit a point where it will no longer scale to the requirements with the same hardware configuration. At that point we start talking about scaling upwards through hardware upgrades. But you can only upgrade hardware so far, at which point, again, the system will reach a hard limit on its ability to perform as required.
The next logical step is to scale through redundancy: If a system can, in whole or (more commonly, as in "multi-tier" web clusters) in part, run concurrently in multiple instances, then you can scale by adding more instances of the system or system components: Multiple web servers, multiple "back ends", multiple database servers, and similar. This kind of organic scaling can be, in practical terms, near-infinite if a system is designed well; for example, multiple mirrored web servers serving static pages will scale indefinitely, whereas a trading system requiring inter-process synchronization through a message-queue system will most likely not.
In scalability terms, efficient redundant clusters is the holy grail. It's the way Google scales, and it's how the core parts of the Internet scales.
In the context of this Slashdot story, since the poster faces limited possibilities for investing in expensive hardware, he might consider going for the "many cheap boxes" route, if his Microsoft-dominated infrastructure permits it.
In DB2, you can write stored procedures and functions in many different programming languages, then call those routines through standard SQL. It's a nice way of reducing the network overhead that would be introduced if you had to send thousands of rows of data back to your client, then process the data on the client. Of course, it's also a little bit tricky to program...
It was also pretty handy in the old days if you wanted to automatically trigger an email when your dinky little online sales site recorded a whopping $500 order for Idaho potatoes, and you had to warm up the jalopy so you could make the drive to Idaho.
I just perused the Windows 2000 SP4 EULA and it states that it's against the licensing agreement to publish .NET benchmarks without Microsoft's permission.
Why this was in the Win2k SP4 EULA is beyond me.
When I first started testing SQL Server using the default install "point, click and go" I was shocked to find that the performance for our application sucked.
A quick lesson in optimization from the local DBA taught me how to tune it to use 80% of the physical RAM, keep log files on separate physical drives (and buss if possible) than the data file. If you can isolate log/data/os, so much the better. So I did that and performance immediately improved 3000% or more. I'm sure there are several other tune-ups I've forgotten and hundreds I never learned.
Even so, IIS throughput for static and ASP apps (pre-.NET) was difficult to get above 200 and 100 concurrent connections respectively several years ago. Performance tuning is all about test, analysis, tinker... unless this guy knows where his bottlenecks are, there's not much anyone can suggest to help.
And if he knows where the bottlenecks are, then he wouldn't be asking this question.
The sad thing is that without analyzing the bottlenecks, people go buy faster CPUs where RAM was needed, or multi-proc systems with 15K RPM drives when moving to GigE would have done three times as much.
Then there is the application itself. What is it trying to do? What can be cached? What is unnecessary? Can lookups be bundled together? I've seen Verity servers dragged down to 1TPS through crazy business requirements and poor development understanding. There is no such thing as a inherently fast platform. And no replacement for careful, analysis-based perforformance testing.
These opinions guaranteed or your money back.
This is not rocket science, and I had presumed this rule had been learned a long time ago... but here it is again:
"To ensure scalability, host each server-component of an application on it's own hardware - optimsed for the specific task assigned."
In other words, DO NOT deploy everything onto one machine. Remember the old adage "Jack of all trades, master of none".
So, put the database server on its own box, with dual cpu, loads of memory and RAID-mirrored drives.
Put IIS, the ASP.Net app (and the web services if you're feeling cheap) onto a fast, single cpu box, enough memory to turn off paging and a single drive - GHOST'd onto CD for backup.
Install an extra net card in both, and set it up soley as the route for traffic between them.
Implimenting this hardware for less than 20K should be trivial.
If you can't comfortably support 200 concurrent users with this, you need professional help - my consulting rates are quite reasonable...
This sig left unintentionally blank.
if you have a reproducible environment where sql server 2000 is losing or corrupting data, please email me directly with details.
.NET. I suppose i can't change your mind and i definitely cant change your experiences, but i am highly skeptical that you're not using some rose colored lenses when you're examing how your app stack compares.. Windows + ASP.NET + SQL server works pretty well out of the box. I don't want to be too presumptuous here, but unless you've had work experience with one of a handful of simply huge internet properties, i can say that Windows + ASP.NET + SQL server is reliably running many sites bigger than you've ever dealt with. In other words, some very big places seem to think its just fine :)
having said that, i think you are way off base. I used mySQL a lot and use SQL server extensively now. Comparing the two just isn't worth while. mySQL is a lot closer to berkeley DB than it is to SQL server, feature wise, scalability wise, management wise, and reliability wise.
SQL server has several licsensing options - per conection or per processor are the two default ones. Given that SQL server is perf competitive with Oracle, (it infact beats it on official benchmarks) and a HECK of a lot easier to deal with, SQL server is a STEAL compared to oracle pricing and support.
for most OLTP type work mySQL will NOT be faster than SQL server, especially as the volume of data grows, the complexity of queries grows, and the contention increases.
Put another way - when i see mySQL displacing SQL Server in the tpc benchmarks, i'll pay attention to you again. mySQL is very cool, but its a toy compared to DB2, Oracle, or SQL server. If all you need is a toy, by all means, use a toy -- you'll be happier. When you're ready to graduate to real systems, SQL server will be waiting for you. Maybe mySQL or Postgres will evolve faster than your needs do, and you'll never need a commercial quality database (or one of those two will turn into one)
It is unfortuneate that your experience has shown that Java+Jboss+{some db here, as you named like 5 of them}+Linux is "more reliable" than Windows +
My opinions are my own, and do not necessarily represent those of my employer.
Your drugs must be more expensive than mine. (-:
/. as a whole is as clueless as ever, but you did see a few good posts.
Back on topic(-ish): as well as the low-bandwidth point the grandparent made, I think it's more germane to mention that any one of sixty-to-a-hundred failures will keep a Windows server (and hence VNC) off the air, but you only get a-handful-to-a-few-dozen chances to kill a Linux server stone dead as far as remote access is concerned.
Got time? Spend some of it coding or testing
Every program a client has that uses any version of SQL Server needs constant fixing, and is incredibly slow for 7 users. I could give you a list of programs, but it's just about every program on the market. One of them has an option to use paradox database files,
:P
I think I know what you are talking about.
I'm guessing this software is mainly decade old desktop packages that were originally designed to run on Paradox/dBase/FoxPro and ported to SQL Server or Oracle because that's the trendy thing to do. (If you see "BDE", the Borland Data Engine, it's a good sign that this is what you've got.) The thing is, the apps aren't really ported to use a RDBMS design. They still use "Flat Files" and have their own key/indexing system and old style coding.
The one I'm familar with is the very popular "Goldmine" sales package (had to get data from it's schema for an app I built). Doesn't even use Primary Keys, much less non-clustered indexes. Instead it's got these dbase-style bogo keys which look like "AAAA", "AAAa", "AAaa" and so on. But SQL Server is running in case insensitive mode, so all of the key comparison is done on the client! It also appears to do record locking on the client-side. No wonder an almost trivial application only supports 20-some users on a P4 server.
Hopefully as someone targetting DB2, you aren't making the same kinds of error, because you'll see the same issues no matter the RDBMS.
There are intentional problems between NT4 and Win2K+ using NT4 as a file server with SMB
I'm guessing this is the "rogue master browser" problem. Sucks, but an unofficially well known issue.
Business. Numbers. Money. People. Computer World.
I've designed and written a few .NET apps, and have found them to have extremely good performance ("performance" here always refers to both scalability and snappy single-user response times) on low-end boxes, expecially compared to ASP and JSP.
.NET apps. Heck, for a couple of those apps., the team hardly even touched a whiteboard, but just sat down and started rapping out code using good old-fashioned common sense. In my book, follow the KISS principle for .NET apps. whenever possible for architecture and you'll come out way ahead in almost every aspect of performance, time to market, maintainability, etc. Take it from someone who almost blew a multi-million dollar project because of "over-architecting" the application. We got things working fine, but I lossed a lot of stomach lining over the deal, and the customer wasn't too pleased either. Before I jump off of the soapbox on the arch. issue, I always try to remember that just one often used part of an app. can kill scalability, and a complex arch. always makes the bottleneck a lot harder to find and often to fix w/o a re-write.
.NET runtime removes this burden as long as you don't have to jump through coding hoops to wrap dependent updates, etc.
.NET IDE installation is fine for most) of the parts of the app. that update the DB and test with different isolation level settings. As long as rule # 1 is followed, this has been the single most important factor in maximizing the scalability of transaction applications, and the default setting for this usually doesn't give the best results.
There have been a couple of posts w.r.t. proper architecture... While arch. is no doubt important, I've found that it is not hard at all to get good perf. out of even a minimal expenditure on arch. for
Here's where I've found the bottlenecks and easy ways to solve them:
1) Don't put SOAP calls in tight loops . If you seem to have to do this, redesign things. This may seem obvious, but the reason SOAP doesn't really scale well on any platform is because of the function call overhead, so minimize that as much as possible. Make sure to cache results local to the caller for stuff that doesn't change very often (see #4 below).
2) For DB updates, use SqlConnection transaction handling instead of MTS whenever possible. In the old ASP/COM+ world, it was imperitive to use MTS so you could take advantage of the other COM+ features, but the
3) Make sure to use the correct transaction isolation level for your transactions. For highly transactional apps. that also do lookups on the updated tables, the one I've had the best luck with is IsolationLevel.RepeatableRead. I can't stress the importance of this enough - do some rudimentary testing (Microsoft ACT that comes with the
4) Use HttpContext.Current.Cache and related in your code to cache stuff that doesn't change too often (like product description lookups, etc.), especially for SOAP calls.
5) Adjust the "Minimum query plan threshold for considering queries for parallel execution (cost estimate)" once you move your app. to an SMP system, depending on if the DB server is standalone and how highly normalized the database is. For a highly normalized DB (lots of JOINs) on a standalone SQL Server box, set this at 0 or 1 because most of your queries could probably benefit from a parallel execution plan, or at least the small extra overhead won't really hurt. This setting alone dramatically and immediately boosted the performance of one application and took all of 2 minutes to implement.
6) Make sure to apply indexes where it makes sense. For example, on one app. 4 hours of query research and index tunning based on that provided a 10 fold better response time for one query, which probably increased scaleability for that database by 100 fold because of how often that query was used.
7) Read and follow this simple advice for SQL Server clustered indexes on MSDN:
The one that is capable of reverting to Paradox does indeed use the BDE (about the only thing that uses Paradox). I happen to know quite a bit about the BDE having worked on Builder/Delphi projects in the past. If the queries are done properly, it will work in either server or client mode, but you can always try the other. I loaded up the BDE administrator, and I managed to find some set of configuration that made it a little faster, but the queries ran both server and client just fine. The sad part is, I had better results executing on client side than on server side. Running select(*) on a table should never take very long, and most of the problem, that I can tell from diagnostics, was the server just halting network traffic. It may be a bug or problem with NT4 and SQL Server, but I'm sure not going to tell them they need to spend 20K to get these applications running. I'ld much rather develop a work-alike and drive the idiots that chose SQL Server in the first place out of the market if I can. Their product doesn't work for a group of people greater than 3 despite their claims. We've sank enough debugging time into that whole thing that we could already have developed an Alpha.
It's not the rogue master browser problem, BTW. Get out any NT4 system and any operating system built on NT since, and by default, it will be hideously slow. Pull out a Windows 98 machine and put in on the same network, and it will be instant. I'm fairly certain it's intentional incompatibility that they do all the time. Running mixed iterations of MS software will cause a lot of problems. They want you to upgrade everything all the time, because that's how they squeeze more money out of you.
I'm targetting any RDBMS with a JDBC driver, not just DB2. DB2 is just a preferred upgrade path. Even with this kind of portability, I don't take too much of a performance hit. Stored procedures wouldn't help, and neither would sub-selects. The only thing to gain from upgrading beyond MySQL for this application is good realiable data partitioning/clustering. No matter what anyone tells you, don't believe them that the current hot-backup crap in MySQL is useful for anything but failover. There are no distributed transactions the last time I checked, so I can't imagine that it would be reliable if you were using transactions on two different servers simultaneously.
If you are using Java, and you want to have a good abstracted database access, you should take a look at the open source hibernate project. Excellent software, much better than CMP. It supports all kinds of automatic relationship management too. It has it's own abstracted query language that it compiles into native queries for whatever features your DBMS can handle. It's the best way to get performance out of RDBMS's without spending months developing special queries for each DBMS, and still have DBMS portability.
Karma Clown