Slashdot Mirror


The Setup Behind Microsoft.com

Toreo asesino writes "Jeff Alexander gives an insight into how Microsoft runs its main sites. Interesting details include having no firewall, having to manage 650 GB of IIS logs every day, and the use of their yet unreleased Windows Server 2008 in a production environment.

29 of 412 comments (clear)

  1. Re:Beta in production environment. by EvanED · · Score: 5, Informative

    Vista was never meant as a server. Same as XP isn't used as a server, it's Server 2003.

  2. Re:Firewall Schmirewall by great_snoopy · · Score: 5, Informative

    Of course they have a firewall, just watch the difference between a tcptraceroute to a public port (like 80) and tcptraceroute to the same ip but some other port (like 110 pop3 for example). You'll see that packets get dropped at some point indicating a firewall. It's not a RST (port closed) it's just dropping packets for nonpublic services. That is a packet filtering firewall.

  3. Re:Firewall Schmirewall by oliderid · · Score: 4, Informative

    from the article:
    "...At this point we still don't use firewalls for MS.COM..."

    and then

    "Router ACLs are in place to block unnecessary ports"

    blocking unnecessary ports is a firewall feature (IMHO ?)

    Anyway it looks quite impressive. I still don't understand how to handle 650 GB of logs :-).

  4. Re:Beta in production environment. by schnikies79 · · Score: 5, Informative

    Funny, but you're wrong. Pro is for networking enviorments where you need RDP, policies, ability to join a domain, file encryption, etc. Home lacks these.

    --
    Gone!
  5. Re:Microsoft brainwashing by plague3106 · · Score: 4, Informative

    You realize that Win2k3 does turn off most services by default, and Win2k8 takes this even further by not installing them at all.

    Uh, didn't I read an article not too long ago about how the update.microsoft.com site was broken into?

    Link, please?

  6. Re:Beta in production environment. by EvanED · · Score: 3, Informative

    No, the pro version is more intended toward business users. Not servers, but the sort of thing workers have on their desktop. That's why it has tunings for corporate networks and ACLs and quotas and such.

    You can debate the drawbacks and benefits of having so many versions, but XP was never intended to be a substantial server.

  7. Re:Beta in production environment. by Anonymous Coward · · Score: 1, Informative

    No, professional versions offer business-required desktop features that are stripped out of the home version. If it mirrors XP, this would include things like the ability to manage security for accounts on a per-file level.

    But it's not intended for servers, either on Vista or XP, as the GP said.

  8. Re:Firewall Schmirewall by allenw · · Score: 3, Informative

    Large scale log processing isn't hard if you have the right tools. :)

  9. Re:Firewall Schmirewall by truthsearch · · Score: 1, Informative

    MS was (and maybe still is) outsourcing web page caching to Akamai, which is using Linux servers.

  10. akamai by wwmedia · · Score: 3, Informative

    don't forget the whole slough of Linux servers that they use through Akamai to handle the bandwidth;

    it's one reason why why doing a lookup on Microsoft servers, it often shows that they are running Linux. It's also another reason why people point out that Linux is more scalable because even Microsoft can't eat it's own dogfood.

  11. Misleading Summary. Total Propaganda by mpapet · · Score: 3, Informative

    1. The asshat highlights they use no firewall, and yet buried deeper in the article is this "Router ACLs are in place to block unnecessary ports" That's the functional equivalent of a firewall.

    2. I get into discussions where tech guys spew traffic numbers and I'm never impressed. It creates issues if you want to actually do something with the data which I doubt they do much beyond running the usual marketing metrics. Until you actually shoot for 99.99 service uptime, you begin to comprehend the challenge it is (on any platform) the traffic itself is not the challenge.

    3. I'm very interested in reading what their hardware budget is like. I get excellent performance out of Linux compared to server 2003 boxes on similar compaq dl380's.

    --
    http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
  12. Re:Supporting by MightyYar · · Score: 4, Informative
    Whoopsie, looks like Akamai uses IIS now - I'm behind the times, I guess:

    % nmap -A -T4 -F -P0 www.microsoft.com
     
    Starting nmap 3.81 ( http://www.insecure.org/nmap/ ) at 2007-12-13 11:48 EST
    Interesting ports on wwwbaytest2.microsoft.com (207.46.19.254):
    (The 1218 ports scanned but not shown below are in state: filtered)
    PORT STATE SERVICE VERSION
    80/tcp open http Microsoft IIS webserver 7.0
    179/tcp closed bgp
    443/tcp open ssl/http Microsoft IIS webserver 7.0
     
    Nmap finished: 1 IP address (1 host up) scanned in 167.891 seconds
    --
    W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
  13. Re:But generally.. by nuzak · · Score: 4, Informative

    The distinction between port filtering + ACLs and today's notion of "firewall" that's actually useful is of a stateful firewall, doing stateful packet inspection, with policies based on not just the packet you're picking a TCP header out of. If you tried to sell a stateless filter as a "firewall" today, you'd be laughed out of the market.

    And no, I don't see any need to firewall a web farm either.

    --
    Done with slashdot, done with nerds, getting a life.
  14. Re:Swimming in acronym soup... by Anonymous Coward · · Score: 5, Informative

    GFS: Global Foundation Services. Microsoft's big internal network management thing. It's the people who keep the servers up and running for everything facing outward.

    HBI: High Business Impact. Social Security numbers ,Passport accounts, etc.

    NLB: Network Load Balancer.

    AV: AntiVirus.
    DoS: Denial of Service
    IIS: Internet Information Services. 'httpd' for Windows.

  15. Re:Microsoft brainwashing by SEMW · · Score: 2, Informative

    Wow, you got (Score:3, Insightful) for smugly saying "Link please?"? Here's a link for ya Google. Learn to look things up for yourself instead of acting like a smug bastard when someone points out the obvious. "Link, please?" used in that context is a shortened form of "I've looked around, and can't find the slightest reference to what you mentioned; but rather than assume that you made it up, I am going to give you the benefit of the doubt and assume that it merely, for whatever reason, wasn't well publicised. Thus, would care you to supply any proof of your claim?"

    I can't vel (BTW, on an related note, burden of proof is on the person who makes the claim. This follows by necessity from the impossibility of proving a negative.)
    --
    What's purple and commutes? An Abelian grape.
  16. Re:Eating dogfood is good by ashridah · · Score: 4, Informative

    Not complaining in TFA, but this is /. -- I just anticipated the howls of the unwashed hordes rightfully bitching about yet another "professional" OS with a markedly unprofessional Teletubbies UI which certainly isn't ready for market yet, all while ignoring MS' internal dogfood consumption. I'll bet if enough Microsofties had eaten Office dogfood you could shut off that fucking control-click "Research" panel easily.

    Nevermind that the UI for 2008 is roughly the same as 2003, only with a more extensive (yet still looking clean and fairly spartan with the eyecandy) set of configuration utilities for roles and features. Just wish I could say the same for the control panel. :)

    As for the 'research' panel... okay, I work here at microsoft, and I own my own copies of office at home, and I have no idea what that is. Of course, I'm hardly an office power user.

    You can bet your bottom dollar that office 2007 is all that's in use around most of the company. As is vista, although it tends to be a mixture of vista, xp and 2003/2008 in most offices, usually for a variety of legacy reasons (maintenance of older projects, testing, etc)

    I've got all but XP myself, but only because I haven't needed it to do my job.

  17. Re:Beta in production environment. by Tim+C · · Score: 2, Informative

    Home has the rdp *client* of course, so you can connect out, but not the rdp *server*. Pro also ships with IIS as an optional installable extra, which Home lacks.

  18. Re:Microsoft brainwashing by plague3106 · · Score: 2, Informative

    Well, first I said "most." Second, it's possible he wrote incorrectly. He might mean "we only run required services."

    But don't believe me though, go install Server 2003 R2 yourself. IIS either isn't installed unless you specify, or it comes locked down to server ONLY static content. (I know that latter part is the default IIS setup, because I had to go turn everything I needed on).

  19. Re:Beta in production environment. by merreborn · · Score: 2, Informative

    NT4, and win2K both had "Workstation" and "Server" versions. Windows XP had "Home" and "Pro". So it's understandable that you might assume that workstation equates to home, and server equates to pro. However, in actuality, "Pro" is closest to "Workstation", and "Home" is really more of a "Workstation lite", with a lot of the workstation features disabled. Win2K3 is the closest thing to a "XP Server" release that ever came to be -- although it's really not related to XP at all.

  20. Re:Supporting by jimicus · · Score: 2, Informative

    Erm.... nmap always reported the webserver as being IIS, because the nature of Akamai's service is that the webserver reports itself as being whatever's really running on the other side of their network.

    The thing that causes the confusion is if you do an nmap -O, and it guesses the host operating system to be Linux despite running IIS on the web server.

  21. Re:Perhaps the only ones who can do it "right" by Super_Z · · Score: 2, Informative

    MS claims their software is stable and secure. Perhaps it is -- when was the last time microsoft.com was taken down by malevolent hackers?

    # dig www.microsoft.com
    [..]

    ;; ANSWER SECTION:
    www.microsoft.com. 2520 IN CNAME toggle.www.ms.akadns.net.
    toggle.www.ms.akadns.net. 300 IN CNAME g.www.ms.akadns.net.
    g.www.ms.akadns.net. 300 IN CNAME lb1.www.ms.akadns.net.
    lb1.www.ms.akadns.net. 300 IN A 207.46.19.190
    lb1.www.ms.akadns.net. 300 IN A 207.46.192.254
    lb1.www.ms.akadns.net. 300 IN A 207.46.19.254
    lb1.www.ms.akadns.net. 300 IN A 207.46.193.254
    [..]

    # nmap -v -p22 -O 207.46.19.190
    [..]
    Host wwwbaytest1.microsoft.com (207.46.19.190) appears to be up ... good.
    Interesting ports on wwwbaytest1.microsoft.com (207.46.19.190):
    PORT STATE SERVICE
    22/tcp filtered ssh
    Device type: general purpose
    Running: lwIP, Sun Solaris 2.X|7
    OS details: lwIP (Lightweight TCP/IP stack) version lwip-0.5.3-win32, Sun Solaris 2.6 - 7 (SPARC), Sun Solaris 2.6 - 7 x86, Sun Solaris 2.6 - 7 with tcp_strong_iss=0, Sun Solaris 2.6 - 7 with tcp_strong_iss=2

    Nmap run completed -- 1 IP address (1 host up) scanned in 1.806 seconds

    I'm actually out of words at this point.

  22. Re:Microsoft brainwashing by jjrockman · · Score: 2, Informative

    Wow. I'm impressed. Each of these links either are: a) really old, before Windows 2003 Server even existed, or b) about exploits in the DotNetNuke software and not specifically IIS. Troll, FUD, Flamebait, eh? So which one are you guilty of?

    --
    Quit jabbering on the phone while driving. You are not that important.
  23. Re:Supporting by Bri3D · · Score: 2, Informative

    Akami forwards the header strings from whatever httpd the Akami network is caching/fronting for.

    http://news.netcraft.com/archives/2003/08/17/wwwmicrosoftcom_runs_linux_up_to_a_point_.html

  24. Re:Firewall Schmirewall by Anonymous Coward · · Score: 1, Informative

    it's actually 650GB compressed. around 10 TB uncompressed.

  25. Re:Firewall Schmirewall by lena_10326 · · Score: 5, Informative

    My question is why are the logs in ASCII text format? When all you want is say the IP [4 bytes], time of day [4 bytes], URI, referrer and return code [do you really care about their browser strings? You are MS after all, just assume it's IE]. Storing an IP as text requires on average 15 bytes, so right there you can shave off 11 bytes with a binary IP. Time of day is worse, a date+time string is like 25 chars. Doesn't seem like much, but multiply the 32 bytes per entry you save by say 50 million hits and that's 1.5Gbyte you saved. That's not counting the white space you can remove, and a simple huffman code you could apply to the URL/referrer.

    Logging in fixed format is not more efficient than variable format text files (unless we're talking about transactions but we're not). Let's assume you're logging the basics: IP address, Timestamp, Return code, URI and we'll look at logging in fixed format then variable format.

    [abcd] [timestmap] [code] [URI]
    4 bytes 8 bytes 1 byte 50 bytes (you actually need 2 bytes for HTTP return code, but let's ignore that)

    Every record will require 63 bytes and we'll round up to 64 for proper word alignment). So, if we log 1000 messages, we will consume 64,000 bytes total.

    Ok. Now for text logging with space delimiters. We have 3 options below, each requiring slightly less space than the previous. We'll run totals for each.

    123.567.890.123 YYYYMMDDHHMMSS x URI...............\n
    16 bytes 15 bytes 2 bytes 50 bytes 1 byte

    123.567.890.123 1197572382 x URI...............\n (UNIX time)
    16 bytes 11 bytes 2 bytes 50 bytes 1 byte

    1235678901231197572382xURI...............\n (UNIX time)
    12 bytes 10 bytes 1 bytes 50 bytes 1 byte

    16 + 15 + 2 + 50 + 1 = 84 bytes * 1000 = 84,000 bytes
    16 + 11 + 2 + 50 + 1 = 80 bytes * 1000 = 80,000 bytes
    12 + 10 + 1 + 50 + 1 = 74 bytes * 1000 = 74,000 bytes

    Wow. Fixed binary format kicks variable text format's ass. Wrong. This assumes the URI (or message) block will always occupy 50 bytes. It will not. Let's go right down the middle and assume it averages 25 bytes and we'll recalculate.

    16 + 15 + 2 + 25 + 1 = 59 bytes * 1000 = 59,000 bytes
    16 + 11 + 2 + 25 + 1 = 55 bytes * 1000 = 55,000 bytes
    12 + 10 + 1 + 25 + 1 = 49 bytes * 1000 = 49,000 bytes

    Variable text format almost always beats fixed binary format for logging. That's why Microsoft (and the rest of the world) stores log files as text. Plus, it's far easier to manage and debug when you can slice and dice the files with standard command line tools.

    One more thing. I know what you might be thinking. We're logging URLS, which will probably consume the majority of the 50 byte allotment. Most developers will calculate an average width size and double it, so no matter what we'll still be filling about 50% of the message section.

    Last point. If I were to use your example, the savings with text logging would even be greater. 2 URLS would be stored, both consuming about 50% of their data block. IP address, timestamp, URI, Referrer URI, Return Code. There's also a bunch of other little optimizations you can do such as storing the domain, year, month, and day in the filename rather than in the data or dropping the least significant byte in the HTTP return code.

    --
    Camping on quad since 1996.
  26. Re:But generally.. by Kalriath · · Score: 3, Informative

    No, because you'd have to go to considerable effort to configure it in such a way that what you say would actually happen. Hell, even my Windows Server 2003 machine is still running stable and virus/spyware free after about five years (or so).

    --
    For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
  27. Re:Firewall Schmirewall by DeadBeef · · Score: 2, Informative
    Sounds like you just made up some definitions in your head ( or worse follow someone other deluded sods mantra ) for some fairly well worn terminology and then decided to go on a crusade to harass the unbelievers.

    Firewall is not an synonym for stateful filter like you imply later on in this thread. For some data to support my statement, the firewall entry at wikipedia says:

    "A firewall is a dedicated appliance, or software running on another computer, which inspects network traffic passing through it, and denies or permits passage based on a set of rules."

    It then goes on to mention classify firewalls into first, second and third generation ( the first being what you called "Port blocking" ).

    In retrospect IPHBT. Oh well.

    --
    I am a lawyer and this constitutes legal advice and I shall indemnify you against any losses arising from taking it.
  28. Re:Microsoft brainwashing by Kalriath · · Score: 3, Informative

    Actually, when you first boot Windows Server it pops up with the "Configure Your Server" page, and an extra note that until you've set up roles on it, nothing will work. As in, it hasn't started IIS, it hasn't started AD, it hasn't even started Terminal Services. And until you've picked which ones you want to run, it wont even allow inbound connections whatsoever!

    --
    For a site about things like basic rights, Slashdot users sure do like to censor "dissent".