Slashdot Mirror


The Setup Behind Microsoft.com

Toreo asesino writes "Jeff Alexander gives an insight into how Microsoft runs its main sites. Interesting details include having no firewall, having to manage 650 GB of IIS logs every day, and the use of their yet unreleased Windows Server 2008 in a production environment.

4 of 412 comments (clear)

  1. Re:Firewall Schmirewall by Anonymous Coward · · Score: 3, Interesting

    Anyway it looks quite impressive. I still don't understand how to handle 650 GB of logs :-).

    My question is why are the logs in ASCII text format? When all you want is say the IP [4 bytes], time of day [4 bytes], URI, referrer and return code [do you really care about their browser strings? You are MS after all, just assume it's IE].

    Storing an IP as text requires on average 15 bytes, so right there you can shave off 11 bytes with a binary IP. Time of day is worse, a date+time string is like 25 chars. Doesn't seem like much, but multiply the 32 bytes per entry you save by say 50 million hits and that's 1.5Gbyte you saved. That's not counting the white space you can remove, and a simple huffman code you could apply to the URL/referrer.

    Heck, just piping the binary IP/date and ASCII URL/referrer through gzip [or use libz's gzPrintf() etc...] could make a large difference as well.

    Point is, bragging about 650GB/day logs is not really impressive when you're "doing it wrong" (tm). That's like bragging about how much you cut your face while shaving.

  2. Re:Swimming in acronym soup... by loconet · · Score: 3, Interesting

    Interesting, I thought I was the only one. Why is it that every time I read about Microsoft related technology it's always an acronym salad. Not even commonly used acronyms either, they use acronyms for their own way of calling technology xyz. It's almost like they do it on purpose ..

    --
    [alk]
  3. Re:Beta in production environment. by ashridah · · Score: 4, Interesting

    Which we do on a regular basis. Every few weeks I see emails going around from higher-ups asking us to test their team's RC or beta stuff at home for them, and the project I'm working on has been dependent on VS2008 since beta2. Everyone here has their favourite project they like to keep tabs on. I've got longhorn server 2008 running on one of my machines here.

    That said, the choice to use longhorn server in production isn't actually a bad one. It's really, REALLY stable. I keep hearing (from people both inside and outside the company) that it's more stable than 2003 is (and 2003 has the benefits of multiple service packs). It's also a lot more configurable about what it runs, and how much of it it enables when it's installed. I wouldn't bet the entire stable on it, but I'd be willing to put money on it getting a place.

    All in all, it's pretty sweet, if you look at it from the sysadmin perspective. Also, the stuff you can setup when you couple it with vista is really nice (from a security standpoint, particularly). That said, some of that functionality is being backported to XP with SP3 or whatever.

  4. Re:Beta in production environment. by ashridah · · Score: 4, Interesting

    Because at least Unix has conventions.

    Conventions are a nice way of saying "that's the way it's always been, so that's the way it stays." Windows has similar problems left over from legacy, going all the way back to CP/M. Yes, this sucks, but so does some conventions in unixland. Just ask a Solaris 10 admin how much it sucks when your upstream vendor breaks decades-long convention.

    Really? Ok, lets open up C:\Windows on one of our Windows servers. Hmmm a folder named "$hf_mig$". I suppose you know what that means or what convention that follows? Or C:\Windows\adam. Kinda looks like it might be some directory tools. Maybe ADAM = Active Directory AdMinistration? What's that doing there anyway? I could keep going down the list. I suppose there is a very good reason why there are .BMP files in C:\Windows? Desktop wallpapers? Come on. I wonder if they're related the other brilliantly named files such as SET2.tmp and SET3.tmp in that same directory. And don't get me started on the insanity that is C:\Windows\System32. Hardly a single file/folder that doesn't use 8.3 naming. I haven't clue what have that stuff is doing there.

    You're not looking in the right place. Microsoft, love it or hate it, worked out a long time ago that 'filename' and 'metadata' aren't necessarily the same thing. The filename and path are just handy locational indexes, and don't necessarily need to mean *anything*. Sure, a DLL can, and often, for newer stuff, IS far longer than 8.3, but it wasn't until later versions of NT (3.5/4.0, I don't remember my history too well) that support for it kicked in well enough, and there's some legacy stuff around. You don't break legacy just because it's fun. Microsoft gets this right, even if they had to tread over it a fair bit in vista, and add some nasty hacks to deal with most of the fallout.

    Anyway, as I was saying, you're not looking in the right place. Case study: C:\windows\system32\apss.dll: Microsoft(r) InfoTech Storage System Library.
    Problem solved. (it's not at all difficult to use something like powershell (or possibly other tools) to just print this out in a souped up version of ls with a little scripting, I might add, just like I can do a few similar scripting tricks on my debian system to tell you who owns the copyright to 90% of .so's in /usr/lib.)

    Want another one?

    c:\windows\System32\bitsigd.dll: Background Intelligent Transfer Service IGD Support

    Oh look, another one, fully named.

    Of course, this starts to fall down when the file doesn't contain metadata, but that's a problem for, say, XML schema files in /usr/share/ on linux too. The organisation might be a bit better, but not by much. The saving grace there is that I have dpkg to work shit out for me. .NET goes even further. You can register as many different versions of a namespace as you like, and .NET will do the mapping for you if you request a specific version.

    First of all, I was only talking about superficial organization. And if you want to see something nice, have a look at OS X some time. Not only is the System (/System) well organized, but most applications are neatly self contained in /Applications/Some.app. They usually don't spew files all over the place when installed. You know where the term DLL Hell comes from, don't you?

    Yes. I do. .NET does a good job of solving this quite nicely. Adds public/private keys into the mix too, plus a bunch of other mechanisms. .NET isn't just for C# either. It deals with VB, C++, and (ahahahha) J# too.
    I will admit that the mac platform is neatly arranged, but their QA seems to have gone to the toilet right now. A place that windows' QA has emerged from rather nicely, I should mention.

    As for random stuff appearing in random places, try dealing with commercial software. Even on linux, the developers will put shit in strange places. Open