Slashdot Mirror


Software Engineering at Microsoft

an_mo writes "A link to a google cached document is floating around some mailing lists containing some info about microsoft software engineering. In particular the document contains juicy bits about the development of a large project like NT/2K. Some examples: Team size went from 200 (NT3.1) to 1400 (Win2k). Complete build of win2k time is 8hrs on 4way PIII and requires 50GB of hard drive space. Written/email permission required for checkins by the build team." The HTML version on Usenix's site is much nicer than Google's auto-translated version.

32 of 461 comments (clear)

  1. Re:I don't know much about build times.. by Osty · · Score: 4, Insightful

    So for something like Windows 2000 is that a long time?

    It's long-ish, but not overly long. For a comparison that you may be more familiar with, consider the time it takes to compile the Linux kernel, your chosen libc, other libs you'll eventually need (say, gtk and/or qt, etc), X, GNOME or KDE, some apps (xmms, xine, a couple editors, etc), and probably 8 or 9 other things I'm forgetting right now. You'll probably come up with a similar number (probably smaller, but there's also probably less code in all the above tools).

    That's not to say it can't be made faster. I don't know whether that time was on a multi-threaded compile or not, but I'd sure hope so given that their build machines were 4-way machines. Also, note that they didn't say what speed the P3s were. 4 P3-500s will surely compile slower than 2 p3-1.2GHzs. Nor did they say if those were Xeons or not (larger cache is better for compiling). The obvious solution is to throw hardware at the issue, but there are other things that can be done like incremental building, better sync/drains for multi-threaded compiles, more efficient compilers and build scripts, etc.

  2. Re:What a waste of time and money! by cpeterso · · Score: 4, Insightful


    Only the NT build lab needs to rebuild everything. Individual developers only need to built their feature's DLL and EXE files.

  3. Single point of failure by PingXao · · Score: 5, Insightful
    1 defect stops 1400 devs, 5000 team members!
    I would think this would lead to a situation where CYA would become a way of life. Sure, even the best developers will make an occasional mistake. The document notes that a successful culture needs to recognize that mistakes will happen, but if ONE defect is going to shut down 5,000 people, I know I wouldn't want to be the one everybody is pointing their fingers at. I can imagine the circus atmosphere when the blame-shifting and the search for the guilty goes into high gear.
    1. Re:Single point of failure by gwernol · · Score: 5, Insightful

      I would think this would lead to a situation where CYA would become a way of life.

      I don't think so - he's talking about buiuld brreaks (i.e. code that won't compile). These are automatically detected and the culprit is auto emailed. Under source code control there is nowhere to hide from this because you know whose code broke the build.

      The only CYA you can do is not check in broken code. This is a good thing :-)

      Runtime errors don't stop 5000 team members.

      --
      Sailing over the event horizon
    2. Re:Single point of failure by WaKall · · Score: 2, Insightful

      With proper branching in your source repository, you can isolate different areas of change, and thus keep build breakages limited to subsets of developers.

      With regards to isolating who broke a build, that would require a clean build for each and every checkin, which just isn't practical in terms of hardware resources. A more practical solution is to grab tip, build, if fail -> indicate all checkins since last green build. This gives you a bigger culprit set, but it's MUCH cheaper in terms of hardware.

    3. Re:Single point of failure by gwernol · · Score: 3, Insightful

      With proper branching in your source repository, you can isolate different areas of change, and thus keep build breakages limited to subsets of developers.

      Agreed, and we know that their SCCS was broken in this respect.

      With regards to isolating who broke a build, that would require a clean build for each and every checkin, which just isn't practical in terms of hardware resources. A more practical solution is to grab tip, build, if fail -> indicate all checkins since last green build. This gives you a bigger culprit set, but it's MUCH cheaper in terms of hardware.

      Again going back to the article, we're talking about their daily builds, which will be clean. The compilers will spit out failure information that can be easily traced back to the culprit.

      This is how many large (i.e. OS-sized) projects work - regular clean builds, usually once per day, with auto emailing of break information to those responsible. One group I worked in also required you donate some chocolate to a central "fund" available to all the engineers when you broke the build. A fun way of encouraging people to compile against clean sources before checking in.

      --
      Sailing over the event horizon
  4. Actually quite strange by rmassa · · Score: 2, Insightful

    I'm a linux user, but most MS people I know hail win2k as the best microsoft OS ever. So this presentation seems to be kind of strange, pointing out things (increased build time, developers/testers) that illogically seem to create a better OS.

    1. Re:Actually quite strange by Anonymous Coward · · Score: 1, Insightful

      What exactly is illogical?

      That compiling 9M LOC (NT3.51) took less time than compiling 29M LOC (W2k)?

      Or that scaling to 1200 developers demanded some changes in the process management?

      In case you didn't notice, that was an article about software engineering, not about software politics, OS wars or stuff like that. No every comp related article on /. has to be about that.

      Although you can probably be forgiven for thinking so.

  5. Re:standard linux praise... by DeepZenPill · · Score: 2, Insightful

    I sorta doubt that 8hrs was spent only compiling the win2k "kernel." Don't forget that the GUI, networking and a ton of other crap is built in. I'm sure compiling an entire linux system consisting of apps equivalent to those which are built into windows would take a hell of a lot of time too.

  6. Re:A recipe for disaster by plierhead · · Score: 5, Insightful

    I venture to guess, however, that your company is somewhat smaller than Microsoft, is held together by shared enthusiasm and the exilaration of short term releases, and that you don't face many of the problems that any large company, not just the Borg, does. I would never defend the quality of MS products but anyone who has worked on large products with many existing custoemrs in a large software company like an Oracle, Microsoft or IBM will understand that it is simply impossible to only hire expert programmers whose work never needs to be checked by anyone else and who don't need any supervision.

    Some of your other statements are rather sweeping. Some parts of UML - such as object modelling - are very useful indeed and can act as highly rigorous sources for a lot of code and database generation or automated access. Others (like Use cases IMHO) suck and are of little use to programmers, though more in communication with PHBs and business types.

    A lot of what you say is very true for small focused teams working in their bedrooms/garages/garretts but much less so for any large software developer who sells software for money. Your "expert-driven" approach would never work at a Microsoft.

    Your last point, that OSS produces better results, is probably true. Certainly its more cost-efficient. But does it produce profitable companies that make heaps of money ? Maybe you don't like the idea of that. But most of the rest of the world, including your gray-haired neighbour who plans to retire on the proceeds of his portfolio, does.

    --

    [x] auto-moderate all posts by this user as insightful

  7. Re:A recipe for disaster by MAXOMENOS · · Score: 5, Insightful

    UML and other modelling fads. My former employer required the use of 65-page UML diagrams for the simplest command-line utilities. Why? Because it was popular, and the investors liked to make sure we were buzzword-compliant. UML is designed for non-technical audiences, and as such it flies in the face of the engineering goals it is designed to solve.

    I've found UML, or at least quasi-UML, useful; any time I design a system I draw a quick UML sketch just to help me think about what's invovled. Unless, that is, it's something really dead simple .. something equivalent to a homework assignment. Sometimes most of the really hard work goes into a good UML diagram, and the rest becomes easy.

    But despite this, I can't help but reflect on your statement in utter horror. What the hell kind of UML diagram does one put together for, say, ls? Or cd? Or a numerical calculation?

    Code review. Code review is a power trip and best, and a drain on morale at worst. If a programmer cannot be trusted to develop excellent code, he should be replaced with somebody who can. It's a tight labor market on the developers' side, so incompetent programmers should be spending their time reading O'Reilley books instead of playing games and looking at porn in their parents' basement.

    I disagree with you on two fronts. One, I've always found code review beneficial for a project. Weaker coders learn good habits; stonger coders teach good habits; bugs not visible to some become visible to others; the general quality of code improves. People who can't deal with constructive criticism of their code make for bad team-mates.

    Secondly, I've never met anyone who became a good programmer by reading books, even books as high quality as O'Reilly's. I learned to code by writing code and reading others' code. The books make handy references, but sticking to books is akin to trying to learn to write well by reading the dictionary.

    Large, geographically concentrated development teams. The best work is emphatically not done by 1400 people in the Redmond campus. The best work is done by culling experts of individual niche areas from around the globe. Not surprisingly, this is the model that Linux and most Open Source software uses, and that is why OSS is phenominally successful compared with any of its proprietary competition.

    Most of Microsoft's problems can probably be directly attributed to the size of its development team. MS project designers might do well to re-read The Mythical Man-Month (if they never read it, they have no business being project designers, IMO).

  8. Your design process is the real disaster recipe... by javabandit · · Score: 4, Insightful
    My former employer required the use of 65-page UML diagrams for the simplest command-line utilities. Why? Because it was popular, and the investors liked to make sure we were buzzword-compliant. UML is designed for non-technical audiences, and as such it flies in the face of the engineering goals it is designed to solve. What's good for the suits isn't necessarily good for the engineers.
    I'm not sure how to say this nicely, but you are a moron. You actually think that UML and design diagrams are only for suits? That is ridiculous. Just because your former employer was a complete idiot and requiring obscene amounts of UML diagrams for small things doesn't make the whole concept a farce.

    Good engineering (of any kind) starts with design... a plan. I'm glad you don't build skyscrapers or airplanes.
    These stand in the way of progress like no other corporate "bad habit." Requiring programmers to have a supervisor (often a non-technical PHB) "sign off" on their code prior to the commit is ludicrous. Developer time costs $20-40 an hour - should that time be wasted pursuading co-workers to check in and approve their code, or should it be spent doing actual development?
    Oh boy. So you basically are thinking... what... that code should be reviewed after it has already hit QA or something? Or perhaps we shouldn't review code at all?

    Here's a clue. If a developer is costing 20-40 per hour writing CRAPPY code... THAT is a far worse waste of time than taking a little time... reviewing the code... and correcting it if necessary.

    Development isn't just writing code any way you want. You want things to be very solid, standardized, and consistent before it gets into beta. Using your way... you'd never know if the code was good or not. Apparently... to you... if it works... ship it!
    Code review is a power trip and best, and a drain on morale at worst. If a programmer cannot be trusted to develop excellent code, he should be replaced with somebody who can.
    What? How do we know if the code is bad? We have to REVIEW it? What if the developer doesn't understand a certain design pattern and implemented it incorrectly? Hell... what if a bug or flaw is discovered during the review process?

    These are all common issues in everyday development. It doesn't necessarily mean the developer is BAD. Rather... the developer is HUMAN.

    Although... with your lack of a code review process, lack of system design process, and lack of formal check-in process... I am surprised that any decent code gets written at all.
    The best work is emphatically not done by 1400 people in the Redmond campus. The best work is done by culling experts of individual niche areas from around the globe. Not surprisingly, this is the model that Linux and most Open Source software uses, and that is why OSS is phenominally successful compared with any of its proprietary competition.
    You're comparing apples and tractors. Financial gain or customer/user base size are NOT measures of good code, excellent development standards, or strong design processes. Although, I'm not certain you will understand what I'm saying here.

    There is some excellent open-source software out there. Likewise, there is some excellent proprietary software out there.

    And there is crappy software out there, too... for both worlds. Whether or not something is open source or proprietary says nothing about how it is written or how well it is designed.

    This obviously is a huge troll that I'm feeding here.
  9. Re:A recipe for disaster by WaKall · · Score: 2, Insightful

    I think you're way off-base on the Code Review comment. At my job, code reviews take roughly 3-5% of the time I spend writing code, and very often find problems (in having my own code reviewed, or reviewing someone elses code).

    And even if no bugs are found, it helps to have another pair of eyes go over code for readability. It may make sense to you, but it may not to someone else. When you leave and that code needs fixing, NOBODY will understand it because they don't have preconceived notions about how it operates.

    The BEST thing you can do to improve quality of code, and of your developers, is to code review before every checkin. Find someone who is more/as intelligent as you, and have them scour your code while sitting next to you. At worst, they'll understand the code. At best, you'll find a bug before it goes out there, or you'll learn something new about the language/library you're using that you didn't know before.

  10. Re:A recipe for disaster by marick · · Score: 5, Insightful

    # Code review. Code review is a power trip and best, and a drain on morale at worst. If a programmer cannot be trusted to develop excellent code, he should be replaced with somebody who can. It's a tight labor market on the developers' side, so incompetent programmers should be spending their time reading O'Reilley books instead of playing games and looking at porn in their parents' basement.

    No, no, no. Code-review is VERY USEFUL. No, it won't catch architecture mistakes (necessarily). No, it won't catch design mistakes. Hopefully you already know how to design before you get your first software job.

    What code-review catches is the annoying things that the best developers tend to think don't matter so much. Style-differences from company practices. Naming conventions not being followed. Poorly chosen variable-names. Lack of documentation.

    In short, code-review makes your code more maintainable. Your company may not use it, but that doesn't make it useless.

  11. Can't pull IE from Windows, huh? by davebo · · Score: 5, Insightful

    Microsoft claims IE can't be separated from the OS. Yet, the presentation points out the code is broken into 16 sub-projects, largely isolated from each other, and separately buildable.

    Two of those projects were "INetCore" and "INetServices".

    So why can't you just build 2K without those 2 subprojects, or just stubs inserted for the functions declaired in those projects?

    1. Re:Can't pull IE from Windows, huh? by hyoo · · Score: 3, Insightful

      Seperately buildable does not imply seperately runnable.

  12. Xeons by xrayspx · · Score: 3, Insightful

    They have to be Xeons, AFAIK, non-Xeon Intel CPUs won't do 4-way. And even if you CAN do 4 way on regular PIII's, which you cannot, MS wouldn't, they would have Xeons.

    I'm imagining this machine to be a Compaq 6400r or the like, from the timeframe of the build it's probably 550s or 700(?), since they have a very close relationship to Compaq for servers.

  13. Re:read a book by selan · · Score: 2, Insightful

    I really enjoyed reading Showstopper. It's very well written and tells interesting stories about the people behind NT. I was surprised by the amount of work and testing that went into NT. Actually raised my opinion of the lowly Microsoft coders (not the brass, though). The book also goes into the sordid history of how Microsoft shafted IBM and OS/2 by making NT for Windows only. Very good read.

  14. I see you missed the point by Anonymous Coward · · Score: 2, Insightful

    If you go through the whole presentation, you will see that when the Windows 2000 team was working as a single centrally-co-ordinated group, it was slowing them down and causing stagnation.

    It was only when they broke up into smaller, relatively independent groups, that they managed to regain some of the earlier productivity from their NT days.

    And that's how Linux development works -- many small, relatively independent development groups.

    With Windows, Microsoft even ties the applications into the Operating System, with the purpose of making the use of non-MS applications "a jolting experience" (as an internal Microsoft memo said about IE versus Netscape).

    But with Linux, everything is developed to standardized interfaces. That makes it possible for the Kernel development to progress independently of the GUI development, which is independent of the development of the Desktop Managers, which is independent of the development of the Applications, which is independent of Distribution Packaging, and so on.

    Even within the larger projects, such as the Kernel, or Mozilla, the work is divided into smaller, relatively independent modules.

    And that's one of the reasons why Linux development is progressing so much faster than Windows.

  15. The numbers aren't that large by Twillerror · · Score: 3, Insightful

    Remember that Windows 2000 is essentially everyone who is working on the linux kernal, basic distribution, and X. If the number includes Explorer, which could be likened to Mozzila and includes management, testers, and all the design specialistics ( people who do research to make it user friendly, or handicap accesable, I would think it's pretty small.

  16. NT kernel problem is not software engineering by g4dget · · Score: 4, Insightful
    Compared to the rest of Windows, the NT kernel seems reasonably well engineered. The problem I think is that the end product is a combination of features that marketing thinks really need to go in there for their feature check lists, and pet ideas of the developers/researchers.

    UNIX and Linux are different. UNIX (at least Research UNIX) was constrained by its paradigms: it was vigorously policed by its developers. For Linux, something doesn't make it into the kernel unless it really scratches an itch that a lot of people have--the feedback is immediate and direct: no interest, no developers.

    Microsoft software development doesn't operate in a competitive market of ideas (let alone a competitive market), it doesn't have a paradigm to focus it, and it doesn't even have resource constraints to focus it. It's nice that they make the software engineering work out, but the end result still is mediocre at best.

  17. Re:Your design process is the real disaster recipe by Bouncings · · Score: 3, Insightful
    Apparently... to you... if it works... ship it!
    If I had a frag each time I heard a manager say something almost verbatium to that, I would be Quake champion of the universe. :) "If it works, ship it" is the creed of all of corporate America, not just with software. Remember that article a while ago on why software sucks? It's because programmers are rarely allowed to write software that doesn't suck. It's a mandate. Quality control is something us programmers have wanted for a long time, along with the occational chance to refactor, to document, and to test. Such luxeries are never offorded to us, but we always get the blame from the users when PHBs force each step of the development process prematurely forward, if not skip steps entirely.
    --
    -- Ken Kinder ken@_nospam_kenkinder.com http://kenkinder.com/
  18. Re:What no sacrifices to the gods? by SysKoll · · Score: 2, Insightful

    Indeed. You need to sacrifice at least the mythical all-redhair goat if you want to get 3 days of uptime with NT5.0 a.k.a. Win2000

    I saw a sig saying: Taking software security advice from Microsoft is like taking airline security advice from Bin Laden.

    I disagree. Bin Laden proved that he knew a lot about airline security (and how to defeat it).

    For all its self-congratulation, MS still does not know how to achieve code quality in a large software project. They do a lot of wide-and-shallow useability studies, but they pay as much attention to reliability testing as Hollywood pays attention to scriptwriters (i.e., not a lot. Remember the old joke? "How do you spot a blonde would-be actress in a movie cast? She sleeps with the writer.")

    -- SysKoll
    --

    --
    Mad science! Robots! Underwear! Cute girls! Full comic online! http://www.girlgeniusonline.com/

  19. Re:MS Coders by God!+Awful · · Score: 3, Insightful

    Whuh... Did I just hear a slashdot geek call Microsoft employees even bigger geeks? I'd take what your friend told you with a grain of salt, especially since it doesn't make much sense. I actually worked at Microsoft for a few months as a university student, and my impression was that the workers there were pretty normal, as far as coders go. I've met a lot wierder people since then.

    I was working on a pretty trivial part of NT, so the build system didn't affect me. However, when you walked around the halls you could see who checked in code that broke the build because they would have a "build breaker award" taped to their office window. It seemed to be in good fun, but I suppose it could result in a CYA mentality.

    Also, I remember there being problems with source control, like the article mentioned, though not specific to NT. I seem to remember that Word Viewer used a different codestream from Word and the sample files in the SDK are merely very out-of-date versions of some of the small apps that ship with Windows.

    -a

  20. Wrong point of reference by Anonymous Coward · · Score: 1, Insightful

    most MS people I know hail win2k as the best microsoft OS ever

    It's their least bad OS ever. Still closer to the that end of the spectrum than the other.

  21. Microsoft Found Solutions to Their Problems by AaronLuz · · Score: 5, Insightful

    Given the tone of most of the comments here, one might think that the slides merely reveal Microsoft's errors. In fact, they indicate what problems the company faced scaling their NT development team from 200 to 1400 programmers and their solutions. The conclusion is, "With the new environment in place, the team is working a lot like they did in the NT 3.1 days with a small, fast moving, development team."

    As Linux grows, it is headed for the same sorts of problems. The open source movement can learn a lot from Microsoft's struggles. The fact that Linus opted to use a new source control system -- just as Microsoft realized that their in-house system was not up to the task and so switched -- gives me hope.

    P.S. May we please have better summaries for the articles on the front page?

  22. Unfortunately, the goals were changed by Anonymous Coward · · Score: 2, Insightful

    The original team's goals, as you listed them, look great, especially for a server OS, and the positive results showed in NT 3.51.

    Unfortunately, Mr. Gates was not as technically competent as the original NT team, and he provided a new set of goals for NT 4.0:

    1. Snazzy GUI - For a server OS!

    2. Good GUI performance - Run video drivers in kernel mode (ring zero).

    3. Cool new features - Good games platform.

    4. Other considerations, such as reliability and compatibility, are secondary.

    The fact that Bill Gates overrode the goals of the original team made it impossible for NT 4.0 to compete with Unix in the high-end server market.

    Now, even if Microsoft happened to get it right with Windows 2000, no one will believe them.

    Unfortunately, if you look at Netcraft's measurements, you will find that the major W2K sites has uptimes averaging less than 10 days, while the major Linux sites tend to have uptimes of around 100 days. That, combined with all the security holes in Microsoft's software, will tell you that they still don't have it right.

  23. Why sad? by alienmole · · Score: 3, Insightful

    Not everyone can know everything. Why discriminate against good information based on its age?

  24. Re:Old news... by inkfox · · Score: 3, Insightful
    Guys, the PowerPoint slides for the Lucovsky presentation has been publicly downloadable for almost 2 years. I always find it sad when Slashdot reports something old as something new.
    It was still probably news to most here. And it's interesting. Both make it a good story.
    --
    Says the RIAA: When you EQ, you're stealing bass!
  25. Re:A recipe for disaster by DuranDuran · · Score: 4, Insightful
    > Code review is a power trip and best, and a drain on morale at worst.

    Can you see the spelling error you made in this sentence? Did you mean to make that error?

    If you can't even type error-free prose, how could you be expected to create error-free code?

    People make errors. Code review helps reduce the effect of those errors.

    --
    "You can justify anything by putting it in quotes, adding a famous name and making it a sig" - Albert Einstein
  26. Re:SourceDepot = Perforce != VSS by Dom2 · · Score: 2, Insightful

    And rightly criticized if I might add. Too many comments obscure the code and duplicate what it says a lot of the time. Good comments are an art form.

    -Dom

  27. Re:SourceDepot = Perforce != VSS by anshil · · Score: 3, Insightful

    Thuis is not in any kind funny.

    Really lot of windows guys people believe that if they use gcc as compiler they have to be GPL, it's FUD, and jokes like this only HURT Gnu,GPL,Linux. (or use bison, or edit the code with vim, and so on). This kind of humor is just too expensive, as people not knowing the regarding background actually believe kind of stuff, it's fear from the FUD they heared from the MCSE's.

    --

    --
    Karma 50, and all I got was this lousy T-Shirt.