Slashdot Mirror


Sandia Studies Botnets In 1M OS Digital Petri Dish

Ponca City, We love you writes "The NY Times has the story of researchers at Sandia National Laboratories creating what is in effect a vast digital petri dish able to hold one million operating systems at once in an effort to study the behavior of botnets. Sandia scientist Ron Minnich, the inventor of LinuxBIOS, and his colleague Don Rudish have converted a Dell supercomputer to simulate a mini-Internet of one million computers. The researchers say they hope to be able to infect their digital petri dish with a botnet and then gather data on how the system behaves. 'When a forest is on fire you can fly over it, but with a cyber-attack you have no clear idea of what it looks like,' says Minnich. 'It's an extremely difficult task to get a global picture.' The Dell Thunderbird supercomputer, named MegaTux, has 4,480 Intel microprocessors running Linux virtual machines with Wine, making it possible to run 1 million copies of a Windows environment without paying licensing fees to Microsoft. MegaTux is an example of a new kind of computational science, in which computers are used to simulate scientific instruments that were once used in physical world laboratories. In the past, the researchers said, no one has tried to program a computer to simulate more than tens of thousands of operating systems."

30 of 161 comments (clear)

  1. Life imitates XKCD by Tackhead · · Score: 5, Interesting

    Once again, life imitates XKCD: Network.

    1. Re:Life imitates XKCD by The_mad_linguist · · Score: 4, Informative

      Well, given that XKCD was imitating an old hacker competition...

    2. Re:Life imitates XKCD by dintlu · · Score: 3, Informative

      Goes to show that ideas are a dime a dozen.

      Implementing something like this is what makes the news.

  2. I've got an easier way by iamapizza · · Score: 3, Insightful

    what is in effect a vast digital petri dish able to hold one million operating systems at once in an effort to study the behavior of botnets

    If they've set up this mini-internet and have set up this botnet, then the easiest way to understand its behavior would be to look at the source code

    --
    Always proofread carefully to see if you any words out.
    1. Re:I've got an easier way by Sta7ic · · Score: 3, Insightful

      Just like the easiest way to understand how a dog works is to dissect them.

      In short, no. You can figure out how some of the parts work, but there's a lot within complex software that is non-deterministic, whether for internal, external, or thoroughly inadvertant reasons on either side. Just because you _think_ you know what it's doing doesn't mean it'll act the way you expect it to.

      Also, see http://xkcd.com/397/

    2. Re:I've got an easier way by caramelcarrot · · Score: 5, Insightful

      Simple rules can give rise to complex behaviour. Who knows what the botnet might do? It could have harmonic resonances, it could have phase changes at critical infection rates, it could do all sorts of interesting and complex behaviour. Looking at the source code won't tell you any of this.

    3. Re:I've got an easier way by voidphoenix · · Score: 3, Insightful

      You can't study emergent behavior by studying source code. Even within one host, the interactions between malware, applications and every the piece of the OS would already have emergent properties. Magnify by tens of thousands to millions (exponentially, not additively or multiplicatively), and the sheer complexity of the entire system would overwhelm our ability to understand it.

      We have ~100 billion neurons and ~100 trillion synapses. At 2^N - N - 1 subgroups, how many pieces before the system's complexity outruns our brain's processing power? A network of 47 pieces has ~140 trillion subgroups. With several million pieces...

    4. Re:I've got an easier way by swillden · · Score: 4, Insightful

      If it's unclear what the code does, run it in a debugger and control the inputs. Step through the code line by line. If the debugger doesn't do everything you want, write a better debugger.

      Is that right?

      Here, I'll describe a program so simple it can be coded in under 100 lines, and can be fully specified in a few sentences, then ask you a question about its behavior. It should be easy, right?

      There is a 100x100 grid of cells. Each cell is in one of two states "live" or "dead". Each cell has 8 neighbors, the cells horizontally, vertically and diagonally adjacent (the edges of the grid "wrap", so this is true even for edge cells). Each "generation", the state of the cells is updated according to the following rules:

      1. Any live cell with fewer than two live neighbours becomes dead.
      2. Any live cell with more than three live neighbours becomes dead.
      3. Any dead cell with exactly three live neighbours becomes live.
      4. All other cells remain unchanged.

      That's it. Now, given an initial state of the grid, tell me what the state is after 100, 500 and 1000 generations. Further, tell me whether or not any patterns of live cells will survive across across generations. Will patterns repeat? Can patterns move? Interact?

      Amazing complexity can arise from very simple rules. In this case (known as Conway's Game of Life, if you hadn't recognized it), the above rules contain enough power that if you make the grid infinite in size, the result is a Turing-complete computation system. In addition, the shifting patterns it creates are bewildering in their number, complexity and behavior.

      Now scale that up to thousands of lines of code. Granted, not code specifically chosen to create interesting interactions, but still 2-3 orders of magnitude more complex. Further, code that itself lives in and interacts with a complex and varied ecosystem of other code, some of which is trying to detect the code and kill it -- so the code is written to be self-modifying, to "mutate" a bit, after a fashion. Also add in the ability to migrate between "ecosystems", reproduce, receive deliberate external updates and instructions, etc.

      Simulation is the only way to get a handle on this sort of thing. And that's why the very smart people who designed and built the world's first million-machine simulator decided to do it.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  3. They can't afford an MSDN subscription? by n0tWorthy · · Score: 3, Funny

    Then they can run 1 million copies without a subscription.

    --
    "Be kind, for everyone you meet is facing a great battle." - Philo of Alexandria -
    1. Re:They can't afford an MSDN subscription? by Anonymous Coward · · Score: 3, Informative

      Someone marked this as 'funny' but it is true. Read the license it is per user... If your creating a cluster with THOUSANDS of nodes and testing things you are perfectly within your rights to do this. You can even get most of the different versions of the OS going. 98, 98se, 95 (shudder), ME (double shudder), NT4, 2k, XP, Vista, 7, etc... Putting different versions at different patch levels etc...

      http://msdn.microsoft.com/en-us/subscriptions/cc150618.aspx

      They lost me at Wine. As that would not truly create the environment they are trying to describe.

      I have had up to 100 desktops all going from 10 msdn licenses (10 users). With different levels of the OS to test install and different configurations. They probably dont even need a very high level of it.

  4. Is that really a windows environment? by damn_registrars · · Score: 5, Interesting

    I understand not wanting to buy 1M windows licenses; I am of the persuasion that is not inclined to buy 1 license.

    However, the summary seems to claim that Wine == Windows environment. I don't see how they are analogous in this sense. In particular, if you are trying to understand botnet behavior, you need infected botnet systems. Is there a way to make Wine vulnerable to the infections that frequently hit Windows systems?

    --
    Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
    1. Re:Is that really a windows environment? by mcrbids · · Score: 5, Insightful

      I don't see how they are analogous in this sense. In particular, if you are trying to understand botnet behavior, you need infected botnet systems. Is there a way to make Wine vulnerable to the infections that frequently hit Windows systems?

      WINE is an implementation of the Win32 API. Since the *target* of WINE is to emulate Windows, then in order to be successful, it must implement the bugs as well. So the better WINE is, the better it runs *ALL* Windows software - including the viruses and malware!

      I would assume (ass + u + me) that they've done enough unit testing on the particular botnet software in question to determine its compatibility with WINE, and so long as this compatibility is sufficient, then this could be a very useful test environment. It's the botnet being studied, not Windows itself!

      Another example: Windows 2000. I build data management software. I test with Windows 2000. Not because Win2000 is an example of the latest greatest from MS, but because it costs me nothing extra and runs nicely in a VM. Since the only O/S features I care about are those that are already present in Win2000, it creates a very useful test environment despite lacking many pieces present in later OS versions.

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    2. Re:Is that really a windows environment? by Anonymous Coward · · Score: 4, Insightful

      I can't possibly imagine how a simulation of millions of instances of your software infecting itself would be good PR.

    3. Re:Is that really a windows environment? by MaskedSlacker · · Score: 4, Informative

      I think you're misunderstanding what they are doing. They are not studying in-the-wild worms. They are trying to build theoretical models of botnets and how they propagate through networks--this is the equivalent of computer simulations of viral epidemics. You don't need to simulate what the virus does in a person to study how it spreads through a population.

    4. Re:Is that really a windows environment? by amicusNYCL · · Score: 3, Insightful

      The research isn't to determine how Windows reacts to a botnet. They're trying to figure out how the botnet itself communicates and spreads. Or, more specifically, what the botnet looks like as it is spreading. Windows is just the platform that they're running the botnet on (sort of), but they don't really care how Windows reacts to it.

      In other words, they're studying the botnet itself, not the infrastructure it runs on.

      --
      "Our two-party system is like a bowl of shit looking at itself in a mirror." - Lewis Black
    5. Re:Is that really a windows environment? by hairyfeet · · Score: 3, Insightful

      As an old greybeard PC repairman I can tell you that Windows bugs are screwing around with the guts of Windows more than any tweaked Wine could ever replicate. I don't see why they wouldn't just pony up for MSDN where they could then run all the real Windows versions they wanted and then get more realistic results. This seems like they are going pretty far out of their way to keep from spending a buck, when the cost of that monster PC makes being so "penny wise, pound foolish" seem extra crazy to me.

      But IMHO you aren't gonna see how a real botnet works without running real unpatched Windows boxes. I used to keep a box here in the shop for dropping bugs on to find the best ways to clean them (before cleaning got to be pointless) and the amount of crap some of these bugs were screwing with was just mind blowing, we are talking fake .tmp files, stuff hidden in places like program files/ windows media player, a couple that would even rip out different windows system files and replace them with their own hacked versions, just really crazy stuff. But since Wine is primarily a very tiny subset of the Windows susbsystem I really don't see how they are gonna get any real results from this.

      If it was just some guys playing in their basement I would think "okay...maybe cool" but spending the amount they did on that "Bigtux" makes it just nuts not to buy an MSDN and run a real simulation. I feel this is a moment where we need the late Graham Chapman to come out in his military uniform and tutu and demand that they cease and desist for being just too silly, because spending all that cash to study Windows botnet behavior and then cheaping out on a ...what? $600 MSDN license? It is just too silly.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    6. Re:Is that really a windows environment? by Antique+Geekmeister · · Score: 3, Funny

      Lease time on one of the larger botnets?

  5. Wine? by Facegarden · · Score: 4, Insightful

    I understand using WINE to avoid license fees, but wouldn't that potentially hinder the results of the experiment? I suppose that if they knew what functionality was needed by the botnet, they could be sure WINE provided what they needed, but it also seems like they might be able to work out a deal with MS to get a free site license for use in this test only, since it betters the computing world in general, which ultimately benefits microsoft?

    Seems like a few phone calls might go a long way, if they get a hold of the right people.
    -Taylor

    --
    Worldwide Military budgets: $2100 billion. Worldwide Space Exploration budgets: $38 billion. Really, world? Really?
  6. But -- how can you infect it? by Nefarious+Wheel · · Score: 3, Funny
    My first thought meme was "Yes, but does it run Linux?" ("Megatux". Duh.) Then I thought - hang on, how can you develop a botnet that runs on Linux in the first place? And if you did, how would it reflect the nature of real botnets if those millions of operating systems weren't running NT4 or variants?

    Then it got surreal - I imagined all those bots emulating the game of life , with little dots flashing on and off, and little gliders and factories...

    Ok, I'll go back to work now.

    --
    Do not mock my vision of impractical footwear
  7. A few notes from Ron Minnich by coreboot · · Score: 5, Informative

    Hi, Ron here. Just thought I would mention a few things.
    I love the "life imitates xkcd" aspect. :-)
    We're well aware that Wine is not quite enough to run many windows bots. Until a year or so ago, however, there was a researcher in North Carolina running Storm under Wine, but he told me that that effort ended when Storm added a kernel driver. We've got some ideas in that area. We expect that implementing them will cost less than 1 million Vista licenses.
    I was surprised to find I have become a cybersecurity expert! What I really am is an HPC expert who is using HPC tools and resources to build a system for studying cybersecurity phenomena on a millions-of-nodes scale.
    Doing anything with a million of something gets interesting fast. There's a lot of interesting challenges.
    Thanks
    ron

    1. Re:A few notes from Ron Minnich by PCM2 · · Score: 3, Interesting

      Well Ron, since you're here, I'm curious whether you had in fact tried to approach Microsoft for a free site license. You could explain to them that you're doing security research in a unique environment and that you'd be willing to share your results with them, etc. I could even imagine a distorted PR spin where the fact that all this major security research is being done on Windows shows that Windows is clearly the dominant operating system, blah blah...

      Or if Microsoft doesn't see the value of the kind of information your research could yield, maybe someone like Symantec would be willing to buy a license and donate it to you (if that's even possible, given EULAs etc.)?

      --
      Breakfast served all day!
    2. Re:A few notes from Ron Minnich by coreboot · · Score: 5, Informative

      We will probably approach MS at some point, if it appears to be necessary, and see if they are interested. I do have friends there who might be interested in what we're doing.
      The biggest limit we've found on the VM side is memory footprint of the VM guests, and it's very easy to control that with Linux; harder with Windows. We have some ideas in that area too, but it's way too early to speculate on them.
      But from my point of view, it is a lot easier to do this kind of work in Linux than in Windows (I have done NT drivers in a past life), not least because of the openness of the environment. Hence, I'd rather try to find a way to make it all work on Linux.
      Consider this work the beginning of the story; it's not even chapter 1, maybe it's the preface. There's a lot of work left to do. There's a lot we still don't know.
      thanks
      ron

  8. Re:WINE by monopole · · Score: 5, Funny

    I hope Microsoft issues a statement that only Genuine Windows software can fully support viruses and malware in an effective fashion.

  9. I would guess it wouldnt' be a problem at all by Sycraft-fu · · Score: 5, Interesting

    I work for a university and MS is extremely generous with academic licensing. When it is for academics, like education or research, it is actually no cost. For infrastructure it does cost, but not very much. I bet if they asked MS, MS would give them all the licenses they needed for little or no cost.

    For that matter, they might be eligible for volume licensing. That is where you pay a fixed yearly fee and get an unlimited use of the software it is for. Often that is based on total academic headcount, which might not be very much.

    Regardless, if they asked I'd give good odds MS would figure out a way to offer them a good deal.

    I'm also with you that if you want to study something, you need to run it on the actual environment. Wine is a neat idea and a neat goal, but anyone who has made use of it for more than simple testing well tell you that it has some serious issues. Not only do things not run, worse is that they'll run but not completely correct. For a user this might be fine, something works in a bit of an unexpected way, you just work around it. For research though, it could mean your conclusion is invalid.

  10. Re:WINE by Eighty7 · · Score: 5, Funny

    In other news, Miguel de Icaza said that he believes botnet support is a good idea. Linux should support malware because Microsoft is going to win anyway, so linux would better be prepared if it doesn't want to be locked out of the future markets, and presented a beta version of the software. Members of the Mono project are participating in the standarization.

  11. Re:Wine on Linux? by geegel · · Score: 3, Interesting

    Not necessarily.

    You might want indeed at some point to emulate an internet choke full of unpatched machines, but other times you will probably want only a percentage of them to be this way, or you might want to study a particular vector of infection, or concurrent vectors of infection to see how they interact. The combinations are endless and so will probably be the number of WINE flavors used.

    --
    right...
  12. What about Norton Antivirus? by node+3 · · Score: 5, Interesting

    What about Norton Antivirus? Specifically they should run a second experiment with a simulation of 1 million systems running Norton Antivirus, and compare the results of the first test to see which has the greatest adverse effect...

  13. Not exactly. by khasim · · Score: 4, Insightful

    A patent on an IMPLEMENTATION of an idea is a good thing.

    A patent on an idea itself ... that's stupid. And that's what we're stuck with today.

  14. Old News... by davevr · · Score: 5, Funny

    There is already a system running somewhere around 420 million windows machines in a semi-private walled-off version of the internet, with no license fees paid to Microsoft, hosting several botnets and just about every virus under the sun.

    It is called "China".

  15. Linux for a reason... by Entropius · · Score: 4, Informative

    The researcher posted up above saying he's an HPC researcher, not a computer security guy, and in that context using Wine makes sense.

    HPC people typically study emergent behavior -- how a lot of nodes interacting by simple rules generate complicated phenomena. The challenge is coming up with the simple rules in a form that accurately captures whatever leads to the emergent behavior you want to model. In this case, "actually being Windows so all the viruses work exactly right" is less important than getting a lot of nodes running to capture the interesting behaviors of viruses spreading through a large network.

    Supercomputing is difficult on Windows. I'm at a computational physics conference now, and everything runs on Linux just because it's bloody *easier* to make everything go. I doubt many people here would even know *how* to run our models on a Windows supercomputer.

    Performance issues aside, my guess is that the fellow chose Linux because the computer *already* ran Linux.