Slashdot Mirror


SGI And /Massive/ Linux Machine

Thanks to some of the folks from SGI for sending us some information about their latest project. Pretty interesting project -- the largest configuration has 10 PCI busses (busi?) with 24 scsi controllers and 10 disks. And wait'll you see the rest of the stats.

Hi all,

Just thought I would send out a note outlining the state of the mips64 port. Ralf, Ulf and I have been actively working past few months to bring up Linux on the SGI ccNUMA machines.

The executive summary: we have achieved multiuser boot on o200 and o2000s. The largest configuration is a 32p, 16node machine (only approx 4G worth of memory was populated over the 16 nodes, the system can take 4G * 16 node worth of memory). This machine has 10 PCI busses, with 24 scsi controllers and 10 disks. (Sample output is at

OSS SGI

If you are interested in the system architecture and details of the port, read on. The o2000s use R10000 series of MIPS processors. Each machine is comprised of modules, each module has 4 node boards with max 2 cpus and 4G memory on each node, and IO boards and routers. In a module, the two alternate node boards are each connected to a XBOW. Each XBOW possibly is connected on the other side to a number of PCI busses, which is what the IO boards connect to. Apart from this, there are routers in the system that provide connection paths between all memory to all cpus, to create a true CC-NUMA architecture.

On the software side, we are still struggling with compiler and binutils issues. The kernel itself is 64 bits, created by cross compiling on an ia32 box. We have not attempted 64 bit user program compilation or execution. The root disk is currently very close to the MIPS/Indy root disks. The architecture specific code uses the CONFIG_DISCONTIGMEM code to support memory on all nodes. The architecture specific NUMA features currently are: 1. replicate the kernel text on all nodes, so that no one node becomes a memory hot spot (unfortunately, the kernel data has to reside on only one node). 2. replicate low level excpetion handler code on all nodes. The architecture code also turns on CONFIG_NUMA to take advantage of node-local page allocations. (A CONFIG_NUMA patch that I have been submitting to Linus was put into the kernel in test6-pre1). For more information on NUMA and ongoing work, refer to

this document

The purpose of doing this port is to boot Linux on bigger systems that we have, in order to do cpu/memory scalability studies. This also lets us do NUMA performance work in the future. Another advantage is to be able to leverage this work on the upcoming SGI CC-NUMA Itanium boxes, which will be an SGI supported product. Initial results from scalability studies using mips64 is documented at

The OSS SGI site.

Kanoj

2 of 72 comments (clear)

  1. Someone... by enneff · · Score: 5
    ...give this guy a fat ip pipe and a gnutella node! This machine has 10 PCI busses, with 24 scsi controllers and 10 disks. !!!!!


    nf

  2. A perfect machine for render land and other uses by tolldog · · Score: 5

    This is a great machine for rendering or any other application that is both CPU and memory bound.

    Some jobs do not parrallel well, such as individual frame rendering. With 24 boxes, the 5 + minute overhead of loading the scene file plus the memory spent on loading the textures and the geometry would be done on each machine, costing you 24x's the overhead of doing it on one machine. Trying to do this with a "quasi" shared memory system would kill the network. But would remove that hidious overhead.
    Doing this on a NUMA box fixes all of those problems. The memory is shared. The procs all look like one machine. The system runs smooth and well.

    This is why SGI is still in the large graphics server environment. People want individual frames done fast.

    The benifit of this being a linux box and not Irix....
    I, a huge linux vs. irix advocate, strugle to see why this would be good. Most of the apps that I would use are built for Irix first and then Linux (like Maya's renderer). I can see where others might have custom apps to use this, but the code would probably port to Irix just as easily as it would to Linux on the MIPS.

    It is a step in the right direction, IA64 NUMA boxes running linux. The ultimate in render farm machines.

    --
    -I just work here... how am I supposed to know?