Linux Gains Support for NUMA

← Back to Stories (view on slashdot.org)

Posted by CowboyNeal on Thursday January 30, 2003 @07:57PM from the big-iron-at-your-hip dept.

soosterh writes "CNet has an article about a NUMA patch from IBM. It says that the improvement adds some support in Linux for nonuniform memory access, or NUMA, a design for higher-end servers with many processors. Linus Torvalds, the original creator of the operating system and still its top authority, accepted the update this month into version 2.5, the current test version of the software."

18 of 143 comments (clear)

Min score:

Reason:

Sort:

Imagine a beowulf... by Gordonjcp · 2003-01-30 20:01 · Score: 3, Informative

Seriously, this is something that will close one of the last remaining gaps between Linux and Solaris. Not that it will do much good for 99% of users out there, but if you need this, you *really* need it.
And AMD... by addaon · 2003-01-30 20:02 · Score: 5, Informative

And, of course, also support for the Hammer architecture, which is (smaller scale) NUMA. Each processor in an x86-64 system has its own memory bus, so time to access memory depends on whether the memory is directly connected to a given processor, or whether another processor needs to mediate, the definition of NUMA.

--

I've had this sig for three days.
NUMA by Anonymous Coward · 2003-01-30 20:09 · Score: 5, Informative

NUMA

Short for Non-Uniform Memory Access, a type of parallel processing architecture in which each processor has its own local memory but can also access memory owned by other processors. It's called non-uniform because the memory access times are faster when a processor accesses its own memory than when it borrows memory from another processor.

NUMA computers offer the scalability of MPP and the programming ease of SMP.
What about the feature freeze? by Screaming+Lunatic · 2003-01-30 20:18 · Score: 3, Informative

I thought there was a feature freeze. There must have been some NUMA code in kernel already and this cleans up all the loose ends.
Someone correct me if I'm wrong.
1. Re:What about the feature freeze? by greppling · 2003-01-30 20:27 · Score: 5, Informative
  
  The NUMA-aware scheduler was merged recently despite the feauture freeze. The patch was considered non-intrusive (and safe for non-NUMA architectures). Feature freeze is not code freeze.
  See the good discussion in the LWN article on this topic.
Re:32/64 by larien · 2003-01-30 20:36 · Score: 2, Informative

I'd imagine it's mainly for 64-bit as that's the kind of systems which tend to ship with NUMA (usually with MIPS or Itanium). Without knowing more, I couldn't comment as to whether it will work under 32-bit or not, but I can't see how it would be so limited.
Also, I seriously doubt if any desktop machine will use NUMA; it's primarily about systems which use system boards, where there are CPUs & RAM on a board which slots into the system & a CPU can access memory on a local board faster than that on other boards. Desktops tend to use one "system board" (i.e. the motherboard) so there isn't the difference in speed for accessing the data.
Re:A lot of talk about NUMA by Twirlip+of+the+Mists · 2003-01-30 20:42 · Score: 3, Informative

Oh, for cryin' out loud. Dude, there's this thing called Google. Try it out some time.

That said, I'll give you a hint: non-uniform memory access. If you've got a computer that uses different banks of memory as a single physical address space, then that computer has a NUMA architecture.

If you want to maintain cache coherency across a NUMA system, you have to employ some tricks. These tricks are sufficiently complex to warrant their own name: ccNUMA.

--

I write in my journal
Re:Isn't there some other numa stuff already in? by hansendc · 2003-01-30 20:56 · Score: 2, Informative

NUMA refers to a wide range of features. Everything from multipatch networking or SCSI, to memory allocation, to placing processes close to "good" memory. This particular patch simply makes processes run on CPUs where they're likely to be close to memory which they will need.
Re:32/64 by hansendc · 2003-01-30 20:59 · Score: 2, Informative

I'd imagine it's mainly for 64-bit as that's the kind of systems which tend to ship with NUMA (usually with MIPS or Itanium). Without knowing more, I couldn't comment as to whether it will work under 32-bit or not, but I can't see how it would be so limited.

That is an incredibly naive comment. NUMA systems have been around for quite a while (think Sequent), the current generation of IBM x440 are NUMA. These are all 32-bit Intel architectures.

This patch didn't even address memory, it only dealt with scheduling processes anyway.
Re:Isn't there some other numa stuff already in? by Error27 · 2003-01-30 20:59 · Score: 3, Informative

You are correct. The LWN article on this just became available to non-subscribers and you can read it here:http://lwn.net/Articles/20741/

(BTW. Everyone should subscribe to LWN. It's an exceptional value)
SGI's systems (was Re:32/64) by grey1 · 2003-01-30 21:28 · Score: 3, Informative

the MIPS/Itanium systems the parent refers to are (I assume) the SGI Origin and Altix multiprocessor servers, both 64bit, the first MIPS/IRIX, the second Itanium/Linux:

Origin

Altix

--
"we demand rigidly defined areas of doubt and uncertainty!"
Re:Linux lacks democracy by Lussarn · 2003-01-30 21:52 · Score: 2, Informative

You do not have to run Linus stock kernel.

Not two vendors ship the same kernel. So in the end it's up to the vendor you use to tweak your kernel. Redhats are heavily patched to suit (what they belive) is there users needs..

I think thats a good system.
Linus' Acceptance... by sd790 · 2003-01-30 23:12 · Score: 2, Informative

...can be found here.
It's time to stop reading slashdot when... by nibelung · 2003-01-30 23:35 · Score: 4, Informative

they are copying Linux related news from CNET.
Recent Patch Modification by LJPeixoto · 2003-01-31 00:15 · Score: 5, Informative

"More recently, the NUMA scheduler patch has been reworked (by Martin Bligh, Erich Focht, Michael Hohnbaum, and others) around a simple observation: most of the NUMA problems can be solved by simply restricting the current scheduler's balancing code to processors within a single node. If the rebalancer - which moves processes across CPUs in order to keep them all busy - only balances inside a node, the worst processor imbalances will be addressed without moving processes into a foreign-node slow zone. A simple (three-line) patch which did nothing but add the within-node restriction yielded most of the benefits of the full NUMA scheduler; indeed, it performed better on some benchmarks. Real-world loads, however, will require a scheduler which can distribute processes evenly across nodes. Occasionally it is necessary, even, to move processes to a slower node; a lot of CPU time on a lightly-loaded node will give better performance than waiting in the run queue on a heavily-loaded node. So a bit of complexity had to be added back into the new scheduler to complete the job."

Extracted from:
http://lwn.net/Articles/20741/
Re:ram question then by larien · 2003-01-31 00:55 · Score: 3, Informative

Er, why use a hard drive as RAM when you can just add loads of swap space? The VM will handle that space more efficiently if it knows it's hard disk rather than RAM.
However, the main way you might be able to add RAM over and above the MB limit is via some kind of PCI card with DIMMS on it. I'm not sure how that would work over PCI (even 66MHz/64bit) or how it would work at a lower level, but it might get by some limits. The limits OP was asking about may be of the order of trying to get over 1GB of RAM for some simulation code. Of course if you need over 1GB of RAM, buy a system which supports it.
In any event, from what people are saying, the NUMA patch is a change to the scheduler, to ensure that processes run on the CPU nearest the RAM bank storing the data. I don't think it addresses trying to add RAM from other sources (either disk or hypothetical PCI card)
NUMA support was improved, not added by slavemowgli · 2003-01-31 03:25 · Score: 3, Informative

Contrary to what is said in the post, NUMA support has been in Linux for quite a while already. The recent patches accepted by Linus merely add NUMA awareness to the scheduler, which, while certainly being a prerequisite for Linux being used on production NUMA boxen, is not at all required for NUMA support in general.

--
quidquid latine dictum sit altum videtur.
Re:Not just for big iron by Merlin42 · 2003-01-31 03:31 · Score: 2, Informative

Actually the HT implementation in the P4/Xeon chips does not act as you suggest in 1. When doing HT the cache is cut in half and each virtual CPU gets a half cache ... which is probably the main reason HT can yeild inferior performance for some applications.

There is a very good reason for doing it this way. The P4 cache uses VIRTUAL addresses so if each virtual cpu is executing in a different virtual address space(which is allowed) then you need a way to differentiate which cache lines belong to each virtual cpu since they might very well both reference lets say virtual address 0xDEADBEEF which translates into a different physical address (and hence different data). Intel engineers went with the simple solution of splitting the cache in two, instead of adding an extra tag to each cache line which would have created extra overhead/latency on every cache access.

I apologize for overusing the word virtual ... but I really couldn't help it too much. It just seems to be an overused word in CS/EE.

--
Thoughts on tech, Software Engineering, and stuff