Slashdot Mirror


Linux May Need a Rewrite Beyond 48 Cores

An anonymous reader writes "There is interesting new research coming out of MIT which suggests current operating systems are struggling with the addition of more cores to the CPU. It appears that the problem, which affects the available memory in a chip when multiple cores are working on the same chunks of data, is getting worse and may be hitting a peak somewhere in the neighborhood of 48 cores, when entirely new operating systems will be needed, the report says. Luckily, we aren't anywhere near 48 cores and there is some time left to come up with a new Linux (Windows?)."

9 of 462 comments (clear)

  1. Original Source and Actual Paper by eldavojohn · · Score: 5, Informative

    It appears that the problem, that affect the available memory in a chip when multiple cores are working on the same chunks of data, is getting worse and may be hitting a peak somewhere in the neighborhood of 48 cores, when entirely new operating systems will be needed, the report says.

    Seriously? You picked that over my submission?

    I submitted this earlier this morning I guess my submission was lacking. But if you're interested in the original MIT article and the actual paper (PDF):

    eldavojohn writes "Multicore (think tens or hundreds of cores) will come at a price for current operating systems. A team at MIT found that as they approached 48 cores their operating system slowed down. After activating more and more cores in their simulation, a sort of memory leak occurred whereby data had to remain in memory as long as a core might need it in its calculations. But the good news is that in their paper (PDF), they showed that for at least several years Linux should be able to keep up with chip enhancements in the multicore realm. To handle multiple cores, Linux keeps a counter of which cores are working on the data. As a core starts to work on a piece of data, Linux increments the number. When the core is done, Linux decrements the number. As the core count approached 48, the amount of actual work decreased and Linux spent more time managing counters. But the team found that 'Slightly rewriting the Linux code so that each core kept a local count, which was only occasionally synchronized with those of the other cores, greatly improved the system's overall performance.' The researchers caution that as the number of cores skyrockets, operating systems will have to be completely redesigned to handle managing these cores and SMP. After reviewing the paper, one researcher is confident Linux will remain viable for five to eight years without need for a major redesign."

    I don't know, guess I picked a bad title or something?

    Luckily we aren't anywhere near 48 cores and there is some time left to come up with a new Linux (Windows?).

    Again, seriously? What does "(Windows?)" even mean? As you pass a certain number of cores, modern operating systems will need to be redesigned to handle extreme SMP. It's going to differ from OS to OS but we won't know about Windows until somebody takes the time to test it.

    --
    My work here is dung.
    1. Re:Original Source and Actual Paper by klingens · · Score: 5, Interesting

      Yes it is lacking: it's too long for a /. "story". Editors want small, easily digested soundbites, not articles with actual information.

    2. Re:Original Source and Actual Paper by eudaemon · · Score: 5, Informative

      I just laughed at the "we aren't anywhere near 48 cores" comment - there are already commercial products with more than 48 cores now. I mean even a crappy old T5220 pretends to have 64 CPUs due to the 8 CPU, 8 thread design.

    3. Re:Original Source and Actual Paper by spazdor · · Score: 5, Insightful

      The very act of summarization constitutes an act of commentary. You're saying "I think the pertinent parts of this story are these, and the most important questions raised are those."

      A good summary invites commentary and frames the questions in a way which makes for better discussion, but don't for a second imagine the OP ought to be value-neutral (if such a thing could even exist.)

      --
      DRM: Terminator crops for your mind!
    4. Re:Original Source and Actual Paper by dAzED1 · · Score: 5, Informative

      and YET...that's irrelevant, because as many people have pointed out the problem is the cores that share L2 cache. There have been large systems with many, many processors for a long time, some of which run Linux. The problem that was described was 48cores on a single die, sharing the same cache. Sun's die-to-die tech isn't relevant to this problem, nor is putting more than 6 8-core CPUs in a single system.

  2. Error in their math by El_Muerte_TDS · · Score: 5, Funny

    They have an one-off error in their math, it's actually 9 times a 6 core CPU. So, at 42 cores a rewrite is needed.

  3. Re:What are they talking about by jd · · Score: 5, Informative

    What they are talking about really reduces to a variant of Ahmdals Law, but simply put scaling is always non-linear. There will be overheads per core for communication (why is why SMP over 16 CPUs is such a headache) and overheads per core within the OS for housekeeping (knowing what core a specific thread is running on, whether it is bound to that core, etc, and trying to schedule all threads to make best use of the cores available).

    The more cores you have, the more state information is needed for a thread and the more possible permutations the scheduler must consider in order to be efficient. Which, in turn, means the scheduler is going to be bulkier.

    (Scheduling is a variant of the box-packing problem, which is an NP-Complete problem, but it has the added catch that you only get a very short time to pack the threads in and scheduling policies - such as realtime and core-binding - must also be satisfied in addition to packing all the threads in.)

    The more of this extra data you need, the slower task-switching becomes and the more of the cache you are hogging with stuff not actually tied to whatever the threads are actually doing. At some point, the degradation in performance will exactly equal the increase in performance for the extra cores. The claim is that this happens at 48 cores for modern OS'. This is plausible but it is unclear if it is an actual problem. Those same OS' are used on supercomputers of 64+ cores, by segregating the activities in each node. MOSIX, Kerrighd and other such mechanisms have allowed Linux kernels to migrate tasks from one node to another transparently. (ie: You don't know or care where the code runs, the I/O doesn't change at all.) The only reason Linux doesn't have clustering as standard is that Linus is waiting for cluster developers to produce a standard mechanism for process migration that also fits within the architectural standards already in use.

    If you clustered a couple of hundred nodes, each with 48 cores, you're looking at having around 2000+ on the system. It wouldn't take a "rewrite" per-se, merely a few hooks and a standard protocol. To support a single physical node with more than 48 cores, you might need to split it into virtual nodes with 48 or fewer cores in each, but Linux already has support for virtualization so that's no big deal either.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  4. K42: these problems were already tackled by compudj · · Score: 5, Informative

    The K42 project at IBM Research investigated the benefit of a complete OS rewrite with scalability to very large SMP systems in mind. This is an open source operating system supporting Linux-compatible API and ABI.

    Their target systems, "next generation SMP systems", back in 2003 seems to have become the current generation of SMP/multi-core systems in the meantime.

  5. So what the fuck is he doing here then? by SmallFurryCreature · · Score: 5, Funny

    Lets drive the greenhorn OUT! No filthy high UID's with their spelling and gramar and solid well researched non-sensationlist writing. I want my editors to rape the language (bonus points if it is several languages at once) and sent my heart racing by raising my bile and fear of the unknown and known.

    Headlines sell adverts. Truth, accuracy, honesty do not. Accept it, you are reading slashdot, it works.

    --

    MMO Quests are like orgasms:

    You may solo them, I prefer them in a group.