Slashdot Mirror


Linux May Need a Rewrite Beyond 48 Cores

An anonymous reader writes "There is interesting new research coming out of MIT which suggests current operating systems are struggling with the addition of more cores to the CPU. It appears that the problem, which affects the available memory in a chip when multiple cores are working on the same chunks of data, is getting worse and may be hitting a peak somewhere in the neighborhood of 48 cores, when entirely new operating systems will be needed, the report says. Luckily, we aren't anywhere near 48 cores and there is some time left to come up with a new Linux (Windows?)."

462 comments

  1. Original Source and Actual Paper by eldavojohn · · Score: 5, Informative

    It appears that the problem, that affect the available memory in a chip when multiple cores are working on the same chunks of data, is getting worse and may be hitting a peak somewhere in the neighborhood of 48 cores, when entirely new operating systems will be needed, the report says.

    Seriously? You picked that over my submission?

    I submitted this earlier this morning I guess my submission was lacking. But if you're interested in the original MIT article and the actual paper (PDF):

    eldavojohn writes "Multicore (think tens or hundreds of cores) will come at a price for current operating systems. A team at MIT found that as they approached 48 cores their operating system slowed down. After activating more and more cores in their simulation, a sort of memory leak occurred whereby data had to remain in memory as long as a core might need it in its calculations. But the good news is that in their paper (PDF), they showed that for at least several years Linux should be able to keep up with chip enhancements in the multicore realm. To handle multiple cores, Linux keeps a counter of which cores are working on the data. As a core starts to work on a piece of data, Linux increments the number. When the core is done, Linux decrements the number. As the core count approached 48, the amount of actual work decreased and Linux spent more time managing counters. But the team found that 'Slightly rewriting the Linux code so that each core kept a local count, which was only occasionally synchronized with those of the other cores, greatly improved the system's overall performance.' The researchers caution that as the number of cores skyrockets, operating systems will have to be completely redesigned to handle managing these cores and SMP. After reviewing the paper, one researcher is confident Linux will remain viable for five to eight years without need for a major redesign."

    I don't know, guess I picked a bad title or something?

    Luckily we aren't anywhere near 48 cores and there is some time left to come up with a new Linux (Windows?).

    Again, seriously? What does "(Windows?)" even mean? As you pass a certain number of cores, modern operating systems will need to be redesigned to handle extreme SMP. It's going to differ from OS to OS but we won't know about Windows until somebody takes the time to test it.

    --
    My work here is dung.
    1. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      U mad?

    2. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0, Informative

      I’d just like to interject for a moment. What you’re refering to as Linux, is in fact, GNU/LInux, or as I’ve recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX.

      Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called “Linux”, and many of its users are not aware that it is basically the GNU system, developed by the GNU Project.

      There really is a Linux, and these people are using it, but it is just a part of the system they use. Linux is the kernel: the program in the system that allocates the machine’s resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Linux is normally used in combination with the GNU operating system: the whole system is basically GNU with Linux added, or GNU/Linux. All the so-called “Linux” distributions are really distributions of GNU/Linux.

      So this blog needs to be renamed to the GNU/Linux Hater's Blog. Have a nice day.

    3. Re:Original Source and Actual Paper by VorpalRodent · · Score: 4, Funny

      What does "(Windows?)" even mean?

      I read that as saying "Windows is the new Linux!". Clearly the submitter is trying to incite violence in the Slashdot community.

      --
      Take it to the limit, everybody to the limit, come on, everybody fhqwhgads.
    4. Re:Original Source and Actual Paper by Dragoniz3r · · Score: 4, Interesting

      Oh look, CmdrTaco published yet another story with a poorly-written, hypersensationalist summary! Par for the course.

    5. Re:Original Source and Actual Paper by klingens · · Score: 5, Interesting

      Yes it is lacking: it's too long for a /. "story". Editors want small, easily digested soundbites, not articles with actual information.

    6. Re:Original Source and Actual Paper by eudaemon · · Score: 5, Informative

      I just laughed at the "we aren't anywhere near 48 cores" comment - there are already commercial products with more than 48 cores now. I mean even a crappy old T5220 pretends to have 64 CPUs due to the 8 CPU, 8 thread design.

    7. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 2, Informative

      I don't know, guess I picked a bad title or something?

      No. Your summary was too long.

      Seriously, the purpose of a summary is not to include every last fact and detail mentioned in the article; it's to give the reader enough information to decide whether reading the full article is worth it. Don't try to put everything in there.

    8. Re:Original Source and Actual Paper by bindoeve · · Score: 1

      Understandable. This is nothing but inferior compared to his submission.

    9. Re:Original Source and Actual Paper by characterZer0 · · Score: 1

      (Windows?)

      I thought he was implying that we will also need to come up with a new Windows.

      --
      Go green: turn off your refrigerator.
    10. Re:Original Source and Actual Paper by Skal+Tura · · Score: 3, Interesting

      Scare piece.

      Your submission wasn't scaring enough. From your submission, it seems that it's not that big of a deal and rather easy solution. This submission makes it sound like linux kernel needs a complete rewrite ground-up, as in starting from scratch.
      Plus yours was a bit long and lots of details.

    11. Re:Original Source and Actual Paper by WinterSolstice · · Score: 3, Informative

      Got a pile of AIX servers here like that:
      http://www-03.ibm.com/systems/power/hardware/780/index.html

      I was kind of wondering about the "modern operating systems" comment... I think he meant "desktop operating systems".
      Many of the big OS vendors (IBM, DEC (now HP), CRAY, etc) are well beyond this point. Even OS/2 could scale to 1024 processors if I recall correctly.

      --
      An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.
    12. Re:Original Source and Actual Paper by poetmatt · · Score: 0, Flamebait

      toot your own horn much?

      way to link your own article, which I will avoid now.

    13. Re:Original Source and Actual Paper by Skal+Tura · · Score: 2, Informative

      nevermind quite an standard server, a dual xeon 6core HT... total reported CPUs is 24, and it's quite a lot used and nothing special.

    14. Re:Original Source and Actual Paper by jpmorgan · · Score: 1

      Well, Windows historically did have problem scaling beyond a fairly small number of processors. So with Windows 7, Microsoft replaced the original NT system executive with a new system called MinWin. Microsoft claims MinWin efficiently scales to 256 cores: http://tech.slashdot.org/tech/08/11/02/0130253.shtml

    15. Re:Original Source and Actual Paper by Perl-Pusher · · Score: 4, Insightful

      Core !=CPU

    16. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 1, Informative

      The effect is otherwise known as "Amdahls Law", well documented by Gene Amdahl in 1967. Is this news at all?

    17. Re:Original Source and Actual Paper by TheRaven64 · · Score: 3, Interesting

      And it's worth noting that the most common application for that kind of machine is to partition it and run several different operating systems on it. Solaris has already had some major redesign work for scaling that well. For example, the networking stack is partitioned both horizontally and vertically. Separate connections are independent except at the very bottom of the stack (and sometimes even then, if they go via different NICs), and each layer in the stack communicates with the ones above it via message passing and runs in a separate thread.

      However, it sounds like this paper is focussing on a very specific issue: process accounting. To fairly schedule processes, you need to work out how much time they have spent running already, relative to others. I'm a bit surprised that Linux actually works as they seem to be describing, since their 'change' was to make it work in the same way as pretty much every other SMP-aware scheduler that I've come across; schedule processes on cores independently and periodically migrate processes off overloaded cores and onto spare ones.

      There are lots of potential bottlenecks. The one I was expecting to hear about was cache contention. In a monolithic kernel, there are some data structures that must be shared among each core and every tim you do an update on one core you must flush the caches on all of them, which can start to hurt performance when you have lots of concurrent updates. A few important data structures in the Linux kernel were rewritten in the last year to ensure that unrelated portions of them ended up in different cache lines, to help reduce this.

      Even then, it's not a problem that's easy to solve at the software level. Hardware transactional memory would go a long way towards helping us scale to 128+ processors, but the only chip I know of to implement it (Sun's Rock) was cancelled before it made it into production.

      --
      I am TheRaven on Soylent News
    18. Re:Original Source and Actual Paper by NevarMore · · Score: 4, Interesting

      The thing is eldavojohn practically *is* an editor for /. , just check out his submission page. Despite having such a high UID he's got a solid reputation, a good writing style, and offers good commentary on a wide variety of topics.

    19. Re:Original Source and Actual Paper by poet · · Score: 1

      Of course they picked it over yours. Yours is intelligently written and has an expectation that people will understand what you are talking about.

      Unfortunately, this is Slashdot.

      --
      Get your PostgreSQL here: http://www.commandprompt.com/
    20. Re:Original Source and Actual Paper by interkin3tic · · Score: 2, Insightful

      I don't know, guess I picked a bad title or something?

      Slashdot: dramatically overstated news for nerds... since that seems to be the evolution of news services for some reason?

      I'm working on a submission: Fox news just had a bit about the internet, I'm assuming that their headline is something like "WILL USING OBAMANET 'IPv6' KILL YOU AND MAKE YOUR CHILDREN TERRORISTS?"

    21. Re:Original Source and Actual Paper by Dahamma · · Score: 2, Informative

      the purpose of a summary is not to include every last fact and detail mentioned in the article; it's to give the reader enough information to decide whether reading the full article is worth it.

      If you think a summary can actually help get a /. reader to RTFA, you must be new here...

    22. Re:Original Source and Actual Paper by RCGodward · · Score: 3, Funny

      Don't bother checking the box, RMS, we know who it is.

    23. Re:Original Source and Actual Paper by BeardedChimp · · Score: 4, Informative

      The purpose of an editor is to edit any submissions to make them ready for print.

      If the summary was too long, the editor should have got off his arse rather than wait for the summary that fits the word count to come along.

    24. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 3, Informative

      OS/2's SMP support is a joke. I'm sure that somewhere in that tangle is a comment like "up to 1024 processors". But it's as relevant as a sticker on a Ford Cortina warning not to exceed the speed of sound.

      Officially the SMP version of OS/2 "Warp Server" supported 64 processors. In practice anything other than an embarrassingly parallel task would see rapidly diminishing returns after just a couple of CPUs. The stuff that this article is moaning about, that Linux doesn't do well enough on 48 CPUs? OS/2 doesn't even attempt it, the official docs just say to "avoid" such things. This test case on 48 CPUs on OS/2 would just leave the OS constantly thrashing trying to move pages from one CPU to another, and no work being done.

      Now maybe if OS/2 had been a huge success, and IBM were now the dominant OS vendor on the desktop, there'd be a 1024 CPU version of OS/2 today. But in our reality, where OS/2 support was gradually abandoned and handed over to an underfunded little independent outfit, it sucks on SMP.

    25. Re:Original Source and Actual Paper by wastedlife · · Score: 1, Redundant

      You've been misinformed, the NT executive is still alive and kicking:

      MinWin is not, in and of itself a kernel, but rather a set of components that includes both the Windows NT Executive and several other components that Russinovich has described as "Cutler's NT".[16]

      It's all still NT, Windows 7 is just NT version 6.1. I guess "6.1" doesn't have the same ring to it as a whole number. Will Windows 8 be NT 6.2, or will they move the version up to NT 7.0?

      --
      Said, "It's just like dice but it's got more sides And it tells me who lives and who dies"
    26. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 2, Insightful

      Wow, really just wow you sir are the cream of the crop! /sarcasm

      the OP has a very valid point, i come to read about technology news on slashdot not scare pieces with little or no information or value, his post was far superior in every respect and yet got passed over for this garbage post. And you devalue his point further by not even giving him the time of day, way to go asshole.

    27. Re:Original Source and Actual Paper by Wonko+the+Sane · · Score: 3, Insightful

      Your summary was too long.

      Yes, but the submission that got accepted has a bullshit headline.

      Of course "Linux May Need to Continue Making Incremental Changes Like It Has Been Doing For The Last Several Years To Scale Beyond 48 Cores" doesn't draw in as many clicks.

    28. Re:Original Source and Actual Paper by TheLink · · Score: 1

      Yes his summary was a bit long.

      But the purpose of many slashdot summaries seem to be to generate more comments about errors in the summary, or due to misunderstanding of the summary, or the summary just being crap. A bit like trolling for hits ;).

      --
    29. Re:Original Source and Actual Paper by X0563511 · · Score: 2, Informative

      I've seen longer stories about lamer things get published...

      --
      For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    30. Re:Original Source and Actual Paper by spazdor · · Score: 2, Funny

      Fuck, dude, we hurd you the first time. and "GNU Plus Linux" is terrible marketing.

      --
      DRM: Terminator crops for your mind!
    31. Re:Original Source and Actual Paper by davev2.0 · · Score: 1, Insightful

      Good summaries do not offer commentary. Save the commentary for the comments.

    32. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      linux too claim scalability up to 64 cores. (and kolivas was bragging about being theoretically scalable up to 65535 cores)

      then, you always need to check actual feature with real world....

    33. Re:Original Source and Actual Paper by Hijacked+Public · · Score: 1, Troll

      All of the things Taco is not. So he is the perfect target for trolling, which Taco has just masterfully done.

      Taco made major modification to the entire Karma system mostly to frustrate a couple of users. Taco loves to troll folks.

      --
      "Sacrifice for the good of The State" - The State
    34. Re:Original Source and Actual Paper by Captain+Splendid · · Score: 4, Insightful

      Which is why he's treated like shit: Can't have any kind of excellence here, Taco wants to keep that old-school newsgroup feel. That's the only explanation that still fits.

      --
      Linux, you magnificent bastard, I read the fucking manual!
    35. Re:Original Source and Actual Paper by aywwts4 · · Score: 2, Informative

      If it is any consolation this straw is the one that broke the RSS feed's back.

      I have unsubscribe from Slashdot today due to the trend typified in your article VS the one published. (No this is not a new trend, but I'm fed up and finished with it.) See you on Reddit's Science/Linux/Everything else

      --
      Web Developers: Celebrate to our roots! Animated Gifs and Tiled Backgrounds, dont let our history die!
    36. Re:Original Source and Actual Paper by Unequivocal · · Score: 2, Insightful

      Elaborate please. I'm ignorant and curious.

    37. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 2, Insightful

      Oh look, CmdrTaco published yet another story with a poorly-written, hypersensationalist summary! Par for the course.

      Remember back when the slashdot "editors" were part of the community and would actually respond to site concerns raised by users? I haven't seen ANY "editor" post a reply to any slashdot user post in friggin YEARS. Good luck with getting their attention these days if you aren't an advertiser.

    38. Re:Original Source and Actual Paper by aywwts4 · · Score: 1

      Some desktop PCs have 8! (Quad core with HT)

      A good server will be past 48 in no time. Especially with the kind of high end computing that Linux is often used for.

      --
      Web Developers: Celebrate to our roots! Animated Gifs and Tiled Backgrounds, dont let our history die!
    39. Re:Original Source and Actual Paper by LWATCDR · · Score: 1

      Or if density keeps doubling every 18 months. AMD has a 12 core chip now. In three years we will hit 48. Of course they may not rev that fast or increase cache instead of cores.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    40. Re:Original Source and Actual Paper by Unequivocal · · Score: 1

      Awesome fake title. Congrats - funniest post I've read today..

    41. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Sheldon, is that you?

    42. Re:Original Source and Actual Paper by Lumpy · · Score: 2, Interesting

      you are not in the club of liked submitters. Honestly the number of crap submissions that get picked over well thought out and very well cited ones is nuts to the point that I simply stopped submitting stories here. Its a waste of time.

      --
      Do not look at laser with remaining good eye.
    43. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      A CPU may have many cores (like 4, 8, 6 or 12). Big iron mainframes may have many processors (1024 he mentioned) , but that may be like 8k cores. Of course those do not run standard operating systems, and programming them effectively already requires parallel programming techniques, so it's a bullshit comparison.

      Still a 4 processor server is already very near 48 cores.

    44. Re:Original Source and Actual Paper by UnknowingFool · · Score: 2, Funny

      it's to give the reader enough information to decide whether reading the full article is worth it.

      We are supposed to read the articles? Why didn't anyone tell me about this before?!!

      --
      Well, there's spam egg sausage and spam, that's not got much spam in it.
    45. Re:Original Source and Actual Paper by fahrbot-bot · · Score: 1

      As you pass a certain number of cores, modern operating systems will need to be redesigned to handle extreme SMP. It's going to differ from OS to OS ...

      Exactly. As described, this is specifically a Linux issue, not a "every OS" issue. Christ, I used an Intel system running Unix with 1024 cores back in the late 1980's when I was NASA LaRC / ICASE.

      --
      It must have been something you assimilated. . . .
    46. Re:Original Source and Actual Paper by spazdor · · Score: 5, Insightful

      The very act of summarization constitutes an act of commentary. You're saying "I think the pertinent parts of this story are these, and the most important questions raised are those."

      A good summary invites commentary and frames the questions in a way which makes for better discussion, but don't for a second imagine the OP ought to be value-neutral (if such a thing could even exist.)

      --
      DRM: Terminator crops for your mind!
    47. Re:Original Source and Actual Paper by graffix01 · · Score: 1

      Yes but Solaris handles 100's of CPU's without any problem today. Linux and Windows, not so much.

      --
      Women don't want to hear what you think. Women want to hear what they think, in a deeper voice.
    48. Re:Original Source and Actual Paper by electrosoccertux · · Score: 1

      All of the things Taco is not. So he is the perfect target for trolling, which Taco has just masterfully done.

      Taco made major modification to the entire Karma system mostly to frustrate a couple of users. Taco loves to troll folks.

      Eh, John's summary is long. I think I can see why this shorter version was chosen. I don't know that Taco was trolling, but I'm not close to any of that political stuff anyways so I don't really; I'm just pointing out I can see why the shorter summary was chosen.

    49. Re:Original Source and Actual Paper by monkeySauce · · Score: 4, Informative

      The article is about cores per chip, not cores per system.

      You're trying to compare a 48-cylinder engine with a bunch of 4-cylinder engines working together.

    50. Re:Original Source and Actual Paper by poetmatt · · Score: 1

      uh? His post was legitimate and valid, and his did indeed have more value than the one that made it to the frontpage. I absolutely agree 100%. However, posting your own post in your own post is a bit excessive, and there could have been better ways to do this than just repost your entire freakin story as the first comment.

    51. Re:Original Source and Actual Paper by Gilmoure · · Score: 3, Funny

      Wait, Macs don't suck?

      --
      I drank what? -- Socrates
    52. Re:Original Source and Actual Paper by jgagnon · · Score: 3, Interesting

      You can also think of it as the difference between rooms and buildings. Multiple cores may exist in a single CPU just like multiple rooms may exist in a building. Getting around between rooms in the same building isn't such a big deal. But getting from Room A in Building 1 to Room B in Building 2 requires you to leave Building 1 and then enter Building 2, which takes more time. Some motherboards support multiple CPUs (buildings) but most do not. Those that do are usually more expensive than the ones that support only a single CPU.

      --
      Remember to maintain your supply of /facepalm oil to prevent chafing.
    53. Re:Original Source and Actual Paper by Profane+MuthaFucka · · Score: 0, Flamebait

      Despite having such a high UID he's got a solid reputation, a good writing style, and offers good commentary on a wide variety of topics.

      Who cares. We want ass jokes.

      --
      Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
    54. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 1, Informative

      What the article is referring to is the number of cores PER SOCKET. Yes, you have some big computers with 48 cores across multiple sockets right now, but you do not have 48 cores in a single socket. I think the article is referring to a counter that Linux maintains per socket.

      The other trend to keep in mind is non-uniform memory access (NUMA). There is memory associated with each socket of a machine. It is more expensive to access memory on a different socket. To help with this, you try to keep memory accesses local. This is most likely why Linux would maintain a counter PER SOCKET, because that would keep all the memory accesses to the counter local.

    55. Re:Original Source and Actual Paper by bberens · · Score: 2, Informative

      A CPU can contain multiple cores which share Level 2 cache. Conversely a multi-CPU system has multiple complete CPUs which do not share their L2 cache.

      --
      Check out my lame java blog at www.javachopshop.com
    56. Re:Original Source and Actual Paper by Kristopeit,+Mike+Da. · · Score: 1

      slashdot = stagnated

    57. Re:Original Source and Actual Paper by Surt · · Score: 1

      You were surprised? This is slashdot, where the LCD wins every time.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    58. Re:Original Source and Actual Paper by Surt · · Score: 3, Informative

      This is not Amdahl's law, this is the dispatcher being inefficient.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    59. Re:Original Source and Actual Paper by Anpheus · · Score: 1

      Quad Nehelem and Westmere EX servers can be purchased from just about any large vendor, and they have 4 processors, 8 cores, and 2 threads per core, so they appear as 64 logical processors.

      Next generation Sandy Bridge servers will feature up to 10 cores, I don't know if the threads per core is planned to be increased any time soon though.

      IBM also sells a really, really custom Nehalem box that expands to multiple servers, permitting I believe between 8 and 16 8-core processors to be spanned together and run a single operating system.

    60. Re:Original Source and Actual Paper by spottedkangaroo · · Score: 1, Offtopic

      Seems a little silly to make that distinction here, since he's clearly talking about the kernel and the way it handles SMP, which ... is not GNU.

      --
      Imagine if you weren't allowed to use roads because a bus company complained about your driving 3 times. --skunkpussy
    61. Re:Original Source and Actual Paper by DaHat · · Score: 1

      Really? I've personally seen Windows run quite well on a box with 256 cores.

    62. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      As an example, take the IBM 795. It comes as a 24" frame (or more), assembled at the factory. It's probably at, or near, the highest end machine that can run Linux, unless you start counting massively parallel machines like BlueGene.

      Using the 4.0 GHz Power7 chip, you can have 8 processor books, with 4 sockets per book. That's 32 sockets.

      Each socket holds an 8-core chip. Meaning 8 cores on a single bit of silicium. And each core is able to run four threads simultaneously.

      Means that this machine is able to run 4 (thread per core) x8 (cores per socket) x4 (sockets per book) x8 (books) = 1024 threads simultaneously.

      Linux would see this as 1024 CPUs since it assumes all these threads are independent execution units. They're not, not really, especially not if the threads try to use the same execution units within a core (there's only so many floating point units in a core, so if other threads are using them, you'll have to wait) but it makes things "easy" to schedule.

      And yes, all these 1024 threads can access the same memory. Simultaneously if necessary. Which screws up caching big time.

    63. Re:Original Source and Actual Paper by Anpheus · · Score: 3, Interesting

      And multiple threads per core can be thought of as say, movable dividers in rooms. Yeah, it's really one room, but you can divide it into 2 "sort of", and it doesn't really mean you have twice as many rooms, but there are certain benefits you can get from doing so.

    64. Re:Original Source and Actual Paper by afidel · · Score: 1

      Heck, any 8 socket server with Nehalem-EX can support up to 128 logical processors (64 cores). I don't think I've heard anyone say their new 8 socket servers aren't capable of sustaining decent workloads.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    65. Re:Original Source and Actual Paper by illumin8 · · Score: 1

      To make matters even worse, we are definitely hitting 48 cores right now.

      Luckily we aren't anywhere near 48 cores and there is some time left to come up with a new Linux (Windows?).

      Total BS. I just installed two HP Proliant DL785G6 servers which have a total of 8 AMD 6-core CPUs each, for a total of 48 cores. Coming very soon is the new HP DL970 servers which will have a total of eight Intel 8-core servers, or 64 cores.

      Bad summary. Doesn't Slashdot do the most minimal fact checking first?

      --
      "When the president does it, that means it's not illegal." - Richard M. Nixon
    66. Re:Original Source and Actual Paper by drsmithy · · Score: 4, Insightful

      I was kind of wondering about the "modern operating systems" comment... I think he meant "desktop operating systems".

      What's a "desktop operating system" these days ? The only mainstream OS that hasn't seen extensive use and development in SMP server environments for a decade plus is OS X. For all the others, "desktop" vs "server" is just a matter of the bundled software and kernel tuning.

      Even OS/2 could scale to 1024 processors if I recall correctly.

      Yeah. Just like those old PPC Macs were "up to twice as fast" as a PC.

    67. Re:Original Source and Actual Paper by jgagnon · · Score: 3, Informative

      To elaborate slightly further... If you had two CPUs on your motherboard with 8 cores each and four threads of execution per core, you'd have a total of: 2 CPUs, 16 cores, and 64 threads of execution.

      --
      Remember to maintain your supply of /facepalm oil to prevent chafing.
    68. Re:Original Source and Actual Paper by hardburn · · Score: 4, Insightful

      Trolling, I'm sure, but to people who take "GNU/Linux" seriously: how much of any given distro is really GNU code anymore? While GNOME may still be preferred by Ubuntu, there are also a lot of Kbuntu users, and many other distros seem to prefer KDE. Neither XFree86 nor X.Org were ever GNU. Smaller installations, like smartphones and home gateways (which often do run Linux, even if you can't install a custom version like DD-WRT), use busybox for their basic command line tools, and almost certainly do not use glibc. Debian even went for the eglibc fork, partially because Ulrich Drepper makes Theo DeRaadt look like a nice guy. HURD has gone nowhere for 20 years now, even if it does have some neat ideas.

      Non-GNU GUI applications and libraries now make up a huge percentage of a desktop distro, Apache and custom web apps make up a big chunk of server code, and smartphones may or may not have any GNU code at all.

      So what's left of GNU code now? Well, gcc is likely to keep being the world's de facto C compiler (though even this was mainly because of the egcs fork way back when). I'm sure there will be legions of emacs users for years to come, and I guess a lot of people still prefer GNOME. GNU's basic command line tools and bash will no doubt still be used on servers and desktops. But is this really sufficient to warrant a "GNU/Linux" nomenclature, not to mention all the pedantry that surrounds it?

      To the AnonCow troll above: GNU code has nothing to do with how the kernel handles multicore processors, so your whole point is moot within this context.

      --
      Not a typewriter
    69. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 1, Interesting

      [expletives redacted] "A whole new operating system?" [expletives redacted]. It has nothing to do with operating system. Are you guys too cool for school, or did you forget http://en.wikipedia.org/wiki/Von_Neumann_architecture#von_Neumann_bottleneck.

    70. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Seriously? You picked that over my submission?

      Ah man. If they'd known you had wanted to be special, they would have picked you for sure. Can you cope?

    71. Re:Original Source and Actual Paper by Rogerborg · · Score: 1

      Come on, you've been here long enough to know that you need to troll a little to get published. Next time try adding something about the Steve Jobs Reality Distortion Field making iOS scale to Infinity CPUs.

      --
      If you were blocking sigs, you wouldn't have to read this.
    72. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Correct me if I'm wrong, but it seems like this is one situation where even RMS would agree with the use of the term Linux. Managing of CPU cores is something that happens entirely in the Linux kernel. The GNU component of the functional OS is unrelated to the problem being described. The changes necessary to ensure that the GNU/Linux OS will scale gracefully as the number of cores proliferates are entirely in the Linux component...the GNU component should need little to no changes.

      Your argument is like saying that an article on pancreatic cancer should instead call it human cancer because pancreases aren't alive without the rest of the parts of a human.

    73. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      I just laughed at the "we aren't anywhere near 48 cores" comment - there are already commercial products with more than 48 cores now. I mean even a crappy old T5220 pretends to have 64 CPUs due to the 8 CPU, 8 thread design.

      Sun had 64 CPUs back in the late '90s with their E10K product (which was a re-worked SGI design AFAIK). 48 CPUs / core is now mid-range in the old-school Unix shops.

    74. Re:Original Source and Actual Paper by Old97 · · Score: 2, Funny

      Wow, you've convinced me. I'm canceling all my plans to migrate to OS/2. Thanks.

      --
      Very often, people confuse simple with simplistic. The nuance is lost on most. - Clement Mok
    75. Re:Original Source and Actual Paper by Jon_E · · Score: 1

      dude .. Stallman doesn't use the internet .. he might connect to send/receive email, but that's about it:
      http://richard.stallman.usesthis.com/

    76. Re:Original Source and Actual Paper by AaronLS · · Score: 1

      "we aren't anywhere near 48 "

      We are actually probably within a 3-4 years of affordable mainstream 48 core servers. AMD is on track for delivering a 16 core cpu and probably another couple years at most for 32 cores. Considering a dual CPU system that puts us at 64 cores. Now consider how much time it might take for developing the OS modifications. Depending on how accurate/inaccurate this article is, rewriting an entire OS will take at least that long if they started right now, but I have doubts about if a complete rewrite is required. I think this is just another case of inaccurate sensationalistic writing getting attention.

    77. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      On Solaris it is!

      Waaaiiittt, you're not on Solaris, are you? It sucks to be you!

    78. Re:Original Source and Actual Paper by tonique · · Score: 1

      It should be "DoubleGNULinux"!

    79. Re:Original Source and Actual Paper by Score+Whore · · Score: 1

      Actually the Niagara line from Sun/Oracle has 64+ "cpus" per socket. Depending on the exact model, a system like a T5240 will have 64 "cpus" per socket and two sockets for a total of 128 "cpus". The SPARC T3 processors have 16 cores with 8 threads per core, for a total of 128 threads of execution running from a single socket. Each of which has it's own registers, etc. From an OS scheduling point of view it is 128 cores.

    80. Re:Original Source and Actual Paper by Kymermosst · · Score: 1

      The problem is that your summary fell into the TL;DR category.

      Thanks for the links to the original papers, though.

      --
      "Alcohol, Tobacco, Firearms, and Explosives" should be a convenience store, not a government agency.
    81. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 1, Insightful

      Despite having such a high UID he's got a solid reputation

      Really? I've never thought so. To me, he's come of as a self-important blow hard who always thinks he has something interesting to add on every subject. He writes things that aren't really interesting but he knows will be moderated up. There have been tons of instances where he's had the first post to +5 that adds little-to-no value to the discussion and I'm forced to collapse his thread to get to the interesting stuff posted by people with actual expertise in the subject (yes, those people do exist.)

      To me, he's a karma whore of the first degree who I wish would refrain from posting on subjects where he can't add anything productive to the conversation. The moderation system is great for dealing with trolls, since they're -1'd into oblivion and ignored. It sucks for dealing with karma whores because the first +5 post will receive the bulk of the responses and it's way too easy to get a +5.

    82. Re:Original Source and Actual Paper by hierophanta · · Score: 1

      and i've seen ugly guys get hot chicks. i'm sorry but this logic applies to nothing

    83. Re:Original Source and Actual Paper by hsoftdev17 · · Score: 1

      I thought the same thing. There have been contests held by AMD for 48 core machines as the prize. (And that was about a year ago!) Intel has demonstrated 80 core prototype CPUs. Someday it's not unreasonable to expect everything to run on GPUs with 100s and 100s of effective cores. "Aren't anywhere near 48 cores" my ass.

    84. Re:Original Source and Actual Paper by bn557 · · Score: 2, Informative

      Cores often share cache. Separate CPUs rarely do. The problem in this case is, when you approach 48 Cores in 1 CPU, the accounting task for the cache users starts growing out of proportion to the performance gain from adding cores.

      --
      Humans are slow, innaccurate, and brilliant; computers are fast, acurrate, and dumb; together they are unbeatable
    85. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Dude, you didn't put enough Fear, Uncertainty, Despair in your summary. Plus, you didn't make an inflammatory reference to a perceived inferior OS.

    86. Re:Original Source and Actual Paper by KarmaMB84 · · Score: 1

      Depends on if there's real compatibility breaks. Windows 7 runs pretty much all Vista software unmodified exactly as it ran on Vista so to avoid installers breaking for no good reason because they're just checking major version number, they made Windows 7 "6.1". They claimed there was enough changes to warrant a major version number of 7 but made it 6.1 for compatibility. If there's actual compatibility breaks warranting a new version number, they'll probably use 8.0 or else label it 6.2 for compatibility.

    87. Re:Original Source and Actual Paper by Ultra64 · · Score: 1

      There's nothing wrong with tooting your own horn. At least, not when your horn is clearly superior to the horns others are tooting.

    88. Re:Original Source and Actual Paper by mitgib · · Score: 2, Informative

      You can have 48 cores today with a Quad G34 motherboard.

      --
      Being a spelling & grammar Nazi is a sign you do not poses the intelligence to contribute to the conversation
    89. Re:Original Source and Actual Paper by zizzo · · Score: 1

      This summary was anything but easily digestible. The second sentence is a nightmarish run-on and misuses "affect". I stared at it for 30 seconds trying to figure out what the hell was going on.

    90. Re:Original Source and Actual Paper by CRCulver · · Score: 3, Insightful

      Debian (and I suppose Ubuntu too) makes use of a lot of Bash scripts behind the scenes. Grub is still the boot loader of choice. A lot of installation CDs use parted to set up the hard drive. Just some examples off the top of my head.

    91. Re:Original Source and Actual Paper by dAzED1 · · Score: 5, Informative

      and YET...that's irrelevant, because as many people have pointed out the problem is the cores that share L2 cache. There have been large systems with many, many processors for a long time, some of which run Linux. The problem that was described was 48cores on a single die, sharing the same cache. Sun's die-to-die tech isn't relevant to this problem, nor is putting more than 6 8-core CPUs in a single system.

    92. Re:Original Source and Actual Paper by mcgrew · · Score: 1

      IIRC one of the ten fastest computers on the planet is running Linux, and in fact I was under the impression that Linux would run on anything from a wristwatch to a supercomputer. Is that fast top ten Linux computer multicore, or what?

      At any rate, it seems to me from your comment (which is much better than TFS IMO), that Linux as is will run fine on a computer with a hundred CPUs, it's just that a partial rewrite of the kernel will improve its performance.

      I don't know, guess I picked a bad title or something?

      No, your submission just wasn't inflammatory enough. The one that was posted made it look like Linux was flawed. Contrary to the popular meme, there are a LOT of windows fanbois here who have never even considered TRYING Linux and who are, in fact, scared shitless of it. Any supposed flaw in Linux cheers their hearts. Yes, this is supposed to be "news for nerds" but not everyone here is a nerd.

      Luckily we aren't anywhere near 48 cores and there is some time left to come up with a new Linux (Windows?).

      Unlike Linux, windows WON'T run on "anything from a wristwatch to a supercomputer". No Windows distro will fit a wristwatch, and iinm there are no supercomputers running windows. I don't think you could get windows to run on a supercomputer; IINM it's x86 only, while Linux can be compiled to run on almost anything.

      So while the 48 cores is a problem for Linux now (but only on the biggest machines), it's no problem for Windows because Windows just won't work AT ALL on a computer that big. Not yet.

    93. Re:Original Source and Actual Paper by mlts · · Score: 3, Informative

      I saw earlier today on another news site a post about something similar saying that no OS commercially made can support more than 32 cores.

      One of the followup postings was someone with an IBM 780 doing a prtconf|grep proc and showing 64 virtual processors on an LPAR. AIX supports up to 256 CPUs (physical or virtual.) I'm sure Solaris can do similar without breaking a sweat.

    94. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Reading comprehension FTL. NevarMore was discussing eldavojohn's overall contribution to slashdot including posts containing good commentary.

    95. Re:Original Source and Actual Paper by apoc.famine · · Score: 1

      I've been feeling the same way for a long time now. However, I don't see that Reddit is any better. I skimmed the science section there and it's a mess of random cruft.

      Where do we go to get useful science and technology news, edited appropriately, with a comment system that doesn't completely suck?

      Slashdot would be fantastic if we could get some editors. However, the lack of any sort of useful work in that area is pretty close to driving me away even without an alternative. I blocked kdwason, which helped. But blocking absolute, mind-blowing suck doesn't help make the rest not suck.

      --
      Velociraptor = Distiraptor / Timeraptor
    96. Re:Original Source and Actual Paper by dgatwood · · Score: 4, Informative

      Well, gcc is likely to keep being the world's de facto C compiler (though even this was mainly because of the egcs fork way back when).

      Actually, I doubt that is true. At this point, the commercial UNIX vendors and the BSDs seem to be putting their weight behind Clang/LLVM/LLDB, in large part due to GCC going GPLv3. In addition to being a cleaner architecture that's easier to enhance than GCC, it is also faster, and it often produces much better code as well. The GNU toolchain's days as the de facto standard are numbered, IMHO.

      Back on topic, it occurs to be that large clusters with hundreds of cores start to inherently behave a lot more like NUMA and really need to be treated that way. Note that lots of modern OSes, including Linux, have supported NUMA in the past, so suggesting that it requires a completely rewritten OS is a preposterous assertion. That's not at all what this article is saying. What this article is saying is that tasks often are not easily divisible into tasks small enough to take advantage of multiple cores, and that managing processor affinity to ensure that threads working on the same data are run on the cores within the same physical die starts to become an unmanageable problem past a certain point.

      In effect, what it is saying is that barring interconnect improvements, for many classes of problems, the performance penalty caused by multiple cores needing to access the same data exceeds the performance gain from adding additional cores at or around 48 cores. No OS change will help this, and in many cases, no software changes can help this, either. Most computing tasks are simply not massively parallelizable. This conclusion should be entirely expected by anybody who has ever tried to parallelize software to any real degree, but it's always good to see studies that bear out.

      Put another way, once you exceed about 48 cores, the cores start to act more like clusters than cores. You start to see more and more accesses in which one CPU has to force data out of another CPU's cache. The nonuniformity of memory accesses starts to dominate the access times. Thus, past about that point (and probably much lower for most problems), adding more cores no longer improves performance. Even for massively parallelizable problems like video compression, once you exceed a certain number of nodes doing the work, the time spent assembling the final data actually exceeds the performance win achieved by adding additional processing nodes. This is completely straightforward, completely understood by real-world computer programmers, and shouldn't really be a surprise to anyone.

      I'm not convinced an OS change can fix this, nor even an architectural change, though both can help to some degree by making parallelization easier (e.g. by providing APIs for supporting work units arranged in a dependency graph like GCD as an alternative to raw thread-based APIs). At some point, though, you're bounded by the number of distinct pieces that a problem can be divided into that don't depend on the output of any other piece, and once you hit that limit, adding additional computational units can only hinder performance, not help it. Your only real choices, then, are to find new and interesting ways to refactor the problem so that this is no longer the case, to change the structure of the input data to remove dependencies, to increase the speed of the individual CPU cores, or to turn the machines loose processing more than one problem at any given time to keep the remaining cores occupied.

      Oh, yeah, and there's one other change that helps a lot: keep your read-only data in read-only pages, and write your code so that results go somewhere else. Read-only pages can be cached in every CPU without any real cache coherency overhead, at least in theory (I'm assuming that most modern CPUs do this), which means that input data sharing between CPUs doesn't matter. This design, combined with lockless work unit APIs, can make a huge difference in how many CPU

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    97. Re:Original Source and Actual Paper by wastedlife · · Score: 1

      While I understand why they kept the version number at 6.1(the changes were definitely not as drastic as 5.2->6.0), I have trouble believing thats why it was named "7". Windows NT hasn't been referred to by its version number since 4.0. Why start now? And then why do something confusing like not naming after the actual version number? Then again, they started NT off with 3.1, with no 1.0 or 2.0 preceding it... This is one of many reasons why Marketing should not dictate version numbers.

      --
      Said, "It's just like dice but it's got more sides And it tells me who lives and who dies"
    98. Re:Original Source and Actual Paper by lwriemen · · Score: 1

      Posting as "Anonymous Coward" and not citing any sources doesn't lend much credibility to your opinion. In fact, I've never heard this stated before outside of anti-OS/2 FUDsters in the newsgroups. It doesn't reconcile very well with this statement, "Ziff-Davis Labs observed a 90 percent improvement in throughput when adding one processor, and a 300 percent improvement when adding three processors." from http://www.databook.bz/?page_id=223

      For those interested in OS/2 SMP: http://www.edm2.com/index.php/OS/2's_Symmetrical_Multiprocessing_Demystified

    99. Re:Original Source and Actual Paper by aBaldrich · · Score: 1

      what about this: http://slashdot.org/~rms?

      --
      In soviet russia the government regulates the companies.
    100. Re:Original Source and Actual Paper by Angostura · · Score: 1

      Having edited a few IT mags in my time, I think your summary was just too wordy, with the main thrust getting buried. Here's my attempt at subbing your submission down, without murdering the meaning or jettisoning the salient points

      It seems current operating systems inevitability slow down as the number of cores they are running reaches around 48. An MIT teams's simulation shows the cause to be increasing memory congestion as data is forced to remain in memory as long as a core might need it. In their paper (PDF), they show how Linux can be adapted in the medium term by implementing counters to track which cores are working on the data. Even this approach eventually runs into problems as the OS spends an increasing proportion of its more time managing counters. The researchers caution that as the number of cores skyrockets, operating systems will have to be completely redesigned to handle managing these cores and SMP. Linux has got five to eight years until it needs a major redesign."

      My headline would probably have been along the lines of: "'Multicore proliferation will force major Linux redesign say MIT team"

    101. Re:Original Source and Actual Paper by DrgnDancer · · Score: 3, Informative

      SGI runs Single System Image Linux systems with over 1000 cores, that's not the problem. If you read the article it seems that they aren't talking about the number of cores in the system, they're talking about the number of cores on a chip. Multicore chips use shared caches. the problem is that the algorithms used to handle CPU caching don't scale to really huge numbers of cores sharing the cache in a single chip. Having 4X16 core chips will work fine, having a single 64 core chip will present difficulties. At least that's how I understand the article.

      --
      I don't need a million points of light, just two points of multi-mode fiber and a 10 Gig-E router.
    102. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      You put a CPU on a socket and a CPU can have several Cores if I'm not mistaken.

    103. Re:Original Source and Actual Paper by tajribah · · Score: 1

      The Amdahl's law is a gross oversimplification. It assumes that every problem consists of a part that is unavoidably sequential, while the rest is parallelizable in an unlimited way with no overhead. The reality is that almost every problem is parallelizable (with a few notable exceptions like the lexicographically minimal shortest path or constructing the DFS numbering of a graph where we do not know whether an efficient parallel algorithm exist), but problems differ in overhead imposed by their parallelization.

    104. Re:Original Source and Actual Paper by Angostura · · Score: 1

      No, it wasn't a bullshit headline, at least if you consider eldavojohn's summary to be correct. The last line in his submission was: "After reviewing the paper, one researcher is confident Linux will remain viable for five to eight years without need for a major redesign."

      Now, just because he buried that as the last line means nothing, any decent editor is going to see that and realise that it means: "Linux is likely to need a major redesign within 5-8 years to cope with multicore proliferation". They will also realise that this is a pretty important deal and should be brought to people's attention.

    105. Re:Original Source and Actual Paper by hardburn · · Score: 1

      At this point, the commercial UNIX vendors and the BSDs seem to be putting their weight behind Clang/LLVM/LLDB, in large part due to GCC going GPLv3.

      Do you mean to say that Netcraft confirms that BSD is killing GCC?

      --
      Not a typewriter
    106. Re:Original Source and Actual Paper by short · · Score: 1

      So what's left of GNU code now?

      Choose any two GNU packages on your system. That is still more than that one Linux kernel.

    107. Re:Original Source and Actual Paper by shutdown+-p+now · · Score: 1

      We are supposed to read the articles? Why didn't anyone tell me about this before?!!

      But they did - there was an article about that no long ago.

    108. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Nevertheless, we already are beyond that point...
      Power7 and SunSparc T2 are around 64 Threads per Chip and modern OS like Solaris have no trouble I'm aware of with massive amounts of cores/Chips...

    109. Re:Original Source and Actual Paper by careysub · · Score: 1

      To the AnonCow troll above: GNU code has nothing to do with how the kernel handles multicore processors, so your whole point is moot within this context.

      I think you should inform AnonCow that GNU is m-m-ooooot!

      --
      Starships were meant to fly, Hands up and touch the sky - Nicky Minaj
    110. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      How does a post that links our OT rant to the article get modded offtopic?

    111. Re:Original Source and Actual Paper by toofishes · · Score: 1

      So what's left of GNU code now?

      Choose any two GNU packages on your system. That is still more than that one Linux kernel.

      Yes, clearly all packages are made equal...

    112. Re:Original Source and Actual Paper by The+Wild+Norseman · · Score: 2, Funny

      SGI runs Single System Image Linux systems with over 1000 cores, that's not the problem.

      640 cores should be enough for anyone.

      --
      "A government is a body of people usually -- notably -- ungoverned." -Shepherd Book
    113. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      GNU's basic command line tools and bash will no doubt still be used on servers and desktops.

      Off and on for the last few years I've been porting over tools from FreeBSD, NetBSD, and OpenBSD. Thus far I have BSD versions of the following tools/suites:

      at, compress, coreutils, ed, findutils, gzip, init, nc (netcat), netkit, newsyslog, pax, rcs, syslogd, traceroute, whois, yacc, tar (from libarchive).

      Some poorly-written programs expect GNU tools (generally when building from source), but these tend to be easy to work around. I've been using these BSD tools for a while with few hitches. The huge advantage as far as I'm concerned is that I have useful man pages (GNU's man pages are, if they exist, generally pointers to info pages, which are extremely annoying when all you want is a concise summary of a program). It's only a bonus that my tools are under a more free license and contribute to a "GNU/Linux"-free environment. Plus I use zsh, not bash.

      There are still a number of GNU programs I have around (glibc, gcc {but I'm following/using clang a lot}, bc, make, and so on) that I probably will for the foreseeable future, but I'm slowly phasing them out as I can.

      Why not just use FreeBSD? In part because I've used Linux for over 15 years and am comfortable with its kernel/module build process, in part because I've created my own distribution (package system, init system, etc) that I'd hate to stop using (NIH and all that)... Although for a production system that other people would have to use, I'd take FreeBSD over any other Linux distribution. My system really does feel like BSD, by and large.

    114. Re:Original Source and Actual Paper by mini+me · · Score: 1

      Windows NT versions 1.0 and 2.0 exist. They went by the name OS/2. When OS/2 3.0 became Windows NT, dropping the version back down to 1.0 would have been marketing driven.

    115. Re:Original Source and Actual Paper by Guspaz · · Score: 1

      Sure, but we're not that far off. AMD has got 12 core Opteron x86 chips sitting around, and Intel's got 16 thread Xeon chips. Since Moore's law is at least currently alive and well, we might well hit 48 threads in 2-3 years, and 48 cores in 4-5.

    116. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Great post, thankyou for bringing it to a larger audience. I look forward to more cross-posts from LHB.

      You know, the more I think about it, the more it seems like it's not Linux I hate at all. It's GNU. Maybe LHB should be GNU Hater's Blog.

    117. Re:Original Source and Actual Paper by dmitriy · · Score: 1

      Intel announced Knights Corner, a 50-core x86 processor.

    118. Re:Original Source and Actual Paper by AaronLS · · Score: 1

      LOL. I remember my friends mocking me when I bought a computer with 24mb ram. They said I would never need that much...

    119. Re:Original Source and Actual Paper by Wonko+the+Sane · · Score: 1

      Based on the rate of change of the kernel it's safe to say that the versions that exist 5-8 years from now will be a complete redesign compared to the versions that exist today.

    120. Re:Original Source and Actual Paper by Shagg · · Score: 1

      it's to give the reader enough information to decide whether reading the full article is worth it

      You're joking, right? This is slashdot.

      --
      Unix is user friendly, it's just selective about who its friends are.
    121. Re:Original Source and Actual Paper by joib · · Score: 4, Informative

      Unfortunately, the summary as well as the short articles on the web were more or less completely missing the point. The actual paper ( http://pdos.csail.mit.edu/papers/linux:osdi10.pdf ) explains what was done.

      Essentially they benchmarked a number of applications, figured out where the bottlenecks were, and fixed them. Some of the things they fixed where done by introducing "sloppy counters" in order to avoid updating a global counter. Others were to switch to more fine-grained locking, switching to per-cpu data structures, and so forth. In other words, pretty standard kernel scalability work. As an aside, a lot of the VFS scalability work seems to clash with the VFS scalability patches by Nick Piggin that are in the process of being integrated into the mainline kernel.

      And yes, as the PDF article explains, the Linux cpu scheduler mostly works per-core, with only occasional communication with schedulers on other cores.

    122. Re:Original Source and Actual Paper by Guspaz · · Score: 1

      Even for massively parallelizable problems like video compression, once you exceed a certain number of nodes doing the work, the time spent assembling the final data actually exceeds the performance win achieved by adding additional processing nodes.

      Clearly you don't work on any of the h.265 proposals. When it takes almost an hour per frame to encode, the number of nodes where assembly time exceeds time savings is astronomical ;)

    123. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Shut. The. Fuck. Up.

      Seriously, stop giving them ideas. Do not underestimate their willingness and capacity to turn obvious, noncontroversial things into fear-mongering jingoistic bullshit. If any Fox News researchers read Slashdot, they probably just copy/pasted that into their "Bullshit To Propagate" database.

      What I'm saying is, if Glen Beck starts talking about IPv6 tomorrow, and how it's part of Obama's Orwellian strategy to destroy America, then I am blaming <i>you</i>.

    124. Re:Original Source and Actual Paper by amorsen · · Score: 2, Insightful

      I'm willing to bet that when mainstream 64-core general-purpose CPUs arrive, they will be NUMA and be partitioned in groups with shared cache. I will be surprised if all the cores have a shared cache other than possibly a large slow write-through level-4 cache. It would be very tricky to make an efficient modern cache with deferred writeback and access by 64 cores, and the gains over e.g. 4 smaller caches would be modest. The memory bandwidth requirements of a 64-core chip also make it very tempting to implement separate memory controllers for groups of cores instead of needing an extremely fast shared memory controller.

      So all in all, I think a very fast desktop tomorrow will look like a shrunk version of a modern NUMA server, at least when it comes to what the operating system can see.

      --
      Finally! A year of moderation! Ready for 2019?
    125. Re:Original Source and Actual Paper by tiptone · · Score: 1

      Aww fuck, and here I am with 0 mod points. +1 funny (close enough?)

      --
      Please don't read my sig.
    126. Re:Original Source and Actual Paper by joib · · Score: 1

      From the PDF article:

      We run experiments on a 48-core machine, with a Tyan
      Thunder S4985 board and an M4985 quad CPU daughter-
      board. The machine has a total of eight 2.4 GHz 6-core AMD Opteron 8431 chips.

    127. Re:Original Source and Actual Paper by DrgnDancer · · Score: 1

      You may well be right, and it that case I think the paper's analysis would need revision. Under the current methods for producing multicore CPU's though, this will be a problem. Hence the paper.

      --
      I don't need a million points of light, just two points of multi-mode fiber and a 10 Gig-E router.
    128. Re:Original Source and Actual Paper by djdavetrouble · · Score: 1

      I acknowledge your desire for good summaries, but YMBNH. Apparently you never saw the now famous post announcing apple's ipod. Lame.

      --
      music lover since 1969
    129. Re:Original Source and Actual Paper by joib · · Score: 1

      From the PDF article (http://pdos.csail.mit.edu/papers/linux:osdi10.pdf ):

      We run experiments on a 48-core machine, with a Tyan
      Thunder S4985 board and an M4985 quad CPU daughter-
      board. The machine has a total of eight 2.4 GHz 6-core AMD Opteron 8431 chips.

    130. Re:Original Source and Actual Paper by dieth · · Score: 1

      he mad!

    131. Re:Original Source and Actual Paper by jopsen · · Score: 1

      As you pass a certain number of cores, modern operating systems will need to be redesigned to handle extreme SMP.

      Assuming you want one... operating system running all cores... When the number of cores starts sky rocking, shared memory between all cores is going to be impossible anyway...
      In a distant future where Linux needs a rewrite, we will be doing distributed computing... not SMP... Face it, even if we could do SMP, no one would be able to program it...

    132. Re:Original Source and Actual Paper by Ben4jammin · · Score: 1

      Whoa there, Sparky...let's ease up with all this "crazy talk" about reading articles. If I wanted well reasoned informed posts I sure as hell wouldn't be here, now would I? I want hyperbole, insinuation, straight up lies, and for the icing on the cake talk about my momma.

    133. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      HP sells Proliant servers that can have up to 64 cores these days. Stuff an 8 socket DL980 with a bunch of X7560 processors and you've got 64 cores. I work in a shop where we run Windows Datacenter on it and the thing screams.

    134. Re:Original Source and Actual Paper by bonch · · Score: 1

      Well, hey, a boot loader and some Bash scripts sure sound like a valid reason to prefix everything with "GNU."

    135. Re:Original Source and Actual Paper by vinod4linux · · Score: 1

      Actually a lot of intensive tasks today are highly parallelizable.. for example network stacks. OSes changes will definitely help when they are able to identify the distances between processing units and memory... I'm currently working on such projects and have seen the limitations due to lack of NUMA on systems with multiple CPUs, advantages of allowing CPUs to access "local" memory and reducing page access contention. The main problems seems to be that once you have more than one CPU, for any workload that has high memory access, your bottleneck is most likely going to be the memory controller.

    136. Re:Original Source and Actual Paper by GooberToo · · Score: 2, Insightful

      Completely agree.

      Of course, this all ignores the fact that Linux already scales well beyond 48 cores. Even more so, it appears the group is confusing bus contention for OS scalability. The problem is, using modern CPUs (cores), they are sharing caching, which is all too frequently the real problem. The shared cache leads to cache contention.

      Linux, right now, is capable of scaling well beyond 128 cores (err...cpus)...and more... Its just not standard code because the overhead is less optimal for 99.999% of the current user base. Basically this boils down to, Windows scales poorly. I've not met anyone who doesn't already know this.

      Long story short, News at 11, a story everyone already knows. No new news is now news. Basically they documented what everyone already knows for almost a decade now.

    137. Re:Original Source and Actual Paper by bonch · · Score: 1

      Good summaries do not offer commentary. Save the commentary for the comments.

      Not implying something in a submission increases the likelihood it will get rejected. When I learned to start implying things, my acceptance rate went up.

    138. Re:Original Source and Actual Paper by bonch · · Score: 1

      If you've heard of the inverted pyramid in journalism, you'd know that basic facts are stated in the first paragraph, and less relevant details follow in subsequent paragraphs. The first paragraphs of every news story are a summarization.

    139. Re:Original Source and Actual Paper by davester666 · · Score: 1

      In other news, lots of software will need to be updated and/or rewritten to work optimally with hardware THAT HAS YET TO BE CREATED.

      I'll be MacOSX, Windows 7, BSD and every other x86 operating system would need significant changes to work optimally on a 48-core chip.

      --
      Sleep your way to a whiter smile...date a dentist!
    140. Re:Original Source and Actual Paper by The+Grim+Reefer2 · · Score: 1

      Good summaries do not offer commentary. Save the commentary for the comments.

      This is Slashdot dammit. A good summary is to cut and paste the first paragraph or two from the link.

    141. Re:Original Source and Actual Paper by im_thatoneguy · · Score: 3, Informative

      The original summary was lacking but the alternative proposed summary was WAY too long.

      It's just supposed to pique my interest enough to read the article, not run several pages.

    142. Re:Original Source and Actual Paper by Gilmoure · · Score: 1

      Hey, we're trying to incite violence here.

      How about: Nano is way betterz than eMacs and Vi

      or 9mm totally blows away .45's.

      Something that will get frothing people at the mouth.

      --
      I drank what? -- Socrates
    143. Re:Original Source and Actual Paper by eclectus · · Score: 1

      The Sun/Oracle Niagra II cpu is an 8 core, 8 thread per core cpu, and they put 4 physical cpu's in a box. That's 256 virtual cpus (and up to 512GB ram) in a 4u chassis. They've been shipping for 2 years.

      --
      This signature is a waste of 42 characters
    144. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Yup, you beat me to it. AIX 7.1 scales to 256 cores and 1024 threads. The hardware guys have not gotten there yet, however. If you have a Power 795 and want to build an LPAR with more that 128 CPU's you have to get a special RPQ from IBM. So far this has not been a problem for any of my customers.
      - Anonymous IBM Solution Provider

    145. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Yes, the summaries and headlines (i.e., the part that editors control) are sometimes crappy and even offensive. The linked articles frequently are blog entries that may not even link to the informative original. Still, it's the comments that keep me coming; amid the trolls, regurgitated prejudices and overdone memes there is almost guaranteed:

      • one gem of insight,
      • something I hadn't thought of before,
      • a silly remark to laugh heartily at and make my coworkers wonder about me again,
      • a worthy link,
      • a well-told personal story,
      • AND/OR a longer than usual piece of knowledge.

      To me that's worth the indignation

    146. Re:Original Source and Actual Paper by dgatwood · · Score: 1

      Ouch. :-)

      By "video compression", I was thinking more about actual, optimized, shipping codecs that are not computationally intractable. :-D

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    147. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      For gods (all of them) sake, Solaris has successfully been doing this since the late 90's (only partial success in the mid 90's). Why do we have to keep re-inventing the wheel!!!!

      Unfortunately Larry might no want you to use it any more, so best off stop wasting time on Linus's kernel and wheel in a real kernel with linux libraries on top... oh wait Sun had done that with branded zones, but I think Larry has other plans for that too.... bugger.

      Sorry

    148. Re:Original Source and Actual Paper by gtall · · Score: 2, Funny

      Nah, jokes like "64xx should be enough for anybody" actually suck the humor out of the reader due to old age. The GP is probably an inmate in the Home for the Terminally Bland.

    149. Re:Original Source and Actual Paper by zopf · · Score: 1

      The fun part is virtualization. In virtualization, you buy four rooms and call them an office, but it's a magical office, where the building managers can choose to swap out the rooms, plumbing, and electricity at any time, but you can't tell because they paint it all to look exactly the same. Creepy, huh?

      --
      Did you see the pool? They flipped the bitch!
    150. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      What are you smoking? CmdrTaco posted a reply in a thread in September of '09! That's merely a single year ago (plus a few days).

    151. Re:Original Source and Actual Paper by Kumiorava · · Score: 2, Insightful

      I just read the original article that said they used 8 6-core processors to _simulate_ 48-core processor. It would be hard to experiment on a real 48-core processor as those are not readily available.

    152. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      read the article... it's not about "hyperthreading" as much as different cores.

      Whether it's 48 or not is up for debate... but there is a problem coming, and it's no great surprise. Generally software just doesn't run in parallel very well. Unless it's something well suited for it - perhaps a webserver with a high load (even then, according to this article, you will hit a wall where the OS spends more time coordinating the CPUs than running software).

      We split our apps into UI, and printing threads for example - which is fine for 2 or 3 cores. However, when you have 100 cores, 200, 300... then you have a MUCH greater need for software to split itself into parts that can run alongside each other - and most of our design techniques and tools just don't cut it.

    153. Re:Original Source and Actual Paper by dgatwood · · Score: 1

      Certainly. I/O in general is highly parallelizable because it involves working with individual chunks of data that have little or no dependency on any other data around them, at least up to the point where you have to have those packets assembled in order in an mbuf chain or whatever. At some point, ordering has to be enforced, and I'd expect this to become a bottleneck, though admittedly you could still parallelize throughput up to the point where you have one thread per open socket.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    154. Re:Original Source and Actual Paper by NoOneInParticular · · Score: 1

      The article is about a single chip with 48 cores, not about 48 chips with one core. These have been there forever. There is a difference.

    155. Re:Original Source and Actual Paper by Bert64 · · Score: 1

      They were, up to twice as fast as a PC specially selected to be half the speed.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    156. Re:Original Source and Actual Paper by aixylinux · · Score: 1

      To say nothing of the Power 7 795, which scales to 256 cores with 4-way SMT, for 1024 hardware threads.

      http://www-03.ibm.com/systems/power/hardware/795/index.html

      And AIX 7 will handle this in a single-system image:

      http://www-03.ibm.com/systems/power/software/aix/v71/preview.html

      Desktop operating systems, indeed.

    157. Re:Original Source and Actual Paper by Slime-dogg · · Score: 1

      Timothy posts often enough.

      --
      You need to restart your computer. Hold down the Power button for several seconds or press the Restart button.
    158. Re:Original Source and Actual Paper by Slime-dogg · · Score: 1

      The purpose of an editor is to edit any submissions to make them ready for print. If the summary was too long, the editor should have got off his arse rather than wait for the summary that fits the word count to come along.

      Or... They could choose the shorter submission out of the firehouse. I highly doubt that there was a serious intention to wait around until a shorter submission showed up.

      --
      You need to restart your computer. Hold down the Power button for several seconds or press the Restart button.
    159. Re:Original Source and Actual Paper by Bob-taro · · Score: 2, Insightful

      Think of in terms of cars. The processes are roads, the CPUs are cars and the cores are the seats in the cars, only the seats can each travel on different roads independently and share resources with the other seats in the same car. If you have a 2-seater and the seats are on different roads, they can obviously only go half as fast as if they are on the same road. Now if you have 48 seats in a car, than it isn't a car anymore, it's a bus, so obviously you'd have to make fundamental changes to the OS.

      When it comes to computers, you can never go wrong with a car analogy.

      --
      Prov 9:8 Do not rebuke mockers or they will hate you; rebuke the wise and they will love you.
    160. Re:Original Source and Actual Paper by tyrione · · Score: 1

      If you've heard of the inverted pyramid in journalism, you'd know that basic facts are stated in the first paragraph, and less relevant details follow in subsequent paragraphs. The first paragraphs of every news story are a summarization.

      One word, Abstract.

    161. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      I don't follow. Could you please rephrase into a car analogy?

    162. Re:Original Source and Actual Paper by Deathlizard · · Score: 1

      a 4 way 12 core AMD Magny-Cours system is 48 core, is cheap, and available now.

      AMD is already talking about eclipsing that on the server end, that is unless Oracle buys them and screws it up.

    163. Re:Original Source and Actual Paper by petermgreen · · Score: 1

      Debian (and I suppose Ubuntu too) makes use of a lot of Bash scripts behind the scenes
      Both ubuntu and debian now use dash as the default shell.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    164. Re:Original Source and Actual Paper by Magic5Ball · · Score: 1

      Why do you assume the activity of /. is connected to journalism in the particular ways you imply?

      --
      There are 1.1... kinds of people.
    165. Re:Original Source and Actual Paper by Nethead · · Score: 1

      Like me, I'm sure you've found that a low UID doesn't matter for squat.

      --
      -- I have a private email server in my basement.
    166. Re:Original Source and Actual Paper by AVryhof · · Score: 1

      He uses an EMACS mode to post on Slashdot.

    167. Re:Original Source and Actual Paper by loufoque · · Score: 1

      I know for a fact that at least one of the h265 proposals, the one from Nokia, Tandberg and Ericsson, does run in real-time, since we have it working on prototype videconferencing software.

    168. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      I would dig deeper into the subreddits, You can fully replace all the news Slashdot delivers with very good commentary. Generally Reddit gets much better the smaller you slice it. Consider it the Homeopathy of discussion groups. /science is a bit broad, but you break it down and you find some gems.

      http://www.reddit.com/r/technology
      http://www.reddit.com/r/space/
      http://www.reddit.com/r/physics/
      http://www.reddit.com/r/particlephysics
      http://www.reddit.com/r/chemistry/
      http://www.reddit.com/r/biology/
      http://www.reddit.com/r/PhilosophyofScience
      http://www.reddit.com/r/cyberlaws
      http://www.reddit.com/r/hardware
      http://www.reddit.com/r/netsec

    169. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Even OS/2 could scale to 1024 processors if I recall correctly.

      Because #define MAX_CORES 1024 means it scales efficiently to 1024 cores. Right?

    170. Re:Original Source and Actual Paper by FooHentai · · Score: 1

      Or 2 buildings, 16 rooms and 64 dividers. Plus with Hyperthreading probably a water cooler and some pot plants.

    171. Re:Original Source and Actual Paper by kramulous · · Score: 1

      While one side of the divided room has more people in it, the divider can be moved to allow greater space. When that meeting is over and the other side of the room requires the space, the divider moves.

      --
      .
    172. Re:Original Source and Actual Paper by treeves · · Score: 1

      How do you then explain the Bennett Haselton (am I remembering the right guy?) summaries that /. uses? Maybe it's been a while but they could run *very* long.

      --
      ...the future crusty old bastards are already drinking the Kool-Aid.
    173. Re:Original Source and Actual Paper by Just+Some+Guy · · Score: 1

      I have unsubscribe from Slashdot today due to the trend typified in your article VS the one published. (No this is not a new trend, but I'm fed up and finished with it.) See you on Reddit's Science/Linux/Everything else

      You read Slashdot via RSS? No wonder you don't like it. See, the thing is that we all know the stories suck. Furthermore, they've always sucked. There's isn't some Golden Age of Slashdot when the editors edited and the headlines matched the summary which matched the linked article. The point of Slashdot isn't in the news but in the discussions that follow. I promise you that I've learned far more by listening to subject matter experts who disagree to the death with each other and expound on why the other person's solution is clearly inferior to their own than I ever have by reading the article. Did you think all those "What?!? Read the article?!? Read the summary?!?" comments were jokes?

      If all you want to do is catch an interesting headline and a paragraph about something cool, look elsewhere. If you want to hear the behind-the-scenes stories and talk directly to computer scientists and particle physicists and lawyers and doctors and farmers and mechanics about their favorite fields, then you're in the right place.

      --
      Dewey, what part of this looks like authorities should be involved?
    174. Re:Original Source and Actual Paper by Thinboy00 · · Score: 1

      How about: Visual Studio is way betterz than eMacs and Vi

      FTFY. Remember, we're trying to incite violence, not irritation.

      --
      $ make available
    175. Re:Original Source and Actual Paper by nanospook · · Score: 1

      Ok everyone check their join number.. it should decrement by one..

      --
      Have you fscked your local propeller head today?
    176. Re:Original Source and Actual Paper by nanospook · · Score: 1

      My 32K apple beats that story all hollow :)

      --
      Have you fscked your local propeller head today?
    177. Re:Original Source and Actual Paper by Sxooter · · Score: 1

      The big problem with those chips (the 8xxx series AMDs) is that they have an 800MHz memory bus, and you just can't pump enough data into and out of them at that speed to keep them all busy. I've got a quad 12 core magny cours, and it can pump around 80 to 100Gigabytes per second into / out of memory. If they want to borrow some of my down time to benchmark I'll gladly cooperate.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    178. Re:Original Source and Actual Paper by mjwx · · Score: 2, Funny

      The thing is eldavojohn practically *is* an editor for /. , just check out his submission page. Despite having such a high UID he's got a solid reputation, a good writing style, and offers good commentary on a wide variety of topics.

      Which is exactly why he _cant_ be a /. editor.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    179. Re:Original Source and Actual Paper by daveime · · Score: 1

      How about: Visual Studio is way betterz than eMacs and Vi

      FTFY. Remember, we're trying to incite violence, not irritation.

      FTFTFY. Remember, we're trying to incite violence, not uncontrollable laughter.

    180. Re:Original Source and Actual Paper by daveime · · Score: 1

      So the L2 shared cache is like the coffee machine ?

    181. Re:Original Source and Actual Paper by walshy007 · · Score: 1

      The problem only exhibits itself when there are 48 cores on a single cpu socket and die.

      We are still a while away from that, and the changes to accomodate it aren't that severe so likely the kernel will just deal with things as it has to as new hardware is made and exists.

    182. Re:Original Source and Actual Paper by FrootLoops · · Score: 1

      I disagree. A poor summary makes me go straight to the story's comments for a decent one. This story is a good example, and the comments included a particularly good summary (the above).

      I'd imagine the majority of readers would prefer longer, better summaries than shorter, less informative ones. Of course, without a detailed survey and such I can't say for sure.... My point, though, is that I doubt there's much consensus on what a summary is "supposed" to do, and randomly saying one attribute is required without backup is silly.

    183. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      I was thinking more like "each room has 2 monkeys on typewriters in it".
      But I guess "movable dividers" works.

    184. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Linux already powers servers with thousands of processors in them. From companies such as Silicon Graphics for example.
      I worked at Silicon Graphics many years ago when we first sold a server with more than 1000 cpus, and it was running Linux.
      If the point here is processors with more than 48 cores, then maybe those chips should be tuned to be better managed and exploited in computers.

      Of course system design has to be adapted, OS has to be adapted, compilers have to be adapted, and even programmers have to adapt to get full benefit a new architectures.

      Bottom line is this post is not really serious, or at least it does not address the problem correctly.

    185. Re:Original Source and Actual Paper by g4b · · Score: 1

      true, your simple english article does read a little bit longer.

    186. Re:Original Source and Actual Paper by RocketRabbit · · Score: 1

      Your mistake is that you didn't link to an ad-infested garbage tech site. Nobody makes any money with your sub.

      It's rare to see a /. post that has a link directly to the meat of an issue these days, for just that reason. You can't do any search engine optimization for a third party your way.

      Quit submitting. I did years ago.

    187. Re:Original Source and Actual Paper by DrXym · · Score: 1
      GNU is just part of a regular Linux distribution, the sum of which people quite happily refer to by the simple moniker Linux, or if more context is required Red Hat Linux etc.

      It strikes me as immature and selfish that anyone insist any dist be called GNU/Linux. There are substantial non-FSF parts to all mainstream dists. Parts from Apache, X11, Mozilla, Sun, Novell etc. Parts that make the dist useful for something and all of which deserve credit. But unless we intend to say Mozilla/Apache/Perl/Aladdin/X11/MIT/BSD/RedHat/Novell/Sun/Trolltech/GNU/etc./Linux, just plain old Linux is perfectly simple understand and is also the right thing.

      Insisting on calling it GNU/Linux is disingenuous, dishonest, nonsensical and probably motivated by sour grapes. If the FSF doesn't like people running Linux and using the many non-FSF contributions that make it useful, perhaps they should get cracking on Hurd.

    188. Re:Original Source and Actual Paper by spottedkangaroo · · Score: 1

      AC, that's a -1: didn't like.

      --
      Imagine if you weren't allowed to use roads because a bus company complained about your driving 3 times. --skunkpussy
    189. Re:Original Source and Actual Paper by r_a_trip · · Score: 1

      Well, hey, a boot loader and some Bash scripts sure sound like a valid reason to prefix everything with "GNU."

      Ah, the shortsightedness when it comes to looking at history.

      If RMS hadn't started GNU in the early eighties, then Linus Torvalds wouldn't have had the free and rich operating system tools with which he complemented his Linux kernel.

      If Linus wouldn't have made the combination of Linux and the GNU toolchain, then KDE and Gnome wouldn't have had Linux to start their ascent upon.

      If the whole Linux + GNU + DE thing hadn't materialized then most of the big corporations wouldn't have gotten interested and poored in so much resources. Which would have led to much of the current applications not being developed in the first place.

      So I'd say it is more than deserved that staid old GNU gets some kudos. If that needs to be done by calling a distro GNU/Linux, I don't know, but GNU is a very important part of what makes up a modern Linux distro.

      --
      # touch universe # chmod +rwx universe # ./universe
    190. Re:Original Source and Actual Paper by CAIMLAS · · Score: 1

      Actually, I doubt that is true. At this point, the commercial UNIX vendors and the BSDs seem to be putting their weight behind Clang/LLVM/LLDB, in large part due to GCC going GPLv3. In addition to being a cleaner architecture that's easier to enhance than GCC, it is also faster, and it often produces much better code as well. The GNU toolchain's days as the de facto standard are numbered, IMHO.

      As far as linux is concerned, the kernel and most of the common useerland will also compile "just fine" with Open64 these days, which offers signfiicant performance improvements over gcc.

      gcc has been a bit dated for some time now - try almost a decade. THe performance of the resulting binaries is sad, even when compared against older versions of Microsoft's compilers (which now stomp it thoroughly.) It got a 'rewrite' with 4, but a lot of projects still use 3. I'd say it's probably about time for a change.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    191. Re:Original Source and Actual Paper by inKubus · · Score: 1

      You must be new here.

      --
      Cool! Amazing Toys.
    192. Re:Original Source and Actual Paper by inKubus · · Score: 1

      There's probably some type of mutex or something (probably called something else in hardware-land) that grants access to a given core to the L2. The more cores you have running, the greater the chance of lock contention (probably called something else in hardware land). So what you do is split the L2 up into chunks, preferably 1-2x the number of cores. Then each one has their own lock. And then you have a massive crossbar switch that connects each core to all the caches. Wait, this sounds like the Opteron.

      --
      Cool! Amazing Toys.
    193. Re:Original Source and Actual Paper by PitaBred · · Score: 1

      That's Hyperthreading. AMD's upcoming single-core multithreading is different, and I think it has the potential to be much more efficiently divided

    194. Re:Original Source and Actual Paper by smeaggie · · Score: 1

      I would think they recieved one of these by now: http://www.engadget.com/2010/04/10/intels-48-core-processor-destined-for-science-ships-to-univers/
      But maybe the profiling is easier in a simulated environment?

    195. Re:Original Source and Actual Paper by Thuktun · · Score: 1

      Yeah, the summary was horrible.

      It appears that the problem [...] is getting worse and may be hitting a peak somewhere in the neighborhood of 48 cores.

      This suggests that the problem will get better beyond 48 core, which is clearly twaddle.

    196. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      It's not uncommon to see someone lift the first paragraph of the original article without adding anything of their own. That's about as neutral as it gets.

    197. Re:Original Source and Actual Paper by the_womble · · Score: 1

      I believe Ubuntu actually runs its shell scripts on Dash, although it does use Bash as the default login shell (but that is easy enough to change). The boot loader is not part of the OS, neither is the partition editor.

    198. Re:Original Source and Actual Paper by the_womble · · Score: 1

      in large part due to GCC going GPLv3.

      Why on earth is that a problem. I cannot think of any aspect of the differences between GPL2 and GPL three that will have much effect on the adoption of a compiler.

    199. Re:Original Source and Actual Paper by wastedlife · · Score: 2, Informative

      While NT was originally supposed to be called OS/2 3.0, it was a new OS developed by Cutler and some other devs from DEC, not continued development of the OS/2 code.

      --
      Said, "It's just like dice but it's got more sides And it tells me who lives and who dies"
    200. Re:Original Source and Actual Paper by graffix01 · · Score: 1

      Hmm, I stand, err, sit, corrected!

      --
      Women don't want to hear what you think. Women want to hear what they think, in a deeper voice.
    201. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      And here I'm wondering: why exactly are you such a fanatic anti-GNU zealot?

    202. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      GNU is usually the biggest single part of a distribution (take the most popular one: Ubuntu). Insisting on calling it Linux instead of just GNU is dishonest, nonsensical and all that other stuff you mentioned.
      Using the illogical choice of "Linux" is motivated by idealism, especially that of catering to commercial interests and distancing oneself form Free Software (see the OSI). In the early days GNU was seen as the "big system" and Linux as a complement. But strangely, to a large part because of IBM's and Red Hat's marketing campaigns (among others), the inverse has become the accepted norm today.

    203. Re:Original Source and Actual Paper by ps2os2 · · Score: 0

      I cannot speak for other OS's but the two major IBM OS's have been able to handle 31 since the 1970's. With IBM's more current box's the max has been upped to 64. If I remember correctly there have been a few studies done by IBM (on their hardward and software) that somewhere along about 15 CPU's the line goes down. In other words there is more penalty due to overhead and you do not get any more real thruput. IBM's architecture is a lot more stringent than INTELS and a lot of processing is done outside of the CPU so the CPU is not getting charged to anyone for the internal processing that is done. So I am not sure on INTEL does its "thing" but IBM does document the entire process in publically (free) manuals (or PDF's ). I have yet to see how INTEL protects two (or more) different CPU's from updating the same location in memory, IBM does it very swiftly and it really works well. I once worked on a early multi processor from IBM and IBM's code was solid from day one, the other can't be said for vendors as we had to shoot several bugs of that type. Believe it happens quite often (at least in the IBM software side). Ever since day one of IBM's first publically available MP box and the use of the then current OS(MVS) it was written with multiple processors in mind and you could delete and add processors (and storage and channels) dynamically. We needed the feature quite often and would take storage offline and also take CPU's offline. Never ran into issues doing so but IBM did wrtite and test the OS better than anyone else has to date.

    204. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Debian uses /bin/dash for its "scripts behind the scenes", because (given that a script doesn't include any bashisms) it does the same job and is much faster.

      I've had /bin/sh linked to /bin/dash for years now, and... Well, go have a look, on my system it's:

      $ grep -Rhc '#!/bin/bash' /etc/ | perl -e 'my $a;$a+=$_ foreach(); print "$a\n"'
      16

      For /bin/sh, it's 315.

    205. Re:Original Source and Actual Paper by DrXym · · Score: 1
      Don't be silly Mr AC. I call Windows Windows regardless of what other software I run on it. I call OS X OS X regardless of what other software I run on it. And I call Linux Linux regardless of what other software I run on it. A single word is a perfectly adequate way to refer to the OS.

      The arguments made by the FSF that it should be called GNU/Linux are incredibly weak and incredibly selfish. There is no idealism at work in laughing at those arguments. Linux is the OS and Linux is a name everyone understands.

      And in case you think the same should not apply to the Linux kernel in some situations - wrong. Android is Android even though it has a Linux kernel in it. WebOS is WebOS even though it has a Linux kernel in it. People who advance the theory that Android should be called BSD/Linux or Linux/BSD/Android or even Linux/BSD/GNU/Android should be laughed at with equal force to those insisting on GNU/Linux.

    206. Re:Original Source and Actual Paper by anaesthetica · · Score: 1

      I've seen Pudge post a number of times.

    207. Re:Original Source and Actual Paper by Anonymous Coward · · Score: 0

      Hmm. I wasn't trying to be silly, but your arguments certainly aren't stronger than those you call for laughing at.
      OSX is OSX regardless of what other software I run on it, and I don't call it XNU/Mach because of it's kernel. Windows is Windows regardless of what software I run on it, and I don't call it WinNT because of it's kernel. And GNU is GNU regardless of what software I run on it, yet people still insist calling it by it's kernel.

      You might claim that the kernel is the central part of the OS, and it clearly is one of the most important pieces. But from a technical perspective, my applications don't run on the kernel - they run on the GNU libc, a part of the system as important as the kernel. Its only non-devs or kernel-only devs who call the kernel the OS ;)

      Android is Android no matter what kernel, the same for WebOS, and Debian is Debian no matter download the Linux or kFreeBSD or HURD variant - they're all GNU distributions, but only one of three has anything to do with Linux.
      I don't see how mentioning Android/WebOS helps your case at all... it rather works against you, doesn't it?

      GNU is, along with the kernel, the foundation of the system (think of all the core tools, utilities and helpers). I can't see anything selfish in trying to be honest. Sure Linux is understood by everyone, because this brand has been hyped a lot. It just depends on whether you value proper attribution and honesty more than... well, fandom?

      Mindlessly repeating the mantra that it's called Linux only is something I see way too often nowadays. Sadly.

      Oh well, maybe I should finally bother to create a Slashdot account, just so as my postings don't start with a score of 0...

    208. Re:Original Source and Actual Paper by spazdor · · Score: 1

      If anything, it constitutes agreement or endorsement of the editorial decisions made by the author of TFA.

      --
      DRM: Terminator crops for your mind!
    209. Re:Original Source and Actual Paper by Kaldaien · · Score: 1

      The paper refers to none of these parts of an Operating System, however. You can safely assume that anytime the paper mentions "Operating System," it is talking about the kernel. User-land components have almost nothing to do with _how_ the kernel deals with resource allocation, aside from requesting a subset of the resources.

      And there are plenty of Linux-based Operating Systems that do not use GNU, for your information. Embedded devices being the king.

  2. Barrelfish by Anonymous Coward · · Score: 1, Informative

    This is exactly why people are doing research on Barrelfish (http://www.barrelfish.org/).

    1. Re:Barrelfish by ciderbrew · · Score: 1

      The more I see photos like that, the more I think computers make you go bald.

    2. Re:Barrelfish by Eladith · · Score: 1

      From what I remember (hopefully correctly) engineering cache coherent shared memory architectures is increasingly difficult as core counts go up. Intel built a experimental processor that utilizes low latency message passing instead of shared, coherent cache. Nvidia and Ati have quite capable vector architectures as processors do not have shared cache, just as IBM Cell SPU's are somewhat capable as well. Distributed memory and message passing seem to have a bright future. Didn't SMP systems these days look and act internally like distributed systems?

      When we get to 48 cores same programming models might not be available at all. Barrelfish seems as quite sensible and interesting preparation for the future. Or perhaps Plan 9 will finally get to rise. I think I'll learn Go in the meanwhile, it has this interesting concept of "sharing memory by communicating".

  3. Linux already runs on thousands of cores by Chirs · · Score: 2, Insightful

    SGI has some awfully big single-system-image linux boxes.

    I saw a comment on the kernel mailing list about someone running into problems with 16 terabytes of RAM.

    1. Re:Linux already runs on thousands of cores by Gaygirlie · · Score: 4, Interesting

      It's not the case of not being able to do such, but instead about where there are performance regressions. Of course it's possible to run Linux on multiple hundreds of cores, but it seems that after 48 cores there is a performance regression and thus all those cores don't benefit as much as they could. That is the issue here.

    2. Re:Linux already runs on thousands of cores by DrgnDancer · · Score: 4, Informative

      I thought this as well, but after more carefully reading the article, I *think* I see what the problem is. It's not really a problem with large numbers of cores in a system, so much as a problem with large numbers of cores on a chip. Since the multicore chips share caches (level 2 cache is shared, level 1 cache isn't IIRC, but I could be wrong) it's actually cache memory where the issue lies. I've worked on single system image SGI systems with 512 cores, but those systems were actually 256 dual core chips. That works fine, and assuming well written SMP code performance scales as you'd expect with number of cores.

      --
      I don't need a million points of light, just two points of multi-mode fiber and a 10 Gig-E router.
    3. Re:Linux already runs on thousands of cores by TheRaven64 · · Score: 3, Interesting

      SGI has some awfully big single-system-image linux boxes.

      Not really. SGI has big NUMA machines, with a single Linux kernel per node (typically under 8 processors), some support for process / thread migration between nodes, and a very clever memory controller for automatically handle accessing and caching remote RAM. Each kernel instance is only responsible for a few processes. They also have a lot of middleware on top of the kernel that handles process distribution among nodes.

      It's an interesting design, and the SGI guys have given a lot of public talks about their systems so it's easy to find out more, but it is definitely not an example of Linux scaling to large multicore systems.

      --
      I am TheRaven on Soylent News
    4. Re:Linux already runs on thousands of cores by Gaygirlie · · Score: 2, Interesting

      Since the multicore chips share caches (level 2 cache is shared, level 1 cache isn't IIRC, but I could be wrong) it's actually cache memory where the issue lies.

      That's what I thought too, but after thinking it a bit more I'd dare to claim it's both a hardware and software issue. Too small cache of course does cause issues like the researchers noticed but it's mostly because the method how memory accesses and cache is handled in software that makes it such a big issue. Rethinking the approach how kernel handles such could very well minimize the impact even in cases where there is not all that much cache available.

      Of course, I'm not an expert in SMP or multi-core systems so I could have verily misunderstood it.

    5. Re:Linux already runs on thousands of cores by Troy+Baer · · Score: 3, Interesting

      Um, no. The early Itanium-based Altixes (Altices?) could go up to 512 cores running a single copy of Linux. The new Nehalem-based Altixes can have up to 2048 cores in a single system image IIRC. We just finished acceptance testing on an SGI Altix UV 1000 with 1024 cores. It runs one copy of Linux on it.

      --
      "My life's work has been to prompt others... and be forgotten." --Cyrano de Bergerac
    6. Re:Linux already runs on thousands of cores by Anonymous Coward · · Score: 0

      We just finished acceptance testing on an SGI Altix UV 1000 with 1024 cores. It runs one copy of Linux on it.

      I bet that saves on licensing costs.

    7. Re:Linux already runs on thousands of cores by prograde · · Score: 1

      I beg to differ. I have access to the following system:

      $ grep -c processor /proc/cpuinfo
      127
      $ uname -a
      Linux xxxxx 2.6.16.54-0.2.12-default #1 SMP Fri Oct 24 02:16:38 UTC 2008 ia64 ia64 ia64 GNU/Linux
      $ free
      total used free shared buffers cached
      Mem: 255840720 81293200 174547520 0 16 0
      -/+ buffers/cache: 81293184 174547536
      Swap: 73243120 0 73243120

      ...that's a single instance with 127 processors and 256GB of ram. It rocks. If you have the means, I highly recommend picking one up.

    8. Re:Linux already runs on thousands of cores by Anonymous Coward · · Score: 0

      This is wrong! The followup by Troy Baer is correct. I could walk upstairs and touch the 2048 core SMP Altix that has been running for a couple years at NASA Ames. It runs a single copy of Linux. SGI had scaled IRIX to 1024 cores on a single OS instances, so they built off of that work to scale Linux to 2048.

    9. Re:Linux already runs on thousands of cores by markus_baertschi · · Score: 1

      I don't know about SGI, but I do know that the biggest AIX/Power7 box you can get today has 256 cores (32 chips with 8 cores each). I know there was kernel work involved but the scaling problems I remember mentioned involved the number threads (1024) as the P7 core has 4-way multithreading.

      The Linux certainly may need work for such machines, but a new operating system ? bullshit.

    10. Re:Linux already runs on thousands of cores by Troy+Baer · · Score: 1

      I bet that saves on licensing costs.

      You'd think that, but a lot of HPC software gets priced either by-core or by-socket...

      --
      "My life's work has been to prompt others... and be forgotten." --Cyrano de Bergerac
    11. Re:Linux already runs on thousands of cores by Anonymous Coward · · Score: 0

      Yes, it is a single copy running on a very large number of CPUs. I've seen a lot of scalability patches related to that on LKML. Linux works and scales up to at least 2048 CPUs (I think we can get to 4096 actually), and short of 16TB of memory (soon to be fixed, apparently someone hit that ceiling and the patch is already ACKed for the next release).

      But that is NUMA, with a nice, sane cache hierarchy and topology. Sharing L2 across many cores can get nasty very fast, as the MIT research *proves*.

      Memory management starts getting nasty with too much RAM and too many zones, though. There will be limits there as well. It does come to a point where it would make more sense to have two hardware partitions that are completely independent, but with an extremely fast interconnect (i.e. a cluster).

    12. Re:Linux already runs on thousands of cores by falcon_dark · · Score: 1

      You're right! And Windows seem to suffer from a performance regressions after 1 core! We really need a complete OS rewrite ASAP!!!! just joking...

    13. Re:Linux already runs on thousands of cores by gilboad · · Score: 1

      The linked article doesn't really explain what was simulated and how. (At least not in depth, unless I missed something).

      However, I cannot disagree more.
      In one of my previous projects we developed a kernel based traffic monitoring software that was originally designed when top of the lines servers (such as the HP DL-585G1) had 4-socket, single core CPUs (A total of 4 cores).
      The same software scaled -linearly- (*) when 4 socket, 12 cores (AMD Opteron) and 8 core / 16 thread (Xeon EX) CPUs were released. (A total of 48 cores on AMD Opteron machines and 32 cores / 64 threads on Xeon EX machines)

      Now, if the Linux kernel (at least the short code paths that we used) had severe problems with scaling above, say, 48 cores, a 64 core Xeon EX performance would have scaled poorly compared to 48 core AMD Opteron machine, let alone a 24 cores Xeon EP (dual socket), but our experience seems to suggest otherwise.

      - Gilboa

      * Actually, we saw better-than-linear scaling, but this can be attributed to the huge L2/L3 caches in modern CPUs.

    14. Re:Linux already runs on thousands of cores by gilboad · · Score: 1

      Pressed send too early (3am here).
      My example goes to show that one must not threat the Linux kernel (or any other kernel for that matter) as a single entity that "needs redesign".
      The code paths that we used might have been huge-SMP safe, but it's very likely that other parts of the kernel might not be.
      When talking about redesign, one should be specific: Are they talking about the memory management? network stack? driver layer? workqueues? scheduler? IRQ handling? etc, etc, etc.

      - Gilboa

    15. Re:Linux already runs on thousands of cores by __aardcx5948 · · Score: 1

      Cat /proc/cpuinfo & paste here? ;)

  4. Error in their math by El_Muerte_TDS · · Score: 5, Funny

    They have an one-off error in their math, it's actually 9 times a 6 core CPU. So, at 42 cores a rewrite is needed.

    1. Re:Error in their math by Anonymous Coward · · Score: 1, Funny

      Dude, nobody makes jokes in base 13!

    2. Re:Error in their math by Anonymous Coward · · Score: 0

      What math are YOU using?

    3. Re:Error in their math by Anonymous Coward · · Score: 0

      9 * 6 = 54

  5. not anywhere near 48 cores? by Anonymous Coward · · Score: 0

    Not anywhere near 48 cores? Stick 4 AMD Magny Cours Opterons (12 cores each) in a quad socket motherboard and you will have 48 cores. Not that uncommon.

    1. Re:not anywhere near 48 cores? by NuclearRampage · · Score: 1

      You are not talking about 48 cores on a single CPU. Your comment does not apply.

  6. Not close yet? by BWJones · · Score: 1

    Dunno... I am typing this on a system with 12 cores and 24 virtual cores. And the GPU has somewhere around 1600 cores... Other systems I've worked with have hundreds to thousands of cores so I think we are pretty close...

    Seriously though, these issues have been known for a while but will have to trickle down to desktop OSs to deal with caching and shared memory.

    --
    Visit Jonesblog and say hello.
    1. Re:Not close yet? by blueg3 · · Score: 1

      And the GPU has somewhere around 1600 cores...

      It doesn't. Its resources also aren't managed by your operating system, so how Linux behaves in multicore environments is irrelevant to your GPU's operation.

    2. Re:Not close yet? by Fulcrum+of+Evil · · Score: 1

      and your GPU is managed much differently than the rest of the system. I can get 24 real cores with 48 virtual ones now, but apparently, it's going to be a pain to use efficiently.

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
    3. Re:Not close yet? by BWJones · · Score: 1

      It (the GPU) does according to the specs.

      As to OS resources, many would argue that how the OS behaves in multi-core environments is indeed relevant to the GPU. Witness Grand Central Dispatch and Open CL.

      --
      Visit Jonesblog and say hello.
    4. Re:Not close yet? by BWJones · · Score: 1

      Efficient use of said cores is the issue, no doubt. See comment above on Grand Central Dispatch and Open CL.

      --
      Visit Jonesblog and say hello.
    5. Re:Not close yet? by hardburn · · Score: 1

      It (the GPU) does according to the specs market-speak.

      GPUs are a very different architecture. They don't have "cores" in the same sense that CPUs do. They're also managed by a vender driver that may or may not be running at the kernel level. They're designed so that you can run the same operation on many different pieces of data simultaneously, rather than running many independent programs.

      --
      Not a typewriter
    6. Re:Not close yet? by blueg3 · · Score: 1

      It (the GPU) does according to the specs.

      Then you're misreading the specs, or reading poorly-written specs.

      On general-purpose CPUs, a 2-core processor has two computational devices that operate and are scheduled independently. (One core has multiple computational devices within it, but they're not independently-scheduled.)

      Take NVIDIA chips as typical GPU design. You can get the high-level design information from the CUDA documentation. A single graphics card contains a small number of "multiprocessors", which are computational units that are operate and are scheduled independently. At one point in time, the entire multiprocessor is executing only one instruction. Multiprocessors are SIMD: they execute each instruction on large numbers of data elements at the same time. The sequence of instructions operating on one data element goes by the unfortunate name of "thread" (not to be confused with an OS thread), and CUDA calls a block of these a "warp". So a 4-multiprocessor graphics card with 32-thread warps may perform 128 "operations" in a clock cycle, but only 4 distinct, separately-scheduled instructions. To perform all this work, each multiprocessor will have many logic units of different types.

      The closest analogue to a CPU core for GPUs is the multiprocessor, as they're independently scheduled and multiple multiprocessors will be placed on the same die.

      (Note that nVidia uses the unfortunate term "core" to describe a component of a multiprocessor. Most instructions are executed by a set of 8 "cores" on the multiprocessor. So one instruction executed on a 32-thread warp really takes four steps, being done 8 "instructions" at a time. Sets of 8 thread-instructions aren't scheduled independently, though; only whole warp-instructions. Likewise, the cores on a multiprocessor aren't independently-scheduled and bear little resemblance to CPU cores.)

      As to OS resources, many would argue that how the OS behaves in multi-core environments is indeed relevant to the GPU. Witness Grand Central Dispatch and Open CL

      When tasks are sent to a GPU via something like OpenCL, the operating system is still not involved in any of the memory or scheduling internals of the GPU.

      The relevant bit is that this is about how Linux handles N-core processors. How many processing units are on a graphics card has nothing to do with that problem.

    7. Re:Not close yet? by blueg3 · · Score: 1

      Note that current high-end nVidia processors have ~16 multiprocessors, each of which has 8 "cores". So even using the poor term of "core" (not like a CPU core), GPUs have in the hundreds, not thousands.

  7. Enough by wooferhound · · Score: 2, Funny

    640 cores ought to be enough for anybody . . .

    --
    We are Dead Stars looking back Up at the Sky
    1. Re:Enough by Ukab+the+Great · · Score: 1

      All you need is a single monolithic chip that's 640 times bigger than a regular core.

    2. Re:Enough by Anonymous Coward · · Score: 1, Funny
    3. Re:Enough by Anonymous Coward · · Score: 0

      lol i dont see why people cant be happy with a quad it gets the job done if you need more than 4 cores you should just shoot yourself

    4. Re:Enough by Abstrackt · · Score: 2, Funny

      lol i dont see why people cant be happy with a quad it gets the job done if you need more than 4 cores you should just shoot yourself

      I'll hand out guns to the scientists then. Maybe they'll be willing to donate their punctuation to you as well.

      --
      They say a little knowledge is a dangerous thing, but it's not one half so bad as a lot of ignorance. - Terry Pratchett
    5. Re:Enough by kevinmenzel · · Score: 1

      Do you do content creation? Especially the kind of content creation where massive parallelism is built in? (ie many tasks involved in recording an album?) No? Fine, don't complain. I reserve the right to want more cores in my poor compy :)

  8. What are they talking about by pclminion · · Score: 4, Insightful

    Can somebody please explain what the fuck they are actually talking about? They've dumbed down the terminology to the point I have no idea what they are saying. Is this some kind of cache-related issue? Inefficient bouncing of processes between cores? What?

    1. Re:What are they talking about by Anonymous Coward · · Score: 1, Funny

      ... from the original MIT article it sounds like there is lock contention on shared memory reference counts. ... but I'm making that up.

    2. Re:What are they talking about by Anonymous Coward · · Score: 1, Informative

      Memory pages are reference counted. Some of the pages are shared and the cores spend a lot of time reference counting. There is a point where the reference counting overhead dominates. It is hypothesized that this could be fixed by enhancing the OS to isolate pages to physical cores thereby removing the need to reference count; this is a fundamental change to the structure of traditional virtual memory management.

    3. Re:What are they talking about by jd · · Score: 5, Informative

      What they are talking about really reduces to a variant of Ahmdals Law, but simply put scaling is always non-linear. There will be overheads per core for communication (why is why SMP over 16 CPUs is such a headache) and overheads per core within the OS for housekeeping (knowing what core a specific thread is running on, whether it is bound to that core, etc, and trying to schedule all threads to make best use of the cores available).

      The more cores you have, the more state information is needed for a thread and the more possible permutations the scheduler must consider in order to be efficient. Which, in turn, means the scheduler is going to be bulkier.

      (Scheduling is a variant of the box-packing problem, which is an NP-Complete problem, but it has the added catch that you only get a very short time to pack the threads in and scheduling policies - such as realtime and core-binding - must also be satisfied in addition to packing all the threads in.)

      The more of this extra data you need, the slower task-switching becomes and the more of the cache you are hogging with stuff not actually tied to whatever the threads are actually doing. At some point, the degradation in performance will exactly equal the increase in performance for the extra cores. The claim is that this happens at 48 cores for modern OS'. This is plausible but it is unclear if it is an actual problem. Those same OS' are used on supercomputers of 64+ cores, by segregating the activities in each node. MOSIX, Kerrighd and other such mechanisms have allowed Linux kernels to migrate tasks from one node to another transparently. (ie: You don't know or care where the code runs, the I/O doesn't change at all.) The only reason Linux doesn't have clustering as standard is that Linus is waiting for cluster developers to produce a standard mechanism for process migration that also fits within the architectural standards already in use.

      If you clustered a couple of hundred nodes, each with 48 cores, you're looking at having around 2000+ on the system. It wouldn't take a "rewrite" per-se, merely a few hooks and a standard protocol. To support a single physical node with more than 48 cores, you might need to split it into virtual nodes with 48 or fewer cores in each, but Linux already has support for virtualization so that's no big deal either.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    4. Re:What are they talking about by Kristopeit,+Mike+Da. · · Score: 1

      they're talking about software not designed to benefit from many cores not benefiting from many cores.

    5. Re:What are they talking about by Darinbob · · Score: 1

      SMP (which multicore essentially is) has always been the simple, cheap, but dumb way of doing parallel processing. The apps don't have to know about the parallel architecture, even how many processors exist, but they'll transparently take advantage of more processors. Easy for the programmer, easy for the user, nice upgrade paths. But the drawback is that you have a lot more bottlenecks than you would in a classical MIMD parallel computer. Everyone shares the same memory bus, even if they're working on separate portions of memory; caching helps but the bottleneck is still there. The operating systems for SMP also tend to have bottlenecks, synchronization primitives necessary for OS calls, centralized task switching, centralized file buffers, etc.

      In short, the scalability isn't so great. But it's the age old problem of an application specific entity versus a general purpose entity. Multicore is great for general purpose computing, but it's not going to scale anything like a system designed for specific purposes. These operating systems mentioned are general purpose operating systems, originally designed for single processors that have incrementally been adapted to multiple processors. Of course there's going to be a problem.

    6. Re:What are they talking about by emt377 · · Score: 1

      Can somebody please explain what the fuck they are actually talking about?

      I was wondering the same. Some sort of "counter" is a rather useless description. Page ref counts? Shared TLB lockout? Cache aliasing? Mutex wait counts? Interrupt thread concurrency? Something else entirely???

    7. Re:What are they talking about by emt377 · · Score: 1

      The more cores you have, the more state information is needed for a thread and the more possible permutations the scheduler must consider in order to be efficient.

      You mean they're thrashing on scheduler state? THIS can be solved by getting rid of the HZ scheduler tick. Either a thread is blocked until time X, is blocked indefinitely, or is running. Either way, there is no need for a regular tick; its known in advance when the next thread needs to wake up, when the next thread for a given core (execution unit) needs to wake up, and when the next thread on a given execution thread/CPU thread needs to wake up. ALL computers these days have programmable timers and there is absolutely no need to soft time anything. Simply set a timer to the next wake point, with whatever N-nanosecond precision the timer can handle, and keep rolling. Just like Solaris does.

    8. Re:What are they talking about by pclminion · · Score: 1

      Memory pages are reference counted. Some of the pages are shared and the cores spend a lot of time reference counting. There is a point where the reference counting overhead dominates. It is hypothesized that this could be fixed by enhancing the OS to isolate pages to physical cores thereby removing the need to reference count; this is a fundamental change to the structure of traditional virtual memory management.

      I still don't understand. Why would the reference counts of the pages be changing all the time? For shared library pages, that would imply zillions of extremely short-lived processes being executed and then dying, all the time. And for threads within a single process that all share the same VM, why would any reference count change unless one of the threads modified the VM map by calling mmap() or some other such function?

      I'm getting this image of my head of reference counts constantly beings increment and decremented and I find myself wondering why the hell that should be happening in the first place. I'm no kernel wizard but I've written device drivers and that doesn't make any sense to me.

  9. Simple solution by Haxamanish · · Score: 0

    Just write a little AWK script to replace evey occurence of "48' in the source code by, say, 256 or 1024.

  10. Only Linux? by Ltap · · Score: 3, Interesting

    It looks like TFS was written by a Windows fanboy; why mention Linux specifically when it is a general problem? Why try to half-assedly imply that Windows is more advanced than Linux?

    --
    Yet Another Tech Blog
    (but so much more, including game and movie reviews)
    http://yanteb.peasantoid.org
    1. Re:Only Linux? by jpmorgan · · Score: 0, Redundant

      Probably because Microsoft rewrote the NT kernel for Windows 7, to eliminate the kinds of problems this study discovered:

      http://www.zdnet.com/blog/microsoft/windows-7-to-scale-to-256-processors/1687

    2. Re:Only Linux? by Attila+Dimedici · · Score: 3, Insightful

      Having read eldavojohn's post that summarizes the article, it appears that the reason to pick out Linux specifically is because that is the OS that the writers of the paper actually tested. Since Windows uses a different system for keeping track of what various cores are doing it is likely that Windows will run into this problem at a different number of cores. However, until someone conducts a similar test using Windows we will not know if that number is more or less than 48.

      --
      The truth is that all men having power ought to be mistrusted. James Madison
    3. Re:Only Linux? by KarmaMB84 · · Score: 1

      As a post already pointed out, Microsoft modified the NT kernel to scale to 256 cores with Windows 7 and Windows Server 2008 R2.

    4. Re:Only Linux? by Jorl17 · · Score: 1, Redundant

      No, their rewrite is also subject to to this issue. Go publicize Windows somewhere else.

      --
      Have you heard about SoylentNews?
    5. Re:Only Linux? by Anonymous Coward · · Score: 0

      And yet SGI Altrix UV line goes up to a total of 2048 cores per system, and runs Linux.

      The only reasonable explanation I can find for this is that Linux treats multiple CPUs and multiple cores on the same CPU differently. Windows may be just as affected.

    6. Re:Only Linux? by wastedlife · · Score: 2, Informative

      They did not "rewrite the kernel" for 7. They updated the code, just like every other piece of software normally does when it moves from version to version. Rewriting the kernel implies that they tore it down and started over, which is most certainly not true. Vista/2008 is NT version 6.0, 7/2008 R2 is NT version 6.1, not a rewrite.

      --
      Said, "It's just like dice but it's got more sides And it tells me who lives and who dies"
    7. Re:Only Linux? by arkane1234 · · Score: 1

      That's great, since Windows 7 is a desktop OS...

      --
      -- This space for lease, low setup fee, inquire within!
    8. Re:Only Linux? by UnknowingFool · · Score: 1

      No they didn't. MS rewrote Windows 7 to support 256 logical processors. Logical processor doesn't always equate core; a single hyper-threading core may count as 2 logical processors. The 2.6 linux kernel for x64 will support 256 logical processors. The x86 version supports only 32, and I think the big iron mainframe version supports 1024. The problem that is being brought to light in the article is that after 48 cores, there are problems (at least with Linux) with memory optimization, and a rewrite might be needed to avoid this problem. Nothing says that Windows 7 Server does or does not have this problem.

      --
      Well, there's spam egg sausage and spam, that's not got much spam in it.
    9. Re:Only Linux? by aardwolf64 · · Score: 2, Insightful

      No, their rewrite is also subject to to this issue. Go publicize Windows somewhere else.

      No, it isn't subject to this issue. They removed the dispatcher lock. Go bash Windows somewhere else.

    10. Re:Only Linux? by tibman · · Score: 2, Informative

      The problem isn't scaling to that number of cores but the overhead in doing so. That's what i took from it

      --
      http://soylentnews.org/~tibman
    11. Re:Only Linux? by UnknowingFool · · Score: 1

      We are not talking about scaling. Linux has supported 256 logical processors on the x64 branch since 2.6. The researchers found that there might be problems with memory optimization with more than 48 cores. The OP has legitimate question as to whether Windows might suffer from these problems.

      --
      Well, there's spam egg sausage and spam, that's not got much spam in it.
    12. Re:Only Linux? by Jorl17 · · Score: 1

      Oh, we can go on for days.

      No, [irrelevant or sourceless argument X], go bash [fundamentalist software choice] somewhere else.

      [optional idiot smile]

      --
      Have you heard about SoylentNews?
    13. Re:Only Linux? by Anonymous Coward · · Score: 0

      Wow, a whole 256 processors? Impressive!

      Wake me up when Windows runs single-image on a 1024 CPU machine, like, say, Linux has been running for years now on SGI.

    14. Re:Only Linux? by Anonymous Coward · · Score: 0

      No, they have not. Win 7 has been made slightly less lame and is now able to run on up to 256 cpus, not cores. Previously, because of locking mechanisms and general kernel architecture the performance beyond 32 cpus was nothing short of appalling, so Microsoft decided to limit SMP to 32 cpus -- and might I add that running Windows on anything over a 16 way box in XP or previous was painful, to put it mildly, and 32 way it was borderline masochism.

      Meanwhile, Linux runs single image on 1024 cpu boxes since 2006. And apparently runs well enough to make NASA purchase a SGI Altix with 1024 cpus / 2048 cores running Linux back in 2007.

    15. Re:Only Linux? by MostAwesomeDude · · Score: 0, Offtopic

      I attended a talk by one of the technical MS VPs at Oregon State University, where he talked about the challenges of scaling up to massively multi-core machines. His talk basically covered the various SMP/NUMA optimizations that Linux has had for a while, and how Win7's kernel has been adapted to do the same things as Linux in these situations. Notably, a section of the talk was dedicated to NUMA and how massively SMP systems start to have the same kinds of memory access problems as NUMA systems.

      Very cool guy; got to chat with him after the talk about Wine and various Windows technologies, etc.

      tl;dr Win7 is roughly at the same spot as Linux WRT scheduling and scaling for NUMA/massive SMP systems.

      --
      ~ C.
    16. Re:Only Linux? by UnknowingFool · · Score: 1

      While I can't read the article right now, this problem may affect Windows as well; however, it is more of pressing problem for Linux. Personally I've seen far more Linux installations that have more than 48 cores than Windows installations. Some big iron implementations have hundreds of processors.

      --
      Well, there's spam egg sausage and spam, that's not got much spam in it.
    17. Re:Only Linux? by falcon_dark · · Score: 1

      They can't test it with Windows because nobody can afford to pay so many licenses to Microsoft... it's just too expensive!!!

    18. Re:Only Linux? by Johnno74 · · Score: 1

      Windows 7 shares the same kernel as server 2008 R2.

    19. Re:Only Linux? by Rainefan · · Score: 1

      Yet, worse. From TFA:

      “The MIT researchers found that the separate cores were spending so much time ratcheting the [memory] counter up and down that they weren’t getting nearly enough work done,” the report states. However, the researchers also found that “slightly rewriting the Linux code so that each core kept a local count, which was only occasionally synchronized with those of the other cores, greatly improved the system’s overall performance.”

      Stop FUD'ing please.

    20. Re:Only Linux? by Anonymous Coward · · Score: 0

      http://channel9.msdn.com/shows/Going+Deep/Mark-Russinovich-Inside-Windows-7/

  11. based on a 1970s OS and language by peter303 · · Score: 1, Troll

    UNIX and C were great in their days. But perhaps not in the meg-core era.

    1. Re:based on a 1970s OS and language by Anonymous Coward · · Score: 2, Insightful

      UNIX and C were great in their days. But perhaps not in the meg-core era.

      So, what is better in your opinion? Java? Or maybe even ruby? Oh yes, that would be great. Run-time OS reflection through kernel drivers implemented as ruby modules.

      Too bad CPU's don't come with built-in ruby interpreters.

    2. Re:based on a 1970s OS and language by Anonymous Coward · · Score: 0

      Hah! that's a good one, thanks for making my day! Hey can you tell me where you've bought the low id?

    3. Re:based on a 1970s OS and language by geekoid · · Score: 4, Insightful

      Hahaha. Oh arrogances from ignorance, how I loath you.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    4. Re:based on a 1970s OS and language by arkane1234 · · Score: 1

      UNIX is more a concept, and not an OS.

      As far as language, that's not even the question. A language can change nearly overnight to add mechanism for threading.

      --
      -- This space for lease, low setup fee, inquire within!
    5. Re:based on a 1970s OS and language by LWATCDR · · Score: 1

      While you will get a load of crap for your comment it has some value.
      The good news and where you are off is that UNIX of today "Linux" isn't the same as the Unix of 70s, 80s, 90s.
      Unix is kind of like a B-52. Bits and pieces have been updated over time.

      The real issue has to do with stagnation. Most OS's today are based on Unix. Linux, BSD, and OS/X are all based on Unix. Windows is probably at least in sprit based on VMS but I don't see a lot of it in Windows.
      Real interest in new OS's seems to be limited to what we can stick into our current OS's. The reason is that none of use want to give up our software base.
      As far as c goes. It is fine in the hands of a skilled programer. There is no real demanding technical reason to move away from c. c like Unix has also adapted and now has such nice features as STL and has been extened into Objective C and C++.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    6. Re:based on a 1970s OS and language by tibman · · Score: 1

      eeek. Trying to build an entire OS in a managed language? you're crazy. Not even MS can do it: http://en.wikipedia.org/wiki/Singularity_(operating_system)

      C will probably never go away.. it's as likely as assembly going away.

      --
      http://soylentnews.org/~tibman
    7. Re:based on a 1970s OS and language by gatzby3jr · · Score: 1

      That's almost as intelligent as this post

    8. Re:based on a 1970s OS and language by Jorl17 · · Score: 1

      They didn't limit this issue to Linux -- or Unix, or Windows, for that matter. Stop flamebaiting. -Perhaps- is a very *peculiar* word for that.

      --
      Have you heard about SoylentNews?
    9. Re:based on a 1970s OS and language by rufty_tufty · · Score: 1

      You may be able to bodge on threads overnight, but a language that supports multiprocessing from the ground up (occam springs to mind) is not an overnight addition.
      Handal-C perhaps might be a good example of multi-processing built into the language that is beyond current popular programming models, but I don't see the ecosystem to go with it and that will take time...

      --
      "The weirdest thing about a mind, is that every answer that you find, is the basis of a brand new cliche" -
    10. Re:based on a 1970s OS and language by Anonymous Coward · · Score: 0

      Yeah, because a moron's language like Java will be more efficient than C as the number of cores approaches infinity.

    11. Re:based on a 1970s OS and language by Anonymous Coward · · Score: 0

      UNIX and C were great in their days. But perhaps not in the meg-core era.

      So, what is better in your opinion? Java? Or maybe even ruby? Oh yes, that would be great. Run-time OS reflection through kernel drivers implemented as ruby modules.

      Too bad CPU's don't come with built-in ruby interpreters.

      No, he probably meant HTML :D

    12. Re:based on a 1970s OS and language by hardburn · · Score: 1

      As far as language, that's not even the question. A language can change nearly overnight to add mechanism for threading.

      There's what's possible, and then there's what's easy. Programs written in a functional style (to some level purity between Lisp and Haskell) are often much easier to multithread than C or most other curly-brace languages, simply because they have little to no state information.

      --
      Not a typewriter
    13. Re:based on a 1970s OS and language by EvanED · · Score: 2, Interesting

      A language can change nearly overnight to add mechanism for threading.

      Is that why the C and C++ people have spent so long at trying to come up with a memory model that will actually work correctly under concurrent execution? Is that why Java got it wrong the first time?

    14. Re:based on a 1970s OS and language by Anonymous Coward · · Score: 0

      You are seriously confused. C DOES NOT can STL.

  12. 64 cores by hansamurai · · Score: 2, Interesting

    At my last job we had a bunch of Sun T5120s which housed 64 cores. So yeah, we are "anywhere near 48".

    1. Re:64 cores by Anonymous Coward · · Score: 0

      No, it had 64 Virtual CPU's. The T5120's have 8 cores and 8 threads per core so Solaris reports 64 vCPUs. Only one thread can run at a time on a core (the rest are parked waiting for a page fault to be serviced...maybe other reasons too).

    2. Re:64 cores by Splab · · Score: 1

      They aren't talking about physical CPUs, they are talking about cores within any given CPU, which I doubt you where having 64 of. While this may seem like a moot point, when scheduling it is very important to keep track of what's going where since you are usually at least sharing L3 cache on a single die.

    3. Re:64 cores by dirtyhippie · · Score: 1

      That's 64 threads, not cores. T5120s have only 8 physical cores, but allow for up to 8 cores per thread, hence why you see 64.

      There is a closer counter-example to the article's claims, though: SiCortex's 72+ core machines:
          http://sicortex.com/products

      Drool-worthy...

    4. Re:64 cores by Anonymous Coward · · Score: 0

      Really, where did MIT come up with a 48-core CPU?

    5. Re:64 cores by Big+Jason · · Score: 1

      You mean 64 threads, 8 cores total.

    6. Re:64 cores by Score+Whore · · Score: 1

      No. The Niagara CPUs do a hardware level context thread on every clock cycle or on a cache stall or halted thread. That 8 core, 8 thread cpu is effectively 64 slow cpus sharing a single L2 cache and having an 8:1 ratio of threads:L1 cache. It's not the same as 64 true cores, but it's very different than multiple sockets. For someone not familiar with the line and not interested in going out and reading up on the processors, you could think of it as a form of hyperthreading. In fact the T2 series does exactly that -- in each clock cycle two threads can execute as long as one of them can fit within a particular set of hardware needs, no fpu and some other things.

      The spiffy thing about the Niagara line is that there are many more opportunities to keep the CPU running, contrasted with a non hyperthreaded x86 system which spends 30% or more of it's time stalled waiting for a cache line.

    7. Re:64 cores by Anonymous Coward · · Score: 0

      No, your T5120s had 8 cores with 8 threads per core, which was represented in the OS as 64 CPUs.

      The new T3-4 servers that were just announced can have four 16-core processors, with each core able to run 8 threads, which will look like 512 CPUs in the OS.

  13. Question is, what to do... by Anonymous Coward · · Score: 0

    ...with the other 46 cores we are not using. Most people still do one thing at a time and only need a couple of cores at best. Three if you throw in Windows anti-malware software. :)

    1. Re:Question is, what to do... by Anonymous Coward · · Score: 0

      Well let's see... The user's Facebook Firefox tab takes up one core. There's probably a keylogger running on another core. And various other forms of malware are probably slamming the other 40+ cores.

    2. Re:Question is, what to do... by MachineShedFred · · Score: 1

      Or, if you have an application that isn't written for a system from the 1990's, it will spawn multiple threads and use all the cores to do that one task much faster. You know, things like video compression which is nothing but massive amounts of math - nobody does that on a consumer level at all.

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
  14. Jaguar? by MrFurious5150 · · Score: 2, Insightful

    Cray seems to have addressed this problem, yes?

    1. Re:Jaguar? by Anonymous Coward · · Score: 0

      No, because the Jaguar isn't a "monolithic" computer like a personal PC. It's a series of nodes - "Each XT5 compute node contains dual hex-core AMD Opteron 2435 (Istanbul) processors".

    2. Re:Jaguar? by Anonymous Coward · · Score: 0

      Did you even read the wikipedia article you posted? Jaguar contains 26,520 nodes. Each of those only have up to twelve cores apiece.

    3. Re:Jaguar? by NuclearRampage · · Score: 1

      There aren't 64 cores on any single Opteron chip, so no, they haven't.

  15. 48 cores? by drunkennewfiemidget · · Score: 4, Funny

    I'm still waiting for Windows to work well on ONE.

    1. Re:48 cores? by Beer+Drunk · · Score: 2, Funny

      Actually I'm rather pleased with Windows 7. It's a great improvement over their last few attempts and other than a few spurious reboots right in the middle of several hours work and often requiring me to force the *&#%$ drives to re-mirror it's not toooo bad. OK, I just use it on this box because there are a couple of programs I like not available for native Linux yet but at least it's not Vista or ME bad.

    2. Re:48 cores? by Abstrackt · · Score: 2, Interesting

      OK, I just use it on this box because there are a couple of programs I like not available for native Linux yet but at least it's not Vista or ME bad.

      One trick in business and politics is to offer a bad choice next to a worse one so it doesn't seem as bad by comparison. Every time I see or hear that comment the conspiracy theorist in me wonders whether ME and Vista were deliberately bad to soften the shock of adjustment to XP and 7.

      --
      They say a little knowledge is a dangerous thing, but it's not one half so bad as a lot of ignorance. - Terry Pratchett
    3. Re:48 cores? by Anonymous Coward · · Score: 0

      I could believe that with ME since it was just 98SE (purposely?) messed up and was replaced quickly, but Vista was inflicted on us far too long to just be a marketing ploy because a lot of folks who might not have tried Linux otherwise found out operating systems didn't HAVE to be buggy. The fruit vendors picked up a lot of customers as well.

    4. Re:48 cores? by noidentity · · Score: 1

      I swear I once heard Linus Torvalds mutter under his breath, "48 cores should be enough for anyone." Probably just my imagination though.

    5. Re:48 cores? by jimicus · · Score: 1

      Way OT here, but I think (at least in the UI department, haven't really looked under the hood) Windows 7 is quite an improvement over XP for a number of reasons - and yes I know a lot of these debuted in Vista:

      - GUI is resolution-independent. No more squinting to see the screen on a high-res monitor. The only amazing thing about this was it took so long for it to happen.
      - Taskbar deals with many applications open much more efficiently. The only amazing thing about this was it took so long for it to happen.
      - As does the system tray. The only amazing thing about th...... you get the idea.
      - Much more effort made to ensure things JFW, and to fix it when things get broken.
      - "Yes I know you've got updates. Bug me later over them" - you can tell Windows to bug you again in several hours rather than having to tell it every 10 minutes.

      Of course, there are still some issues:

      - Things still don't always JFW - and when they don't, getting into the relevant configuration to sort them by hand is often more convoluted.
      - It's called the Event Log. It's there for a reason. Seriously, how in God's name does anyone ever fix anything in Windows when so few things - even internal Windows components - bother to write to the damn thing? Don't bother answering that one, I already know - it's a combination of trial and error and learning all the little glitches through experience.
      - Speaking of the Event Log, why does Windows still not have a mechanism to access it which lends itself to browsing? Double-clicking on every entry in the hope of turning up something interesting gets really old really fast.
      - Microsoft's own developers appear to have given up on expecting people to actually read error messages. As a rule, they've become even more meaningless. Think along the lines of "Something went wrong".

  16. too bad, by Major+Downtime · · Score: 1

    i was hoping to see Crysis 2 running on Linux

  17. seeing as Linux does 10240 cores already, WTF? by r00t · · Score: 4, Interesting

    No kidding. SGI's Altix is a huge box full of multi-core IA-64 processors. 512 to 2048 cores is more normal, but they were reaching 10240 last I checked. This is SMP (NUMA of course), not a cluster. I won't say things work just lovely at that level, but it does run.

    48 cores is nothing.

    1. Re:seeing as Linux does 10240 cores already, WTF? by Anonymous Coward · · Score: 0

      They mean cores per socket.

    2. Re:seeing as Linux does 10240 cores already, WTF? by varmittang · · Score: 1

      But that is over multiple processors in the whole machine. They are talking about a single processor having 48 cores, not the whole machine over multiple processors.

      --
      -----BEGIN PGP SIGNATURE-----
      12345
      -----END PGP SIGNATURE-----
    3. Re:seeing as Linux does 10240 cores already, WTF? by Anonymous Coward · · Score: 0

      They mean 48 cores per CPU. Try reading the article.

    4. Re:seeing as Linux does 10240 cores already, WTF? by Unequivocal · · Score: 3, Informative

      I think specifically they are talking about having 48 cores behind an L2 cache. Or 48 cores on a single die. Multi-CPU boxes generally communicate between CPU dies via the bus and from what little I can gather, that helps reduce or eliminate the issue they're describing..

    5. Re:seeing as Linux does 10240 cores already, WTF? by NuclearRampage · · Score: 1

      Those multi-core IA-64's don't have 48+ cores on a single chip do they? Your comment does not apply.

    6. Re:seeing as Linux does 10240 cores already, WTF? by Anonymous Coward · · Score: 1, Insightful

      Stop posting the same wrong shit as everyone else.

      Before you post:
      1) Read the fucking article.
      2) Read the fucking comments.

      Are there any CPUs with 10240 cores? No.

    7. Re:seeing as Linux does 10240 cores already, WTF? by wrongrook · · Score: 1

      I regularly use a Tilera chip which has 64 cores on a single die, each running Linux, with a common (distributed) L2 cache. For most tasks there are not any particular scalability issues, but that is because my tasks are doing a lot of user space code. I think the paper is concentrating on tasks which are dominated by kernel code, and in these cases they seem to have made a useful contribution.

    8. Re:seeing as Linux does 10240 cores already, WTF? by afidel · · Score: 1

      Nobody is going to put 48 cores behind a L2 cache, 4 cores is probably the max for a shared L2, beyond that you go to a L3. Heck the newer Intel designs split L3 cache into two sets to keep the number of cores sharing a cache reasonable.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    9. Re:seeing as Linux does 10240 cores already, WTF? by Jeremy+Erwin · · Score: 1

      Hurray for wait states!

    10. Re:seeing as Linux does 10240 cores already, WTF? by guyminuslife · · Score: 1

      I saw what you did there!

      --
      I don't believe in time. It's a grand conspiracy designed to sell watches.
    11. Re:seeing as Linux does 10240 cores already, WTF? by Unequivocal · · Score: 1

      Thanks for the clarification - very helpful. (And also thanks for being thoughtful in your response - your civility is welcome & appreciated!)

    12. Re:seeing as Linux does 10240 cores already, WTF? by Unequivocal · · Score: 1

      Yeah really -- some bonus for messaging on the bus right? It's slower so it doesn't break! I suspect the problems they foresee on multi-core don't slow down as much as inter-bus comms anyway, so there may actually be less information in the article than I thought at first.

      Thanks for the tip!

    13. Re:seeing as Linux does 10240 cores already, WTF? by aegl · · Score: 1

      The 10240 core system is a 20 node cluster with 512 cores on each node (Columbia cluster at NASA).

      These big SGI systems prove little about the general scalability of Linux as they are all running HPC (high performance computing) workloads - read in some data, crunch floating point numbers for several hours then printf("42\n") at the end of the run. I.e. not stuff that will stress the scalability of the operating system.

      Take one of those systems and have it run a workload where all the cpus are trying to create/write/read/close on a bunch of small files, and you'll soon see how poorly they scale for general purpose workloads.

  18. Windows 7 scales to 256 cores by figleaf · · Score: 1
    1. Re:Windows 7 scales to 256 cores by Anonymous Coward · · Score: 1, Interesting

      It would be interesting to observe the transition over 64 cores in terms of scalability.

    2. Re:Windows 7 scales to 256 cores by h4rr4r · · Score: 3, Insightful

      Linux supposedly scales to 1024 or something like that. This is not what they supposedly scale to, but the performance impact of actually trying to use that many cores.

    3. Re:Windows 7 scales to 256 cores by djdanlib · · Score: 1

      Right, but what is the performance overhead of having that many cores on Win7, though?

      "It scales" doesn't necessarily imply a linear performance increase :)

    4. Re:Windows 7 scales to 256 cores by cpscotti · · Score: 1

      Oh yes! From the (real) summary, anyone can get that Linux "scales" beyond 48 cores but starts losing on performance due to the "counters" overhead. Working is different than "works great" or "pays off". Theoretically (as "works" go), Linux can work with way more than 48.

      Now windows...

    5. Re:Windows 7 scales to 256 cores by Troy+Baer · · Score: 1

      The current limit is 2048 cores in a single system image, IIRC. I currently run an SGI UV 1000 with 1024 cores (128 8-core Nehalem EXs), and that's not the biggest possible configuration of that hardware.

      --
      "My life's work has been to prompt others... and be forgotten." --Cyrano de Bergerac
    6. Re:Windows 7 scales to 256 cores by Anonymous Coward · · Score: 0

      SGI was running a 4096 CPU system little while ago, and they had to submit some patches (which got in) to make sure that the kernel didn't consume all available resources doing per-cpu accounting.

    7. Re:Windows 7 scales to 256 cores by TheNetAvenger · · Score: 3, Insightful

      The point isn't that NT Scales to 256 cores, the point is how efficient it is when scaling to this many processors. The NT Kernel in Win7 was adjusted so that systems with 64 or 256 CPUs have a very low overhead handling the extra processors.

      Linux in theory (just like NT in theory) can support several thousand processors, but there is a level that this becomes inefficient as the overhead of managing the additional processors saturates a single system. (Hence other multi-SMP models are often used instead of a single 'system')

      Just simply Google/Bing: windows7 256 Mark Russinovich

      You can find nice articles and even videos of Mark talking about this in everyday terms to make it easy to understand.

    8. Re:Windows 7 scales to 256 cores by walshy007 · · Score: 2, Informative

      The point is the article dealing with a simulated theoretical cpu with 48+ cores on a single die with shared l2 cache.

      The changes made are incremental and I imagine will be dealt with long before this actually becomes an issue when (or if) we get cpus with that many cores on a single die.

      multi socket systems are already immune to this the way it is setup, you could have an 8 socket system with each cpu having 8 cores and it would not show the problems shown in the article.

      In other words, business as usual, the kernel gets optimized for hardware that actually exists or will exist in the near future. 48 core single cpus are a few years away, and the changes to accomodate them don't require anything significant so I'm sure it will be dealt with at the time.

  19. Priorities by jeff4747 · · Score: 0, Troll

    Perhaps they should worry about getting Flash to work without stuttering before they worry 'bout 48 cores.

    ...unless the plan is to use 48 cores to make Flash work.

    1. Re:Priorities by silas_moeckel · · Score: 1

      Somebody cares about flash video working? Only people I can think of is Adobe flash is there closed DRM ridden POS it's in there interests to make it work. If they start using open video streaming protocols they would not have a problem.

      --
      No sir I dont like it.
    2. Re:Priorities by jedidiah · · Score: 1

      You get Flash to stop stuttering by having a really fast single core.

      This is true for Windows 7 as much as it is for Linux.

      It tries to multi-thread and it will happily eat up every CPU on your box.

      It will still stutter though.

      Flash is just crap. An example of that rule about any task being limited to it's serial components...

      --
      A Pirate and a Puritan look the same on a balance sheet.
    3. Re:Priorities by cynyr · · Score: 1

      tell adobe.... let them do it, or provide doc/\ and protection to do it...

      --
      All of the above was encrypted with a Quad ROT-13 method. Unauthorized decryption is in violation of the DMCA.
    4. Re:Priorities by jeff4747 · · Score: 1

      Well, everyone who bitches about the iPhone uses it as an example of why the iPhone is terrible. So it must be critical.

    5. Re:Priorities by h4rr4r · · Score: 1

      That problem has been solved for a while. I use a linux HTPC and see no stuttering watching hulu.

      Go find a new troll.

    6. Re:Priorities by Anonymous Coward · · Score: 0

      Who's "they"? Do you really think that it's the same people working on linux that writes Flash?

    7. Re:Priorities by armanox · · Score: 1

      As stated by others, this is not a kernel issue. This is a vendor issue.

      --
      I'm starting to think GNU is the problem with "GNU/Linux" these days.
  20. 48 isn't far off at all. by Anonymous Coward · · Score: 0

    My new HP DL-385 G7 has a 12 core AMD processor. A four fold increase is not far off.

  21. Sun E10Ks were at 72 cores over a decade ago by Anonymous Coward · · Score: 0

    And guess what? With near linear scaling.

    These have 512.

    These have 256.

    Appears to be a Linux problem.

    1. Re:Sun E10Ks were at 72 cores over a decade ago by jedidiah · · Score: 2, Informative

      An E10K is a glorified network computing cluster.

      It's not what's being discussed at all.

      --
      A Pirate and a Puritan look the same on a balance sheet.
  22. Windows is good by Anonymous Coward · · Score: 0

    http://channel9.msdn.com/shows/Going+Deep/Mark-Russinovich-Inside-Windows-7/

    Windows 7 can scale to 256 processors.

  23. Who uses that by MSDos-486 · · Score: 2, Funny

    http://xkcd.com/619/

  24. Re:Linux needs a rewrite anyway. by arkane1234 · · Score: 1

    What part?
    Be specific, since I've been using it since '95 and it's only gotten better.
    Granted, it's a little more bloated than back then, but hey you get that when you have sixteen billion subsystems.

    --
    -- This space for lease, low setup fee, inquire within!
  25. Obligatory xkcd reference by zill · · Score: 2, Interesting

    Do they have support for smooth full-screen flash video yet?

    My Ubuntu 10.04 system still can't play embedded youtube videos. At least Adobe provided a work-around by adding a "play on youtube" option in the right click context menu.

    1. Re:Obligatory xkcd reference by diegocg · · Score: 2, Informative

      Yes, you can play smooth full-screen video in Linux with the "Square" preview release (which includes 64 bit support). Full-screen 720p video only uses 30-40% of the CPU on my crappy Intel graphics chip, and it's completely smooth.

    2. Re:Obligatory xkcd reference by Anonymous Coward · · Score: 0

      I got flash runing 720p fullscreen no lag on ubuntu studio( realtime kernel) and switffox(that made a huge difference)

      AMD 5200+

    3. Re:Obligatory xkcd reference by Anonymous Coward · · Score: 0

      1) That button was most likely put there by Youtube than by Adobe
      2) Is your system 32 or 64 bits? If it's 64 bits, try installing the 64 bit version (from https://launchpad.net/~sevenmachines/+archive/flash).
      Firefox on 64 bit systems is compiled native, unlike in Windows (and OSX?) where firefox is a 32 bit binary

    4. Re:Obligatory xkcd reference by Anonymous Coward · · Score: 2, Insightful

      If your Ubuntu 10.04 system can't play embedded youtube videos then you should get off your ass and fix it instead of wasting your time pasting xkcd links. Ubuntu plays flash videos out of the box without a single hitch for years.

    5. Re:Obligatory xkcd reference by MostAwesomeDude · · Score: 1

      As we keep saying, we cannot do anything about the fact that Adobe's Flash Player does not accelerate many operations, and usually ends up going slower when acceleration is enabled and used.

      My recommendation is *still* to use youtube-dl or get-flash-videos, save the FLV video locally, and then watch it with a Real Movie Player, like mplayer, VLC, etc.

      --
      ~ C.
    6. Re:Obligatory xkcd reference by Anonymous Coward · · Score: 0

      I don't get this, my 8-year-old athlon xp 2600+ plays youtube videos in full screen fine. You must be doing something wrong?

    7. Re:Obligatory xkcd reference by RyuuzakiTetsuya · · Score: 2, Funny

      If you have 4,096 CPUs I don't think that "smooth flash playback" is a problem.

      4,095 CPUs however...

      --
      Non impediti ratione cogitationus.
    8. Re:Obligatory xkcd reference by Anonymous Coward · · Score: 0

      It's probably just anti-Linux trolling.

    9. Re:Obligatory xkcd reference by Trelane · · Score: 1

      My recommendation is to opt in to the HTML5 beta (uses WebM; http://www.youtube.com/html5 ; it seems to occasionally forget that I'm in the beta program) and the forget the rest. :)

      --

      --
      Given enough personal experience, all stereotypes are shallow.
    10. Re:Obligatory xkcd reference by geschild · · Score: 1

      'At least Adobe provided a work-around by adding a "play on youtube" option in the right click context menu.'

      *Head assplodes.*

      These benevolent people at Adobe at least provide a bloody workaround?!

      Why don't they fix the Linux flash player altogether? They're the ones, making it this crappy on Linux in the first place, aren't they?

      --
      Karma? What's that again?
    11. Re:Obligatory xkcd reference by Anonymous Coward · · Score: 0

      That's Adobes fault. Not the operating systems, regardless of what it is.

      Oh and consider yourself lucky. Flash on OS X is even slower and more CPU intensive. There's no way watching any video on Macbook without the fans exploding.

    12. Re:Obligatory xkcd reference by RocketRabbit · · Score: 1

      Even on my Mac Pro, the fans go into afterburner mode with flash video playing. I should mount it to the back of a semi-truck and go racing.

  26. Why? by geekoid · · Score: 1

    Why aren't we even close to 48 cores? There have been chip with 16 or more cores. Why are we still mulling around 6 cores?

    I suspect it's a fab issue. In that so may ships on a die with such small lines means a lot more flawed chips and bad wafers.

    Of course, Who will rewrite Linux? We could never recaptures it's unique origins.

    --
    The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
  27. Re:Linux needs a rewrite anyway. by Anonymous Coward · · Score: 0

    The part that requires the user to grow a neckbeard and masturbate to lolicon.

  28. We passed that point by Anonymous Coward · · Score: 0

    I seem to recall seeing operating systems running on more than 48 cores. In fact, doesn't Linux power some of the giant super computers with 64+ cores?

  29. OpenIndiana?? by andersenep · · Score: 0, Troll

    Why write a new Linux when Solaris already does such a fine job scaling to large numbers of cores/threads? OpenIndiana is just getting off the ground, but it's open source, free, and works now.

    1. Re:OpenIndiana?? by h4rr4r · · Score: 2, Informative

      OpenSolaris is dead. Solaris sucks to use without GNU userland anyway and being sued by oracle is no fun. Besides you troll, this would not need a new linux, just some small changes to the current one.

  30. 48 Cores in 1U by kybur · · Score: 2, Informative

    I'm not affiliated with Supermicro in any way, but they have four 1U serverboards designed for the 12 core opterons, so that's 48 cores in a 1U server. I'm guessing that Supermicro is not the only vendor of quad opteron boards supporting the latest chips. There are most likely quite a few of these in use by real people. Anyone want to speak up?

    I know from personal experience that the socket F opterons performed very poorly in an 8 way configuration compared to the previous generation (socket 940 gen). I ran multiple tests on dual core chips (885s, I think), back in 2006 or 7 where I'd get nearly double the performance in going from a quad configuration to an 8 way configuration, but with the socket F breed of chips, there was no performance boost at all, it was like the clock speed was being cut in half and all the threads took twice as long to complete. I saw this behavior again and again, and the motherboard manufacturer that I was testing the chips with told me that it was an issue with the chips themselves. I think this is the reason why 8-way opteron systems are very rare now.

  31. that's crazy by Punto · · Score: 2, Funny

    Nobody's every going to need more than 640 cores

    --

    --
    Stay tuned for some shock and awe coming right up after this messages!

  32. how is this news? by dirtyhippie · · Score: 3, Insightful

    We've known about this problem for ... well, as long as we've had more than one core - actually as long as we've had SMP... You increase the number of cores/CPUs, you decrease available memory thruput per core, which was already the bottleneck anyway. Am I missing something here?

    1. Re:how is this news? by bingoUV · · Score: 1

      Did you know that 48 is the magic number where increase in number of cores is more than offset by the cache sharing effort? If so, you could have published kind of a research paper like TFA.

      --
      Bingo Dictionary - Pragmatist, n. A myopic idealist.
  33. Infinite by Chameleon+Man · · Score: 1

    It's amazing how we live in a world involving an infinite, non-discrete numeric system yet the computers we construct are always bound by some finite, discrete limitation.

    1. Re:Infinite by deapbluesea · · Score: 1

      Soooooo, you're not a fan of Turing then?

      Seriously, I can't tell if you're trolling or not, so I'll just elaborate anyway. You've just stated the essence of computability theory. Without infinite memory and infinite states, it's not possible compute infinite sets of subsets in the general case. Put another way, there are a countably infinite number of possible computer/program configurations and an uncountably infinite number of problems to be solved, so there will always be more problems to solve than computers to solve them.

      --
      Government is not reason; it is not eloquent; it is force. Like fire, it is a dangerous servant and a fearful master.
    2. Re:Infinite by shutdown+-p+now · · Score: 1

      It's amazing how we live in a world involving an infinite, non-discrete numeric system

      We don't know that for sure. There is a distinct theoretical possibility that everything in our universe is quantized on some level, and therefore that infinity is a purely abstract concept.

  34. Yeah, on Windows, 47 for viruses by cjonslashdot · · Score: 0, Troll

    If a Windows machine had 48 cores, 47 of them would be running viruses, spyware, and anti-virus/anti-spyware software and one would be running the user's applications.

  35. Large linux systems today have 3072 processors by kroyd · · Score: 1
    (or more, probably)

    http://lkml.org/lkml/2010/7/22/252 is a fun post on the Linux-Kernel list about missing caching of ACPI tables leading to 20 minute boot times. I get that problem every day! (I wish :P)

    It is a pretty safe bet that you don't have to worry about Linux and more than 48 cores, as it is the OS of choice for a lot of the top supercomputers and OS research in general. Of course, applications which can take advantage of such systems is another problem, but that is hardly a Linux problem.

  36. Hundreds of cores on today's Linux by Florian+Weimer · · Score: 1

    One SGI Altix version comes with 2,048 cores running a single image.

    The benchmarks in the paper are a bit suspicious because they avoid disk I/O. tmpfs is used instead, which may skew results significantly. Surprisingly, they do not describe the architecture of the test machine, but perhaps I've missed that. They suggest that a workload which does not spend much time in the kernel cannot have scaling issues caused by the kernel, which seems rather dubious to me.

    1. Re:Hundreds of cores on today's Linux by NuclearRampage · · Score: 1

      As I'll explain again to people that can't grasp the concept of 48 cores on single CPU (chip), the system you linked uses 8-core CPU's. According the paper being discussed, there is no issue with the total number of cores, just the number of cores on a single chip.

    2. Re:Hundreds of cores on today's Linux by Florian+Weimer · · Score: 1

      You're wrong. I found the hardware description in the paper (finally):

      We run experiments on a 48-core machine, with a Tyan
      Thunder S4985 board and an M4985 quad CPU daughter-board. The machine has a total of eight 2.4 GHz 6-core
      AMD Opteron 8431 chips. Each core has private 64 Kbyte
      instruction and data caches, and a 512 Kbyte private L2
      cache. The cores on each chip share a 6 Mbyte L3 cache,
      1 Mbyte of which is used for the HT Assist probe filter [7]. Each chip has 8 Gbyte of local off-chip DRAM

      This is a NUMA machine, so their testing methodology involving tmpfs is totally bogus because it artificially increases inter-node memory traffic. Furthermore, it is unlikely that the results apply to the non-NUMA, single-chip 48-core architecture you have in mind.

  37. 48 cores a while off? by HolyCoitus · · Score: 1
    --
    That's scary.
    1. Re:48 cores a while off? by Surt · · Score: 1

      As is getting pointed out in many other threads, that's 4 cpus. Linux is having a problem running on more than 48 cores PER CPU. It has already been scaled to thousands of cores across large numbers of cpus.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
  38. I don't understand... by Anonymous Coward · · Score: 2, Informative

    I'm trying to understand the point of this article..Do we really need a new paper to say that centralized memory bandwidth is at some point a limiting problem in an SMP environment? Isn't this why we have NUMA?

    If you want to go after linux internals like the BKL more power to you but that horse left the stable a long long time ago as well.

    You could talk about the software problem in dealing with decentralized memory access, synchronization, scalable algorithms...etc but this is all likely something needing to be addressed in application space rather than at the kernel where this paper seems to focus.

    There are no shortage of huge single system image linux systems with thousands of processor cores and not a single one of them use SMP architecture. They are all NUMA based (decentralized memory access).

  39. other kernels by TheSHAD0W · · Score: 1

    Are there other open-source OSes which are better suited to more parallelism? The Hurd, perhaps?

    1. Re:other kernels by Beelzebud · · Score: 2, Interesting

      Possibly, but they still have tons of work to do. I recently installed Arch Hurd http://www.archhurd.org/ just to get some hands on time with the state of the OS, and was kind of surprised at the status. Many things are in place and work correctly, but it's nowhere near something I could say I'd actually want to use on a daily basis.

  40. Patches available by diegocg · · Score: 3, Informative

    So, they found scalability problems in some microbenchmarks. Well, some of the scalability paths cited in the paper will be fixed when Nick Piggin's VFS scalability patchset gets merged. But it's not like you need to rewrite every operative system to scale beyond 48 cores, it's just the typical scalability stuff, and the kind of scalability issues found these days are mostly corner cases (Piggin's VFS being an exception).

  41. Just the cache problem by Todd+Knarr · · Score: 4, Informative

    What they're saying is basically two things:

    First, there's a bottleneck in the on-chip caches. When a core's working on data it needs to have it in it's cache. And if two cores are working on the same block of memory (block size being determined by cache line size), they need to keep their copies of the cache synchronized. When you get a lot of cores working on the same block of memory, the overhead of keeping the caches in sync starts to exceed the performance gains from the additional cores. That's not new, we've known that in multi-threaded programming for decades: when you've got a lot of threads dependent on the same data items, the locking overhead's going to be the killer. And we've known the solution for just as long: code to avoid lock contention. The easiest is to make it so you don't have multiple threads (cores) working on the same (non-read-only) memory at the same time, that just requires some thinking on the part of the developers.

    Second, you only gain from additional cores if there's workload to spread to them usefully. If you've got 8 threads of execution actually running at any given time, you won't gain from having more than 8 cores. And on modern computers often we don't have more than a few threads actually using CPU time at any given moment. The rest are waiting on something and don't need the CPU and, as long as we aren't thrashing execution contexts too badly, they can be ignore from a performance standpoint. To take advantage of truly large numbers of cores, we need to change the applications themselves to parallelize things more. But often applications aren't inherently multi-threaded. Games, yes. Computation, yes. But your average word processor or spreadsheet? It's 99% waiting on the human at the keyboard. You can do a few things in the background, file auto-save and such, but not enough to take advantage of a large number of cores. The things that really take advantage of lots of cores are things like Web servers where you can assign each request to it's own core. And no, browsers don't benefit the same way. On the client side there are so (relatively) few requests and network I/O's so slow relative to CPU speed that you can handle dozens of requests on a single core and still have cycles free assuming you use an efficient I/O model. But it all boils down to the developers actually thinking about parallel programming, and I've noticed a lot of courses of study these days don't go into the brain-bending skull-sweat details of juggling large numbers of threads in parallel.

    1. Re:Just the cache problem by 14erCleaner · · Score: 1

      Second, you only gain from additional cores if there's workload to spread to them usefully.

      Yes, but "cores" are the new "gigahertz". The MBAs now need 8-core processors in their laptops, whereas a few years ago they all needed 3 ghz processors. It doesn't matter if they're useful, it's just an ego thing.

      --
      Have you read my blog lately?
  42. Windows? by Dunbal · · Score: 0, Troll

    Luckily we aren't anywhere near 48 cores and there is some time left to come up with a new Linux (Windows?).

    Emphasis mine.

    Oh please. Don't list an OS that has trouble running on 1 core as a possible solution.

    --
    Seven puppies were harmed during the making of this post.
  43. K42: these problems were already tackled by compudj · · Score: 5, Informative

    The K42 project at IBM Research investigated the benefit of a complete OS rewrite with scalability to very large SMP systems in mind. This is an open source operating system supporting Linux-compatible API and ABI.

    Their target systems, "next generation SMP systems", back in 2003 seems to have become the current generation of SMP/multi-core systems in the meantime.

  44. article on physorg by Anonymous Coward · · Score: 1, Interesting

    explains it rather well, imho.

    http://www.physorg.com/news205050157.html

    "In a multicore system, multiple cores often perform calculations that involve the same chunk of data. As long as the data is still required by some core, it shouldn't be deleted from memory. So when a core begins to work on the data, it ratchets up a counter stored at a central location, and when it finishes its task, it ratchets the counter down. ...
    As the number of cores increases, however, tasks that depend on the same data get split up into smaller and smaller chunks. The MIT researchers found that the separate cores were spending so much time ratcheting the counter up and down that they weren't getting nearly enough work done."

  45. Exactly by flashpro · · Score: 1

    http://xkcd.com/619//. 'Nuff Said.

  46. Aren't anywhere near 48 cores? by TheTrueScotsman · · Score: 1

    I'm using multiple servers right now (and have been for the past year) with 24 cores (4 x 6 cores running Debian Linux and Windows Server 2008). No performance problems at the moment but thanks for the heads-up.

  47. Well, they better start coding now... by MrWin2kMan · · Score: 1

    Intel has 12-core Xeon's in the pipeline, and HP (and IBM, and etc.) have quad-socket servers...with Hyper-Threading, that's 96 cores presented to the OS.

    --
    Nothing to see here but us trolls...move along...
    1. Re:Well, they better start coding now... by WhitePanther5000 · · Score: 1

      There's nothing stopping you from building a 4 socket x 12 core Opteron system today... except for maybe your budget.

    2. Re:Well, they better start coding now... by NuclearRampage · · Score: 1

      But that's still not 48 cores on a single chip. So it's not a big deal for most of us buying business type hardware that isn't even close to have 48 cores on a single CPU.

    3. Re:Well, they better start coding now... by Surt · · Score: 2, Insightful

      PER CPU. As was pointed out in many other comments. Linux has already scaled to thousands of cores across many cpus.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    4. Re:Well, they better start coding now... by MrWin2kMan · · Score: 1

      Well, yeah, but that's AMD stuff...

      --
      Nothing to see here but us trolls...move along...
  48. Tilera? by Anonymous Coward · · Score: 3, Informative

    Tilera Corp. already has CPU architecture with 16-100 cores per chip.
    TILE-Gx family

    Support for these is already being included in the mainline kernel.

  49. Slashdot by carrier+lost · · Score: 3, Funny

    ...there is some time left to come up with a new Linux (Windows?).

    Windows, the new Linux.

    You read it here first...

  50. Already Problematic with 4 cores by chocapix · · Score: 2, Interesting

    Using "cat /proc/cpuinfo" as a benchmark, I can see that my quad core is several times slower with an SMP kernel compared to a non-SMP kernel.

  51. I'm sry I thought I was on slashdot not wikipedia by Anonymous Coward · · Score: 0

    So, I wake up, check out a slashdot article, and lo and behold, I see a bunch of nerds having a nerd war about nerd knowledge. Did someone poison my DNS cache so that slashdot points to wikipedia.org and then reskinned wikipedia to look like slashdot?

    Sure as fuck feels that way.

    Have any of you participating in this conversation even read the conversation? You should be embarrassed and turn off your machines now if so. I don't know why the fuck we're concerned about how software can't keep up with more than 48 cores when it is clear that our own brains can't keep up with one another in a fluid conversation without devolving into an arrogant discussion about whose OPINION is right. For shame, ./, for shame.

  52. Re:Linux needs a rewrite anyway. by bberens · · Score: 1

    Hey watch it buddy, I resemble that remark!

    --
    Check out my lame java blog at www.javachopshop.com
  53. So what the fuck is he doing here then? by SmallFurryCreature · · Score: 5, Funny

    Lets drive the greenhorn OUT! No filthy high UID's with their spelling and gramar and solid well researched non-sensationlist writing. I want my editors to rape the language (bonus points if it is several languages at once) and sent my heart racing by raising my bile and fear of the unknown and known.

    Headlines sell adverts. Truth, accuracy, honesty do not. Accept it, you are reading slashdot, it works.

    --

    MMO Quests are like orgasms:

    You may solo them, I prefer them in a group.

    1. Re:So what the fuck is he doing here then? by icebraining · · Score: 3, Insightful

      Headlines sell adverts. Truth, accuracy, honesty do not. Accept it, you are reading slashdot, it works.

      No, I read /. because of comments like eldavojohn's. If they were to disable the comments I'd unsubscribe it from my feeds immediately.

    2. Re:So what the fuck is he doing here then? by StikyPad · · Score: 1

      Slashdot seems to be a lot like Playboy in that regard... nobody reads it for the articles.

      Although TBH, Slashdot's pictorials could use some work.

    3. Re:So what the fuck is he doing here then? by tyrione · · Score: 1

      Slashdot seems to be a lot like Playboy in that regard... nobody reads it for the articles.

      Although TBH, Slashdot's pictorials could use some work.

      Are you implying they jerk off to the headlines? Where are those damn centerfolds, Slashdot!

  54. Re:Linux needs a rewrite anyway. by Anonymous Coward · · Score: 0

    Hate to tell you this but someone's been pulling your leg. Really, you can stop doing that now.

  55. y2k by Anonymous Coward · · Score: 0

    y2k all over again? :))

  56. It hits a peak? by Anonymous Coward · · Score: 0

    > may be hitting a peak somewhere in the neighborhood of 48 cores

    There's the solution then - more cores. Since 48 is the peak, it must start getting quicker if you thow more at it. QED.

  57. Haiku by Anonymous Coward · · Score: 0

    Hopefully the Haiku Project will be in a good place to pick up the slack by then.

    http://www.haiku-os.org/

  58. BeOS! by JonnyO · · Score: 2

    If BeOS had survived this wouldn't be an issue. Cores and threads everywhere! But noooooooo...

  59. OT - your sig by zooblethorpe · · Score: 1

    Just a bit of linguistic trivia: yaban in Japanese means "barbarian". Made me chuckle.

    Cheers,

    --
    "What in the name of Fats Waller is that?"
    "A four-foot prune."
    1. Re:OT - your sig by Gilmoure · · Score: 1

      Cool! Fits in with the long hair, unruly beard and pale skin.

      --
      I drank what? -- Socrates
    2. Re:OT - your sig by Anonymous Coward · · Score: 1, Interesting

      Yabanjin is a better fit for "barbarian" and yaban is closer to "barbaric or uncivilized".
      For example, "Yabajin desu. Yaban na kuni amerika kara yattekimasita."

  60. Multi cores seem to be worth something after all by yelvington · · Score: 1

    I watch embedded and full-screen Flash videos all the time on a $400 Acer Aspire laptop. That's with a dual-core Celeron. Hulu, YouTube, Vimeo, on-site or embedded in somebody's blog, internal display or big external monitor, all of them work great under Ubuntu.

    My daughter's single-core Atom netbook, on the other hand, does get choppy.

  61. BS on not being near 48 cores... I have 34 already by Fallen+Kell · · Score: 2, Informative

    I have 34 systems which have 48 cores already in the server room. These are quad socket systems with 4 AMD 12-core CPU's. So I call BS to the guys who think we have plenty of time, because there are plenty of people deploying these things already.

    --
    We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
  62. no - doesn't address cores per CPU issue by rubycodez · · Score: 1

    Solaris (and the defunct opensolaris) has the exact same issue when scaling up the cores per CPU. this badly written article was about cache constrained shared memory usage.

    besides, Solaris doesn't scale as Linux does, despite the hype. Solaris doesn't scale *down* to the PDA level nor *up* to the monster NUMA architectures Linux does.

    Oracle imagines they can make a unified IBM or Unisys type mainframe vertical stack with it now. But that won't work as commodity hardware in clusters can run Oracle's main applications faster and more cheaply than any ultrasparc box.

    Solaris dying, OpenSolaris is dead.

  63. eh? for geniuses by Anonymous Coward · · Score: 0

    There is interesting new research coming out of MIT[...]

    Who'd thunk of that now?

  64. You're Welcome! by eldavojohn · · Score: 2, Funny

    However, posting your own post in your own post is a bit excessive, and there could have been better ways to do this than just repost your entire freakin story as the first comment.

    Yo dawg, I heard you liked my post so I put a post inside my post so you could enjoy it while you're enjoying my post!

    --
    My work here is dung.
  65. I have more than 48 cores. by 140Mandak262Jamuna · · Score: 1

    I let all my friends whitewash the fence for a fee, and most of them paid with apple cores, apart from dead cat in a string, a blue bottle glass to look through and a kite in good repair. I have more than 48 cores and now this! Well, going to give the whole charade up and become a Pirate in the Spanish Main.

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    1. Re:I have more than 48 cores. by guyminuslife · · Score: 1

      I'm wondering if your mind is for rent. For instance, by a god, or a government.

      Also, I think your company is pretty fucked up.

      --
      I don't believe in time. It's a grand conspiracy designed to sell watches.
  66. Disaster by squidguy · · Score: 1

    Damn...this is going to seriously limit how many concurrent instances of goatse I can view.

  67. Terminology by eclectro · · Score: 1

    When Linux is run on the 49th cpu in a system, that can be called the hardcore.

    --
    Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
  68. Bullshit :) by RichiH · · Score: 1

    That would be why they have been displaying a "thanks for helping make Slashdot great, wanna disable ads?" notice for me, and a lot of others, for years.

  69. Solaris by turgid · · Score: 1

    That's why intel is keen on Solaris. It already scales. I'm sure Oracle will manage to put a spanner in the works somehow, though. Then it will be more economical to rewrite Linux.

  70. Dragonfly BSD already addresses this by Anonymous Coward · · Score: 0

    There's nothing BSD cannot do.

    From Wikipedia:

    In DragonFly, threads are locked to CPUs by design, and each processor has its own LWKT scheduler. Threads are never preemptively switched from one processor to another; they are only migrated by the passing of an "inter-processor interrupt" (IPI) message between the CPUs involved. Inter-processor thread scheduling is also accomplished by sending asynchronous IPI messages. One advantage to this clean compartmentalization of the threading subsystem is that the processors' on-board caches in SMP systems do not contain duplicated data, allowing for higher performance by giving each processor in the system the ability to use its own cache to store different things to work on.

    and from dragonfly bsd:

    DragonFly belongs to the same class of operating system as BSD and Linux and is based on the same UNIX ideals and APIs. DragonFly gives the BSD base an opportunity to grow in an entirely different direction from the one taken in the FreeBSD, NetBSD, and OpenBSD series.

    The DragonFly project's ultimate goal is to provide native clustering support in the kernel. This involves the creation of a sophisticated cache management framework for filesystem namespaces, file spaces, and VM spaces, which allows heavily interactive programs to run across multiple machines with cache coherency fully guaranteed in all respects. This also involves being able to chop up resources, including the cpu by way of a controlled VM context, for safe assignment to unsecured third-party clusters over the internet (though the security of such clusters itself might be in doubt, the first and most important thing is for systems donating resources to not be made vulnerable through their donation).

  71. Re:BS on not being near 48 cores... I have 34 alre by internettoughguy · · Score: 1

    These are quad socket systems with 4 AMD 12-core CPU's.

    That's not the problem; 48 cores on one chip is the problem.

  72. Linux is for wimps by jacobsm · · Score: 1

    z/OS on a z196 processor supports up to 80 CPU's per lpar with up to 1TB memory per lpar. Hiperdispatch technology alleviates most of the MP effects of dispatching tasks in large CPU configurations . Parallel sysplex technology provides for the intelligent dispatch of units of work across up to 32 loosely coupled systems. Do the math. 80*32=2560 processors, 32 TB memory. Full fault tolerance for the hardware and the OS. Rolling IPL's of each lpar allows the rest of the sysplex to keep on doing your critical business work.

  73. It's a good thing Solaris can handle... oh, crap by rockrat · · Score: 1

    Well, Solaris has handled many more than 48 cores for years. Too bad Solaris' future looks grim at the best, following Oracle's acquisition of Sun.

  74. "Luckily we aren't anywhere near 48 cores" by toby · · Score: 1

    Heck, most of us don't even have 640KB RAM.

    --
    you had me at #!
  75. 2304 cores by Anonymous Coward · · Score: 0

    Maybe the Cray people (cray.com) can tell the people from kernel dev, slashdot.org or MIT (somebody is wrong!) what to do.
    Cray Linux supports 2304 cores!!!!!

  76. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  77. 48 is not a big deal by Anonymous Coward · · Score: 0

    I remember reading a statement from Intel affirming they had produced an 80 core processor which they didn't intend to put into market.

  78. Re:Linux needs a rewrite anyway. by Anonymous Coward · · Score: 0

    Aww.. Is someone upset that they can't figure out how to make WINE run their favorite eroge? Keep fapping on Windows, Linux doesn't want you anyway.

  79. Beowolf cluster! (core-ish) by rusl · · Score: 1

    First!

    --
    Stupidity is its own reward.
  80. Re:BS on not being near 48 cores... I have 34 alre by sys_mast · · Score: 1

    I'd second this, We're already at dozens of cores in a regular server...even a year ago we had those, so all those talking about big iron boxes from Sun and others, HP sells them. However this is all pointless since the summary seems to have meant to say cores in a single socket. We're a ways from that in normal day-to-day servers. (but maybe not to far;)

    --
    Those who can, do.
  81. OS X by greatcaffeine · · Score: 1

    Does anyone know how OS X will do with 48+ cores? I know Snow Leopard was supposed to improve the scaling to an extent with Grand Central Dispatch, but I don't think Apple went so far as to test the performance with 48 core machines.

  82. Speaking as a layman.. by nanospook · · Score: 1

    Thank you for an enlightening response. Very interesting reading.. One question that popped up in my mind as I read it is does this mean that one or more cores need to be reserved to "manage" the other cores and determine the point at which adding more cores to a "problem" is slowing performance vs speeding up performance? Perhaps the measurement of speed and efficiency needs to become more AI and intuitive on an on-going basis. Obviously the researchers could determine that at some point adding more cores wasn't improving performance. Could this kind of observation be built into the operating system so the cores are managed better?

    --
    Have you fscked your local propeller head today?
    1. Re:Speaking as a layman.. by dgatwood · · Score: 1

      Ostensibly, if this problem were refactored into a work-unit-based API like GCD, then yes, there would be a thread to manage the queues, and it would hop from core to core as needed, and the details would depend a lot on the thread scheduling algorithm, and the optimal number of cores would be equal to the maximum number of concurrent work units that might be available to run at any given time plus one for queue management, in which case that management thread would sit on its own core.

      The real problem is that they didn't divide up the problem into dependency-free work units. At least in theory, if adding more cores slows down performance, then somebody wrote the code wrong. A slowdown means one of two things: either A. that somebody is depending on somebody else's computation but is sucking off cycles while it does so or that B. the problem is infinitely parallelizable but the time required to reassemble the final result ends up being a hard lower bound to the algorithm. Oh, or C. that they haven't properly separated input pages from output pages and the cache coherency overhead is killing them.

      If the problem is A., then if work is divided properly, it should simply stop speeding up (provided that the kernel isn't doing something silly like ignoring processor affinity for the work threads or running into some limitation of the VM system or scheduler design) because it should not be possible to have more threads than there are work units available to run, and it should not be possible for work units to share data in a way that hurts performance. Unfortunately, most software isn't written this way.

      If the problem is B., then it still isn't something the OS can really guess until the computation is done, at which point it is too late to do anything about it. It would be algorithm-dependent, and would basically be bounded by performance of the final series of memcpy operations or whatever at the end.

      At that point, you're probably bounded by raw memory bandwidth more than anything else. Either way, the OS can't possibly guess whether adding more cores will help because it can't know that the threads are all going to have to perform one high-contention serialization of the results at the end or whatever. Maybe the OS could provide some services to help the application author determine the optimal number of pieces to split the work into, but at least in my mind, it seems likely that it's just too algorithm-dependent for the responsibility to fall anywhere but on the app writer at that point.

      To give an analogy, imagine that you are building a wall. You have a brick layer and ten servants to carry concrete blocks, each of whom can carry only one block at a time. You upgrade to thirty younger, smaller servants, but each can now carry only a block of half the size. So you've increased the rate at which the blocks get there by half, but you've now made twice as much work for the brick layer putting it together, and since the entire task is still serialized on the single brick layer, performance drops in that step. At some point, if you keep decreasing the size of the blocks and increasing the number of workers, the extra time that the brick layer takes at the end cancels out the time saved in moving the blocks, and beyond that point, the total construction time starts to get longer again. This is not at all an unusual situation in computer science. Many problems end with certain tasks that can't be parallelized very easily or that become progressively less parallelizable as the data trickles from the leaves of some sort of dependency tree to the root.

      Now imagine that the brick layer won't arrive until the servants have moved all the bricks or blocks. An outside observer looking only at the bricks would see that more blocks are getting there faster and will naively assume that the task will get done faster with more pieces. That's the extent to which the OS can realistically have insight into the process; it can only see what is happening, not what will eventually happen. T

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

  83. Let's explain this the touchy feelie way by nanospook · · Score: 1

    If you pull the CPU chip thingie out and lick it with your tongue, you will feel many many more bumps on a CPU that has many cores.. Those bumps are like nipples. The more nipples you can fit in at one time the more milk you can get in one suck!

    --
    Have you fscked your local propeller head today?
  84. There is no problem, move along by Technomancer · · Score: 1

    Just CmdrTaco trolling for ad impressions
    From TFA:
      “slightly rewriting the Linux code so that each core kept a local count, which was only occasionally synchronized with those of the other cores, greatly improved the system’s overall performance.”

    “The fact that that is the major scalability problem suggests that a lot of things already have been fixed. You could imagine much more important things to be problems, and they’re not. You’re down to simple reference counts.” Kaashoek said. “Our claim is not that our fixes are the ones that are going to make Linux more scalable,”

  85. Re:Multi cores seem to be worth something after al by Anonymous Coward · · Score: 0

    I bet the per-core performance of your Dual-core Celeron exceeds that of the Atom. Any time I run a flash intensive site, it always seems single threaded. A single core PIV 3.6Ghz would also exceed your daughter's Atom on performance.

  86. pont by Anonymous Coward · · Score: 0

    This is wierd because all supercomputers have crazy many cpu cores. And they mostly running Linux.

    e.g.
    System: HP Cluster Platform 3000 BL460c
    Processor cores: 13728 (Xeon 53xx, 2,66 gigahertz, Infiniband)
    Preferences: 102,8 teraflops rmax, 146 teraflops rpeak
    Operating system: Linux
    Manufacture: HP
    Owner: Swedish state, FRA

  87. 48 cores by whitroth · · Score: 1

    "Who'se even got 48 cores?"

    Yo! Here. Honkin' powerfull servers from Penguin (not so wild about them, but that's who we bought from), SuperMicro m/b h8qg6-f, AMD chips, Opteron 6172... and four of 'em. We're running the current CentOS, 5.5.

                      mark

  88. Re:BS on not being near 48 cores... I have 34 alre by mcfedr · · Score: 1

    thats presisly the problem of distributed caches and trying to put threads of the same job on the same phycial chip so they share caches

  89. Well it may need help but ... by Anonymous Coward · · Score: 0

    Well it may need more work but ... SGI ran Linux on a very large SMP/NUMA machine years ago. It may be true that while this +100 cpu system only ran Linux for a short time (with only modest work) it proved internally that CCNUMA and Linux was a viable pair. This was done about 2002. The machine had an internal network name of "stinger". Stinger did make the Top500 list running Irix...

  90. Re:Linux needs a rewrite anyway. by Tubal-Cain · · Score: 1

    I just compile that part out.

  91. We're already there... by Anonymous Coward · · Score: 0

    We're already at 48 cores with 64GB RAM, and for under $7200 Canadian.

    1. Re:We're already there... by Anonymous Coward · · Score: 0

      forgot my link - http://www.newegg.ca/Product/ComboDealDetails.aspx?ItemList=Combo.492276

  92. Windows?? haha by Anonymous Coward · · Score: 0

    I am willing to bet that Windows will NOT work with 48 cores. Windows is not an enterprise OS like Linux is. Windows doesn't even support PAE correctly in most versions.

  93. Oh crap! by Anonymous Coward · · Score: 0

    47 it is.