Running 100,000 Parallel Threads
An anonymous reader writes "This story explains how the latest Linux development kernel is now able to start and stop over 100,000 threads in parallel in only 2 seconds (about 14 minutes 58 seconds faster than with earlier Linux kernels)! Much of this impressive work is thanks to Ingo Molnar, author of the O(1) scheduler recently merged with the 2.5 Linux development kernel."
I frequently hear people bitching about pthread lib and how f*cked up it is... is this going to change the way we use thread too? :-)
The linux song
Arbitrary sig
this image springs to mind
It takes two seconds to start 100,000 threads???? Piff! With my ME computer, It doesn't matter how many parallel threads I am running... I can stop them all instantly by simply attempting to use my computer :P.
And this is great news, and, indeed, impressive. But my question is, what (if any) change is this going to make to my daily use of linux (for gcc, reading slashdot, and that's about it...) Am I going to notice any performance differences?
Launch 100,000 threads while I walk away. . .
OK I'll shut up now.
This is very cool; but does it scale to multiple CPU systems? More and more, SMP, split-bus and multi-core architectures are going to be taking over. If this holds up in those environments, Linux may actually have a leg up on some of the dedicated task heavyweights.
Says the RIAA: When you EQ, you're stealing bass!
Got a link for that?
So now I'm able to open up 100.000 pr0n pictures in just 2 sec. Ubercool ;-)
Thomas S. Iversen
Why so many threads? "Because we can :)"
We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
"Hello, my name is Ingo Molnar. You killed -9 my process: prepare to die."
:P
Sorry, had to
At school (before I graduated so long ago) we would "fork bomb" the compute servers [ while(1) do { fork(); } ] in an attempt to extend deadlines or simply be assholes :)
Religion is a gateway psychosis. -- Dave Foley
Just out of curiousity, how does the benchmark in windows compare?
- Jeff Brubaker
I'm building a project where there will be one huge database with up to 200 different companies connected to it pretty much nonstop. 1-10 users from every company depending on the time of the year. 2 threads for every connection.
200*10*2=4000 threads.
Could you please refrain from using "boxen". It makes my head hurt
Im not here now... Im out KILLING pepperoni
I have no idea what the hell you're talking about but it certainly sounds impressive. :)
-
Now we finally have the power to run 99,999 pop up ads when we visit that pr0n site
Very interestingly enough, either windows has a quota, or some sort of memory leak or something...
Max I can create in a process is 2031 threads... That being done in 700ms.
It's odd cause I can create more if I run several processes. It doesn't look like the kernel is choking on thread creation...
will investigate more.
Normally I am of the "use only as many threads as CPUs" school of thought, but I can think of a reason to use 100,000 threads - imagine a large FTP server, or a multi-homed HTTP server, where you need to provide each connected user with his own set of access privileges or filesystem context. A one-thread-per-connection server may be the easiest way to build security into the system.
so this means Gary Kasparov can get beat at chess that much faster now?
I hate sigs.
It's much interested to have so many processes,
not threads in UNIX-like system...
Leave threads for those Window-ers...
There was a patch for an O(1) scheduler awhile. What this means is it takes the same amount of time to select what runs next and it's not affected by how much is running. But you won't notice an improvement unless you have about 200 processes running at the same time. This may be good for servers, and the like, but it's a lot slower if you have few processes running. Keep this in mind...
I thought Linux didn't have real threads, and they were implemented as processes... Am I missing something?
Uh, why did that get moderated as a troll? Oh, right, Linux is absolutely perfect, and anyone who says otherwise must be a troll.
Come on, Linux's scheduler has long been known to have performance problems once you have a lot of processes/threads... for example, read this paper [text version] (appropriately subtitled "How I Learned to Love the Alpha and Hate the Scheduler"):
Moderators, don't be Slashbots, moderating according to the groupthink. Educate yourselves, and you'll be better moderators, and better people.I suppose this means that sites will want to switch to Linux/Apache in order to avoid being incapacitated when linked by Slashdot?
Very thread uses a minimum of *1 PAGE* of reserve memory for its statck, which is 64K. However, you have to go out of your way to use less than 1 megabyte of reserve memory. Since only 2GB of reserve memory (addressable memory) is available to user applications, this would fit your 2000 thread figure like a glove.
C//
It's nice that the Linux kernel can handle that many threads. But user level threads generally are even more lightweight, and high performance implementations like those on Solaris provide both user level and kernel level threads and map the former onto the latter. Is Linux going to get something similar? Is Sun perhaps donating their implementation? Or are these new kernel threads so lightweight and quick that they are competitive with Solaris on their own, without the mess and complication of adding user level threads?
How will this change affect Mozilla, the Sun JVM and OpenOffice, for instance.
While it probably is generally true that it will take some time for most applications to start using the new threading model some larger applications could support it fairly soon.
Can we expect these applications to be adapted to the new threading model some time soon, and how will it affect performance?
The Internet is full. Go Away!!!
Be careful who you call a dumb fuck. Netscape had a functional browser long before IE3, aguably the first usable version of IE. And it would not surprise me if Netscape 1 predated IE 1, though I can't say I know that for sure.
Speeding The Net is an excellent book about Netscape vs Microsoft, in case anybody cares (it's been a long while since I read it, thus why my date memory is rusty).
Anarchy$ dd if=/dev/random of=~/.signature bs=120 count=1
...will start writing horrible monsters running hundreds and thousands of threads, and their creations will suffer from all other shortcomings of that decision.
Contrary to the popular belief, there indeed is no God.
(I'm sorry. I had to do it.)
I do security
I ran this in DOS:
prompt "Enter Password:"
No one could figure out that all i did was change the prompt from "$P$G" to that, and everyone was asking what the password was. haha, good old teacher was infinitely frustrated as well! IT WAS BEAUTIFUL.
I got kicked out for a year (not beautiful).
100.000 threads? What nonsense; everybody knows that no computer would ever use more than 640.
Wenn ist das Nunstueck git und Slotermeyer? Ja! Beiherhund das Oder die Flipperwaldt gersput.
" - - libpthread should now be much more resistant to linking problems: even if the application doesn't list libpthread as a direct dependency functions which are extended by libpthread should work correctly."
This ought to be a big help for those of us who write plug-in modules for servers like Apache 1.x and PHP. The existing thread library doesn't work properly unless the program executable explicitly links to it, which means that my shared libraries can't take advantage of standard thread management such as pthread_atfork().Given that Apache 2.x can utilise threads as well as processes, does this mean that you can configure a large web server with, say "MaxSpareThreads 1000000" so that you can cope when you're slashdotted ;-)?
640 should be enough for anybody!
LEXX
"Gold still represents the ultimate form of payment in the world." - Alan Greenspan, 1999
In any large group of people you will find a few idiots, a few luminaries, and a great number of average thinkers. Sometimes the only thing that separates idiots from luminaries is their lack of social grace. Welcome to democracy.
"I have opinions of my own, strong opinions, but I don't always agree with them." -- George H. W. Bush
Or perhaps know which part of the banana peel is the good part to smoke? =) GOOD JOB OPEN SOURCE!!! KEEP AT IT!
Netscape is the direct descendant of NSCA Mosaic, the Ur browser. Frankly, I don't remember what the big deal about Netscape 1.0 was, relative to Mosaic, but there was much hype. Maybe something really hardcore, like introducing background colors?
Actually I think the two big features was the "stop" button and the ability to open more than one connection as a time. (So it would load the images much faster)
They misunderestimated me. -- George W. Bush
This ought to make RedHat, Dell, IBM, and Oracle very happy, given a few of the newer contracts with large retailers using Oracle's back-end... if you read the article closely you notice that RH takes the claim for sponsoring a bunch of the work involved in developing this.
C|N>K
Combine this with Apache2's Multi-threaded or Hybrid MPM and you'll have a heck of a web-server!
And does this mean the Java will start to really scale on linux?
Alternatively, you might want to consider that Linux's scheduler was very nicely tuned for far and away the most common case - where you have only a small number of running processes.
/isn't/ insane, and hence these new developments have come along.
/have/ to realise that the kernel developers care about how people actually use the system, rather than crappy benchmarketing numbers. These developments have come about because people needed them, and they didn't happen earlier because no one had needed them before. Go back and read the last few years of the lkml archives, and /then/ come back and talk about this kind of thing, when you understand /why/.
Likewise, threading support under Linux has been oriented towards what the developers considered sane: a fairly small number of threads. They had good reasons for considering that the right way to do it - for a start, it worked nicely for what they wanted, and it was sufficiently simple that they didn't have to put in lots of complex code. Further, it's almost never a good idea to have a program architecture that requires very large numbers of threads - it generally only shows up in naive code where people simply don't understand the problems it brings. So, as far as the kernel developers were concerned, stupid people hurting themselves wasn't something to put any effort into amelioriating. This has changed recently, as people have started using Linux in areas where this kind of thing
You need to understand the reasoning behind a lot of these decisions before you can start complaining about them. First and foremost, you simply
himi
My very own DeCSS mirror.
Scalability is a good thing, no doubt about that. However, there is another aspect that should be pointed out: the current thread API in linux is quite different from the POSIX specification and somewhat crufty. Just to mention the biggest problems: ... All in all, linux threads really need much better integration with the standard system API. A lot of applications could profit from multithreading. Just think of GUI responsiveness. Also, using threads makes some programming tasks much easier. No need for asynchronous hostname lookup, for example.
missing cancellation points: testing whether a thread has been cancelled should be done in lots of system calls, but linux pthreads do not support this. Instead, you have to call pthread_testcancel() before and after every such call. A real drag.
signal handling: linux pthread signal handling is very different from the POSIX specification. However, proper signal handling is crucial for any real world application.
fork() will not work as expected. This is a real nuissance if you want proper daemon behaviour for your application.
documentation of linux-specific behaviour is poor. As a result, most of the existing literature on thread programming is pretty useless for linux.
All these points can be worked around, for sure. Nevertheless, it makes writing portable software a nightmare. Porting threaded software to linux, well
A solid, well documented, standard conforming threads implementation will make linux a much nicer environment for serious programming than it already is. I am really looking forward to this.
sig intentionally left blank
Okay, where did you come up with the 64K figure, and also the 1 mega (megi) byte figure?
All Intel processors have 4KiB pages. Each Linux thread has two things of its own: its own stack, which can be as small as 1 or 2 pages if the code to run is simple enough, and also its own task_struct, which is 1 page including kernel stack for the thread. So all in all, you need 12KiB for each thread. Multiplying with the 100000 figure you get 1200000KiB or 1.144GiB, which is quite affordable for a 2GiB system.
Then, with NGPT (Next-Generation Posix Threads), those 100,000 threads would be in user space and may be even cheaper.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
I think we need to pull some old stats out of our ass. This paper is about athe 2.2.x kernel. Correct me if I'm wrong, but hasn't there been massive overhauling of the 2.4.x and 2.5.x kernels in the scheduling area?
I think I'll just slam XP performance based off of NT benchmarks and aricles. What the hell, thier both from MS the argument must be a valid.
Get a grip!
-- Many men would appreciate a woman's mind more if they could fondle it
Wrong in every respect.
First Mosaic was not the 'Ur browser'. Tim's NextStep browser was. Mosaic was browser number 15 or so. The significant things about Mosaic were that 1) it actually compiled without having to hack the code yourself or mess with 6 different support packages like tkwww and 2) it was the first X-Windows browser that did not look really amateur.
Second, Netscape does not contain any code from Mosaic, although it was written by the same main author - Eric Bina. NCSA sold the commercial rights to Mosaic to Spyglass.
Third IE was originally based on the Spyglass code, so if any browser is 'the direct descendant' it would be IE. Go look at the 'about' box on IE, although the original Mosaic actually had more lines of CERN code than NCSA code which were never acknowledged.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
The 1 MB is the default stack size for every Posix thread. It takes some effort to determine what the smallest valid stack size since PTHREAD_STACK_MIN doesn't specify enough space for the start_func stack frame. The parent post merely stated that the default is 1MB and you have to work to lower that, and he is correct. If the grand-parent poster was just creating threads without specify a stack size, he would run out of RAM pretty quickly.
I wouldn't want to guess where the 64k figure came from.
Ingo:...Anton tested 1 million concurrent threads on one of his bigger PowerPC boxes, which started up in around 30 seconds. I think he saw a load average of around 200 thousand. [ie. the runqueue was probably a few hundred thousand entries long at times.]
Wow.. this is pretty good.The ability to spawn & run 1 million concurrent threads should keep even the most demanding users happy for a few years...
OTOH, I hope this post doesn't become the butt of jokes a few months from now ("and you thought 1 million was a lot! Ha! My Palm 5000XL does more than that!")...
- Portable MP3 Players (Done before)
- Net shotting guns (1960s James Bond movies)
- Build your own sub woofer (My friend built a 500mm (1'8") X 500mm X 1000mm(3'4") Sub Woofer in 1988 and then put it in his Ford Transit)
- Tiny Linux boxen (Seen 1000s of these and *BSD boxen as well)
But 100,000 Threeds on Linux now that's impressive, too bad it won't make one iota of a difference to most of us who use Linux for just reading-Jasa -- Linux - The SOURCE will be with you, ALWAYS
If you want to do stupid things with your programs, that's fine by the kernel developers. Just don't expect /them/ to bend over backwards to make /your/ stupid design work as well as you want it to. That's your problem, and no one elses.
himi
My very own DeCSS mirror.
Since I absolutely suck at getting kernels from source to work correctly (I never get everything in there that I need I guess), the question is: When does all this great stuff reach production? (To then be pre-packaged by RedHat, et. al.)
Acts 17:28, "For in Him we live, and move, and have our being."
(about 14 minutes 58 seconds faster than with earlier Linux kernels)
. Ok, it is a genuine and serious question I have:
Were these 15 minutes extra responsaible for the extra painfull
long start times of apps like Mozilla, and Openoffice?
If so, as soon as I upgrade my
distro, I will boot it into 2.5.
-><- no
Your egregious use of the word "egregious".
No one ever had to evacuate a city because the solar panels broke!
I can only suppose you don't know what Ur is, maybe because you come from a very different culture...
Anyway, and I'm really not well qualified to answer this, Ur was an ancient city-state from which a prominent ancestral of the Jewish-Christian-Islamic heritage (Abraham, if I'm not wrong).
This city, IIRC already found, was sumerian (I'm not sure about this), the folks who are said to be the inventors of the wheel, among other neat things.
So an Ur browser would be the primeval browser, in other words.
Upon writing a note, one must be sure it will be understood; nonetheless, the "Ur" mention boosted the note level way up. All in all, I think it was great and I'm all for it.
But explanations as these sometimes become necessary.
See subject. A useful 'heads up' post for folks like myself who tend to assume that Linux will follow the general Un*x-family behaviours we're familiar with from the commercially-sold variants.
;) check this assumption if I were to do some significant implementation for the Linux platform.
And yes, I would of course
yeah, had to say it, first time I do :)
My name is Ingo Molnar.
:)
You kill my father - prepare to die.
er... sorry about that, I won't do it again
-nwp
User-level threads cannot take advantage of multiple CPUs. True, they are somewhat faster on a single CPU system due to lower overhead, but that's all they are good for.
___
If you think big enough, you'll never have to do it.
ACE is nice for big systems.
But it's also way overkill for small stuff. It's a whole distributed framework, not a wrapper around pthreads.
May we never see th
It's a Windows limit, and it's in the documentation.
C//
The 64K page size is Windows' page size. I can only assume that the poster stating that the intel hardware page size is 4K. I would suppose this means that a Window's (2K,NT) page of 64K is assembled from 16 hardware pages, then. The Windows' page size of 64K is in their documentation. I never paused to think about how this interfaces with hardware pages...
C//
Currently in Linux every thread is assigned a distinct process ID, and as such, a process has as many entries in `top' and `ps' as it has threads. This makes it difficult to monitor processes externally, or even see the other processes' information. Has this issue been addressed? (I realize this is a user-space program issue, not a kernel issue).
I can't seem to find any info on whether Linux core files still produce one core file per thread or just one core file per process (as does Solaris). Has `gdb' been enhanced to handle multithreaded programs (or multithreaded core file) on Linux? If I have a thousand threads - I sure don't want 1000 core files in the event of a crash. Is there a way around this?
Okay, you're wrong. This O(1) scheduler in 2.5.x is the "massive overhauling." (Yes, the patch has been around for a while... but as the article says, it's only recently been merged into 2.5)
Err, Windows NT does use the native 4KB page size on Intel, but is designed to be expandable to systems with up to a 64KB page size. As a result, certain operations (like the reserve mapping that goes on for the thread stack) aligns data in 64KB increments. IIRC, there is also 64KB of virtual slack between memory mapped objects as well.
A deep unwavering belief is a sure sign you're missing something...
Each Linux thread has two things of its own: its own stack, which can be as small as 1 or 2 pages if the code to run is simple enough, and also its own task_struct, which is 1 page including kernel stack for the thread.
This is not true; the kernel stack is two pages in size, i.e. 8KB on i386.
Also, in 2.5 (where these tests were done), the task_struct is no longer allocated on the stack. It is allocated off the slab cache, while the thread_info struct is on the stack. The task_struct slab object is another ~1.7KB per task.
Finally, I do not know what the pthreads default stack size is (user-space? what is that?) but it is certainly larger than one page.
Some guys I know copied a Windows error dialog box and set it as a background image for the desktop, centered.
r atings ystem/windows/winerrors.html ;).
s cr eensaver.shtml
:).
Imagine the poor victim vainly clicking on the buttons, and getting more and more worried. Said victim actually rebooted the machine to see it reappear, and was not happy when he started to notice the sniggering bunch behind him...
For example pic:
http://www.adobe.com/support/techguides/ope
Probably want to replace CCmail with Explorer or something more dear to heart
I also installed a bluescreen STOP screensaver on April Fool's day on a colleague's PC. Heh, he was shocked enough to actually called another colleague over and made the usual worried mumbles.
http://www.sysinternals.com/ntw2k/freeware/blue
Since I had admin privs, I was also tempted to have ad.doubleclick.net and similar dns names to resolve to a private webserver which served out custom banner ads.
Wonder how users would take it if they see the "Staff Meeting at 2pm banner ad". Or "Company Slogan here". Or "Big boss is watching you!". Or for search result sensitive ads: "Stop downloading mp3s/movies/porn!"
I could actually justify that as a useful application. It's probably more useful than a doubleclick ad...
But I'd probably need the 100K parallel thread kernel to serve up all those ad banners
Bwahaha!
Link.
sco and solaris both can create threads 10,000 times faster then the current linux kernels according to sun's and sco's marketing departments. My guess is that this was exagurated but is one of the benefits of the big unix's. Heavily threaded linux apps have been rumoured to fly on unixware where they would run slower on their own native platforms! I guess Linux is maturing in this aspect. Does anyone who knows anything about unix/linux threading care to comment? I wonder if this will help linux in server environments.
http://saveie6.com/
I've created over 200,000 process on a PIII 550 laptop with 256 mb of ram running Windows XP. Of course, it took a while (swapping).
The process is called nothing.exe. Source Code: int WinMain(...) {Sleep(INFINITE);}
I work at a lab, so I also ran it on a Compaq 8-way with 4-GB of ram. It worked but I don't remember how fast it went.
However, there is a big gnarley limit in Windows that will limit the # of processes: the amount of memory allocated to virtual desktops or something. We researched it -- Look it up. This is why you get limited to a few thousand processes or threads if they all do GUI stuff. The bad thing is basically any function you call in user32 will register the thread as a GUI thread. It explains it all in the book Inside Windows 2000.
Not meaning to troll, I'm just going to share basic fact: It sucks that Windows threads are so expensive, but tens of thousands of threads *DOES* suck (read: thread per client) on Windows. However, this is not the same thing as saying Windows doesn't scale -- you just have to code it differently. (Check out how many SQL Server uses when it's processing thousands of clients.) Stuff like IO Completion ports, AWE memory, and Scatter/Gather IO is the way that you have to go.
Just because you *can* create hundreds of thousands of threads, doesn't mean it's a good idea or that your app won't run like shit on a 32-CPU machine!
i've tried to bring 2.5.37 up on 5 different machines, and they all crash anywhere from "OK, booting the kernel..." (hard lock) to getting all the way down to loading SCSI drivers, and getting "Powering off device 0." and then locking up.
"Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
I think I know the limit you are talking about: it's a handle limit in the GDI subsystem.
As for the 200k processes taking time to launch, it is quite normal, as launching a process is much more heavy than just launching a thread.
The 2k threads I created were created in 700ms. which is very acceptable in my books.
And to confirm, yes, creating so many threads ain't the best idea.
Someone else mentionned thread pools as being a workaround, but only a workaround. I personally think thread pools are actually a way of doing things, and not a workaround for slow thread creation. In fact there are new WinNT APIs for thread pooling.
yada yada... I don't think anyone will actually ever read this post =)
In the systems programming world threading, thread
scheduling, and signal processing in threads, was always considered Linux' primary weakness; and was the main strength of Solaris, especially for applications running in the telecom space. But with this announcement, I can see Solaris' last tech superority over Linux crumbling.
I think Sun will need to quicken its re-invention pace.
-j
Yes.
Minimum loadable Memory Section in windows is 64K. I guess a thread creation creates a new stack on a newly created section boundary.
Hidden in the article was a reference to a new locking primitive, futex. I don't see a manpage on line for it, though. Where is this documented?
See here ( http://lwn.net/Articles/9632/ )
and here ( http://lwn.net/Articles/10248/ )
--Linus is being pigheaded about this patch, wanting to "keep the code simple" instead of implementing Ingo's **fast** + Fixed solution.
To quote LWN:
[ So it's fast - though a few extra features have been requested. But this patch has stirred up a bit of a debate. Rather than put in a complicated new PID allocator, it is asked, why not just make the maximum PID be very large? Then, in theory, the quadratic part of get_pid() will never run so the performance problems go away, and the code stays simpler. Linus prefers this approach, as do a number of other developers; he has put a simple patch along these lines into his pre-2.5.37 BitKeeper tree.
Ingo disagrees, pointing out that any reasonable maximum PID size can be exceeded eventually. He would rather fix the problem than try to hid it behind a large process ID space. In the absence of real-world examples that show people being bitten by get_pid()'s behavior in a larger PID space, though, Linus appears unlikely to accept any more complicated fix.
]
.
== WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
I remember that Linus made a remark that he tought that the O1 scheduler wouldn't impact Linux much at all, and that its development would not be a biggie for Linux, downplaying the importance of what it can achieve. Go Ingo for keeping at it!
--- Hindsight is 20/20, but walking backwards is not the answer.
- Consider that the Linux scheduler hasn't changed significantly in those THREE years.
- Consider Ingo Molnar's post on the subject.
- Consider providing some evidence for your position, rather than just saying that I'm wrong.
- Bulleted lists are pretty. I can do that too.
If you guys have some evidence that the paper I referenced is no longer valid, please post it (or references to it). Don't just tell me "oh, that paper's ancient; things are different now."'Cuz up until fairly recently, they weren't.
P.S. And if anyone wants to compare Windows XP's scheduling performance with NT's, be my guest... I don't think you'll see much of a change. Remember that XP is just NT 5.1, and I haven't heard about any significant performance improvements in NT's scheduler. (The only vaguely scheduler-related change I remember is the addition of "fibers" in NT 4.0 SPsomething (3?))
The threads issue needs to be solved, and
soon. We are using Java with Linux
and get regular hangs. Conversations with
IBM's Java support indicates that
this is a problem with the Linux kernel,
Java thread design, and underlying
thread libraries on Linux. And no,
we are not running thousands of
threads, just two Java programs
on a 2 CPU SMP machine.
We eagerly await a fix.
So,in other words when it comes to comparing threads, size does matter.
That's the sound of M:N threading whizzing past your head.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Hardware page size is 4KB, as was noted elsewhere. The key element that I haven't seen mentioned is that Windows' virtual memory system has several ways to 'allocate' memory. There's reserving pages, and there's committing pages. In the case where you tell the OS you want memory, it reserves pages. That is to say, it does not actually take memory from the free physical memory, but instead creates a contiguous address space large enough for your request, but allocates no hardware RAM at those addresses.
When you commit a page, either through accessing a page (read or write) that is not allocated, it trips a hardware fault if the VM hasn't mapped a page to the address, which then searches for a free page, then links them together.
The end result is, even if Windows does try to create 64k worth of memory segment space for a process, unless it is actually reading or writing to a byte in each 4k chunk, its internal VM will not allocate physical memory for the whole 64k. Furthermore, there's no such advantage or realistic way for the operating system to align anything in memory physically, except in AGP ram. The VM system handles physical pages of memory exclusively, but does not manage AGP-allocated memory (IIRC). In other words, though the OS can align the address space to anything it likes, the OS layer cannot request any physical allocation mapping or alignment. So that comment about aligning memory for processes is quite unlikely.
Now, the XBox (which runs a variant of the Win2k kernel) has a bit more control over VM, but it also does not support demand paging, so it cannot swap to the hard disk and give you RAM+HD effective memory. Shame, that. But, as a result, you have an API that allows hardware level allocation control. Still, the OS doesn't take advantage of it, AFAIK. It's for developers.
Any connection between your reality and mine is purely coincidental.
In otherwords, I've read tons of articles about all the fancyness being incorperated into the .5 kernel.
.4 kernel (or linux at all rather), is a mistake for a serious production server.
When is it expected that it becomes stable? how long do I have to wait?
The more I read about this, the more I feel going with the
The end result is, even if Windows does try to create 64k worth of memory segment space for a process, unless it is actually reading or writing to a byte in each 4k chunk, its internal VM will not allocate physical memory for the whole 64k.
Yes. Quite true. I hade a problem a while back on Windows which took me a bit of reading through the documentation (and verifying with some low level sys calls) to determine that what was happening is that I was running out of "reserve memory". Which is to say that, while I had plenty of physical memory left, all the address space had been used up. You can do this very easily by creating thousands of threads on your computer. To get a large number of these threads, you'll have to push the default stack size to its minimum, 64K. I was a bit disatisfied with this minimum, but I suppose I'll live with it now (or port to linux) if I have to, or upgrade to a 64 bit os if it becomes a practical limit in the future.
C//
Err, Windows NT does use the native 4KB page size on Intel, but is designed to be expandable to systems with up to a 64KB page size. As a result, certain operations (like the reserve mapping that goes on for the thread stack) aligns data in 64KB increments
That's boneheaded. Linux supports page sizes up to at least 4MB, but it doesn't align everything on 4MB boundries on the off chance that you might be using 4MB pages. It uses the appropriate alignments for the page sizes actually in use.
An OS that has dropped all support for non-Intel hardware citing a portability concern which doesn't exist in portable OSes? As they say in Snatch, "It's spurious, mate. Not genuine."
Sumner
rage, rage against the dying of the light
One of the nice things about Linux. You don't have to live with any of these 32-bit limitations if your application is big enough to justify 64-bitness. While Microsoft had NT running on Alpha, I understand it was essentially still a 32-bit OS - it was only truly ported to 64 bits when Itanium support was added. Linux, on the other hand, has had true 64-bit implementations running since '94 or '95, so you can be fairly confident that the niggling little 32-bit-isms have mostly been caught by now.
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
My thread-creation benchmark can create 100,000 threads in 10 seconds on my 800MHz Linux machine at work. No idea what kernel it's running, but I'm sure it's not recent. Furthermore, my 450MHz home machine running FreeBSD 4.5 runs the same benchmark in only five seconds. WTF?
> Finally, I do not know what the
> pthreads default stack size is
> (user-space? what is that?) but it is
> certainly larger than one page.
Why it needs to be larger than one page? The kernel will trap access to page faults due to stack overflow, and will allocate additional stack to it anyway.
Yes, I know you are right. Amongst other things, I won't be stuck with 64K per thread stack in Linux, and as you say, I could use 64 bit alpha linux. I'm looking forward to Hammer, actually.
C//
Why it needs to be larger than one page? The kernel will trap access to page faults due to stack overflow, and will allocate additional stack to it anyway.
It does not need to be bigger than one page, it just is. You are right, the stack is expanded via implicit mmap as it grows... but for performance reasons the default stack is usually measured in megabytes, not pages.
Anything but the simplest of applications would use a page rather quickly. User-space applications are programmed to assume they have any size stack they want. Local variables are huge.
In short, I was just commenting on the default. It can surely be lowered...
I don't understand what the issue is here.
I was able to run 1,600,000 simultaneous connections with a modified FreeBSD kernel, in June of 2001. Couldn't get much work done, but at about 300 baud per conection, after dividing up a gigabit ethernet link... you shouldn't expect to do much work.
Without modifications, after a patch to the credential reference counting (since committed to FreeBSD 4.5), as long as a stock kernel is tuned correctly, it can still *easily* handle 100,000 simultaneous connections (16K of window space for each connection = 1.6G of mbufs).
-- Terry
So? Use non-blocking I/O instead. Problem solved.
-- Terry
No you will see a pid per thread because, that is how the scheduler knows to schedule things. The getpid() c library call from within the program. When they said it is a 1-to-1 mapping that means that there is a process per thread. Just look when you see all those proccesses with the same name, and see if they have the exact same memory usage. If they do it means they are using the same memory and are threads. No matter how you implement threads there has to be more than one proccess other wise when the program blocks for I/O all threads would be blocked.
One day people will learn the folly of Winbloze, Linux Rules!
Aren't we all? (:
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Hey, dork! We've never seen anyone use a redirect link to the goatse.cx site before. Wow, you must be, like, you know, like, rilly brite. Gosh, me wants be smirt lyke ewe.
Pain is merely failure leaving the body
> It does not need to be bigger than one
> page, it just is.
At least that isn't what suggested by the documentation of linuxthreads (in Debian testing). In E.5 it says the following, implying that the default stack size is really just 1 page.
E.5: Does LinuxThreads implement pthread_attr_setstacksize() and pthread_attr_setstackaddr()?
These optional functions are provided in recent versions of LinuxThreads (0.8 and up). Earlier releases did not provide these optional components of the POSIX standard.
Even if pthread_attr_setstacksize() and pthread_attr_setstackaddr() are now provided, we still recommend that you do not use them unless you really have strong reasons for doing so. The default stack allocation strategy for LinuxThreads is nearly optimal: stacks start small (4k) and automatically grow on demand to a fairly large limit (2M). Moreover, there is no portable way to estimate the stack requirements of a thread, so setting the stack size yourself makes your program less reliable and non-portable.
...run Ada 83 programs.
But while their threads will be slow, they will be to handle the text the users are entering; vastly more useful than the most optimized eight-bit character horror you would turn out.
Trolling is supposed to be:
1. Fast! Writing random mild insults almost a week after the original posting isn't as great as making a real-time flamewar immediately after posting.
2. Accessible to a potential reader. Referring to an obscure recurring theme of my rants made months away from this article (byte-value transparency of protocols vs. Unicode references in RFCs) would require a potential troll spectator a lot of googling before he will be able to appreciate your comment.
Contrary to the popular belief, there indeed is no God.