The 800 Mhz Via CPU is roughly equivalent to a 400 Mhz Celeron.
Which is still _way_ overkill for someone who just does a little word processing and web surfing. Somewhere between 100-200 Mhz computers got to the "fast enough" point for those tasks. Lots of people need more power, even on the desktop--gamers, some spreadsheets, people who want to watch videos, etc--but for a surprisingly large number of people the upgrade cycle ended 4 years ago.
Now, do you really think that any ABX test result applies for all people, all systems, all cables, in the past, present and future?
No, but if there was a difference in sound quality between decent, relatively inexpensive shielded copper wire and $100/ft Monster cable there should be at least 1 double-blind ABX test showing someone who is able to tell the difference. There have been a number of people who claim that there's a difference who volunteered for ABX tests, I have yet to hear of one who could distinguish the cables with any degree of statistical significance.
You laugh at paying 100-500 on speaker cable... speaker cable is cheap at that rate. run it over 75 ft and you will hear a difference
No you won't. Plenty of double-blind listening tests have shown that, and plenty of premier recording studios use cable that's an order of magnitude or two cheaper than that with excellent results. I'm talking Telarc-level recordings here, too.
A true audiophile wouldn't pay $100 a foot for speaker cable. Plenty of double-blind ABX tests show that a lot of much cheaper copper wire performs equally well.
An audio wanker who thinks anything by Bose is high-end and circles his CDs with green magic marker to make the laser track better might pay $100 a foot for such cable, but an audiophile is interested in things that sound better and not things that empty his pocket.
Uncompressed audio does sound better than 128 kbit mp3 on good gear. I use mp3 in my car, but uncompressed audio on my home system.
Shorten (.shn) is not a free program (libre--it is gratis). It also doesn't compress as well as some of the other lossless codecs and as far as I know there aren't any hardware devices that support it.
I use FLAC to compress my music, which is free and lossless. It outperforms shorten on average (smaller compressed files), and is also supported by some hardware playback devices (Rio, Phatbox, some Kenwood stuff) unlike Shorten.
I play back through a Hoontech card with digital output and use an offboard MSB Link DAC III (the computer is acoustically isolated from the listening room) which feeds into a Creek 5350 integrated amp driving Vandersteen 2ce Signature loudspeakers.
I also use lossy compression for my car mp3 player--the stereo there isn't audiophile quality anyway.
Seriously, the reason for this is that Motorola has displayed amazing ineptitude in PowerPC development, so duals are the only way to stay competitive with 3GHz x86 processors.
I'd much rather have an expandible box even if the CPU were a single P3 500 Mhz. Nothing I do comes close to pegging that, and I do plenty of video playback, mp3s in the background, etc. Sure, for video editing or high-end gaming a faster CPU may be nice, but I'd rather save the money on that and put it into more of Apple's cool industrial design, better screen, more hardware, and more expandibility in general.
Mac would be a bit difficult, because it doesn't have a system() function
Uh, ANSI C requires a system() call. There are plenty of ANSI C compilers for the Mac.
Sumner
Re:Seems like a silly move...
on
Yahoo Moving to PHP
·
· Score: 4, Insightful
Standard Java stereotype. Java was slow a long time ago, not today. That gross asumption alone should get you modded down.
Standard Java propoganda. The Java language is plenty fast (relative to the other solutions discussed), but most of the I/O libraries are still hideously slow. Ignoring that completely should get you modded down.
Even a fairly slow computational language like Python drops Java out of the running for typical high-volume web site usage, simply because of I/O problems. Java is quite suitable in low-volume settings with stiff transactional requirements or heavy computational requirements--any setting where high I/O costs are amortized by several-order-of-magnitude higher page generation costs. It's a bad choice for a very high-volume site which basically wants to paste several database sources together into a template and shove it out the pipe; Yahoo! falls pretty squarely into that camp for most of its pages.
They also have many components written in various domain-appropriate languages or that they don't want to rewrite for whatever reason; JNI is still pretty heavyweight, and if you have a lot of language interop requirements Java isn't a great choice (though if you're willing to sacrifice some JVM portability this can often be worked around, especially if the other benefits of Java outweigh the cost of implementation).
On top of that, using EJB/J2EE will kill performance even more, which means that actually getting the feature benefits of Java requires handing away even more performance.
All that's without even addressing the "requires tons of threads" problem; multiplexed I/O is pretty new to Java, and there's no good multiprocess API. Both of those are major problems, though hopefully multiplexed I/O will mature quickly. But until there's a good multiprocess API, Java's going to be unsuitable for a number of applications (and sticking to a platform-independent mentality instead of a platform-agnostic mentality makes implementing an efficient multiprocess API very difficult indeed).
Worst of all are the memory issues, but those are well-known enough not to be worth rehashing.
Note that while the list of drawbacks is only addressed briefly, increased schedule() overhead and increased system call overhead are potentially large drawbacks.
Also, after further investigation the Anzinger solution is _not_ in 2.5.x yet; Linus has looked at the patch, asked for clarification, and Anzinger recently replied with an updated patch. Search linux-kernel archives for "high-res-timer" or "POSIX timer" patches for more info.
There's a better link at LWN explaining the approach and drawbacks. It links to the high-resolution timers project (Anzinger's), which I believe is going into 2.5.
FWIW: BSD has the same select setup as you described.
Yeah, pretty much every Unix has interrupt-driven returns for the non-timeout case, anything else would be pretty bogus--though some systems (e.g. Linux 2.5.x) do interrupt mitigation under high load, but that's more of an "above and beyond" thing. The timeout case is handled differently on several Unices.
So here's my thought: how expensive is it to reprogram the timer chip? Would it be possible to adjust it dynamically to create perfect granularity in sleep/select?
There is a tickless Linux implementation.
I can't find the home page at the moment, but see e.g. http://www.uwsg.iu.edu/hypermail/linux/kernel/01 04.1/0137.html
There are a lot of other ways of dealing with this, and tickless has some negative attributes I don't fully understand (among them is that it's not portable to older hardware, and there is some overhead to programming timer interrupts). I think the nanosecond kernel patches (which are starting to go into 2.5) address the select/sleep granularity issue in a different way but I'm really fuzzy on the details.
From the look of the article, under Linux, select actually does some sort of polling at or related to HZ. It may be on some sort of almost-run queue: a selecting process gets allocated timeslices; on its slice, it polls and either returns to userland or goes back to onto the almost-run queue. I don't have time to verify that-- I don't know my way around the Linux kernel-- but it seems to be reasonable, based on the article. Can I get a Linux developer to confirm/deny my guess?
Deny. It's actually the idle timeout that's affected by HZ. select() itself doesn't poll at all, and e.g. a select() call with an infinite timeout will be completely unaffected by HZ (select will wake up when the network gets an interrupt resulting in readable data/writeable buffer space).
Example of the timeout effect: a game could have a select() loop that waits on user input, but also has a timeout argument so that it can go ahead and update the screen, do enemy AI, etc. The kernel, in absence of interupts, schedules on HZ boundaries. Suppose that you as a programmer put a 1/60 second timeout argument in the select loop (intending to update the screen with a 60 HZ refresh and figure out where everything's moving). If you call select() right after a HZ boundary, you could find yourself waiting until 1/50 second passes even on an idle machine with HZ=100; after 1/100 sec, your timeout hasn't expired yet. Next chance to schedule is at 2/100 (1/50) sec.
With HZ=1000, you'll schedule no more than 1/1000 sec after the 1/60 sec boundry (on an idle machine).
This example is really simplified; a real-life app would adjust for scheduling creep by keeping track of wall-time. But the same concept, with more complicated apps, can cause faster HZ ticks to give you better CPU utilization (especially in e.g. video editing apps and such) because you get around to using the CPU closer to when you want it.
The preempt kernel is an even better example of where decreasing latency can increase throughput, sometimes significantly. There you can really get around to dealing with I/O quickly, keeping CPU saturated (and saturated with cache-warm data) and benefiting things like heavily loaded web servers just as much as sound editing stations.
IF you aren't a real engineer-- and by real engineer, I mean someone who learns new technologies in their spare time, someone who wouldn't be cought dead without a computer at home
That's BS. A lot of good programmers I know don't have computers at home, intentionally. It's a matter of getting balance in your life. Mine serves only as a centralized Vorbis repository and music visualization center. I write TCP servers and search indexing programs for a living and sit in front of a computer more than 8 hours a day. I'm more than happy to _not_ do the same at home. Once in a great while I get an off week at work where I'm stuck in pure maintenance hell or other non-learning positions and I'll dust off the keyboard at home and write some code, but for the most part I'm pretty happy to leave that to work and limit my home involvement to the odd tech book and the whiteboard.
Overrated: How abusive moderators avoid getting caught in metamoderation.
No, it's how moderators work around the fact that there's no "Incorrect" or "Misinformative" category.
man - occasionally called something else (QNX calls it use). The ability to derive meaning from man files is extremely important.
Repeat that about 1000 times.
sh/csh - the vast majority of *nix default to some variant of sh, of which ksh is the most common. If you know sh you can use any of the variants (ksh, bash, etc) and learn the particulars as you go. There are rumors of *nices that default to csh, but I've never encountered one. Regardless, you should be familiar with it just in case.
The only thing you need to be able to do in csh is type "/bin/sh". Knowing sh is key, at least know how to do "for i in...do...done", "if..[..fi", and "case". Know how to use "["/"test".
Also get and know netcat.
find - They say locate is easier, but I've never seen it on a non-Linux system.
locate is far more limited, though much faster (it uses an index instead of an fs search). Know find, and know that it's different on non-GNU systems. "find" alone works with GNU find, but you'll want at least "find . -print" on other systems.
Probably make a/usr/gnu/bin on every new machine and put it first in your path (with GNU fileutils, textutils, find, etc). df, find, echo, and a ton of others will work differently between Unix variants and you'll make your life more consistent if you can use the same set of tools on all platforms.
And _do_ learn to read and use man pages. You'll eventually get caught on a machine without the GNU tools available and need to figure out how the local "mount" works to get back to them.
And _don't_ alias rm to "rm -i" (or similar with cp, mv, etc). You'll wind up destroying files on a machine without the alias.
Also, learn ed. You may need it someday. On my system (FreeBSD 4.6), ed is installed as/bin/ed, and vi is installed as/usr/bin/vi
Definitely.
Use vi frequently for a while, that vi skill will pay off if you get dropped into ed. Learn enough ed to fix things up and get/usr/bin mounted. And keep a statically linked copy of your favorite small(ish) editor and of sash (and scp/ftp/wget/netcat if you can swing it) on the root partition --libc.so _will_ eventually die on one of your machines, and having sash and vi available for recovery will make you much, much happier.
Plus, it'll make the local MCSE's head explode if he sees you using it
I think you are writing as a person who has never had to use either.
I have written a dynamic content server that over the past 2 years has served over 6 billion requests, with 5 9's of uptime. I've written several realtime instrument control applications. I've written a distributed text mining application that does index-assisted regex searches of 1/2 terabyte of data in Threads can really be life savers when used correctly. Sure you have to implement locking but that's what pthread_mutex is for.
On low-mem devices making full copies of the process to spawn copies is just insane.
1) Look up COW and memory sharing. 2) I never said "use only processes". A combination of processes and event loops is the way to go 99% of the time. There are some corner cases where threads are useful, but they tend to be abused by people who think "threads are good" without considering the alternatives nor the ramifications of that choice.
And on windows the Thread implementation is *intentional* not accidental. The idea is that people using threads will take advantage of the speed increase.
It's not a speed increase. Thread switching and thread creation on Windows are slower than process creation and process switching on Linux. On a par, but slower. Process creation on Windows is laughably slow, though, and process switching is substantially slower than thread switching.
It's not that Windows figured out how to make their threads go fast, it's that their processes were dog-slow and they had to create an entirely seperate execution primitive to get any sort of reasonable concurrency. Linux did things the right way by making them both fast, and now allows you to choose between the two for _design_ reasons (do I want to share memory?) rather than artificial implementation reasons.
You'll find a lot of knowledgeable people (Larry McVoy, former SGI kernel architect) who echo the same belief: use threads sparingly. Use as many threads as you have CPUs, and use processes instead if that makes more sense. Use more threads than that only if you're intimately familiar with the alternatives and know why they don't work, because while a state machine with non-blocking I/O may seem hard at first glance it'll almost certainly turn out to be easier to implement correctly, easier to debug, faster, and easier to maintain.
And yes, Linux's process context switches are on a par (possibly faster - can't be bothered to look up benchmarks) with NT's thread context switches.
Last time I benchmarked, which was a long time ago (NT 3.51 days), Linux process switch times were 5x faster than NT thread-switch times on the same hardware. Linux thread-switch times were on a par with process-switch times, NT thread-switch times were about 20x faster than NT process-switch times.
I'd expect all those numbers to have changed, though.
Err, Windows NT does use the native 4KB page size on Intel, but is designed to be expandable to systems with up to a 64KB page size. As a result, certain operations (like the reserve mapping that goes on for the thread stack) aligns data in 64KB increments
That's boneheaded. Linux supports page sizes up to at least 4MB, but it doesn't align everything on 4MB boundries on the off chance that you might be using 4MB pages. It uses the appropriate alignments for the page sizes actually in use.
An OS that has dropped all support for non-Intel hardware citing a portability concern which doesn't exist in portable OSes? As they say in Snatch, "It's spurious, mate. Not genuine."
What I want to know is why use boxen rather than boxes? "boxes" refers to the physical objects (ie the cases and contents thereof)
"boxen" refers to the notional servers.
My Linux boxen could be retasked to be FreeBSD boxen, but they'd still be the same boxes.
AFAIK, "boxen" was derived from "VAXen". And it was never "VAXes", that would brand you as computer-illiterate as quickly as saying "What's the http for that?" or using "PC" to mean "Windows box".
Threads are a way of saying "screw protected memory". They should be used only when you don't want memory protection within your application. Almost always, using threads is the wrong choice; multiple processes and/or a state machine with non-blocking I/O(depending on the problem) will accomplish the same ends as efficiently* and are much easier to implement**
*Remember that processes (COEs which don't share memory) are nearly as fast as threads in Linux, and faster in some cases. On other OSes (Irix, Solaris, Windows), processes are inefficient and threads are implemented more efficiently. That's a horrible hack to make up for ridiculously heavyweight processes; it's true that a small number of things can be optimized in a thread implementation (setting up VM mappings), but the actual speed implications of that are negligible in real-life programs. I'd be extremely surprised if you could find even one program exhibiting a measurable speed difference in Linux attributable to the scheduling, creation, and destruction properties of threads.
**Threaded solutions often seem straightforward. The devil is in the details, though; locking, synchronization, and debugging issues tend to bite hard, and in the end I've never dealt with a problem where threading was a win over multiprocs and/or a state machine. The advantage of multiprocesses is not only in keeping memory protection; it also forces you to be explicit about what's shared and how that is communicated (and greatly simplifies debugging). Resulting designs tend to be much clearer and easier to make correct and maintain.
Of course, given recent history, the latest scam involving Florida is buying not swampland but elections. And it only costs you five justice's robes...
Do it if it raises your gpa. GPA is not everything, but it is definitely BIG.
Only if you're applying to grad school, and even then it's far less important than any research you've done.
In some fields, when you're job hunting straight out of college employers might ask your GPA. But especially if you're looking for a programing job, it'll be far more important to show your skills and prior work/research (math _skills_ will count big) than to list your GPA. That fun open-source project you wrote counts for more than an extra.5 on the GPA.
I've never even been asked for my GPA when job hunting, in fact I finished at CMU in 97 and can't remember what it was (in contrast to the high school GPA and SATs which are burned into my brain as a result of the college application process).
Get the 2nd major if you want to take the classes, but if you have another field of interest go for it--you'll be learning in-field skills for the rest of your career, you won't often have such easy access to that art history or theoretical physics class. A math degree, like a 4.0, might help you get the first job but a photography class will give you a skill you'll use off and on for the rest of your life (and it might me _more_ valuable than that math degree depending on the employer and the job--it's amazing how often something seemingly worthless coincides with really cool job).
I have used EiC for debugging some projects, but as I said I find the combination of Boehm-Weiser and valgrind to be the most effective for real-world projects. EiC is quite nice but as you note it can take a lot of effort to get it working right.
Managers tend to think that once a project is out the door or 'live,' it's done and over with, and assigns developers new project.
That is a problem. You need to get managers to understand that once a project launches, there's going to be a stabilization period that's just as intense as the development period. It can be short or long depending on the project, but it needs to happen.
And you need to get programmers to believe that they _can_, in fact, get the code to a state where it just runs. That means having watchdogs to monitor it rather than people (sending email when it dies, and avoiding false positives). That means taking the 2-3 (or 20) hours to fix things that require "just 5 minutes" of attention every day. It also means isolating the system into as many independent parts as possible so that it's manageable and easy to work with.
As an example, I worked on developing a massive dynamic content system for a past job. It required a GUI for content authors to do input with, a way to schedule and deploy changes, a server to select content based on user parameters, a way to track content to see what was served where, inventory analysis of what web space would be available, etc. It took over a year to develop, had more than 15 seperate executables built from 75,000 lines of code, and was hell when it launched in terms of making sure it ran happily with all the other systems in place. One daemon needed to be restarted occasionally; sometimes unforeseen log data screwed up reports; etc. We could have just gone ahead and restarted that daemon when it needed it, fixed the logs by hand, and so forth--instead we put the up-front time into fixing the root causes of each of the tiny maintenance chores as we identified them.
Within 3 months it was pretty much hands off, and would send email alerts when something went wrong that the watchdogs couldn't handle. Moreover, because it was 15 seperate components, feature enhancements were generally easy: change one small program instead of a huge monolithic one. We went through several iterations of feature requests over the following year, each one followed by a stabilization period, and eventually got it to the point where it does everything we needed done and did it without needing handholding (and serves 6 billion requests per year).
That's what you need to aim for as a programmer--ongoing maintenance tasks when you don't have feature requests coming in is a sign that the code needs to be more robust.
There comes a point, of course, where software is no longer paying for its maintenance. It's always legitimate to move an old version to a "mature" state, where it's no longer supported
Hopefully that doesn't mean deprecated. Most projects I've been involved with (I'm talking internal projects here) start from scratch, have a heavy design, development, and testing phase, and then launch live. The first month or two sees heavy maintenance work, and then the code stabilizes and requires basically no hands-on work.
Obviously when new feature requests come in they destabilize the codebase for some time, but you can and usually do get equilibrium: projects that have no development going on generally need no maintenance after they've had a bit of time to mellow.
And that's what I see as a mature project. The ultimate goal of every project I'm on is to reach maturity and just run.
The 800 Mhz Via CPU is roughly equivalent to a 400 Mhz Celeron.
Which is still _way_ overkill for someone who just does a little word processing and web surfing. Somewhere between 100-200 Mhz computers got to the "fast enough" point for those tasks. Lots of people need more power, even on the desktop--gamers, some spreadsheets, people who want to watch videos, etc--but for a surprisingly large number of people the upgrade cycle ended 4 years ago.
Sumner
Now, do you really think that any ABX test result applies for all people, all systems, all cables, in the past, present and future?
No, but if there was a difference in sound quality between decent, relatively inexpensive shielded copper wire and $100/ft Monster cable there should be at least 1 double-blind ABX test showing someone who is able to tell the difference. There have been a number of people who claim that there's a difference who volunteered for ABX tests, I have yet to hear of one who could distinguish the cables with any degree of statistical significance.
Sumner
You laugh at paying 100-500 on speaker cable... speaker cable is cheap at that rate. run it over 75 ft and you will hear a difference
No you won't. Plenty of double-blind listening tests have shown that, and plenty of premier recording studios use cable that's an order of magnitude or two cheaper than that with excellent results. I'm talking Telarc-level recordings here, too.
Sumner
A true audiophile wouldn't pay $100 a foot for speaker cable. Plenty of double-blind ABX tests show that a lot of much cheaper copper wire performs equally well.
An audio wanker who thinks anything by Bose is high-end and circles his CDs with green magic marker to make the laser track better might pay $100 a foot for such cable, but an audiophile is interested in things that sound better and not things that empty his pocket.
Uncompressed audio does sound better than 128 kbit mp3 on good gear. I use mp3 in my car, but uncompressed audio on my home system.
Sumner
Shorten (.shn) is not a free program (libre--it is gratis). It also doesn't compress as well as some of the other lossless codecs and as far as I know there aren't any hardware devices that support it.
I use FLAC to compress my music, which is free and lossless. It outperforms shorten on average (smaller compressed files), and is also supported by some hardware playback devices (Rio, Phatbox, some Kenwood stuff) unlike Shorten.
I play back through a Hoontech card with digital output and use an offboard MSB Link DAC III (the computer is acoustically isolated from the listening room) which feeds into a Creek 5350 integrated amp driving Vandersteen 2ce Signature loudspeakers.
I also use lossy compression for my car mp3 player--the stereo there isn't audiophile quality anyway.
Sumner
Seriously, the reason for this is that Motorola has displayed amazing ineptitude in PowerPC development, so duals are the only way to stay competitive with 3GHz x86 processors.
I'd much rather have an expandible box even if the CPU were a single P3 500 Mhz. Nothing I do comes close to pegging that, and I do plenty of video playback, mp3s in the background, etc. Sure, for video editing or high-end gaming a faster CPU may be nice, but I'd rather save the money on that and put it into more of Apple's cool industrial design, better screen, more hardware, and more expandibility in general.
Sumner
Mac would be a bit difficult, because it doesn't have a system() function
Uh, ANSI C requires a system() call. There are plenty of ANSI C compilers for the Mac.
Sumner
Standard Java stereotype. Java was slow a long time ago, not today. That gross asumption alone should get you modded down.
Standard Java propoganda. The Java language is plenty fast (relative to the other solutions discussed), but most of the I/O libraries are still hideously slow. Ignoring that completely should get you modded down.
Even a fairly slow computational language like Python drops Java out of the running for typical high-volume web site usage, simply because of I/O problems. Java is quite suitable in low-volume settings with stiff transactional requirements or heavy computational requirements--any setting where high I/O costs are amortized by several-order-of-magnitude higher page generation costs. It's a bad choice for a very high-volume site which basically wants to paste several database sources together into a template and shove it out the pipe; Yahoo! falls pretty squarely into that camp for most of its pages.
They also have many components written in various domain-appropriate languages or that they don't want to rewrite for whatever reason; JNI is still pretty heavyweight, and if you have a lot of language interop requirements Java isn't a great choice (though if you're willing to sacrifice some JVM portability this can often be worked around, especially if the other benefits of Java outweigh the cost of implementation).
On top of that, using EJB/J2EE will kill performance even more, which means that actually getting the feature benefits of Java requires handing away even more performance.
All that's without even addressing the "requires tons of threads" problem; multiplexed I/O is pretty new to Java, and there's no good multiprocess API. Both of those are major problems, though hopefully multiplexed I/O will mature quickly. But until there's a good multiprocess API, Java's going to be unsuitable for a number of applications (and sticking to a platform-independent mentality instead of a platform-agnostic mentality makes implementing an efficient multiprocess API very difficult indeed).
Worst of all are the memory issues, but those are well-known enough not to be worth rehashing.
Sumner
Note that while the list of drawbacks is only addressed briefly, increased schedule() overhead and increased system call overhead are potentially large drawbacks.
Also, after further investigation the Anzinger solution is _not_ in 2.5.x yet; Linus has looked at the patch, asked for clarification, and Anzinger recently replied with an updated patch. Search linux-kernel archives for "high-res-timer" or "POSIX timer" patches for more info.
Sumner
There's a better link at LWN explaining the approach and drawbacks. It links to the high-resolution timers project (Anzinger's), which I believe is going into 2.5.
Sumner
FWIW: BSD has the same select setup as you described.
1 04 .1/0137.html
Yeah, pretty much every Unix has interrupt-driven returns for the non-timeout case, anything else would be pretty bogus--though some systems (e.g. Linux 2.5.x) do interrupt mitigation under high load, but that's more of an "above and beyond" thing. The timeout case is handled differently on several Unices.
So here's my thought: how expensive is it to reprogram the timer chip? Would it be possible to adjust it dynamically to create perfect granularity in sleep/select?
There is a tickless Linux implementation.
I can't find the home page at the moment, but see e.g.
http://www.uwsg.iu.edu/hypermail/linux/kernel/0
There are a lot of other ways of dealing with this, and tickless has some negative attributes I don't fully understand (among them is that it's not portable to older hardware, and there is some overhead to programming timer interrupts). I think the nanosecond kernel patches (which are starting to go into 2.5) address the select/sleep granularity issue in a different way but I'm really fuzzy on the details.
Sumner
From the look of the article, under Linux, select actually does some sort of polling at or related to HZ. It may be on some sort of almost-run queue: a selecting process gets allocated timeslices; on its slice, it polls and either returns to userland or goes back to onto the almost-run queue. I don't have time to verify that-- I don't know my way around the Linux kernel-- but it seems to be reasonable, based on the article. Can I get a Linux developer to confirm/deny my guess?
Deny. It's actually the idle timeout that's affected by HZ. select() itself doesn't poll at all, and e.g. a select() call with an infinite timeout will be completely unaffected by HZ (select will wake up when the network gets an interrupt resulting in readable data/writeable buffer space).
Example of the timeout effect: a game could have a select() loop that waits on user input, but also has a timeout argument so that it can go ahead and update the screen, do enemy AI, etc. The kernel, in absence of interupts, schedules on HZ boundaries. Suppose that you as a programmer put a 1/60 second timeout argument in the select loop (intending to update the screen with a 60 HZ refresh and figure out where everything's moving). If you call select() right after a HZ boundary, you could find yourself waiting until 1/50 second passes even on an idle machine with HZ=100; after 1/100 sec, your timeout hasn't expired yet. Next chance to schedule is at 2/100 (1/50) sec.
With HZ=1000, you'll schedule no more than 1/1000 sec after the 1/60 sec boundry (on an idle machine).
This example is really simplified; a real-life app would adjust for scheduling creep by keeping track of wall-time. But the same concept, with more complicated apps, can cause faster HZ ticks to give you better CPU utilization (especially in e.g. video editing apps and such) because you get around to using the CPU closer to when you want it.
The preempt kernel is an even better example of where decreasing latency can increase throughput, sometimes significantly. There you can really get around to dealing with I/O quickly, keeping CPU saturated (and saturated with cache-warm data) and benefiting things like heavily loaded web servers just as much as sound editing stations.
Sumner
IF you aren't a real engineer-- and by real engineer, I mean someone who learns new technologies in their spare time, someone who wouldn't be cought dead without a computer at home
That's BS. A lot of good programmers I know don't have computers at home, intentionally. It's a matter of getting balance in your life. Mine serves only as a centralized Vorbis repository and music visualization center. I write TCP servers and search indexing programs for a living and sit in front of a computer more than 8 hours a day. I'm more than happy to _not_ do the same at home. Once in a great while I get an off week at work where I'm stuck in pure maintenance hell or other non-learning positions and I'll dust off the keyboard at home and write some code, but for the most part I'm pretty happy to leave that to work and limit my home involvement to the odd tech book and the whiteboard.
Overrated: How abusive moderators avoid getting caught in metamoderation.
No, it's how moderators work around the fact that there's no "Incorrect" or "Misinformative" category.
Sumner
man - occasionally called something else (QNX calls it use). The ability to derive meaning from man files is extremely important.
/usr/gnu/bin on every new machine and put it first in your path (with GNU fileutils, textutils, find, etc). df, find, echo, and a ton of others will work differently between Unix variants and you'll make your life more consistent if you can use the same set of tools on all platforms.
Repeat that about 1000 times.
sh/csh - the vast majority of *nix default to some variant of sh, of which ksh is the most common. If you know sh you can use any of the variants (ksh, bash, etc) and learn the particulars as you go. There are rumors of *nices that default to csh, but I've never encountered one. Regardless, you should be familiar with it just in case.
The only thing you need to be able to do in csh is type "/bin/sh". Knowing sh is key, at least know how to do "for i in...do...done", "if..[..fi", and "case". Know how to use "["/"test".
Also get and know netcat.
find - They say locate is easier, but I've never seen it on a non-Linux system.
locate is far more limited, though much faster (it uses an index instead of an fs search). Know find, and know that it's different on non-GNU systems. "find" alone works with GNU find, but you'll want at least "find . -print" on other systems.
Probably make a
And _do_ learn to read and use man pages. You'll eventually get caught on a machine without the GNU tools available and need to figure out how the local "mount" works to get back to them.
And _don't_ alias rm to "rm -i" (or similar with cp, mv, etc). You'll wind up destroying files on a machine without the alias.
Sumner
Also, learn ed. You may need it someday. On my system (FreeBSD 4.6), ed is installed as /bin/ed, and vi is installed as /usr/bin/vi
/usr/bin mounted. And keep a statically linked copy of your favorite small(ish) editor and of sash (and scp/ftp/wget/netcat if you can swing it) on the root partition --libc.so _will_ eventually die on one of your machines, and having sash and vi available for recovery will make you much, much happier.
Definitely.
Use vi frequently for a while, that vi skill will pay off if you get dropped into ed. Learn enough ed to fix things up and get
Plus, it'll make the local MCSE's head explode if he sees you using it
Unless he remembers edlin from his DOS days
Sumner
I think you are writing as a person who has never had to use either.
I have written a dynamic content server that over the past 2 years has served over 6 billion requests, with 5 9's of uptime. I've written several realtime instrument control applications. I've written a distributed text mining application that does index-assisted regex searches of 1/2 terabyte of data in Threads can really be life savers when used correctly. Sure you have to implement locking but that's what pthread_mutex is for.
On low-mem devices making full copies of the process to spawn copies is just insane.
1) Look up COW and memory sharing.
2) I never said "use only processes". A combination of processes and event loops is the way to go 99% of the time. There are some corner cases where threads are useful, but they tend to be abused by people who think "threads are good" without considering the alternatives nor the ramifications of that choice.
And on windows the Thread implementation is *intentional* not accidental. The idea is that people using threads will take advantage of the speed increase.
It's not a speed increase. Thread switching and thread creation on Windows are slower than process creation and process switching on Linux. On a par, but slower. Process creation on Windows is laughably slow, though, and process switching is substantially slower than thread switching.
It's not that Windows figured out how to make their threads go fast, it's that their processes were dog-slow and they had to create an entirely seperate execution primitive to get any sort of reasonable concurrency. Linux did things the right way by making them both fast, and now allows you to choose between the two for _design_ reasons (do I want to share memory?) rather than artificial implementation reasons.
You'll find a lot of knowledgeable people (Larry McVoy, former SGI kernel architect) who echo the same belief: use threads sparingly. Use as many threads as you have CPUs, and use processes instead if that makes more sense. Use more threads than that only if you're intimately familiar with the alternatives and know why they don't work, because while a state machine with non-blocking I/O may seem hard at first glance it'll almost certainly turn out to be easier to implement correctly, easier to debug, faster, and easier to maintain.
Sumner
And yes, Linux's process context switches are on a par (possibly faster - can't be bothered to look up benchmarks) with NT's thread context switches.
Last time I benchmarked, which was a long time ago (NT 3.51 days), Linux process switch times were 5x faster than NT thread-switch times on the same hardware. Linux thread-switch times were on a par with process-switch times, NT thread-switch times were about 20x faster than NT process-switch times.
I'd expect all those numbers to have changed, though.
Sumner
Err, Windows NT does use the native 4KB page size on Intel, but is designed to be expandable to systems with up to a 64KB page size. As a result, certain operations (like the reserve mapping that goes on for the thread stack) aligns data in 64KB increments
That's boneheaded. Linux supports page sizes up to at least 4MB, but it doesn't align everything on 4MB boundries on the off chance that you might be using 4MB pages. It uses the appropriate alignments for the page sizes actually in use.
An OS that has dropped all support for non-Intel hardware citing a portability concern which doesn't exist in portable OSes? As they say in Snatch, "It's spurious, mate. Not genuine."
Sumner
What I want to know is why use boxen rather than boxes?
"boxes" refers to the physical objects (ie the cases and contents thereof)
"boxen" refers to the notional servers.
My Linux boxen could be retasked to be FreeBSD boxen, but they'd still be the same boxes.
AFAIK, "boxen" was derived from "VAXen". And it was never "VAXes", that would brand you as computer-illiterate as quickly as saying "What's the http for that?" or using "PC" to mean "Windows box".
Sumner
Not as easy as you think
But far easier than multithreaded programming.
Threads are a way of saying "screw protected memory". They should be used only when you don't want memory protection within your application. Almost always, using threads is the wrong choice; multiple processes and/or a state machine with non-blocking I/O(depending on the problem) will accomplish the same ends as efficiently* and are much easier to implement**
*Remember that processes (COEs which don't share memory) are nearly as fast as threads in Linux, and faster in some cases. On other OSes (Irix, Solaris, Windows), processes are inefficient and threads are implemented more efficiently. That's a horrible hack to make up for ridiculously heavyweight processes; it's true that a small number of things can be optimized in a thread implementation (setting up VM mappings), but the actual speed implications of that are negligible in real-life programs. I'd be extremely surprised if you could find even one program exhibiting a measurable speed difference in Linux attributable to the scheduling, creation, and destruction properties of threads.
**Threaded solutions often seem straightforward. The devil is in the details, though; locking, synchronization, and debugging issues tend to bite hard, and in the end I've never dealt with a problem where threading was a win over multiprocs and/or a state machine. The advantage of multiprocesses is not only in keeping memory protection; it also forces you to be explicit about what's shared and how that is communicated (and greatly simplifies debugging). Resulting designs tend to be much clearer and easier to make correct and maintain.
Sumner
Of course, given recent history, the latest scam involving Florida is buying not swampland but elections. And it only costs you five justice's robes...
Oh, come on, that's so old news. It's not even the most recent Florida election count mishap.
Sumner
Do it if it raises your gpa. GPA is not everything, but it is definitely BIG.
.5 on the GPA.
Only if you're applying to grad school, and even then it's far less important than any research you've done.
In some fields, when you're job hunting straight out of college employers might ask your GPA. But especially if you're looking for a programing job, it'll be far more important to show your skills and prior work/research (math _skills_ will count big) than to list your GPA. That fun open-source project you wrote counts for more than an extra
I've never even been asked for my GPA when job hunting, in fact I finished at CMU in 97 and can't remember what it was (in contrast to the high school GPA and SATs which are burned into my brain as a result of the college application process).
Get the 2nd major if you want to take the classes, but if you have another field of interest go for it--you'll be learning in-field skills for the rest of your career, you won't often have such easy access to that art history or theoretical physics class. A math degree, like a 4.0, might help you get the first job but a photography class will give you a skill you'll use off and on for the rest of your life (and it might me _more_ valuable than that math degree depending on the employer and the job--it's amazing how often something seemingly worthless coincides with really cool job).
Sumner
I have used EiC for debugging some projects, but as I said I find the combination of Boehm-Weiser and valgrind to be the most effective for real-world projects. EiC is quite nice but as you note it can take a lot of effort to get it working right.
Sumner
Managers tend to think that once a project is out the door or 'live,' it's done and over with, and assigns developers new project.
That is a problem. You need to get managers to understand that once a project launches, there's going to be a stabilization period that's just as intense as the development period. It can be short or long depending on the project, but it needs to happen.
And you need to get programmers to believe that they _can_, in fact, get the code to a state where it just runs. That means having watchdogs to monitor it rather than people (sending email when it dies, and avoiding false positives). That means taking the 2-3 (or 20) hours to fix things that require "just 5 minutes" of attention every day. It also means isolating the system into as many independent parts as possible so that it's manageable and easy to work with.
As an example, I worked on developing a massive dynamic content system for a past job. It required a GUI for content authors to do input with, a way to schedule and deploy changes, a server to select content based on user parameters, a way to track content to see what was served where, inventory analysis of what web space would be available, etc. It took over a year to develop, had more than 15 seperate executables built from 75,000 lines of code, and was hell when it launched in terms of making sure it ran happily with all the other systems in place. One daemon needed to be restarted occasionally; sometimes unforeseen log data screwed up reports; etc. We could have just gone ahead and restarted that daemon when it needed it, fixed the logs by hand, and so forth--instead we put the up-front time into fixing the root causes of each of the tiny maintenance chores as we identified them.
Within 3 months it was pretty much hands off, and would send email alerts when something went wrong that the watchdogs couldn't handle. Moreover, because it was 15 seperate components, feature enhancements were generally easy: change one small program instead of a huge monolithic one. We went through several iterations of feature requests over the following year, each one followed by a stabilization period, and eventually got it to the point where it does everything we needed done and did it without needing handholding (and serves 6 billion requests per year).
That's what you need to aim for as a programmer--ongoing maintenance tasks when you don't have feature requests coming in is a sign that the code needs to be more robust.
Sumner
There comes a point, of course, where software is no longer paying for its maintenance. It's always legitimate to move an old version to a "mature" state, where it's no longer supported
Hopefully that doesn't mean deprecated. Most projects I've been involved with (I'm talking internal projects here) start from scratch, have a heavy design, development, and testing phase, and then launch live. The first month or two sees heavy maintenance work, and then the code stabilizes and requires basically no hands-on work.
Obviously when new feature requests come in they destabilize the codebase for some time, but you can and usually do get equilibrium: projects that have no development going on generally need no maintenance after they've had a bit of time to mellow.
And that's what I see as a mature project. The ultimate goal of every project I'm on is to reach maturity and just run.
Sumner