the analyst industry is quite amazing - all you have to do is repackage common knowlege as something special and people will pay you for it!
seriously - AMD and Intel are normally out-of-phase in product intros. it's been this way for many years, so we have to assume it's deliberate. Intel made a major improvement by souping up the Pentium-M line into Core2, and has gained a nice lead in some, even most, benchmarks. mainly due to some fairly narrow improvements that AMD hasn't yet answered, like 1-cycle throughput SIMD operations. AMD's current offerings are largely unchanged since the original Opteron intro (2003?), except for smallish tweaks like bigger caches, faster memory, doubled cores. AMD still does well for applications which are sensitive to memory bandwidth, for instance - part of the original technological jump of the K8.
AMD is about to introduce their response to Core2, and it seems quite promising based on the hints AMD has provided. Intel's not in a position to respond immediately, since 45nm production is some way off, and it (Penryn) will apparently be just a shrink of the current Core2 design.
in short, it's only sensible, sound business practice for AMD to drop the prices of their mature, high-yielding, partly-outsourced half-gen-old products. performance is still competitive with Intel's products - at a time when Intel's yields are probably not yet mature. in a way, this sets the stage for AMD to introduce its next-gen parts at a more comfortable margin.
in a real sense, the rush to multicore is pure laziness on the part of chip designers. yes, there are some interesting issues that come up when you replicate the cpu 4 times on one chip or package. but fundamentally, the real win is in the microarchitecture of a single core. and I really, really hope AMD hasn't botched it there, since Intel has the ability to pull some pretty nice stuff out of their hats.
right now, even dual-core opterons are not a clear win because of factors inherent to multi-core, such as how all external resources, already limited, become more highly contended, and how multicore products lag in clock. and don't forget that Amdahl's law is still in force - you may be happy about the throughput in parallel sections of your code/workflow, but your actual happiness is often dominated by serial sections and therefore single-core performance. with many parallel processors, you _only_ care about the serial sections!
it may be that AMD has done enough tweaks to the K8L core to stay competitive. we don't know yet, unfortunately. the worrisome thing is they're trying to transition to 65nm, 300mm, quad and a new core all in one step. I'd personally be a lot happier if they had test versions of a single-core K8L out now, for instance.
it's not about whether you like how Apple products look (in fact, I do). the main point was simply that Apple has become mostly an ipod company, with a sideline in computers. the revenue numbers are incontrovertable, even though I'm sure Jobs and others would resist this characterization.
remember: this topic is about how apple is screwing up in significant ways. the most media-covered ones have to do with hardware problems (heatsink compound, noise, scratches, etc). they do attempt good software design as well, but are hardly perfect there either. apple tries to be the company that sucks less, but I'm not sure even the ipod revenue stream is sufficient to manage. IMO, they need to figure out how to harness open-source to relieve some of their SW development burden - much more than simply using gcc, etc.
Apple has become a jewelry company specializing in audio appliances - it's certainly not a computer company, in the sense of Dell or HP. look at where Apple's revenue is! the computers they sell are primary offshoots of the audio-jewelry line, so how important is it that they work perfectly? as part of the fashion industry, Apple focus is and should be to manage their spin and buzz, mainly through appearance and drama, rather than reliability, price/performance, etc. it can't slow down the pace of product intros to iron out all the little flaws, since the sudden unveiling is a standard fashion-industry technique. who ever heard of Armani trumpeting the beta2 of rev 4.3 of the italian, 3-button pinstripe suit? Apple is the Manolo Blahnik of the computer-electronics industry.
quit judging Apple by hardware-vendor standards! it's a fashion company, and should be measured appropriately. I'm just waiting for Steve Jobs to literally walk the runway with some new do-dad (bluetooth earrings?).
nah. latency between cores is O(L2 latency), which is ~20-30 cycles. so if you know a thread is going to take a major bubble (cache miss), you might well do something about it by punting it off to an otherwise idle core. that assumes that the other core has some mechanism to retrieve the microarchitectural state of the thread, though (instructions and data in flight, etc). unless, of course, all that state is already available to the other "core", in which case it's really HT all over again. this is not a bad thing, since HT (SMT in general) got a bad rap because of how it was conflated with some other issues on the P4. the idea of switching to useful work rather than idling the pipe during a bubble is a sound one. the real question is how much the threads will interfere (contending for cache, etc), and how much overhead you pay for doing the fine-grain switching (extra tags on the internal non-architectural reg file, for instance, complicating retirement, etc.)
clearly this could also be done speculatively, as mentioned in the reference to mitosis. but it's pretty unclear how much it could be done on-the-fly, since there's a huge danger in drowing in speculative work which winds up useless when the initial fork is resolved...
that said, AMD still has lots of other things to clean up: wider superscalarity and single-cycle SSE to match Intel, for instance.
I have a lot of HP equipment, and have only good things to say about hardware or service.
sure, HP prices are not overwhelmingly great, but they will compete, even with a supply-chain company like Dell. and HP really does retain some of the good properties of the companies it used to be - my service guy is DEC/Compaq/HP, and knows his way around. in my field (scientific supercomputing) HP doesn't always have all the right answers, but they have really good guesses on most of them.
ironically, HP and Dell now use largely the same sorts of supply chains. parts are immediately drop-shipped to service staff via standard commercial couriers, for instance. any installation large enough to make sense will have pre-positioned spares. after all, it's not as if any one company has deeper insight into how to do this stuff - logistics is fairly common-sense.
HP has a lot of rock-solid, competitively fast products, but also has some real depth of experience and engineering.
what's wrong with option 1? there are other laws that handle the case where refusing to reveal a source makes the refuser an accomplice to a crime.
IMO, that's the main problem with lawmaking today - pols just applying patches to what they see as bugs in the legal code, rather than coming up with principles. yes, sometimes you really do need to refactor the law.
it's not that GC is slow, it's that the comparison is invalid. malloc libraries are designed to minimize or at least keep low the amount of wasted space. GC trades off extra space for the performance of contiguous arena allocation. malloc could do that too, and its fast path would be identical. the slow path matters, too. further, a GC language necessarily trades off contiguous allocation vs some sort of indirection or pointer-editing. that is, GC has to be able to move memory objects, so the simplest (historic, inefficient, naive) implementation is to always just indirect through an object table. that extra memory reference is much more important than shaving off a few cycles of the allocation fast-path, since the cost is taken on every reference.
it's possible to take server consolidation too far. suppose it costs you $2e6 to buy an IBM mainframe that can support 200 LPARs (I mean real, active ones, not idle ones.) when is this better than putting each on a $2k server of its own? sure, sometimes it is: the LPAR can react more dynamically, and some aspects of TCO would be lower. but we have to be honest when making this comparison - let's assume the separate servers are auto-provisioned, for instance, and have IPMI and some sort of intelligent storage network. the operating costs of such an alternative is fairly low, perhaps lower than IBM's relatively exotic hardware.
don't get me wrong: resource pooling is a great thing (it's what I do, come to think of it), but the kind of partitioning that IBM is pushing is, IMO, more of an effort to sell high-end hardware. and doing a really honest/complete TCO analysis is quite complicated. virtualization is not a value in and of itself.
this is just fear-mongering, with no primary data to substantiate the claim that the task is difficult, only hearsay that there are people who spend a lot to do something that sounds, in a 2 sentence description, vaguely similar.
from the original description, it sounded to me like the central DB was purely for mining purposes, to monitor what was going on, perhaps to learn from patterns in account activity. as such, the decoupled, do-it-yourself approach IS the right thing, and spending big bucks will do nothing but waste money.
sure, do your homework: study whether the central DB has any data path back to the subsidiary ones. whether there's really no interaction between subsidiaries. whether there really is "slack" time when data can be streamed back to central, and whether central can handle it.
observe that the goal is actually just locality: to provide sub data ot central without having to make remote access. as such, all you really have to do is snapshot/mirror sub DB's onto machines at central. doing it incremental is a useful efficiency hack. putting all sub data into one central DB is also just a convenience hack.
indeed, as a monopoly-building tactic, it was a great success. though you could easily argue that the other processors were in their final stages of death anyway, since ia32 and eventually amd64 would certainly have killed them off.
Alpha was a great chip (I still run hundreds!), but was flushed because HP was afraid of Dell.
PA-RISC, MIPS and SPARC had nothing going for them - no competitive advantage, certainly not over AMD64. they were each tied to old-fashioned name-brand-Unix machines, which are now thankfully extinct.
yeah, but when it comes to buying machines, who the heck cares about fancy design unless it gives a clear, measurable performance boost on real codes?
for instance, in SPEC FP, the It2 looks pretty impressive, until you realize that:
you lose most of the speed advantage if you ignore those codes whose working set is entirely in-cache on the It2 (and not so on other processors.)
Intel's compilers have been tuned extensively to make SPEC FP look good, so these numbers are unrealistic upper bounds for real user code.
good old os/2 - so few people remember that Microsoft wrote it, basically under contract, for IBM.
but really, who would care? it was an OS of its time (around 1990), and certainly does not add value to the OS landscape today. if you want layering to interfere with the design of an OS, you need look no further than NT and followons. the rest of the universe has gone on (to linux).
yes, I did work on OS/2 (in Redmond long ago). I even have the tshirts to prove it (including one that elucidates that NT=new technology, and was originally a derivative of OS/2 for RISC chips...)
one big problem is that RMS and GPL give everyone the creeps.
suppose the subtitle of GPL was "license for programmers who play nice with other programmers". after all, that's the whole point: if you want to use our software, we want to use whatever you build on it.
what's more, Intel and AMD mean different things by their power ratings. AMD's tend to be worstcase and Intel's are typically typical.
personally, I can't imagine why anyone cares. laptops are for portability and speed is nearly irrelevant (my PIII/733 is plenty). if you really want a desktop in a quiet, tidy formfactor, why would you care about battery life?
security people seem to have failed to notice a major transition in the industry: single-person desktops and no-user-shell servers have come to dominate. yes, there are still a few places where login accounts are given to a potentially hostile userbase. those sites do need to worry about local-root exploits, and even, to some extent, local-DOS exploits.
but people who focus their lives on security seem to have a clear tendency to lose track of the actual importance of these problems. just look at the whiny grsecurity message - what the FUCK is a "moxa" and why should I give a damn about it?
all security is not equally important, just as perfect security makes any system completely useless. people who live and breathe this stuff need to realize that they are extreme outliers, often so far out that they're not even part of the community they claim to be protecting.
VT demanded a price of $300/node (INCLUDING switches!) for interconnect, and was laughed at by several vendors. they wound up with the set of vendors who were willing to subsidize as the cluster as a loss-leader. everyone in the community knows that the cluster was non-repeatable, and thus $5M is not the actual price.
the "review" was rather mystifying, since the tests prove mainly that if you compile some random code for the P5 (apparently all 32-bit, no use of SSE or AMD-64 registers!), well, then the P4/3.6 runs it faster than a K8/2.4. wow! Occam would attribute this result to the difference in clocks, not chips.
you've got to pin down exactly what you're trying to do. for instance, rendering is embarassingly parallel, so if you have lots of frames, don't even think about parallelism within a frame. rendering is not particularly memory-intensive, so you might well get away without ECC, since the expected failure rate for non-ECC memory depends primarily on how much you have, and how hard you use it. if you're talking more than a handful of boxes, you definitely want to make them diskless/netboot. going SMP is probably a bad choice since it drives up the price (particularly if you're going for lean, diskless nodes.)
gigabit networking is almost certainly the right choice, since it's practically free. if you wind up using a lot of IO (complex models, simple frames, big images), then gigabit/diskless might end up being inadequate. but adding a disk per node will drastically hurt reliability for any large cluster! faster networks (myri/quadrics/IB) are a huge marginal cost unless your nodes are fairly high-end.
don't worry about things like LSF or Mosix - it's trivial enough to set up a farm with static jobs for rendering. LSF/PBS/etc are vast overkill, and Mosix is pointless since you don't care about migration and dynamic load-bal is pointless.
did someone pull a switcheroo on this article? when I look at the linked page, I see a very tongue-in-cheek article which seems designed to demonstrate quite possibly the worst possible comparison. no, wait, he could have used windows ME for one of the machines!
seriously, a 6-7x difference in performance is simply not credible. the only way you can achieve something like that is, for instance, to disable write caching on one disk and not the other. or perhaps not bother installing the chipset-specific driver for the ata interface. please, someone send me some disks, and I'll happily do a real, honest comparison!;)
WTF do you think "our" image *is*? Linux is about having the source; it's absolutely essential that vendors who release binary drivers should be publicly pissed upon. after all, that is precisely what they're doing to "us". releasing a binary driver is a blatant slap in the face of the whole open-source/linux/gnu/bsd movement.
science aims to understand some phenomenon; software science would try to figure out how to produce/test/maintain effective software. some of that happens in software engineering, but really, that's a bit of a misnomer, since software engineering is purely the application of what software science finds.
ultimately, software engineering is just technique to a software artisan (programmer). a decent painter will study vision, brush-handling, art history in order to gain technique. but technique does not make an artist/artisan.
none of this was news to Knuth. and thinking it through also explains why I always thing of software engineering people as utter knobs;)
PCI-express is nothing other than Intel's favorite profit-making tool, version churn. there IS NO WIDESPREAD NEED FOR GREATER BANDWIDTH.
servers do not need PCI-express: they're already using PCI-X just fine, up to 1 GBps, which is plenty for any vaguely affordable IO devices. gosh, yes, it's tragic that 10G eth doesn't do so well on PCI-X, but when's the last time you saw a $40K 10GE nic at your local computer store?
AGP 8x is more than enough bandwidth for graphics - the obvious trend is towards greater intelligence and ram in the graphics card itself, which means you could probably put an ATI 9999-pro-xp on a fucking ISA bus and be happy.
the PC industry has talked itself into believing that there's an inevitability to the escalator that has allowed better hardware to trickle down to the desktop. introduce something at the high end and let the wonder of mass-production and human avarice bring it to even entry-level computers. why 150 MB/s SATA when disks are around 40, and there is by definition one disk per channel? why gigaflops on the graphics board of a business desktop that does everything in 2d?
switched fabrics are cool, though. the real figure of merit should be latency: PCI latency is the main concern for IO throughput today, and if it's not addressed in the next gen, it will have failed...
the analyst industry is quite amazing - all you have to do is repackage common knowlege as something special and people will pay you for it!
seriously - AMD and Intel are normally out-of-phase in product intros. it's been this way for many years, so we have to assume it's deliberate. Intel made a major improvement by souping up the Pentium-M line into Core2, and has gained a nice lead in some, even most, benchmarks. mainly due to some fairly narrow improvements that AMD hasn't yet answered, like 1-cycle throughput SIMD operations. AMD's current offerings are largely unchanged since the original Opteron intro (2003?), except for smallish tweaks like bigger caches, faster memory, doubled cores. AMD still does well for applications which are sensitive to memory bandwidth, for instance - part of the original technological jump of the K8.
AMD is about to introduce their response to Core2, and it seems quite promising based on the hints AMD has provided. Intel's not in a position to respond immediately, since 45nm production is some way off, and it (Penryn) will apparently be just a shrink of the current Core2 design.
in short, it's only sensible, sound business practice for AMD to drop the prices of their mature, high-yielding, partly-outsourced half-gen-old products. performance is still competitive with Intel's products - at a time when Intel's yields are probably not yet mature. in a way, this sets the stage for AMD to introduce its next-gen parts at a more comfortable margin.
in a real sense, the rush to multicore is pure laziness on the part of chip designers. yes, there are some interesting issues that come up when you replicate the cpu 4 times on one chip or package. but fundamentally, the real win is in the microarchitecture of a single core. and I really, really hope AMD hasn't botched it there, since Intel has the ability to pull some pretty nice stuff out of their hats.
right now, even dual-core opterons are not a clear win because of factors inherent to multi-core, such as how all external resources, already limited, become more highly contended, and how multicore products lag in clock. and don't forget that Amdahl's law is still in force - you may be happy about the throughput in parallel sections of your code/workflow, but your actual happiness is often dominated by serial sections and therefore single-core performance. with many parallel processors, you _only_ care about the serial sections!
it may be that AMD has done enough tweaks to the K8L core to stay competitive. we don't know yet, unfortunately. the worrisome thing is they're trying to transition to 65nm, 300mm, quad and a new core all in one step. I'd personally be a lot happier if they had test versions of a single-core K8L out now, for instance.
hah, I guess people really do need smilies.
it's not about whether you like how Apple products look (in fact, I do). the main point was simply that Apple has become mostly an ipod company, with a sideline in computers. the revenue numbers are incontrovertable, even though I'm sure Jobs and others would resist this characterization.
remember: this topic is about how apple is screwing up in significant ways. the most media-covered ones have to do with hardware problems (heatsink compound, noise, scratches, etc). they do attempt good software design as well, but are hardly perfect there either. apple tries to be the company that sucks less, but I'm not sure even the ipod revenue stream is sufficient to manage. IMO, they need to figure out how to harness open-source to relieve some of their SW development burden - much more than simply using gcc, etc.
Apple has become a jewelry company specializing in audio appliances - it's certainly not a computer company, in the sense of Dell or HP. look at where Apple's revenue is! the computers they sell are primary offshoots of the audio-jewelry line, so how important is it that they work perfectly? as part of the fashion industry, Apple focus is and should be to manage their spin and buzz, mainly through appearance and drama, rather than reliability, price/performance, etc. it can't slow down the pace of product intros to iron out all the little flaws, since the sudden unveiling is a standard fashion-industry technique. who ever heard of Armani trumpeting the beta2 of rev 4.3 of the italian, 3-button pinstripe suit? Apple is the Manolo Blahnik of the computer-electronics industry.
quit judging Apple by hardware-vendor standards! it's a fashion company, and should be measured appropriately. I'm just waiting for Steve Jobs to literally walk the runway with some new do-dad (bluetooth earrings?).
nah. latency between cores is O(L2 latency), which is ~20-30 cycles. so if you know a thread is going to take a major bubble (cache miss), you might well do something about it by punting it off to an otherwise idle core. that assumes that the other core has some mechanism to retrieve the microarchitectural state of the thread, though (instructions and data in flight, etc). unless, of course, all that state is already available to the other "core", in which case it's really HT all over again. this is not a bad thing, since HT (SMT in general) got a bad rap because of how it was conflated with some other issues on the P4. the idea of switching to useful work rather
than idling the pipe during a bubble is a sound one. the real question is how much the threads will interfere (contending for cache, etc), and how much overhead you pay for doing the fine-grain switching (extra tags on the internal non-architectural reg file, for instance, complicating retirement, etc.)
clearly this could also be done speculatively, as mentioned in the reference to mitosis. but it's pretty unclear how much it could be done on-the-fly, since there's a huge danger in drowing in speculative work which winds up useless when the initial fork is resolved...
that said, AMD still has lots of other things to clean up: wider superscalarity and single-cycle SSE to match Intel, for instance.
I have a lot of HP equipment, and have only good things to say about hardware or service.
sure, HP prices are not overwhelmingly great, but they will compete, even with a supply-chain company like Dell. and HP really does retain some of the good properties of the companies it used to be - my service guy is DEC/Compaq/HP, and knows his way around. in my field (scientific supercomputing) HP doesn't always have all the right answers, but they have really good guesses on most of them.
ironically, HP and Dell now use largely the same sorts of supply chains. parts are immediately drop-shipped to service staff via standard commercial couriers, for instance. any installation large enough to make sense will have pre-positioned spares. after all, it's not as if any one company has deeper insight into how to do this stuff - logistics is fairly common-sense.
HP has a lot of rock-solid, competitively fast products, but also has some real depth of experience and engineering.
what's wrong with option 1? there are other laws that handle the case where refusing to reveal a source makes the refuser an accomplice to a crime.
IMO, that's the main problem with lawmaking today - pols just applying patches to what they see as bugs in the legal code, rather than coming up with principles. yes, sometimes you really do need to refactor the law.
it's not that GC is slow, it's that the comparison is invalid. malloc libraries are designed to minimize or at least keep low the amount of wasted space. GC trades off extra space for the performance of contiguous arena allocation. malloc could do that too, and its fast path would be identical. the slow path matters, too. further, a GC language necessarily trades off contiguous allocation vs some sort of indirection or pointer-editing. that is, GC has to be able to move memory objects, so the simplest (historic, inefficient, naive) implementation is to always just indirect through an object table. that extra memory reference is much more important than shaving off a few cycles of the allocation fast-path, since the cost is taken on every reference.
it's possible to take server consolidation too far. suppose it costs you $2e6 to buy an IBM mainframe that can support 200 LPARs (I mean real, active ones, not idle ones.) when is this better than putting each on a $2k server of its own? sure, sometimes it is: the LPAR can react more dynamically, and some aspects of TCO would be lower. but we have to be honest when making this comparison - let's assume the separate servers are auto-provisioned, for instance, and have IPMI and some sort of intelligent storage network. the operating costs of such an alternative is fairly low, perhaps lower than IBM's relatively exotic hardware.
don't get me wrong: resource pooling is a great thing (it's what I do, come to think of it), but the kind of partitioning that IBM is pushing is, IMO, more of an effort to sell high-end hardware. and doing a really honest/complete TCO analysis is quite complicated. virtualization is not a value in and of itself.
this is just fear-mongering, with no primary data to substantiate the claim that the task is difficult, only hearsay that there are people who spend a lot to do something that sounds, in a 2 sentence description, vaguely similar.
from the original description, it sounded to me like the central DB was purely for mining purposes, to monitor what was going on, perhaps to learn from patterns in account activity. as such, the decoupled, do-it-yourself approach IS the right thing, and spending big bucks will do nothing but waste money.
sure, do your homework: study whether the central DB has any data path back to the subsidiary ones. whether there's really no interaction between subsidiaries. whether there really is "slack" time when data can be streamed back to central, and whether central can handle it.
observe that the goal is actually just locality: to provide sub data ot central without having to make remote access. as such, all you really have to do is snapshot/mirror sub DB's onto machines at central. doing it incremental is a useful efficiency hack. putting all sub data into one central DB is also just a convenience hack.
indeed, as a monopoly-building tactic, it was a great success. though you could easily argue that the other processors were in their final stages of death anyway, since ia32 and eventually amd64 would certainly have killed them off.
Alpha was a great chip (I still run hundreds!), but was flushed because HP was afraid of Dell.
PA-RISC, MIPS and SPARC had nothing going for them - no competitive advantage, certainly not over AMD64. they were each tied to old-fashioned name-brand-Unix machines, which are now thankfully extinct.
good old os/2 - so few people remember that Microsoft wrote it, basically under contract, for IBM.
but really, who would care? it was an OS of its time (around 1990), and certainly does not add value to the OS landscape today. if you want layering to interfere with the design of an OS, you need look no further than NT and followons. the rest of the universe has gone on (to linux).
yes, I did work on OS/2 (in Redmond long ago). I even have the tshirts to prove it (including one that elucidates that NT=new technology, and was originally a derivative of OS/2 for RISC chips...)
Sam Greenblat, respected Linux kernel hacking authority figure, from the long-time advocate of Linux, CA...
one big problem is that RMS and GPL give everyone the creeps.
suppose the subtitle of GPL was "license for programmers who play nice with other programmers". after all, that's the whole point: if you want to use our software, we want to use whatever you build on it.
GPL - if I show you mine, you show me yours.
what's more, Intel and AMD mean different things by their power ratings. AMD's tend to be worstcase and Intel's are typically typical.
personally, I can't imagine why anyone cares. laptops are for portability and speed is nearly irrelevant (my PIII/733 is plenty). if you really want a desktop in a quiet, tidy formfactor, why would you care about battery life?
security people seem to have failed to notice a major transition in the industry: single-person desktops and no-user-shell servers have come to dominate. yes, there are still a few places where login accounts are given to a potentially hostile userbase. those sites do need to worry about local-root exploits, and even, to some extent, local-DOS exploits.
but people who focus their lives on security seem to have a clear tendency to lose track of the actual importance of these problems. just look at the whiny grsecurity message - what the FUCK is a "moxa" and why should I give a damn about it?
all security is not equally important, just as perfect security makes any system completely useless. people who live and breathe this stuff need to realize that they are extreme outliers, often so far out that they're not even part of the community they claim to be protecting.
VT demanded a price of $300/node (INCLUDING switches!) for interconnect, and was laughed at by several vendors. they wound up with the set of vendors who were willing to subsidize as the cluster as a loss-leader. everyone in the community knows that the cluster was non-repeatable, and thus $5M is not the actual price.
the "review" was rather mystifying, since the tests prove mainly that if you compile some random code for the P5 (apparently all 32-bit, no use of SSE or
AMD-64 registers!), well, then the P4/3.6 runs it faster than a K8/2.4. wow! Occam would attribute this result to the difference in clocks, not chips.
you've got to pin down exactly what you're trying to do. for instance, rendering is embarassingly parallel, so if you have lots of frames, don't even think about parallelism within a frame. rendering is not particularly memory-intensive, so you might well get away without ECC, since the expected failure rate for non-ECC memory depends primarily on how much you have, and how hard you use it. if you're talking more than a handful of boxes, you definitely want to make them diskless/netboot. going SMP is probably a bad choice since it drives up the price (particularly if you're going for lean, diskless nodes.)
gigabit networking is almost certainly the right choice, since it's practically free. if you wind up using a lot of IO (complex models, simple frames, big images), then gigabit/diskless might end up being inadequate. but adding a disk per node will drastically hurt reliability for any large cluster! faster networks (myri/quadrics/IB) are a huge marginal cost unless your nodes are fairly high-end.
don't worry about things like LSF or Mosix - it's trivial enough to set up a farm with static jobs for rendering. LSF/PBS/etc are vast overkill, and Mosix is pointless since you don't care about migration and dynamic load-bal is pointless.
the 640K limit was a result of where IBM placed the video buffer. I somehow doubt they consulted Bill beforehand.
did someone pull a switcheroo on this article? when I look at the linked page, I see a very tongue-in-cheek article which seems designed to demonstrate quite possibly the worst possible comparison. no, wait, he could have used windows ME for one of the machines!
;)
seriously, a 6-7x difference in performance is simply not credible. the only way you can achieve something like that is, for instance, to disable write caching on one disk and not the other. or perhaps not bother installing the chipset-specific driver for the ata interface. please, someone send me some disks, and I'll happily do a real, honest comparison!
WTF do you think "our" image *is*? Linux is about having the source; it's absolutely essential that vendors who release binary drivers should be publicly pissed upon. after all, that is precisely what they're doing to "us". releasing a binary driver is a blatant slap in the face of the whole open-source/linux/gnu/bsd movement.
source or docs: vendors have exactly that choice.
science aims to understand some phenomenon; software science would try to figure out how to produce/test/maintain effective software. some of that happens in software engineering, but really, that's a bit of a misnomer, since software engineering is purely the application of what software science finds.
;)
ultimately, software engineering is just technique to a software artisan (programmer). a decent painter will study vision, brush-handling, art history in order to gain technique. but technique does not make an artist/artisan.
none of this was news to Knuth. and thinking it through also explains why I always thing of software engineering people as utter knobs
PCI-express is nothing other than Intel's favorite profit-making tool, version churn. there IS NO WIDESPREAD NEED FOR GREATER BANDWIDTH.
servers do not need PCI-express: they're already using PCI-X just fine, up to 1 GBps, which is plenty for any vaguely affordable IO devices. gosh, yes, it's tragic that 10G eth doesn't do so well on PCI-X, but when's the last time you saw a $40K 10GE nic at your local computer store?
AGP 8x is more than enough bandwidth for graphics - the obvious trend is towards greater intelligence and ram in the graphics card itself, which means you could probably put an ATI 9999-pro-xp on a fucking ISA bus and be happy.
the PC industry has talked itself into believing that there's an inevitability to the escalator that has allowed better hardware to trickle down to the desktop. introduce something at the high end and let the wonder of mass-production and human avarice bring it to even entry-level computers. why 150 MB/s SATA when disks are around 40, and there is by definition one disk per channel? why gigaflops on the graphics board of a business desktop that does everything in 2d?
switched fabrics are cool, though. the real figure of merit should be latency: PCI latency is the main concern for IO throughput today, and if it's not addressed in the next gen, it will have failed...