Slashdot Mirror


User: bertok

bertok's activity in the archive.

Stories
0
Comments
789
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 789

  1. Re:Doesn't impress me... on First Look At Visual Studio 2010 Beta 1 · · Score: 1

    I think the technical term is 'mass storage device', which encompasses both hard drives and SSDs. Even the term "Solid State Drive" is a bit silly, as there is no "drive" (motor). It's really a Solid State Mass Storage Device, but I guess SSMSD just doesn't roll off the tongue. 8)

  2. Re:Yay! on First Look At Visual Studio 2010 Beta 1 · · Score: 1

    I tried Resharper, but I think the VS add-in model wasn't really flexible enough to support the level of integration the IntelliJ guys were trying to achieve, so it was always flaky for me. It kept misbehaving, the graphics was sometimes glitchy, etc...

    I doubt it's IntelliJ's fault, all of their products are top-notch.

    I hope VS 2010 either improves massively in subsequent betas, or that it'll be flexible enough for a third-party plugin to fix all the gaps.

  3. Doesn't impress me... on First Look At Visual Studio 2010 Beta 1 · · Score: 1

    I gave VS 2010 a try on several machines.

    If you have an SSD, it's fine, if a little sluggish, especially the more complex designers like the Entity Framework stuff.

    On a harddrive, it's almost unusable, it just churns and churns and churns for what seems like hours. Previously, serious developers needed a big monitor and lots of RAM. Now it's a big monitor, lots of RAM, and an SSD.

    Still, the new WPF editor has promise, I like the subtle gradient shading and transparency effects. I think it's a beta issue, but I did notice that when you switch to a new editor window, the text is blurred for about a tenth of a second, then jumps into focus. This is probably a bug in the pixel-snapping they're using, but for a while I thought my eyes were going.

    I have seen demos that show that writing a plugin using C# and WPF is now incredibly trivial, so that's interesting, but won't be useful to 90% of the developers out there.

    However, most of the rest of the changes appear to be cosmetic.

    The new APIs in the 4.0 .NET framework don't impress me at all. The parallel library is a JOKE compared to the Oswego parallel programming library that SUN merged into Java. Microsoft has a bunch of webpages on how "multi core is the future", and "programmers have to start doing threads", but they won't do anything other than the absolute minimum to help the programmers with what is a VERY difficult task. Any decent parallel library is going to need a bag of tools like a "lock free queue", with various queuing styles such as FIFO, LIFO, priority, fixed-length, arbitrary length, and all of them have to have the same interface. Java has had this for YEARS now. Where's my abstract "Executor" interface with a bunch of standard implementations like "immediate, background-thread, thread-pool, asynchronous call, Dispatcher call, etc..."?

    Microsoft talks a lot about 'functional' programming, but that's implemented in a half-assed way as well. For example, take the SortedDicitonary<T> class. Looks useful, right? Now try to enumerate all of the values in a certain key range, without having to enumerate the entire container. You can't, because Microsoft forgot that they need to provide a 'range query' method that takes a comparator function, something like:

    IEnumerable<KeyValuePair<KeyT,ValueT>> RangeQuery(Func<int, keyT> rangeTest) ...

    My biggest gripe is that the Entity Framework GUI designer is STILL a joke. It has a longer list of unsupported features than supported features. It's meant to be a time-saving feature, yet I have to write not one, but THREE mapping files, in XML, by hand, with no tab-complete. Can you FEEL the efficiency? I know I can! And it still doesn't support foreign keys with multiple columns. In general, it can't map to about 50% of the existing schemas out there because of some technical limitation or other.

  4. Re:Yay! on First Look At Visual Studio 2010 Beta 1 · · Score: 4, Interesting

    Screw it, I second this. Visual Studio has the best code completion implementation ever written. I can type lines like obj.GetSomething().Append(item) in about four keystrokes.

    Have you tried IntelliJ IDEA? It's Java-only, but I found its code completion to be many, MANY times better than VS. For one, instead of showing you every symbol that matches a prefix, it narrows down to the appropriate type.

    For example, in VS, if you create two methods with similar names ("int Test1()", "string Test2()"), and try and tab complete something like string "foo = T", it'll show you Test1() first, even though Test2() is a far better match.

    Note that I use VS 2008 daily, and I've got 2010 installed as well. I just tested that in VS 2010, and it still shows you every single identifier available, including class names. I know that technically, the intention may have been to reference a static, but in practice, they could go to some lengths to select a "most likely" set and an "alphabetical" set, and show the most likely first, and only show the complete set if you try to complete twice, or something.

    It's a great IDE, but it could be a lot better. Microsoft really needs to get over their "not invented here" attitude, install a competing IDE at least ONCE, try it, and learn that other people sometimes do things better.

  5. Re:Sorry Cisco on Cisco Introduces Rackmount Servers · · Score: 1

    There's a term for it:

    Reassuringly expensive

  6. Re:Back to the Future? on When VMware Performance Fails, Try BSD Jails · · Score: 1

    Don't blame the technology on the idiots that implement it.

    If SAN cost is an issue, explain to management that 'reassuringly expensive' is not the only metric by which SAN hardware should be evaluated. ESX has native NFS and iSCSI support if you're really cheap.

    SAN failures will bring down physical hardware too, as will internal drive failures. A network switch failure - ditto. Sounds like you had a single point-of-failure where you shouldn't have had one, which was a design error. The failure of a single 'driver' shouldn't be able to take out an entire chassis, ever. The other stupid thing that I've seen is people making one enormous LUN for the ESX cluster. That's like one enormous basket with a metric ton of eggs in it.

    By the sound of it, you've forgotten to add in the hidden costs of managing hardware. Everything from "we don't want to migrate software 'x' off the Pentium III because we're afraid to touch it", to "oops, the server went pop, lets fuck around with tapes for two days while the business grinds to a halt". How long does it take you to build a server, from scratch, all the way up to production functionality? A day? Two? How much does that cost your company?

    What the 'management types' see are the very real advantages of virtualization, especially with ESX, which allows some amazing things.

    ESX allows you to upgrade the host hardware, replace the SAN, the SAN switches, the network switches, in fact, EVERY SINGLE DEVICE in your data center while every virtual machine keeps running, uninterrupted.

    I've seen a server with 470 days uptime running on a cluster with no hardware component older than 3 months. That's so right, it's almost wrong! 8)

    ESX doesn't stop there, you can do an upgrade of a shared-disk cluster live, with virtual machines powered on and running. You can go from, say, v3.0 to v3.5 without stopping anything. I've seen cluster upgrades where you could upgrade the cluster-shared filesystem itself (VMFS) while it was in use.

    That's just the start of it. You can build a cluster out of mixed hardware, running ancient virtualized NT4 boxes with horrendous custom applications and cluster everything with a "drag & drop" operation. Not just 'cluster' in the stupid Microsoft sense of "we reboot from a crash slightly faster", but real, shared-disk clustering, where VMs can balance between hosts on-the-fly. I've seen a screenshot showing a history of 160,000+ VM host-to-host migration operations on a production system! You can now get instant, zero-loss failover of a VM by ticking a checkbox. It'll automatically mirror the memory and replay all inputs on both ends in-sync, so if you lose a host, failover is instant, network connections are not interrupted, and no transactions are lost.

    So... how do you manage your data center?

  7. I've seen this before on When VMware Performance Fails, Try BSD Jails · · Score: 5, Interesting

    I've seen similar hideous slowdowns on ESX before for database workloads, and it's not VMware's fault.

    This kind of slowdown is almost always because of badly written chatty applications that use the database one-row-at-a-time, instead of simply executing a query.

    I once benchmarked a Microsoft reporting tool on bare metal compared to ESX, and it ran 3x slower on ESX. The fault was that it was reading a 10M row database one row at a time, and performing a table join in the client VB code instead of the server. I tried running the exact same query as a pure T-SQL join, and it was something like 1000x faster - except now the ESX box was only 5% slower instead of 3x slower.

    The issue is that ESX has a small overhead to switching between VMs, and also a small overhead for estabilishing a TCP connection. The throughput is good, but it does add a few hundred microseconds of latency, all up. You get similar latency if your physical servers are in a datacenter environment and are seperated by a couple of a switches or a firewall. If you can't handle sub-millisecond latencies, it's time to revisit your application architecture!

  8. Re:Forgive my ignorance WAS:re: Garbage collector? on Java Gets New Garbage Collector, But Only If You Buy Support · · Score: 1

    That's a tad arrogant. Memory management is already non-trivial in small programs, but have you ever tried doing it in a program that is all of the following, all at once:

    - event driven
    - multithreaded
    - modular
    - developed by several people

    In environments like that, you end up spending an awful lot of time figuring out just precisely which thread gets to deallocate memory, except when there's an event handler, but not when.. you get the idea. I've seen C++ code that basically kept making copies of every string that was passed between modules, because there was no other truly safe way of passing strings around. In a string-heavy program like a web server, or a database-driven app, that can be hideously inefficient.

    This is why almost all (99%?) of web development is done with managed languages, because it hits all 4 of those of problem points.

  9. Re:Theoretical != Real World speeds on SATA 3.0 Release Paves the Way To 6Gb/sec Devices · · Score: 1

    I got an OCZ VERTEX SSD drive, and it outperforms a SAN with a whole tray of disks for real-world apps, so you may want to check out the latest stuff. You have to be careful with benchmarks, they all measure throughput of heavily pipelined IOs, which almost never happens in real applications. A mechanical disk, even in a RAID, will struggle to outperform SSDs that can do reads in well under a millisecond.

  10. Re:"functional programming languages can beat C" on World's "Fastest" Small Web Server Released, Based On LISP · · Score: 1

    I would take you up on this challenge, but one of your original requirements is a problem:

    Caveat: It must be a small challenge involving a relatively simple task. I don't have a lot of time to waste on this.

    Of course it must be a relatively simple task! That's the problem with low-level but efficient programs. You trade off developer time for that efficiency. At some point, this can actually become counterproductive, especially for large, complex projects.

    For example, I used to code in C, C++, Java, Haskell, and these days I code in C#. I know full well that C/C++ are more efficient, give better control, and I know of dozens of techniques for writing ultra high performance real-time apps. I've even written a 3D game engine for a computer game, in C++, and it smoked.

    The thing is, I write all of my "high-performance" code in C# these days. I can't quite get the same performance as I could theoretically achieve in C++, but I can get excellent performance with a lot less developer time. It's trivial to write asynchronous code in C#, especially async IO, and farming things out to a thread-pool is one function call away. I don't have to worry about hideous problems like "when to free memory", which is hard enough normally, let alone in a multithreaded program.

    In a huge, complex project, if I can trivially make a segment of code multithreaded, it can get speedups of upwards of 10x on a modern server with a bunch of quad-core CPUs for a few minutes of my time. A modern JIT/VM language like Java or C# isn't 10x slower than C, so the overall result is a win.

  11. Define acronyms in the article! on Clean-Room RTMPE Spec Created From rtmpdump · · Score: 4, Informative

    Clearly, Slashdot editors are strategically shaved monkeys trained to click "accept" or "reject" in exchange for bananas.

    Define obscure acronyms in the articles!

    RTMP is the Real Time Messaging Protocol used by Adobe Flash

  12. A reduced level of blocking is needed on Adblock Plus Maker Proposes Change To Help Sites · · Score: 1

    I've been using Adblock for years, and I love it.

    Here's the thing though: I don't hate ads. I really don't. I don't even mind ads that much. I have even been known to click some in the past, especially if they are 'relevant to my interests', as it were.

    What I do hate is animating ads. (*) I hate them with a white-hot fury that is almost indescribable, and that is the reason I run Adblock, but that actually blocks everything, so can't really keep the non-obtrusive ads while blocking the annoying ones. For the love of all things holy, and good, why won't online publishers learn this basic rule:

    JUST BECAUSE YOU CAN, DOESN'T MEAN YOU SHOULD

    So what I propose is this: A "non-animating" ad whitelist system, whereby ads can be tagged as "static", and allowed through Adblock, if the user so chooses. That way, my favorite websites can continue to receive the financial support they need, and my eyes won't bleed trying to read the 3 lines of tiny text hidden somewhere inside a kaleidoscope of shifting and flashing colors.

    (*) large, multicolored, animating ads are actually a interesting case study of the tragedy of the commons. Everyone feels that they have to have the flashiest, most colorful ad around to compete with the other flashy colorful ads. This is only individually true. After a while, customers start ignoring all the ads, and actually become annoyed by them. It takes government intervention to stop the insanity, nobody will ever back down on their own. Just take a look at Tokyo at night. Some cities or municipalities DO enforce rules, and you end up with tasteful shopfronts that advertise their wares using mannequins and small, elegant signs with the company logo. The internet needs the same, desperately.

  13. Re:Ironic, really... on Pentagon Lost Billions, Pennies At a Time · · Score: 5, Interesting

    Also, don't forget that anything major project is managed according to this chart. :-)

    Now the fun part... Try and find the boxes in the diagram where something functional actually gets built!

    Correct link: http://www.dau.mil/pubs/IDA/chart%20front.pdf ... and I have to say: wow.

    This is why military projects start at $billions and go up from there.

  14. Re:Movies are so last century on Cameron's Avatar a 3D Drug Trip? · · Score: 1

    "Nowadays you get that fast of an fps, but things just don't have that effect."

    It's called nostalgia, and has nothing to do with modern game engines or video cards, which can cheerfully output more frames per second than your monitor can display, and do it to much higher precision than the Quake engines did, which actually interpolated accurately only every 16 pixels for speed.

    I had fond memories of Ultima Underworld, which was an amazing 3D game for the time, with relatively complex gameplay. I recently tried it in an emulator, and it was a bit of a disappointment - movement wasn't smooth, the 3D part of the screen was tiny (for speed), and the boring 2D UI took up the rest.

  15. How about JIT in the Kernel? on Europe Funds Secure Operating System Research · · Score: 1

    I was just thinking recently about Microsoft's Singularity research operating system written in C#, which is cute, but somewhat useless in the real world. One big advantage though of statically verifiable byte-code languages like C# in operating systems though is security, because you can ensure a block of code is secure once and then run it at full speed without further access checks. That's almost impossible with generic C or assembler, but tractable with bytecode-based languages like Java or C#.

    While a *pure* C# operating system is a bit nuts, why not allow a *hybrid* operating system? Simply create a variant of the Java or C# runtime that can execute inside the Kernel at Ring 0, and only allow verifiably safe code to run. You get the benefit of a high level garbage collected language with all the safety checks that are normally enforced by the user-space/kernel-space seperation, but with none of the overheads.

    This would have been impossible some years ago, because most operating system kernels weren't properly preemtible, and Windows on 32-bit had all sorts of pre-allocated buffer size limits, but all of that has been solved or has gone away with 64-bit.

    I can't think of any reason this wouldn't work. Keep in mind that the typical device driver might be written by some minimum-wage code jokey in Taiwan or China who's got a "Kernel Programming for Dummies" book on his desk. I'd rather have him working in a language that can be verified safe, instead of a language that comes with a whole array of guns to shoot all of your feet off.

  16. Re:You should be looking at random I/O speeds on Intel Responds To X25-M Fragmentation Issue · · Score: 3, Informative

    I have an OCZ VERTEX 250GB SSD, and it blows mechanical drives out of the water for random IO.

    I noticed several reviews that indicated that the Samsung drives do have issues with random IO, but the OCZ drives appear to have no such problems. Yes, you lose performance with random IOs vs sequential IOs, but nowhere near as much as first-gen SSDs. I've seen 6000 random IOPS on a single drive, which is unattainable on anything short of a while tray of disks in a SAN.

    I'm not pulling the SSD vs SAN comparison out of my ass, I tested my laptop with the SSD drive head-to-head with the same ~60GB database against two production servers, one with a 20-something spindle SAN volume (shared), and the other with a 3-drive 15K RPM SCSI RAID (dedicated). It won against both for all cases where IO was a significant bottleneck in the query. Obviously, my laptop lost out against the 8-CPU server with 32GB of memory for 'small' queries, but for un-cached data sets, it was usually faster.

  17. My father's small business had a similar problem on Building a Searchable Literature Archive With Keywords? · · Score: 1

    My father runs a small business and has to track a bunch of paperwork for each client, so I got him a cheap LED lit flatbed scanner, but like everyone else, he discovered that it was too slow to manually scan in each page, even if the scanner itself was quite fast.

    He eventually figured out that the fastest scanning technique is not to use a scanner at all, but a digital camera. He made a rig with a marked out area the size of an A4 sheet of paper, and then he attached a camera mount so that the camera would be facing down, pre-aligned to photograph the entire sheet. I've seen it in action, he can easily do a page per second: he just places the next page on the platform with one hand, and presses the shutter button with the other hand.

    The resolution is more than good enough for OCR, and most cameras have better depth-of-field than scanners, so more of the page is in focus, even near bindings and staples.

  18. Re:they already cost less per gig than some SAS dr on MS Researchers Call Moving Server Storage To SSDs a Bad Idea · · Score: 1

    I just got myself a OCZ VERTEX 250GB drive, and I've seen SQL Server do ~6000 random IOPS on it, sustained, in a practical case (complex UPDATE statement for a huge table). This is on a laptop, mind you -- that UPDATE command was CPU limited!

    I ran it head-to-head with the same ~60GB database against two production servers, one with a 20-something spindle SAN volume, and the other with a 3-drive 15K RPM SCSI RAID. It won against both for all cases where IO was a significant bottleneck in the query. Obviously, my laptop lost out against the 8-CPU server with 32GB of memory for 'small' queries, but for un-cached data sets, it was usually faster.

    I've seen a number of enterprise environments where SSDs make perfect sense:

    * System disks for Citrix XenApp servers - if they're hosting 100+ users on a modern 64-bit Nehelem platform, disk I/O will be the bottleneck.
    * ARC cache disks for ZFS volumes - Solaris now lets you use a "fast" disk as a kind of cache for "slow" disks. Whack in a few SSDs in a mirror or a stripe if you're brave, and you could see faster speed than even SAN storage.
    * Data disks for business intelligence or integration servers. Both tend to do horrible things to disks, like run multiple parallel read and write streams at the same time.

  19. Re:Either trivial or bullshit on Coders, Your Days Are Numbered · · Score: 3, Insightful

    And for the record: a few years ago there was a study published in Communications of the ACM that showed while pair programming is more efficient than a single solitary programmer, it is not as efficient as two programmers with two keyboards. FYI.

    I once tried pair programming for a few days, and I found the exact same thing. Yes, together, we were more efficient. I could catch bugs the other guy missed while he was typing, which saved him some debugging time. However, it wasn't a huge improvement, I'd say something like 50%, but it took 100% more manpower, so it was a net loss.

    What few people realize though is that pair programming is boring for the person without the keyboard. It's mind-numbingly boring. It's like watching someone do mathematics homework for a whole day. I enjoy programming, but watching other people code is a lot less fun.

  20. Re:Joking aside... on Reliability of Computer Memory? · · Score: 1

    The problem is that in some ways the issue is getting worse over time because of increased volumes of data, but error rates are largely the same per bit.

    To give you an idea, digital equipment manufacturers will quote "bit error rates". I don't just mean DRAM memory, but hard disks, flash storage, network equipment, CPUs, motherboard chipshets, everything has an error rate. Using ECC memory will reduce the rate, but certainly won't eliminate it, as typical ECC protection is rather weak. On some disk equipment, it's as bad as 1 per 10^14 bits or so, which may seem high, but that's only 1 per 100 trillion, or 1 flipped bit per 12.5 terabytes. A modern hard disk is already over 1 terabyte, so that means there's a roughly 1 in 10 chance that at least one bit it's storing is wrong! I've seen database servers that perform petabytes of I/O in just a few weeks or months, which implies that systems like that would be processing at least a few bits of corrupt data a year.

    Of course, enterprise-grade equipment tends to be more robust than consumer grade gear but the issue of flipped bits should still be in the back of the mind of every system administrator and developer. Strong checksums and redundant storage are your friends - the two main features of the ZFS filesystem in Solaris. Linux is only just now getting there with BTRFS, while Windows is way behind - it generally assumes that the underlying block storage is reliable.

  21. Re:Joking aside... on Reliability of Computer Memory? · · Score: 5, Insightful

    As for ECC in memory... The problem is that ECC carries a heavy performance hit on write. If you only want to write 1 byte, you still have to read in the whole QWord, change the byte, and write it back to get the ECC to recalculate correctly. It is because of that performance hit that ECC was deprecated. The problem goes away to a large extent if your cache is write-back rather than write-through; though there will be still a significant number of cases where you have to write a set of bytes that has not yet been read into cache and does not comprise a whole ECC word.

    AFAIK, on modern computer systems all memory is always written in chunks larger than a byte. I seriously doubt there's any system out there that can perform single-bit writes either in the instruction set, or physically down the bus. ECC is most certainly not "depreciated" -- all standard server memory is always ECC, I've certainly never seen anything else in practice from any major vendor.

    The real issue is that ECC costs a little bit more than standard memory, including additional traces and logic in the motherboard and memory controller. The differential cost of the memory is some fixed percentage (it needs extra storage for the check bits), but the additional cost in the motherboard is some tiny fixed $ amount. Apparently for most desktop motherboard and memory controllers that few $ extra is far too much, so consumers don't really have a choice. Even if you want to pay the premium for ECC memory, you can't plug it into your desktop, because virtually none of them support it. This results in a situation where the "next step up" is a server class sytem, which is usually at least 2x the cost of the equivalent speed desktop part for reasons unrelated to the memory controller. Also, because no desktop manufacturers are buying ECC memory in bulk, it's a "rare" part, so instead of, say, 20% more expensive, it's 150% more expensive.

    I've asked around for ECC motherboards before, and the answer I got was: "ECC memory is too expensive for end-users, it's an 'enterprise' part, that's why we don't support it." - Of course, it's an expensive 'enterprise' part BECAUSE the desktop manufacturers don't support it. If they did, it'd be only 20% more expensive. This is the kind of circular marketing logic that makes my brain hurt.

  22. Re:Two common situations that defeat copy and past on Companies Waste $2.8 Billion Per Year Powering Unused PCs · · Score: 1

    I'm reasonably sure I can recognize a PDF. 8)

  23. Re:Vast underestimation on Companies Waste $2.8 Billion Per Year Powering Unused PCs · · Score: 1

    At least 50% of office workers, even in IT, don't use cut-and-paste to move bits of text from one place to another. The number of times I've seen someone oh-so-slowly type in a piece of data they have in an email right in front of their face just stuns me. And they make typos. Lots of them. Sometimes they correct the typos (slowly), sometimes they don't.

    Even if you're too lazy to remember the keyboard shortcuts, there's at least two different ways to copy with just the mouse in most Windows applications: select & drag OR select, right-click->copy, right-click->paste.

    In the vast majority of business office settings, every employee has been sitting in front of a computer for about a decade now and yet many have failed to grasp even that most basic skill.

    To say that one could "double" productivity is a massive understatement.

    This stuff should have been on equal footing with "Mathematics" and "English" in school curricula for two decades now. On par with learning to read & write. How can one claim to even have an education without serious computer literacy?

  24. Re:I've never understood the UNIX world's fascinat on "Slacker DBs" vs. Old-Guard DBs · · Score: 1

    That doesn't seem like such a good general purpose solution. For a trivial application, it might work, especially if you place an enormous amount of logic into the application code, but I can foresee problems even then.

    How do you deal with disk space wasted by fragmentation? If the "record ID" is essentially an offset, you can't defragment, especially if you want to do it live. That's not even mentioning internal fragmentation - most disk caches store large blocks (64KB or larger), so you're wasting, on average, 50% of your caching capacity because of the mismatch between block and record sizes.

    What happens when you've pre-allocated, say, 1000 small blocks and 1000 large blocks, and it turns out you actually need 1001 large blocks? You may have 30% free space left in the small block section, but you can't use it! Creating a new file sounds expensive (has to be filled with a pattern!), whereas creating new files of arbitrary size is essentially constant time in most modern databases (they don't even ask the OS to fill them with a 0 pattern).

    This also sounds like it can't handle out-of-order writes. This may be less of a problem now with battery-backed RAM caches on disk controllers, but it would have sucked a decade ago. Without an intent log, you have to perform every write in-order or risk corruption.

    Actually, what happens if the program accidentally loses a block key? Would it... leak storage space? How would you reverse that if all the blocks are identical looking binary blobs?

    Not to mention that you get the joy of re-inventing the wheel any time you want to do anything other than "retrieve by key". If you want to locate, say, a passenger by name across ALL flights in a day, you'd probably have to scan all records or write your own index or something.

    But if you were really keen on using such a trivial system, implementing it wouldn't be that hard in any modern programming language. A few thousand lines of Java or C# ought to do it.

  25. Re:Adapt on Windows and Linux Not Well Prepared For Multicore Chips · · Score: 2, Insightful

    I think the consensus was that making compilers emit efficient VLIW for a typical procedural language such as C is very hard. Intel spend many millions on compiler research, and it took them years to get anywhere. I heard of 40% improvements in the first year or two, which implies that they were very far from ideal when they started.

    To achieve automatic parallelism, we need a different architecture to classic "x86 style" procedural assembly. Programming languages have to change too, the current crop are too close to the metal. I suspect that in the future, languages will rely on intermediate byte-code more, and become ever more functional as designers realize that functional code is easy to transform due to a lack of side-effects.

    I've heard of automatically parallelized versions of some pure functional languages that can execute almost any code on almost any number of CPUs without the programmer ever having to write a single synchronization instruction! For example, Microsoft is working on "parallel LINQ" in C# 4.0, which is essentially a small island of parallelizable functional code that can be embedded in a procedural language.