When Mistakes Improve Performance

Impossible design by ThatMegathronDude · 2010-05-29 10:24 · Score: 4, Interesting

If the processor goofs up the instructions that its supposed to execute, how does it recover gracefully?

Re:Impossible design by Anonymous Coward · 2010-05-29 10:36 · Score: 3, Funny

The Indian-developed software will itself fuck up in a way that negates whatever fuck up just happened with the CPU. In the end, it all balances out, and the computation is correct.
Re:Impossible design by TheThiefMaster · 2010-05-29 10:42 · Score: 4, Insightful

Especially a JMP (GOTO) or CALL. If the instruction is JMP 0x04203733 and a transmission error makes it do JMP 0x00203733 instead, causing it to attempt to execute data or an unallocated memory page, how the hell can it recover from that? It could be even worse if the JMP instruction is changed only subtly, jumping only a few bytes too far or too close could land you the wrong side of an important instruction that throws off the entire rest of the program. All you could do is to detect the error/crash and restart from the beginning and hope. What if the error was in your error detection code? Do you have to check the result of your error detection for errors too?
Re:Impossible design by Anonymous Coward · 2010-05-29 10:44 · Score: 0

I belive Windows is american.
Re:Impossible design by Anonymous Coward · 2010-05-29 10:44 · Score: 3, Interesting

Thats a good point. You accept mistakes with the data, but don't want the operation to change from add (where, when doing large averages plus/minus a few hundreds wont matter) to multiply or divide.
But once you have the opcode separated from the data, you can mess with the former. E.g. not care when something is a race condition because that happening every 1000th operation doesn't matter too much.
And as this is a source of noise, you just got a free random data!
Still, this looks more like something for scientific computing, and when they build the next big one that can easily be factored in. For home computing, not so much, 99% of the time they wait for user input anyhow.
Re:Impossible design by mederbil · 2010-05-29 10:50 · Score: 1

It is similar to quantum computing. Quantum computing can be insanely fast, but it is often makes inaccurate calculations.
It's mainly about quantity, not quality. A possibly use for it is computation knowledge engines, like WolframAlpha. It would be inexpensive for computation servers, but only really useful if it was at least 98% accurate.
Re:Impossible design by koreaman · 2010-05-29 10:50 · Score: 1

And Windows is rock solid reliable compared to the software made abroad (I won't name specific countries) that I have to deal with at work. I suspect many people feel the same way.

--
Le français vous intéresse?
Re:Impossible design by Turzyx · 2010-05-29 10:54 · Score: 2, Interesting

Or worse, it could jump to itself repeatedly, thereby creating a HCF situation.
Re:Impossible design by Loupis · 2010-05-29 10:55 · Score: 1

So its like double negatives. It just cancels out?
Re:Impossible design by WrongSizeGlass · 2010-05-29 11:07 · Score: 1

Basically he's saying we should trade power consumption for accuracy? Hmmm ... I vote 'No'.
Re:Impossible design by AmiMoJo · 2010-05-29 11:15 · Score: 3, Informative

The first thing to say is that we are not talking about general purpose CPU instructions but rather the highly repetitive arithmetic processing that is needed for things like video decoding or 3D geometry processing.
The CPU can detect when some types of error occur. It's a bit like ECC RAM where one or two bit errors can be noticed and corrected. It can also check for things like invalid op-codes, jumps to invalid or non-code memory and the like. If a CPU were to have two identical ALUs it could compare results.
Software can also look for errors in processed data. Things like checksums and estimation can be used.
In fact GPUs already do this to some extent. AMD and nVidia's workstation cards are the same as their gaming cards, the only difference being that the workstation ones are certified to produce 100% accurate output. If a gaming card colours a pixel wrong every now and then it's no big deal and the player probably won't even notice. For CAD and other high end applications the cards have to be correct all the time.

--
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
Re:Impossible design by Chowderbags · 2010-05-29 11:19 · Score: 3, Insightful

Moreover, if the processor goofs on the check, how will the program know? Do we run every operation 3 times and take the majority vote (then we've cut down to 1/3rd of the effective power)? Even if we were to take the 1% error rate, given that each core of CPUs right now can run billions of instructions per second, this CPU will fail to check correctly every second (even checking, rechecking, and checking again every single operation). And what about memory operations? Can we accept errors in a load or store function? If so, we can't in practice trust our software to do what we tell it. (change a bit on load and you could do damn near anything from adding the wrong number, to saying an if statement is true when it should be false, to not even running the right fricken instruction.

There's a damn good reason why we want our processors to be rock solid. If they don't work right, we can't trust anything they output.
Re:Impossible design by Anonymous Coward · 2010-05-29 11:29 · Score: 1, Informative

Reduced power OR equal power at a faster clock rate. Many times speed is preferred to accuracy when perfection isn't necessary. Video and audio are good examples already doing this (e.g. dropped frames on slow connections).
Re:Impossible design by NotQuiteReal · 2010-05-29 11:55 · Score: 1

Never mind, that ain't no don't make no sense no how.

--
This issue is a bit more complicated than you think.
Re:Impossible design by cheese_wallet · 2010-05-29 12:36 · Score: 1

Especially a JMP (GOTO) or CALL. If the instruction is JMP 0x04203733 and a transmission error makes it do JMP 0x00203733 instead, causing it to attempt to execute data or an unallocated memory page, how the hell can it recover from that? It could be even worse if the JMP instruction is changed only subtly, jumping only a few bytes too far or too close could land you the wrong side of an important instruction that throws off the entire rest of the program. All you could do is to detect the error/crash and restart from the beginning and hope. What if the error was in your error detection code? Do you have to check the result of your error detection for errors too?
space the instructions further apart so that one or two bit flips won't map to another instruction.
Re:Impossible design by somersault · 2010-05-29 12:42 · Score: 1

Damn you, Canada!

--
which is totally what she said
Re:Impossible design by Interoperable · 2010-05-29 12:46 · Score: 5, Insightful

The research is targeted specifically at dedicated audio/video encoding/decoding blocks within the processors of mobile devices and similar error-tolerant applications. The journalist just didn't mention the fact that the idea isn't to expose the entire system to fault-prone components. When considered in the light that the most power-sensitive mainstream devices (cell-phones) spend most of their time doing these error-tolerant tasks, the research becomes quite interesting. They claim to have demonstrated the effectiveness of the technique to encode an h.264 video.

--
So if this is the future...where's my jet pack?
Re:Impossible design by pipatron · 2010-05-29 13:04 · Score: 4, Insightful

Not very insightful. You seem to say that a CPU today is error-free, and if this is true, the part of the new CPU that does the checks could also be made error-free so there's no problem.
Well, they aren't rock-solid today either, so you can not trust their output even today. It's just not very likeley that there will be a mistake. This is why mainframes execute a lot of instructions at least twice, and decides on-the-fly if something went wrong. This idea is just an extension of that.

--
c++; /* this makes c bigger but returns the old value */
Re:Impossible design by Anonymous Coward · 2010-05-29 13:23 · Score: 0

Hey, we only make/made hardware (AdLib, Gravis, ATI).
Re:Impossible design by Yvan256 · 2010-05-29 13:24 · Score: 1

You can make Hot Cold Fusion with a CPU?
Re:Impossible design by Yvan256 · 2010-05-29 13:25 · Score: 1

I can see decoding not being too critical depending on the errors, but mistakes in the encoding spells trouble.
Re:Impossible design by maxume · 2010-05-29 14:00 · Score: 2, Funny

Encomistakesding.
Or maybe truoble.

--
Nerd rage is the funniest rage.
Re:Impossible design by Anonymous Coward · 2010-05-29 14:07 · Score: 0

I wish we coward had mod point ....
mod thyself funny
Re:Impossible design by Belial6 · 2010-05-29 14:28 · Score: 1

Correct, another method would be to take the double entry accounting approach. You run the command in two different ways that should provide the same answer if correct, but different answers if wrong. You would only need a very small part of the chip to be really reliable as an error checker. I do think that this would be better handled by hardware than software, but the premise is not unreasonable.

The real question is whether chips could be sped up enough to counteract the slow down introduced by error checking CPUs. I suppose that this could be a legitimate use of dual processing. Run two CPUs in parallel that produce their results via different methods, and then have a reliable watchdog that causes them to rerun the commands if they don't match.
Re:Impossible design by Chowderbags · 2010-05-29 14:32 · Score: 1

There's a huge gap between current chips correcting errors in general long before they propagate up to to userland and what this article is talking about. This researcher is talking about creating more "robust" software "so an error simply causes the execution of instructions to take longer". If they're talking about the microcode on the CPU itself (I doubt it), then this is nothing new. If they're talking about code in every end developers program, than they fall into exactly the problem I describe (along with trying to convince the software development community that it's a great idea to add more complexity to programs just to get around design flaws in CPUs, given that most commercial software companies can't be bothered to fix their own bugs.
Re:Impossible design by Anonymous Coward · 2010-05-29 14:34 · Score: 0

I belive Windows is american.
Ah, I see that English is not your first language. Let me help you out.
First of all, the verb is "believe".
Second, it's "American". Make sure you capitalize the 'A', so that you show proper respect for America and all Americans.
Third, Windows is an operating system. It is not a person who is an American.
What you probably meant to write was, "All software written in India is shit."
Re:Impossible design by demerzeleto · 2010-05-29 15:07 · Score: 3, Interesting

There's a damn good reason why we want our processors to be rock solid. If they don't work right, we can't trust anything they output.
Have you ever tried transferring large files over a 100 MBps ethernet link? Thats right, billions of bytes over a noisy, unreliable wired link. And how often have you seen files corrupted? I never have. The link runs along extremely reliably (BER of 10^-9 I think) with as little as 12MBps out of the 100MBps spent on error checking and recovery.

Same case here. I'd expect the signal-to-noise ratio on the connects within CPUs (when the voltage is cut by say 25%) to be similar, if not better, than ethernet links. So the CPU could probably get along with lesser error checking and recovery. Or, if you choose applications (like video decoding or graphics rendering) that have no problems with a few bad bits here and there, you could manage with almost no ECC at all.

If you were to plot Error Rates vs CPU power, I'd say most modern CPUs lie at the far end of the region of diminishing returns. Theres a gold mine to be reaped by moving backwards on the curve.
Re:Impossible design by Anonymous Coward · 2010-05-29 15:16 · Score: 0

The cold part is impossible, but I'm sure you could "fuse" an old Athlon
Re:Impossible design by DavidRawling · 2010-05-29 15:32 · Score: 2, Informative

space the instructions further apart so that one or two bit flips won't map to another instruction.
Yeah - I think you left out the thinking bit before your comment.
Sure, a single bit flip in the least significant bit only moves you 1 byte forward or backward in RAM. But in the most significant bit in a 32 bit CPU it moves you 2GB away (let alone the 8.4 billion GB in a 64 bit CPU, if my mental maths is correct).
Just how far apart do you want the instructions?
Re:Impossible design by 0100010001010011 · 2010-05-29 15:59 · Score: 1

When playing back a movie on my iPhone I don't care if pixel 200x200 is 0xFFFFFF or 0xFFFFFE. My brain can't tell the difference.
Re:Impossible design by Jeremi · 2010-05-29 17:03 · Score: 1

Well, they aren't rock-solid today either, so you can not trust their output even today. It's just not very likely that there will be a mistake.
For common definitions of "rock-solid" and "not very likely", the above statements cancel each other out. (Keep in mind that different markets have different requirements for reliability... 1 hardware error every year is probably acceptable for casual computing use, but not for nuclear reactor control)

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:Impossible design by jd · 2010-05-29 17:56 · Score: 1, Offtopic

1 hardware error every year is probably acceptable for casual computing use, but not for nuclear reactor control
Someone should have told British Nuclear Fuel. I think Windscale/Selafield was up to 20 accidental nuclear waste discharges a year at one point.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:Impossible design by iluvcapra · 2010-05-29 18:21 · Score: 1

Halt and Catch Fire

--
Don't blame me, I voted for Baltar.
Re:Impossible design by TheThiefMaster · 2010-05-29 20:11 · Score: 1

Perhaps addresses need to be error-corrected then. Oh wait, isn't that the kind of complexity he's trying to avoid?
Re:Impossible design by BikeHelmet · 2010-05-29 21:04 · Score: 0

In fact GPUs already do this to some extent. AMD and nVidia's workstation cards are the same as their gaming cards, the only difference being that the workstation ones are certified to produce 100% accurate output. If a gaming card colours a pixel wrong every now and then it's no big deal and the player probably won't even notice. For CAD and other high end applications the cards have to be correct all the time.
Yep. My videocard can be overclocked close to 10% on the core, and 30% on the memory. But to be stable for folding, it has to be underclocked 15% on the core. I've heard of people getting GTX480's that can't fold at stock speeds, so I suspect this is becoming more and more common.
My card never has issues with games. Even fully overclocked, it never locks up or anything. Furmark does have the occasional wrongly shaded pixel, but that test pushes the card to 100C. Folding is only 65-70C, and yet it'll make my computer BSOD unless I underclock it.
Re:Impossible design by Anonymous Coward · 2010-05-29 21:40 · Score: 0

This doesn't restrict the possibility to have a processor that "goofs up" in the microcode (or even lower) to get it right higher up, right?
Re:Impossible design by ultranova · 2010-05-29 23:03 · Score: 1

Folding is only 65-70C, and yet it'll make my computer BSOD unless I underclock it.

And that strongly suggests that it's not just the card that's faulty. Why would the OS crash just because a calculation unrelated to it gave the wrong result?

--
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
Re:Impossible design by molecular · 2010-05-29 23:16 · Score: 1

In fact GPUs already do this to some extent. AMD and nVidia's workstation cards are the same as their gaming cards, the only difference being that the workstation ones are certified to produce 100% accurate output. If a gaming card colours a pixel wrong every now and then it's no big deal and the player probably won't even notice. For CAD and other high end applications the cards have to be correct all the time.
Yeah, because if a workstation card colors a pixel wrongly, the bridge might fail and people will die!
Re:Impossible design by owlstead · 2010-05-30 02:23 · Score: 1

For a DVD yes, but I could foresee that if you just encode video to be used from cell-phone to cell-phone, errors in a single frame don't have to be fatal. As long as you don't screw up the stream beyond recovery you are OK. You could perform building the stream (in some kind of meta-format) using the CPU and encoding the frames by a fault tolerant design.
Re:Impossible design by wwfarch · 2010-05-30 04:29 · Score: 1

But you would care if a pixel was 0xFFFFFF instead of 0x7FFFFF. These are still just one bit different and if the noise happened frequently enough it would be extremely annoying.
Re:Impossible design by asvravi · 2010-05-30 04:53 · Score: 1

That is not a problem for any person who really does think or is skilled. One would be using Gray code in situations like that, not pure base-2.
Re:Impossible design by V!NCENT · 2010-05-30 05:15 · Score: 1

Except that the lowest error handling/correction is to be done by an OS kernel instead, which should never fail...
In other words: The error handling should be handled and corrected by instructions that are in tghemselves error prone...
Solution? Seperate processor? Even higher power consumption.
Real solution: Back where we started: error correction and handling by the CPU itself.
So that's a catch-22, back to scare one, we are fscked situation.
The article was interesting, but as usual: zero insight, a lot of bla bla bla and no real world implementation purpose.
$0,02

--
Here be signatures
Re:Impossible design by Anonymous Coward · 2010-05-30 06:07 · Score: 0

AMD and nVidia's workstation cards are the same as their gaming cards, the only difference being that the workstation ones are certified to produce 100% accurate output. If a gaming card colours a pixel wrong every now and then it's no big deal and the player probably won't even notice. For CAD and other high end applications the cards have to be correct all the time.
Actually, someone at Stanford did some testing on this and found that it wasn't true. One of the sections claims that the professional boards had the same error rate as the consumer boards.
Re:Impossible design by BikeHelmet · 2010-05-30 07:44 · Score: 1

I suspect it's because WinXP doesn't have a stable CUDA interface - or maybe because the card is always trying to do multiple things at once. (like updating the screen)
When the card gets pushed hard, it BSOD's nv4_disp.dll. Underclocking prevents that from happening.
It'd probably be better in a newer OS, where the display drivers can be restarted without a reboot.
Re:Impossible design by turgid · 2010-05-30 09:04 · Score: 1

Someone should have told British Nuclear Fuel. I think Windscale/Selafield was up to 20 accidental nuclear waste discharges a year at one point.
They're probably human operator and administrative errors, and probably to do with the storage of waste rather than the operation of nuclear reactors. The waste is typically held in tanks for weeks to decades depending on the type.
The old Windscale piles last operated in 1957. They didn't have computer control, and one of them was the one that went up in flames covering Britain in radioactive fallout (in 1957). The four Calder Hall reactors (early Magnox) were human-controlled too IIRC.
As for the WAGR (Windscale Advanced Gas-Cooled Reactor), that operated in the very early 1960s, so I doubt that was computer-controlled either.
So that's 7 reactors on the Sellafield site there. Are there any others? I can't remember, and there certainly weren't any "modern" ones, unless you count the WAGR which was a very advanced prototype for its day. It began operating in 1962, the same year that the first two commercial Magnoxes, Bradwell and Berkeley, came on line.
I used to work at Bradwell.
There have never been any nuclear (reactor) accidents resulting in offsite releases in Britain since Windscale in 1957. Even the dreaded Dounreay was pretty good in that regard. Occasionally the odd few gallons of radioactive effluent made its way out to sea, but you can look that up. Her Majesty's Nuclear Installations Inspectorate has it all documented and you can see who got found guilty of what.

--
Stick Men
Re:Impossible design by wooferhound · 2010-05-30 10:31 · Score: 1

not never was none no better

--
We are Dead Stars looking back Up at the Sky
Re:Impossible design by walshy007 · 2010-05-30 20:54 · Score: 1

If a tcp packet arrives with a bad CRC, it winds up being re-sent until one with the right one arrives, now in the cpu example they are talking about not correcting the faults, which would be equivalent of just displaying and accepting the messed up packet, which depending on the connection could really cause some serious damage.
The accept bad noise but detect and re-do model is what is being done now, not bothering to even check for errors is a Bad Idea. When errors are detected on a cpu continuing with the error would be disaster in most instances, redoing the calculation is the only option.
Re:Impossible design by haxney · 2010-05-31 06:58 · Score: 1

There was a really interesting talk I saw a few years back about Failure-Oblivious Computing (original paper (PDF), Google PDF viewer) which would deal with certain kinds of memory errors, like reading or writing past the end of a buffer, by ignoring them and moving on. For reads, if the program tried to read from a bad address, the system would figure out something random to return, and if you tried to write out of bounds, rather than throwing and exception (or segfaulting), it would just ignore the extra writes. This sounds horrifying and seems like it could not possibly work, but it turned out that for (certain kinds of) mostly-correct programs, they could literally ignore errors and things would mostly work.
I can't find a link for it now, but towards the end of the talk, they stress-tested the failure-oblivious compiler by manually introducing off-by-one errors into the source code and seeing what happened. They tried this with a video codec, and found that certain loop bounds were essential for things like determining branch targets, but that a significant number could be fudged and you would still end up with a recognizable video. Obviously, it was degraded from full working order, but you could still make out what was happening in the picture.
The point being that there are certain kinds of faults which a program can tolerate (slightly inaccurate pixel colors, minor graphical garbling of text or images), and there are faults which it cannot (like figuring out a branch target).
I doubt the whole world will move to highly fault-tolerant or failure-oblivious computing any time soon, but it could be an interesting niche for a coprocessor, and/or in certain domains.

I thcnk it4s a gre&t idKa by jolyonr · 2010-05-29 10:27 · Score: 0

Bu) I l/ved in thE day oI the 2400 baUUUd modem.

--

Please read my Canon EOS tech blog at http://www.everyothershot.com

Wrong approach? by Yvan256 · 2010-05-29 10:31 · Score: 1

Wouldn't it be simpler to simply add redundancy and CRC or something similar to that effect?

Re:Wrong approach? by DigiShaman · 2010-05-29 10:41 · Score: 1

The goal is to reduce power consumption and improve performance. Adding redundancy and CRC goes against that.

--
Life is not for the lazy.
Re:Wrong approach? by Dunbal · 2010-05-29 11:56 · Score: 0

The goal is to reduce power consumption and improve performance.
Well that's fine if you are an academic who measures "performance" in "operations per second". Usually, however, computers are used to make CORRECT calculations. What use is a blazing fast computer that is no longer reliable? If you allow a fraction of errors, considering the speed of CPU's and the length of time they can be running, you can expect these errors to compound and magnify over time eventually corrupting the whole program/data.

--
Seven puppies were harmed during the making of this post.
Re:Wrong approach? by jibjibjib · 2010-05-29 12:35 · Score: 2, Funny

Yeah because academics are idiots who measure performance in incorrect calculations per second, and they did this research without thinking of all these things that you've thought up in two minutes reading the Slashdot summary.
Seriously, people, get some common sense.
Re:Wrong approach? by somersault · 2010-05-29 12:54 · Score: 3, Interesting

What use is a blazing fast computer that is no longer reliable
Meh.. I'm pretty happy to have my brain, even if it makes some mistakes sometimes.

--
which is totally what she said
Re:Wrong approach? by maxume · 2010-05-29 14:05 · Score: 1

Yeah, but you have never compared it to the brain you used to have, the one that never made mistakes.

--
Nerd rage is the funniest rage.
Re:Wrong approach? by Belial6 · 2010-05-29 14:51 · Score: 1

Trust me, it's not all it's cracked up to be.
Re:Wrong approach? by Anonymous Coward · 2010-05-30 02:27 · Score: 0

Your brain is quite reliable for the things that count. How would you feel if that brain of yours made your heart or lungs stop working for a few minutes every day by mistake?
If you ask the brain to make guesses based on incomplete information you can expect "mistakes" to happen. However that doesn't mean the brain is unreliable.
Re:Wrong approach? by somersault · 2010-05-30 05:29 · Score: 1

I don't think that's a very good analogy.. a better one would be the brain making your heartbeat or breathing slightly irregular rather than stopping completely. And some people's brains do that. Cardiac dysrhythmia can be fatal, but for some people it's just an annoyance.

--
which is totally what she said

Hmmmm by naz404 · 2010-05-29 10:35 · Score: 0, Offtopic

I wonder what Harold would have to say about Prof. Kumar's theory.

--
http://www.object404.com

Moving, not fixing, the problem by Red+Jesus · 2010-05-29 10:35 · Score: 4, Interesting

The "robustification" of software, as he calls it, involves re-writing it so an error simply causes the execution of instructions to take longer.

Ooh, this is tricky. So we can reduce CPU power consumption by a certain amount if we rewrite software in such a way that it can slowly roll over errors when they take place. There are some crude numbers in the document: a 1% error rate, whatever that means, causes a 23% drop in power consumption. What if the `robustification' of software means that it has an extra "check" instruction for every three "real" instructions? Now you're back to where you started, but you had to rewrite your software to get here. I know, it's unfair to compare his proven reduction in power consumption with my imaginary ratio of "check" instructions to "real" instructions, but my point still stands. This system may very well move the burden of error correction from the hardware to the software in such a way that there is no net gain.

Re:Moving, not fixing, the problem by sourcerror · 2010-05-29 10:41 · Score: 2, Insightful

This system may very well move the burden of error correction from the hardware to the software in such a way that there is no net gain.
People said the same about RISC processors.
Re:Moving, not fixing, the problem by Turzyx · 2010-05-29 10:49 · Score: 2, Interesting

I'm making assumptions here, but if these errors are handled by software would it not be possible for a program to 'ignore' errors in certain circumstances? Perhaps this could result in improved performance/battery life for certain low priority tasks. Although an application where 1% error is acceptable doesn't spring immediately to mind, maybe supercomputing - where anomalous results are checked and verified against each other...?
Re:Moving, not fixing, the problem by DigiShaman · 2010-05-29 10:49 · Score: 2, Informative

From what I understand, all modern processors are now a hybrid of both RISK and CISC (Intel Core 2, AMD K8, etc). Except for embedded applications, the generic CPU doesn't have that kind of pure classification anymore. Right?

--
Life is not for the lazy.
Re:Moving, not fixing, the problem by noidentity · 2010-05-29 10:50 · Score: 1

I can see this possibly working, though the devil is in the details. First, consider a similar situation with a communications link. You could either send every byte twice (TI 99/4A cassette format, I'm looking at you!), or if the error rate isn't too high, checksum large blocks of data and retransmit if there's an error. The latter will usually yield a higher rate for the error-free channel you create the illusion of. So if you could break a computation into blocks and somehow detect a corrupt computation, you could just recompute the block. So bringing this back to computation, how the hell do you determine a corrupt computation without doing the computation to see what the correct result is? And if you knew this already, you wouldn't need to compute it in the first place. Maybe there are solutions to this, but for any piece of software? And you thought rewriting for multiple threads was hard...
Re:Moving, not fixing, the problem by xZgf6xHx2uhoAj9D · 2010-05-29 11:05 · Score: 2, Informative

The classifications weren't totally meaningful to begin with, but CISC has essentially died. I don't mean there aren't CISC chips anymore--any x86 or x64 chip can essentially be called "CISC"--but certainly no one's designed a CISC architecture in the past decade at least.
RISC has essentially won and we've moved into the post-RISC world as far as new designs go. VLIW, for instance, can't really be considered classical RISC, but it builds on what RISC has accomplished.
The grandparent's point is a good one: people thought RISC would never succeed; they were wrong.
Re:Moving, not fixing, the problem by Anonymous Coward · 2010-05-29 11:05 · Score: 0

Video decoding maybe?
Re:Moving, not fixing, the problem by Anonymous Coward · 2010-05-29 11:18 · Score: 0

It's not that simple, their are errors all the time in hardware that are hidden from you. For instance before hard disk companies got smart and started hiding and automatically re-allocating bad sectors, if you're drive was beginning to go you could get replaces as soon as it had bad sectors, but drive companies started reallocating and remapping sectors so disk scanners became redundant since drive companies came with extra space on the platters.
I think the idea MAY have promise if we ever hit a brick wall but the it will require a lot of experimentation to see if it is feasible, they really have to design the hardware so that you don't have to change the software.
Re:Moving, not fixing, the problem by xZgf6xHx2uhoAj9D · 2010-05-29 11:21 · Score: 2, Informative

Error Correction Codes (aka Forward Error Correction) are typically more efficient for high-error channels than error detection (aka checksum and retransmit), which is why 10Gbps Ethernet uses Reed-Solomon rather than CRC in previous Ethernet standards: it avoids the need to retransmit.
I had the same questions about how this is going to work, though. What is the machine code going to look like and how will it allow the programmer to check for errors? Possibly each register could have an extra "error" bit (similarly to IA-64's NaT bit on its GP registers). E.g., if you do an "add" instruction, it checks the error bits on its source operands and propagates them. So long as you only allow false positives and false negatives, it would work, and could be relatively efficient.
Re:Moving, not fixing, the problem by thegarbz · 2010-05-29 11:23 · Score: 1

Although an application where 1% error is acceptable doesn't spring immediately to mind,

This struck me as a problem too. Where is a processing error acceptable? In a game which currently doesn't tax the CPU as much as it does the GPU anyway? In a word processor where the performance of the CPU really doesn't matter? A 1% error is definitely not tolerable when calculating PI, and again how do you check the result is correct? Do you execute each instruction twice and ensure that the same result came out the other end?

Also what kind of error are we talking about? Sure when playing a game a flipped bit could cause the screen to display a fault or it could cause the game to come crashing down. Is the error checking code some how immune to errors too? This just wreaks of the Pentium 3 problems which had a significant impact.
Re:Moving, not fixing, the problem by JoeMerchant · 2010-05-29 11:29 · Score: 2, Interesting

Why rewrite the application software? Why not catch it in the firmware and still present a "perfect" face to the assembly level code? Net effect would be an unreliable rate of execution, but who cares about that if the net rate is faster?
Re:Moving, not fixing, the problem by Anonymous Coward · 2010-05-29 11:30 · Score: 1, Insightful

I've read something similar to this in the past and the example they used is video playback. If a few pixels in a video frame are rendered incorrectly the end user probably won't even notice. I think the likely applications if this is in video decoders and gaming graphics.
Re:Moving, not fixing, the problem by icebraining · 2010-05-29 12:05 · Score: 1

RISK
They try to invade Kamchatka?

--
Dilbert RSS feed
Re:Moving, not fixing, the problem by Low+Ranked+Craig · 2010-05-29 12:19 · Score: 1

Never get involved in a land war in Asia.

--
I still cannot find the droids I am looking for...
Re:Moving, not fixing, the problem by Anonymous Coward · 2010-05-29 12:32 · Score: 0

Moving things from hardware to software has always led to failure or challenges. Examples include
1. The current parallel programming challenge (using all cores is up to the programmer or compiler)
2. Software controlled caches (The Cell processor)
Anything that can be moved from software to hardware shields users from complexities in a transparent manner.
Re:Moving, not fixing, the problem by Interoperable · 2010-05-29 12:34 · Score: 4, Insightful

I did some digging and found some material by the researcher, unfiltered by journalists. I don't have any background in processor architecture but I'll present what I understood. The original publications can be found here.
The target of the research is not general computing, but rather low-power "client-side" computing, as the author puts it. I understand this to be decoding application, such as voice or video in mobile devices. Furthermore, the entire architecture would not be stochastic, but rather it would contain some functional blocks that are stochastic. I think the idea is that certain mobile hardware devices devote much of their time to specialized applications that do not require absolute accuracy.
A mobile phone may spend most of it's time being used encode/decode low resolution voice and video and would have significant blocks within the processor devoted to those tasks. Those tasks could be considered error tolerant. The operating system would not be exposed to error-prone hardware, only applications that use hardware acceleration for specialized, error-tolerant tasks. In fact, the researchers specifically mention encoding/decoding voice and video and have demonstrated the technique on encoding h.264 video.

--
So if this is the future...where's my jet pack?
Re:Moving, not fixing, the problem by oldhack · 2010-05-29 12:35 · Score: 1

Bit like Itanium VLI-whatever architecture - leave it to the compiler (i.e. software) to correctly pack instructions to big units so they can use up all the subunits simultaneously.
Possibly it might be suitable for some niche applications.

--
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
Re:Moving, not fixing, the problem by jd · 2010-05-29 12:57 · Score: 3, Insightful

For this, I'd point to the RISC vs. CISC debate. RISC programs took many more instructions to do the same things, but the gain in performance was so great that you ended up with greater performance. Extra steps = some amount of overhead, but so long as the net gain is greater than the net overhead, you will gain overall. The RISC chip demonstrated that such examples really do exist in the real world.
But just because there are such situations, it does not automatically follow that more steps always equals greater performance. It may be that better error-correction techniques in the hardware would handle the transmission errors just fine without having to alter any software at all. It depends on the nature of the errors (bursty vs randomly-distributed, percent of signal corrupted, etc) as to what error-correction would be needed and whether the additional circuitry can be afforded.
Alternatively, the problem may in fact turn out to be a solution. Once upon a time, electron leakage was a serious problem in CPU designs. Then, chip manufacturers learned that they could use electron tunneling deliberately. The cause of these errors may be further electron leakage or some other quantum effect, it really doesn't matter. If it leads to a better practical understanding of the quantum world to the point where the errors can be mitigated and the phenomenon turned to the advantage of the engineers, it could lead to all kinds of improvement.
There again, it might prove redundant. There are good reasons for believing that "Processor In Memory" architectures are good for certain types of problem - particularly for providing a very standard library of routines, but certain opcodes can be shifted there as well. There is also the Cell approach, which is to have a collection of tightly-coupled but specialized processors of different kinds. A heterogenious cluster with shared memory, in effect. If you extend the idea to allow these cores to be physically distinct, you can offload from the original CPU that way. In both cases, you distribute the logic over a much wider area without increasing the distance signals have to travel (you may even shorten the distance). As such, you can eliminate a lot of the internal sources of errors.
It may prove redundant in other ways, too. There are plenty of cooling kits these days that will take a CPU to much lower temperatures. Less thermal noise may well result in fewer errors, since that is a likely source of some of them. Currently, processors often run at 40'C - and I've seen laptop CPUs reach 80'C. If you can bring the cores down to nearly 0'C and keep them there, that should have a sizable impact on whether the data is being transmitted accurately. The biggest change would be to modify the CPU casing so that heat is reliably and rapidly transferred from the silicon. I would imagine that you'd want the interior of the CPU casing to be flooded with something that conducts heat but not electricity - fluorinert does a good job there - and then have the top of the case to also be an extra-good heat conductor (plastic only takes you so far).
However, if programs were designed with fault-tolerence in mind, these extra layers might not be needed. You might well be able to get away with better software on a poorer processor. Indeed, one could argue that the human brain is an example of an extremely unreliable processor whose net processing power (even allowing for the very high error rates) is far far beyond any computer yet built. This fits perfectly with the professor's description of what he expects, so maybe this design actually will work as intended.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:Moving, not fixing, the problem by jd · 2010-05-29 13:24 · Score: 1

There are many types of error correction codes, and which one you use depends on the nature of the errors. For example, if the errors are totally random, then Reed-Solomon will likely be the error correction code used. CDs use two layers of Reed-Solomon concatenated in series. This is not especially fast, but the output for a CD is in tens of kiloherts and an ASIC could easily be operating in the gigahertz realms. However, when you're talking the internals of a CPU, there's a limit to how slow you can afford things to be.
BCH is a general form of Reed-Solomon and you can therefore tailor the function to handle specific characteristics of the sorts of error observed rather than trying to code round anything that doesn't fit perfectly within Reed-Solomon. Although it's more general, it might actually be quicker if you can find parameters that better describe what it is supposed to be correcting than the default values used by Reed-Solomon.
Turbo codes (as used by NASA for modern deep-space communication) are great when you're dealing with block errors. In long-distance communication, your biggest problem will be bursts of interference rather than the random noise. Wikipedia states: "Turbo codes, as described first in 1993, implemented a parallel concatenation of two convolutional codes, with an interleaver between the two codes and an iterative decoder that would pass information forth and back between the codes." It goes on to state that this is faster than any previous concatenation scheme. Fast is good, here, but it is only meaningful if this is the sort of error experienced.
There are plenty of other sorts out there, and again there may well be error correction codes not listed above that are much better suited to this specific type of problem.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:Moving, not fixing, the problem by Anonymous Coward · 2010-05-29 14:40 · Score: 0

Yeah, and? The same argument made against two totally different architectures doesn't mean the same argument is invalid for both.
Re:Moving, not fixing, the problem by tomhudson · 2010-05-29 15:02 · Score: 1

I would imagine that you'd want the interior of the CPU casing to be flooded with something that conducts heat but not electricity - fluorinert does a good job there

Olive oil works fine and is cheaper. You can stick everything but the disk drives in it http://www.youtube.com/watch?v=6sP45uBj4-k&NR=1&feature=fvwp http://www.youtube.com/watch?v=t8shVDvMdo4
Re:Moving, not fixing, the problem by Jeremi · 2010-05-29 16:47 · Score: 1

You could either send every byte twice (TI 99/4A cassette format, I'm looking at you!)
Seems to me you'd have to send every byte three times, otherwise if two bytes don't match, how do you know one is correct and which one is corrupted?
(on the other hand, this would help explain why TI 99/4A cassette loads were so bloody slow...)

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:Moving, not fixing, the problem by jd · 2010-05-29 17:17 · Score: 1

Wouldn't Bluto keep trying to hack in? And what does Popeye think of it?
Seriously, it's an organic oil, which might create problems in the longer-term in a totally sealed environment. I have no idea what olive oil would do over a 5-10 year period exposed to the extreme heat of the silicon surface and the extreme cold of the heat-sink. I'm also not sure what it would do to the silicon, since this would involve direct exposure of the chip itself. The chemistry of organic oils tends to be non-trivial.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:Moving, not fixing, the problem by noidentity · 2010-05-30 04:17 · Score: 1

I think it also includes a checksum with each block, so that it can choose the one that was received correctly. A long time ago I wanted to reduce the size of TI 99/4A cassette saves on my computer. I had simply digitized them, but they were big (this was around 1995, mind you). So I looked at the waveforms and found a compact way to encode them. By chance, my format matched the bytes and I could see the text of the data I was saving. I noticed that every block was repeated twice, as you can easily hear if you listen to the recordings.
Re:Moving, not fixing, the problem by tomhudson · 2010-05-30 07:19 · Score: 1

That reminds me of the joke:
Q. What happened when the Pope went to Mount Olive?
A. Popeye kicked the s*** out of him.
For the olive oil thing, they've been running it for over a year with no problems. If you're still worried, you can use mineral oil.
The disadvantage of mineral oil is that it's not as environmentally friendly. You can't filter it and throw it in your diesel engine and burn it when you're done with it.
Re:Moving, not fixing, the problem by Anonymous Coward · 2010-05-30 07:25 · Score: 0

Ahem, as long as it's called Ethernet, the Ethernet frame will be identical, hence it will contain a 32 bit CRC. 10Gbps Ethernet does NOT use Reed-Solomon, nor are there retransmissions at the MAC level (but much higher, hint, UDP transport doesn't have retransmissions). As a matter of fact, why don't you go and write code that performs Reed-Solomon in software at, say 1Gbps, MTU size packets and see what's the actual goodput you get.
The difference between 10GbE and 1GbE is the medium (e.g. single mode fiber, multi-mode fiber, CX4 copper vs. cat5/cat6) and the line code, which, if you had done your homework, would have recognized as being 64b/66b which is different than the 8b/10b (incidentally, this latter line code is used not just for 1GbE but also for Serial ATA technology, PCIe, etc.)
Re:Moving, not fixing, the problem by yakovlev · 2010-06-01 00:20 · Score: 1

If the hardware can detect the error in order to put it in the register, the hardware should be able to flush and redo the instruction. The real problems either internally screw up the processor to the point that it can't process instructions, or are undetected by the hardware.

I could see how in certain applications (video decoding/encoding) the attitude could be taken to just ignore the errors, but it isn't obvious to me how to handle errors that screw up the processor. The only solution I see there is for an application where the processor is periodically reset. That could also be practical for something like video encoding, where you could simply reset the processor something like every frame, or if it's "taking too long." I think someone more familiar with this said that it is just these kinds of fault-tolerant applications like video being considered, and not general-purpose computing.

The real problem I see here is that if the user is in a noisy environment (say, outside in the sun) the error rate could go high enough that it would be unacceptable to the user. Even modern servers have problems with periods of increased solar flare activity, which would be killer to a device like this.
Re:Moving, not fixing, the problem by Anonymous Coward · 2010-06-01 03:43 · Score: 0

Perhaps because the firmware will have to treat every error as being equally important and have to correct every error, whereas the software can take into account the impact the error will have when choosing the correct course for the error, some error just just result in a insignificant glitch that the software knows can be ignored, where the firmware solution would correct it anyway.

Never going to work by Anonymous Coward · 2010-05-29 10:38 · Score: 0

Consider that we currently can't even write software that is reliable on relatively error-free hardware. Introducing faulty hardware will just make everything suck even more.

This is just like all the other "solutions" to problems that only require well designed, bug free software. That just ain't gonna happen. Programming is complex and our little fuzzy/faulty brains aren't very good at it.

Does his lab produce Daily WTFs? by Anonymous Coward · 2010-05-29 10:39 · Score: 0

The whole TFS and most of the TFA is a load of non-sequiturs. If they bothered linking the papers it might have been useful, but no, it's another shining example of tech journali^H^H^H^H^H^H^H total bullshit.

It might be something to do with clock variability and operations that are retry-able on error. Or they could be running a clock signal from a grandfather clock.

Sounds like... by Chineseyes · 2010-05-29 10:39 · Score: 1, Offtopic

Sounds like Kumar and his friend Harold have been spending too much time baking weed brownies and not silicon wafers.

--
I think the invisible hand of the market has its middle finger extended

--A wise old fart named SC0RN

Moore's law by koreaman · 2010-05-29 10:44 · Score: 1

I don't see how allowing a higher error rate will enable them to put more transistors on a chip.

--
Le français vous intéresse?

Re:Moore's law by takev · 2010-05-29 10:49 · Score: 3, Informative

What he is proposing is to reduce the number of transistors on a chip, to increase its speed and reduce power usage.
So in fact he is trying to reverse more's law.
Re:Moore's law by Anonymous Coward · 2010-05-29 12:16 · Score: 0

What he is doing is exactly like the people who made NoSQL are doing. Remove error correction/detection, and of course a system will go faster.
OB Car analogy: Removing airbags, brakes, and the body of the car leaving a chassis will surely make a car run faster. It won't be usable though for anything other than "I have moar speed than you!" type of crap.
Re:Moore's law by tomhudson · 2010-05-29 15:08 · Score: 1

Removing airbags, brakes, and the body of the car leaving a chassis will surely make a car run faster. No it won't - removing the body creates a more drag.
Re:Moore's law by Anonymous Coward · 2010-05-29 16:45 · Score: 0

My understanding was that while yes it reduces the number of transistors some, that wasn't the main goal. As transistors get smaller, they get less reliable. By planning for errors in the hardware anyways, the reliability of the transistors becomes less of an issue minimizing one of the hindrances to Moore's law.
Re:Moore's law by Jeremi · 2010-05-29 17:10 · Score: 1

I don't see how allowing a higher error rate will enable them to put more transistors on a chip.
What it does is increase the chances of actually being able to use the chip once it's made. Right now, many chips have to be discarded because they contain manufacturing flaws that (in current designs) makes them unusable. If they can come up with a design that allows flawed chips to be useful anyway, they no longer have to discard all the chips that didn't come out perfectly.

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:Moore's law by tg123 · 2010-05-31 01:20 · Score: 1

What it does is increase the chances of actually being able to use the chip once it's made. Right now, many chips have to be discarded because they contain manufacturing flaws that (in current designs) makes them unusable. If they can come up with a design that allows flawed chips to be useful anyway, they no longer have to discard all the chips that didn't come out perfectly.
Please mod this post up
The 486 SX is a good example - the 486 SX was the same as the DX just that the Floating point Unit part of the chip was disabled. its said Intel did this so it would not have to throw away batches of defective chips.
http://en.wikipedia.org/wiki/Intel_80486SX

More likely: Impossible gains by Anonymous Coward · 2010-05-29 10:48 · Score: 2, Insightful

More importantly, if the software is more robust so as to detect and correct errors, then it will require more clock cycles of the CPU and negate the performance gain.

This CPU design sounds like the processing equivalent of a perpetual motion device. The additional software error correction is like the friction that makes the supposed gain impossible.

Re:More likely: Impossible gains by Anonymous Coward · 2010-05-29 12:04 · Score: 0

More importantly, if the software is more robust so as to detect and correct errors, then it will require more clock cycles of the CPU and negate the performance gain.
This CPU design sounds like the processing equivalent of a perpetual motion device. The additional software error correction is like the friction that makes the supposed gain impossible.
No one here is really getting this.
Think about it this way: just as you can talk about the time complexity of an algorithm you can also talk about its energy complexity.
So if we'll say there's some fixed energy cost associated with doing a particular operation relative to some probability of being correct and that cost changes if you adjust that probability of being correct. So given that, if your goal is to minimize the amount of energy it takes to complete an algorithm with 99.9999% probability of a correct answer. So you would go about this the same way that you optimize the time performance of an algorithm: you organize it so that redundant work is minimized.
So let's say I want to add n numbers together with 99.9999% probability of a correct answer. Do you suppose it would use more energy to do each addition with (99.9999%)^(1/n) probability of error or to design an algorithm that doesn't do that much checking the whole time but generates a number which is highly likely correct and is accompanied by additional redundant coded data which can be used to true up the approximation on demand?
Re:More likely: Impossible gains by shimage · 2010-05-29 12:21 · Score: 1

I think an important part of his argument is that current processors aren't perfect anyway, and they are only going to get worse as we move to smaller processes. The argument is that, eventually, everyone should be writing more robust code anyway, so why not get something out of it. I don't know whether that is true or not, but to argue against his ideas, you need to address his points ...

Current software can't recover from its own errors by Anonymous Coward · 2010-05-29 10:55 · Score: 0

Current software can't recover from its own errors most of the time, but we're supposed to trust it to handle hardware ones, too?

I'll trade some speed for reliability, thanks. Rebooting sucks and this sounds like a great way to do more of it.

why use a stochastic processor? by Anonymous Coward · 2010-05-29 10:57 · Score: 0

Why use a stochastic processor which makes mistakes when we can use our brains, which make mistakes?

Re:why use a stochastic processor? by Arker · 2010-05-29 11:37 · Score: 3, Funny

Why use a stochastic processor which makes mistakes when we can use our brains, which make mistakes?
Because the stochastic processor will be able to make mistakes much more quickly of course.
Don't you understand progress?

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
Re:why use a stochastic processor? by thegrassyknowl · 2010-05-29 18:43 · Score: 1

But can it predict the future with only a 5% margin for error simply be extrapolating the most likely outcome of all known variables?

--
I drink to make other people interesting!
Re:why use a stochastic processor? by Anonymous Coward · 2010-05-30 21:29 · Score: 0

Why use a stochastic processor which makes mistakes when we can use our brains, which make mistakes?
Because the stochastic processor will be able to make mistakes much more quickly of course.
Don't you understand progress?
Well, Microsoft understood that long ago... Except the "more quickly" part, or so it seems.

Sounds reasonable to me by bug1 · 2010-05-29 10:59 · Score: 4, Insightful

Ethernet is an improvement over than token ring, yet Ethernet has collisions and token ring doesn't.

Token ring avoids collisions, Ethernet accepts collisions will take place but has a good error recovery system.

Re:Sounds reasonable to me by Anonymous Coward · 2010-05-29 11:09 · Score: 0

And the recovery algorithm runs on a defective processor?
Re:Sounds reasonable to me by TheGratefulNet · 2010-05-29 12:17 · Score: 0, Insightful

in fact, its the randomness of ethernet (back off and retry at random non-matching intervals) that gets you order. if everyone used the same backoff timers, they'd keep colliding; but add in some randomness and things work better.
increase entropy to ensure order. ha! but its true.

--

--
"It is now safe to switch off your computer."
Re:Sounds reasonable to me by h4rr4r · 2010-05-29 12:18 · Score: 1

No, they are totally separate things. You can run token ring over Ethernet, been there done that. Ethernet does use a bus topology but these days we use switches to avoid collisions.
Re:Sounds reasonable to me by oldhack · 2010-05-29 12:40 · Score: 1

Out networking code explicitly assumed unreliable network transmission, but I doubt much of our code is designed to handle baked CPU faults.

--
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
Re:Sounds reasonable to me by thegarbz · 2010-05-29 14:04 · Score: 1

Yes but the network protocol is designed in such a way to handle it. CPU instructions are not. If I get a network msg with garbage data due to a collision and don't acknowledge to the transmitting party they simply resend. It's a dumb protocol. Yet if I specifically ask a processor to do a logical shift right on 0100 and it replied 0110, by what metric can I detect the error if the very piece of equipment that handles these calculations for me is also responsible for the error?

Also bear in mind that network topologies are inherently quite inefficient. There is lots of data stored in the packets for error checking and logical ordering of data and I remembered a while ago that researchers demonstrated a faster network by modifying TCP to remove much of this information. Whereas what is explained here is exactly the opposite. How do we speed things up if we're looking for errors?
Re:Sounds reasonable to me by Anonymous Coward · 2010-05-29 14:12 · Score: 0

you weren't around back in the day. 4mbit token ring
was way better than 10mbit ethernet. collisions
were a big problem.
the physical layer for gigabit ethernet is completely
different. it's full duplex and hubs are essentially illegal.
Re:Sounds reasonable to me by Anonymous Coward · 2010-05-29 17:14 · Score: 0

What network toplogies are inherently ineffecient? There are no meaningful error detection codes above L2 that are not application specific. Well there is a cheap CRC in the IPv4 header subsequently removed in IPv6.
There is still a TCP level 16-bit checksum (2 whole bytes) basically useless. I remember back in the frame-relay days certain clocking failure modes would lead to high probability of corrupt data as by random chance the checksums matched up.
Checksums in todays IP packets are simply worthless decorations from a forgotten erra. All of the real error detection and correction is done by hardware at the PHY layer using quite sophisticated algorithms.
In terms of out-of-order delivery of TCP packets in a window the 4-byte sequence number required for other reasons is used for free to provide local reordering of data from the peer.
Re:Sounds reasonable to me by jd · 2010-05-29 17:32 · Score: 1

Ethernet has high latency and the kernel adds even more. In consequence, protocols that work via kernel-bypass (which includes some ethernet drivers) will show some performance gain. That's why Van Jacobson channels are superior to normal kernel-processed packets. The greatest gains come, though, when you eliminate Ethernet. Infiniband has roughly a tenth of the latency and SCI has almost a fiftieth. Which is why those tend to be the interconnects used in high-end clusters. They might be good for gaming, too, but they're too damn expensive and you'd need a hybrid switch that could handle multiple connection types and transliterate between them. Not easy, as Infiniband supports concepts regular Ethernet does not. (There are "extended ethernet" devices that add things like RDMA, but can you name anyone who actually has such a device?)

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:Sounds reasonable to me by Anonymous Coward · 2010-05-29 18:51 · Score: 0

a fantastic example
Re:Sounds reasonable to me by badkarmadayaccount · 2010-06-01 10:41 · Score: 1

Any proper programmable DMA engine can make copy latency as in CPU time, zero. And, any descent capability based CPU won't have to make a complete fraking state change for a privileged function call (system trap). Don't blame OSes for crappy CPU design.

--
I know tobacco is bad for you, so I smoke weed with crack.

Actually you've got the right idea by Anonymous Coward · 2010-05-29 11:01 · Score: 2, Interesting

The summary talked about the communication links... I remember when we were running SLIP over two-conductor serial lines and "overclocking" the serial lines. Because the networking stack (software) was doing checksums and retries, it worked faster to run the line into its fast but noisy mode, rather than to clock it conservatively at a rate with low noise.

If the chip communications paths start following the trend of off-chip paths (packetized serial streams), then you could have higher level functions of the chip do checksums and retries, with a timeout that aborts back even higher to a software level. Your program could decide how much to wait around for correct results versus going on with partial results, depending on its real-time requirements. The memory controllers could do this, using the large, remote SRAM and RAM spaces as an assumed-noisy space and overlaying its own software checksum format on top.

This is really not so different from modern filesystems which start to reduce their trust in the storage device, and overlay their own checksum, redundancy, and recovery methods. You can imagine bringing these reliability boundaries ever "closer" to the CPU. Of course, you are right that it doesn't make sense to allow noisy computed goto addresses, unless you can characterize the noise and link your code with "safe landing areas" around the target jump points. It makes even less sense to have noisy instruction pointers, e.g. where it could back up or skip steps by accident, unless you can design an entire ISA out of idempotent instructions which you can then emit with sufficient redundancy for your taste.

Moore Slaw by Anonymous Coward · 2010-05-29 11:04 · Score: 0

And maybe some potato salad?

I wish I could apply this to myself by gyrogeerloose · 2010-05-29 11:12 · Score: 1

With all the mistakes I've made, I could be a superman by now.

--
This ain't rocket surgery.

A brainy idea. by Ostracus · 2010-05-29 11:28 · Score: 4, Interesting

He favors a new architecture, that he calls the 'stochastic processor,' which is designed to handle data corruption and error recovery gracefully.

I dub thee neuron.

--
Shai Schticks:"You don't make peace with friends, you make peace with enemies"

Re:A brainy idea. by Anonymous Coward · 2010-05-29 12:36 · Score: 0

Ah, now I understand what a brain fart is.
Re:A brainy idea. by Colin+Smith · 2010-05-29 12:50 · Score: 1

Indeed. It couldn't be used with traditional programming methods, you'd only be able to use it with statistical methods.
Genetic programming maybe. Errors are mutations.

--
Deleted

The problem isn't hardware to begin with... by Angst+Badger · 2010-05-29 11:31 · Score: 4, Insightful

...the problem is software. In the last twenty years, we've gone from machines running at a few MHz to multicore, multi-CPU machines with clock speeds in the GHz, with corresponding increases in memory capacity and other resources. While the hardware has improved by several orders of magnitude, the same has largely not been true of software. With the exception of games and some media software, which actually require and can use all the hardware you can throw at them, end user software that does very little more than it did twenty years ago could not even run on a machine from 1990, much less run usably fast. I'm not talking enterprise database software here, I'm talking about spreadsheets and word processors.

All of the gains we make in hardware are eaten up as fast or faster than they are produced by two main consumers: useless eye-candy for end users, and higher and higher-level programming languages and tools that make it possible for developers to build increasingly inefficient and resource-hungry applications faster than before. And yes, I realize that there are irresistible market forces at work here, but that only applies to commercial software; for the FOSS world, it's a tremendous lost opportunity that appears to have been driven by little more than a desire to emulate corporate software development, which many FOSS developers admire for reasons known only to them and God.

It really doesn't matter how powerful the hardware becomes. For specialist applications, it's still a help. But for the average user, an increase in processor speed and memory simply means that their 25 meg printer drivers will become 100 meg printer drivers and their operating system will demand another gig of RAM and all their new clock cycles. Anything that's left will be spent on menus that fade in and out and buttons that look like quivering drops of water -- perhaps next year, they'll have animated fish living inside them.

--
Proud member of the Weirdo-American community.

Re:The problem isn't hardware to begin with... by Arker · 2010-05-29 11:45 · Score: 1

Spot on.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Friends don't let friends enable ecmascript.
Re:The problem isn't hardware to begin with... by Anonymous Coward · 2010-05-29 11:54 · Score: 0

That's why real geeks live under the rock of [url=http://suckless.org/]Suckless[/url].
Re:The problem isn't hardware to begin with... by dbirk · 2010-05-29 12:09 · Score: 1

The thing you aren't realizing is that while it might not matter to the person using Microsoft Word on their computer, it matters because now Google has more CPU power to give them better search quality, and Youtube can show people videos on their mobile phone, and WOW can support millions of people logged on to their virtual worlds, and a small startup can buy a server that can scale to thousands of users for a few hundred bucks a month. So while it doesn't affect that individual users machine, the hardware improvements absolutely affect what they do on their computer, and in a very beneficial way.
Re:The problem isn't hardware to begin with... by blindbat · 2010-05-29 12:33 · Score: 1

I started programming on Apple ][ computers and did assembly. The code was fast and efficient.
However, the benefits of developing with powerful APIs on beefy operating systems is well worth it. There is so much more going on under the hood of a program and system and that costs CPU cycles.
Even software I'm writing for the iPad moves so fast that you *must* include the movements, fades, etc. in order to let the user know something has or is changing.
Much is unnecessary and ugly (especially Windows driver interfaces by many manufacturers) but I'll take today over 30 years ago.
Re:The problem isn't hardware to begin with... by uglyduckling · 2010-05-29 12:37 · Score: 1

Higher level languages aren't just there to save developers time. Using higher level languages usually makes it harder to generate code that will walk on protected memory, cause race conditions etc., and higher level languages are usually more portable and make it easier to write modular re-usable code.
Re:The problem isn't hardware to begin with... by Draek · 2010-05-29 13:02 · Score: 4, Insightful

for the FOSS world, it's a tremendous lost opportunity that appears to have been driven by little more than a desire to emulate corporate software development, which many FOSS developers admire for reasons known only to them and God.
You yourself stated that high-level languages allow for a much faster rate of development, yet you dismiss the idea of using them in the F/OSS world as a mere "desire to emulate corporate software development"?
Hell, you also forgot another big reason: high-level code is almost always *far* more readable than its equivalent set of low-level instructions, the appeal of which for F/OSS ought to be similarly obvious.
Sorry but no, the reason practically the whole industry has been moving towards high-level languages isn't because we're all lazy, and if you worked in the field you'd probably know why.

--
No problem is insoluble in all conceivable circumstances.
Re:The problem isn't hardware to begin with... by Homburg · 2010-05-29 13:13 · Score: 3, Insightful

So you're posting this from Mosaic, I take it? I suspect not, because, despite your "get off my lawn" posturing, you recognize in practice that modern software actually does do more than twenty-year-old software. Firefox is much faster and easier to use than Mosaic, and it also does more, dealing with significantly more complicated web pages (like this one; and terrible though Slashdot's code surely is, the ability to expand comments and comment forms in-line is a genuine improvement, leaving aside the much more significant improvements of something like gmail). Try using an early 90s version of Word, and you'll see that, in the past 20 years word processors, too, have become significantly faster, easier to use, and capable of doing more (more complicated layouts, better typography).
Sure, the laptop I'm typing this on now is, what, 60 times faster than a computer in 1990, and the software I'm running now is neither 60 times faster nor 60 times better than the software I was running in 1990. But it is noticeably faster, at the same time that it does noticeably more and is much easier to develop for. The idea that hardware improvements haven't led to huge software improvements over the past 20 years can only be maintained if you don't remember what software was like 20 years ago.
Re:The problem isn't hardware to begin with... by Anonymous Coward · 2010-05-29 13:22 · Score: 0

You think high level languages are a bad idea? Wow. Just wow. I'm not sure you could get much more off the mark with that belief. Sounds like you missed your class on programming languages.
Re:The problem isn't hardware to begin with... by bertok · 2010-05-29 13:25 · Score: 1

All of the gains we make in hardware are eaten up as fast or faster than they are produced by two main consumers: useless eye-candy for end users, and higher and higher-level programming languages and tools that make it possible for developers to build increasingly inefficient and resource-hungry applications faster than before. And yes, I realize that there are irresistible market forces at work here, but that only applies to commercial software; for the FOSS world, it's a tremendous lost opportunity that appears to have been driven by little more than a desire to emulate corporate software development, which many FOSS developers admire for reasons known only to them and God.
This is a common misconception in the computing world: that somehow the additional computing power is 'wasted' on 'bloat'.
The basic principle that one has to understand is that in the meantime, human beings haven't changed. Our brains haven't improved in speed. There is no benefit to us if a program responds in 1 microsecond instead of 1 millisecond.
However, in terms of 'features', programs are still far behind where they should be. There's an awful lot that we as programmers could do that we aren't, either because it's too hard, or because the CPUs just couldn't handle it. Things like predictive text input, grammar checking, mixed languages in a single document, precise font-rendering, etc... make even simple applications like "word processors" very complex internally. All of this complexity makes it difficult to program in low-level languages, so programmers are moving towards easier to use high-level languages like Java and C#. This makes it easier to implement advanced features, at the cost of some performance.
In other words, programmers target a "constant" speed level determined by the properties of the human brain, while the computer hardware changes. This is a lot like the "time budget per frame" that game developers talk about. They target 60fps or 30fps for their games, giving a constant 16.6ms or 33.3ms to do all the required computations in for each frame. Year after year, in that amount of given wall-clock time, processors and video cards have been able to do more, so programmers take advantage of that added capability. There's no point in making a game run at 10,000fps, because humans can't perceive that!
Ordinary office and productivity applications are exactly the same. They target a certain level of responsiveness too, but the processors haven been getting faster, allowing programmers to do more in that time.
Re:The problem isn't hardware to begin with... by DeadDecoy · 2010-05-29 14:08 · Score: 1

Another thing to mention is that a lot of inefficiencies in code are more likely to come from the design in the code rather than the choice of language. If you program a crappy for-loop, it'll be crappy in any language. High level languages allow us to optimize where needed because there's less complexity, or greater clarity in the source; documentation, unit testing and profiling goes a long way too.

Also, the GP says that the increased speed in hardware is only good for eye candy or resource hungry applications. My NLP code and huge databases would beg to differ :P.
Re:The problem isn't hardware to begin with... by Anonymous Coward · 2010-05-29 14:58 · Score: 0

To you and everyone who modded you insightful: why don't you guys sell your leptops and replace them with used stationary 486s that you can get for free?
Re:The problem isn't hardware to begin with... by jasonwc · 2010-05-29 15:06 · Score: 2, Informative

The problem isn't really modern CPUs but the lack of improvement in conventional hard drive speeds. With a Core i7 processor and a 160 GB X-25M Gen 2 Intel SSD, pretty much everything I run loads within 1-2 seconds, and there is little or no difference between loading apps from RAM or my hard drive. Even with a 1 TB WD Caviar Black 7200 RPM drive, my Core i7 machine was constantly limited by the hard drive.
With an SSD, I boot to a usable desktop and can load Firefox with Adblock, Pidgin, Skype, Foobar2000 and Word in around 2 seconds. Many programs like Chrome load so quickly that they are effectively instant-on. Even though quad-core processors are often derided for desktop use, I see a tremendous improvement with a Core i7 + high-performance SSD vs. a Core 2 Duo + mediocre laptop drive. Modern CPUs can make your desktop experience much more responsive. You just need a hard drive that can keep up.
Oh, and in video playback, the difference is incredibly obvious. My roommate is still using a 7 year old laptop which can barely playback a DVD (MPEG-2). In contrast, my Core i7 can simultaneously decode 5 1080p H.264 videos with ease (after this point, the hard drives can't keep up). While this might be considered useless, it definitely makes a difference when running background tasks such as backups. With my Core 2 Duo without hardware decoding, I would have to pause when scheduled backups started or video would skip. With my quad-core system, I can run any task in the background without fear of slowdown, and also use high-quality upscale filters and renderers that would have slowed my dual-core system to a crawl.
Too many people claim that modern processors and hardware do not provide meaningful improvements to the desktop experience. I just don't find this to be true. Multi-core processors have allowed users to run background tasks, install software etc. with no noticeable speed degradation. When I am working with old single-core machines, I miss this benefit.
In addition, today's software is more powerful. You may not need all the features in Word 2007 or the latest Firefox build, but that doesn't mean they aren't useful.
Adblock, Flashblock, Session Management, the ability to have dozens of tabs loaded without memory issues, the ability to stream high definition video in my browser with no or minimal buffering, the "Awesomebar" etc. are all features that didn't exist 5+ years ago.
Real-time indexing of system files and applications is relatively recent, and yet I find that it has fundamentally transformed how I access data.
There are many more examples. It may be popular to say that things haven't changed much in two decades, because word processors are superficially similar for example, but a great deal has changed.
Re:The problem isn't hardware to begin with... by tomhudson · 2010-05-29 15:23 · Score: 1

So you're posting this from Mosaic, I take it? I suspect not, because, despite your "get off my lawn" posturing, you recognize in practice that modern software actually does do more than twenty-year-old software.

[X] I post using telnet, you inconsiderate clod! Now wget off my lawn!

the ability to expand comments and comment forms in-line is a genuine improvement

You're kidding, right> I just leave it in nested mode - far fewer clicks.
Re:The problem isn't hardware to begin with... by rdnetto · 2010-05-29 16:09 · Score: 2, Insightful

a desire to emulate corporate software development, which many FOSS developers admire for reasons known only to them and God.
Probably because the major FOSS developers are in corporate software development.

--
Most human behaviour can be explained in terms of identity.
Re:The problem isn't hardware to begin with... by Jeremi · 2010-05-29 16:53 · Score: 1

Anything that's left will be spent on menus that fade in and out and buttons that look like quivering drops of water -- perhaps next year, they'll have animated fish living inside them.
If animated fish menus are what the consumers want, then why not sell them animated menus? Or, if consumers prefer spending their CPU cycles on getting actual work done instead, they can always buy software with a less fancy GUI. It's not like there aren't options available to suit every taste.

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:The problem isn't hardware to begin with... by jd · 2010-05-29 17:46 · Score: 1

I would dispute the "capable of doing more" part. TeX can do anything the latest version of Word can do. Ventura Publisher from 20 years back could probably do just about everything. Word IS faster and easier to use, yes, but unless you compare Word to WordStar or Wordcraft 80, it is generally a mistake to look at what something can do. If a program supports Turing-Complete macros, then it can do absolutely anything a Turing Machine can do (given sufficient memory) and a Turing Machine can do anything that is computationally possible. No computer, past, present or future, will ever do more. They can, however, make it practical. They can also make it simple. And that, really, is where all development in computing goes.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:The problem isn't hardware to begin with... by jd · 2010-05-29 17:50 · Score: 1

Geeks also don't use BBS markup on an HTML markup website.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:The problem isn't hardware to begin with... by 1+a+bee · 2010-05-29 18:16 · Score: 1

I'm not talking enterprise database software here, I'm talking about spreadsheets and word processors.. All of the gains we make in hardware are eaten up as fast or faster than they are produced by two main consumers: useless eye-candy for end users
Oh c'mon.. It's not like I know how to put my fast processor to any better use.
Re:The problem isn't hardware to begin with... by Pastis · 2010-05-29 21:13 · Score: 1

<BadCarAnalogy>
cars do basically the same thing as they did 70 years ago. Take you from point A
to B within a particular amount of time.
Have they improved ? Yes
</BadCarAnalogy>
Same with software. Browsing isn't just browsing anymore. It's assisted browsing. Spreadsheets isn't just spreadsheets, it's collaborative spreadsheets backed up in the cloud.

--
Sneak teach kids Algebra using a game
Re:The problem isn't hardware to begin with... by drinkypoo · 2010-05-29 23:26 · Score: 1

I'm not talking enterprise database software here, I'm talking about spreadsheets and word processors.
We don't all have to run OO.o, there's Gnumeric and Abiword too. They'll run on any old pile of crap.

All of the gains we make in hardware are eaten up as fast or faster than they are produced by two main consumers: useless eye-candy for end users, and higher and higher-level programming languages and tools that make it possible for developers to build increasingly inefficient and resource-hungry applications faster than before.
That's really not completely true. The heaviest applications aren't really impeded by OS eye candy, because while they're doing their heavy lifting, the only graphic activity they're engaging in is updating a progress bar. The actual drawing activities are engaged in by my GPU so my disgustingly powerful CPU (which of course is a budget processor by modern standards) can do other things.

But for the average user, an increase in processor speed and memory simply means that their 25 meg printer drivers will become 100 meg printer drivers and their operating system will demand another gig of RAM and all their new clock cycles.
That's funny, CUPS doesn't seem to have grown THAT much. I did double my RAM recently, but I already felt I needed more. Honestly, my single largest bottleneck now is disk; I feel the need for a RAID. Anybody have four 500GB SATA disks for sale cheap?

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:The problem isn't hardware to begin with... by badkarmadayaccount · 2010-06-01 11:08 · Score: 1

The issue is that that JIT technology hasn't advanced enough to burn through all the layers of abstraction, though we are getting there. Not to mention we simply need a couple of processor tricks to speed up all those dynamic look-ups. FFS, isn't that what the MMU is for? Someone please add some user mode instructions for MMU queries for virtual table look-ups. And typed registers, overloaded instructions, and typed load instructions. With software exceptions for abstract types. I want my B5000, damn it!

--
I know tobacco is bad for you, so I smoke weed with crack.
Re:The problem isn't hardware to begin with... by badkarmadayaccount · 2010-06-01 11:14 · Score: 1

Where do I get my Fluxbox inspired word processor, with non cryptic integrated overlay command line?

--
I know tobacco is bad for you, so I smoke weed with crack.
Re:The problem isn't hardware to begin with... by Jeremi · 2010-06-03 16:40 · Score: 1

Where do I get my Fluxbox inspired word processor, with non cryptic integrated overlay command line?
One of these might suit you... or there's always emacs or vi. :^)

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:The problem isn't hardware to begin with... by badkarmadayaccount · 2010-06-03 20:01 · Score: 1

I was looking for a well designed and easily customizable menu system, small load times, and an absolutely clean interface, for linux and windows. Can't seem to find one there. Any takers for such a project? Seems a reasonable idea.

--
I know tobacco is bad for you, so I smoke weed with crack.

Human learning by Gruff1002 · 2010-05-29 11:38 · Score: 1

We all learn (or are supposed to) from our mistakes how is a machine supposed to act differently? Its simple logic.

Goes against the trend by izomiac · 2010-05-29 11:42 · Score: 1

The trend lately seems to be to build hardware that runs existing software faster. Designing hardware without legacy support would make for faster, more power efficient hardware. Futhermore, hardware is expensive to modify, whereas software is relatively cheap to update.

OTOH, since the world relies on commercial software distributed in binary form, hardware makers have to support it. Today, the hardware is built so the software doesn't need to be changed, despite the fact that computers would perform at a much higher level if it were the other way around. I suppose one could point out that we have so much software today that porting all of it would isn't practical. Of course, the current state of affairs is solely due to the Windows on x86 near-monoculture. People seem to love sticking with what works, rather than go through a bit of pain to achieve higher levels. I suppose people expect that computers aren't ever going to move past the general design standardized in the 1990s.

IMHO, what we need is a clean break, a complete redesign, every decade or so. At that point, most decade-old software should be emulatable, and we get the benefits of the ever-advancing state of computer science. Plus the periodic chaos should prevent complacency and increase competition, while the decade long stability would allow for optimization and provide a common build target. Fat chance that Microsoft or big business would ever go along with that idea though.

Re:Goes against the trend by tepples · 2010-05-29 12:14 · Score: 1

IMHO, what we need is a clean break, a complete redesign, every decade or so.
Game consoles have that. But the problem with those is the cryptographic lockout chips that enforce policies that shut out small businesses.
Re:Goes against the trend by Anonymous Coward · 2010-05-29 18:13 · Score: 0

Fat chance that Microsoft or big business would ever go along with that idea though.
Or that the idea is even practical.
What exactly is going to force vendors to just drop their existing designs and start again? Intel only did that with Itanium because they thought they could take the 64bit transition to push people onto a cleaner architecture, AMD ate their lunch instead. This alone should show you the power of the market forces here, companies have spent trillions writing software to run on x86 (by shitty dev teams who couldn't recognise portability with a magnifying glass), that crap ain't getting replaced in a hurry and there is no reason for those companies to do so when they can just demand faster chips to run the existing software and get them.
The only time such a shake-up will make it past the market forces is when chip technology reaches the bottom of the barrel (can't shrink any further or otherwise improve the propagation delay/clock speed). At that point, I still doubt those programs will go anywhere, x86 chips will still be made as commodity legacy parts whilst the new hawtness will be optical chips or something which will only have a different instruction architecture if they have to rather than by choice.
Re:Goes against the trend by badkarmadayaccount · 2010-06-01 11:15 · Score: 1

Transmeta, where are you when we need you?

--
I know tobacco is bad for you, so I smoke weed with crack.

While not move and fix it? by BartholomewBernsteyn · 2010-05-29 11:44 · Score: 2, Informative

This may be a far thought, but if stochastic CPUs allow for increased performance in a trade-off for correctness, maybe something like following description may reap the benefits while keeping out the stochastics right away:
Suppose those CPUs really allow for faster instruction handling using less resources, maybe you could put more in a package, for the same price, which on a hardware level would give rise to more processing cores at the same cost. (Multi-Core stochastic CPUs)
Naturally, you have the ability to do parallel processing, with errors possible, but you are able to process instructions at a faster rate.
On the software side, the support for concurrency is a mayor selling point, of course, there has to be something able recover from those pesky stochastics gracefully. I come up with the functional language 'Erlang'.
This is taken from wikipedia

http://en.wikipedia.org/wiki/Erlang_(programming_language)#Concurrency_and_distribution_orientation

Concurrency supports the primary method of error-handling in Erlang. When a process crashes, it neatly exits and sends a message to the controlling process which can take action. This way of error handling increases maintainability and reduces complexity of code

From the official source:

http://www.erlang.org/doc/reference_manual/processes.html#errors

Erlang has a built-in feature for error handling between processes. Terminating processes will emit exit signals to all linked processes, which may terminate as well or handle the exit in some way. This feature can be used to build hierarchical program structures where some processes are supervising other processes, for example restarting them if they terminate abnormally.

Asked to 'refer to OTP Design Principles for more information about OTP supervision trees, which use[s] this feature' I read this:

http://www.erlang.org/doc/design_principles/des_princ.html

A basic concept in Erlang/OTP is the supervision tree. This is a process structuring model based on the idea of workers and supervisors. Workers are processes which perform computations, that is, they do the actual work. Supervisors are processes which monitor the behaviour of workers. A supervisor can restart a worker if something goes wrong. The supervision tree is a hierarchical arrangement of code into supervisors and workers, making it possible to design and program fault-tolerant software.

This seems well fit? Create a real, physical machine for a language both able to reap its benefits and cope with the trade-off.
Or maybe I'm too far off (I'm bored technologically, allow me some paradigmatic change at slashdot).

TamedStochastics - Hiring.

Yes, checksumming on dedicated hardware was my first thought as well.

Re:While not move and fix it? by jd · 2010-05-29 13:41 · Score: 1

Oh, I agree with what you've said, but I have a very hard time believing people will be porting Linux, OpenBSD or Windows to Erlang any time soon, let alone take advantage of all the capabilities of Erlang. It could be done, in principle, but in practice the codebase is a serious problem. As I mentioned in my submission, one of the previous attempts to change the way CPUs were designed was the Transputer. It died, not because of any flaw in the design (which was superb) but because training everyone in Occam and then converting all the software over was just too large of an obsticle to overcome. And the volume of code in the 1980s was nothing like the volume of code that exists today.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

Is this another Fuzzy Logic thing? by Bing+Tsher+E · 2010-05-29 11:48 · Score: 1

It sounds similar.

Nope by klaiber · 2010-05-29 11:51 · Score: 1

If it requires software changes that are not 100% automated, then this won't fly. Programmers have a hard enough time writing sequential programs, let alone multithreaded ones. Now they're supposed to also foresee and check hardware errors? I think not.
I note that the entire idea hinges on the s/w component, yet the article hides the complexity under the harmless-sounding term "robustification".
Another idea from the ivory towers that is good at generating papers, but not actual machines. IMHO.

I've gone even better by Dunbal · 2010-05-29 12:00 · Score: 1

I have designed a CPU that uses only one transistor, requires absolutely no power, and is infinitely fast! Of course at the moment the only instruction it can run is NOP, but I'm working on the problem...

Garbage in, garbage out, professor. A computer that isn't accurate is no longer useful. We might as well go back to using thousands of humans to double-check other thousands of humans. Oh wait no those require FAR more energy and time.

--
Seven puppies were harmed during the making of this post.

OpenCL by tepples · 2010-05-29 12:02 · Score: 2, Interesting

AMD and nVidia's workstation cards are the same as their gaming cards, the only difference being that the workstation ones are certified to produce 100% accurate output. If a gaming card colours a pixel wrong every now and then it's no big deal and the player probably won't even notice.

As OpenCL and other "abuses" of GPU power become more popular, "colors a pixel wrong" will eventually happen in the wrong place at the wrong time on someone using a "gaming" card.

Re:OpenCL by ultranova · 2010-05-29 23:10 · Score: 1

As OpenCL and other "abuses" of GPU power become more popular, "colors a pixel wrong" will eventually happen in the wrong place at the wrong time on someone using a "gaming" card.

Of course, the driver could simply tune each shader core to stress either reliability or speed, based on whether it's running a pixel shader or compute shader.
Coming to think of it, don't some dialects of Lisp let you tell the compiler the desired tradeoff between speed, reliability, generality etc. in a per-function basis?

--
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
Re:OpenCL by AmiMoJo · 2010-05-29 23:54 · Score: 1

Does it really matter though if one pixel in a HD video frame that is only displayed for 1/25th of a second is wrong?
The reason workstation cards cost so much more is certification. It's the old saying: Cheap, fast, reliable - pick two. As long as there is some mechanism for telling the GPU when it needs to be 100% accurate and when it can afford to make a few mistakes then it's a reasonable trade-off.

--
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
Re:OpenCL by tepples · 2010-05-30 00:17 · Score: 1

Does it really matter though if one pixel in a HD video frame that is only displayed for 1/25th of a second is wrong?
It appears you are assuming that compute shaders would be used only to decode bidirectionally predicted frames (B-frames) of raster video. But not all codecs support B-frames due to patents, and a noticeable error in an I-frame or P-frame could leave a streak across the image after the decoder applies motion compensation. Worse yet, if the compute shaders are used to compute a game's physics, the errors could compound far more obviously.
Re:OpenCL by AmiMoJo · 2010-05-30 04:06 · Score: 1

You seem to have missed the point.
Decoding of frames would be done in a reliable way. Transforming the frames for display (resizing, colour space conversion etc.) could be done will a tolerance for errors. Physics would need to be 100% accurate but actually rendering the pixels of objects affected by physics need not be.
That is what TFA is getting at. If you can tolerate errors in some operations then the hardware to do them can be much cheaper and run much faster. As I said in my original post general functional operations and important calculations could not be accelerated this way.

--
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
Re:OpenCL by tepples · 2010-05-30 12:19 · Score: 1

If you can tolerate errors in some operations then the hardware to do them can be much cheaper and run much faster.
But because so much of computing requires reliability, you're duplicating the hardware into more-reliable and less-reliable sections. This duplicate hardware itself would draw power.
Re:OpenCL by badkarmadayaccount · 2010-06-01 08:08 · Score: 1

Scheme, to be precise.

--
I know tobacco is bad for you, so I smoke weed with crack.
Re:OpenCL by badkarmadayaccount · 2010-06-01 08:13 · Score: 1

Make lots of less reliable hardware, and then add software error correction for critical calculations. It might turn out that its cheaper in power and silicon to move the precision concern out of hardware and in to software, and add more cheap and energy efficient horsepower to handle the extra processing. At least that's what the paper says.

--
I know tobacco is bad for you, so I smoke weed with crack.

I'd just like to point out... by DavidR1991 · 2010-05-29 12:04 · Score: 2, Insightful

...that the Transmeta Crusoe processor has sod-all to do with porting or different programming models. The whole point of the Crusoe was that it could distil down various types of instruction (e.g. x86, even Java bytecode) to native instructions it understood. It could run 'anything' so to speak, given the right abstraction layer in between

Its lack of success was nothing to do with programming - just that no one needed a processor that could these things. The demand wasn't there

Re:I'd just like to point out... by 10101001+10101001 · 2010-05-29 15:34 · Score: 2, Interesting

The whole point of the Crusoe was that it could distil down various types of instruction (e.g. x86, even Java bytecode) to native instructions it understood. It could run 'anything' so to speak, given the right abstraction layer in between
Yea, uh, that's true for *any* general purpose processor. What Crusoe original promised was that this dynamically recompiled code might be either faster (by reordering and optimizing many instructions to fit Crusoe's Very Large Instruction Word design--not unlike how the Pentium Pro and above do it in hardware with multiple APU/FPU functional units) or more power efficiently (by removing the hardware parts of the reorderer/optimizer and having the software equivalent run unoften). Of course, the former just didn't hold because Intel/AMD could just pump out higher hertz processors and the latter didn't matter as much as simply underclocking the whole CPU when the system was idle (which is often enough). In short, Crusoe found two niches that both Intel and AMD cornered.

Its lack of success was nothing to do with programming - just that no one needed a processor that could these things. The demand wasn't there
Well, that's the other part of the equation. If the Crusoe had actually provided multiple abstraction layers and not just the one (the x86 one), perhaps they could have survived. Crusoe would have been a great platform for emulating the PSX, for example. Or providing multiple, concurrent x86/Java/whatever systems to sandbox for servers--and for which the power efficiency would be important. But, then, providing well-optimized and many software solutions isn't an easy task, especially when balanced against the task of avoiding running the "Code Morphing" software as much as possible to avoid all the penalties associated with it.
In short, the problem wasn't the demand per se; it was a lack of supply. Pining the hopes of the company on a few niches of which the competitors managed to relatively quickly occupy certainly didn't help.

--
Eurohacker European paranoia, gun rights, and h

Mistakes make you learn more... by AmazinglySmooth · 2010-05-29 12:05 · Score: 1

I had a physics prof for freshman physics that said you learn more from mistakes. I told him that we must be physics experts by now.

Re:Mistakes make you learn more... by oldhack · 2010-05-29 13:47 · Score: 1

I keep "learning" like today, I'd lose all my clients and go bankrupt in a month.

--
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.

How eye candy helps interoperability by tepples · 2010-05-29 12:11 · Score: 2, Interesting

And yes, I realize that there are irresistible market forces at work here, but that only applies to commercial software; for the FOSS world, it's a tremendous lost opportunity that appears to have been driven by little more than a desire to emulate corporate software development, which many FOSS developers admire for reasons known only to them and God.

I think I know why. If free software lacks eye candy, free software has trouble gaining more users. If free software lacks users, hardware makers won't cooperate, leading to the spread of "paperweight" status on hardware compatibility lists. And if free software lacks users, there won't be any way to get other software publishers to document data formats or to get publishers of works to use open data formats.

Just adds another layer... by VortexCortex · 2010-05-29 12:19 · Score: 1

We rarely write software that is even robust enough to be secure against unexpected input on our current "reliable" chips (see: Viruses and other malware).

The idea of having application programmers cope with the new unpredictable hardware errors is seriously flawed.

In the end an additional "software" layer (probably actually firmware) will have to deal with this new type of hardware error; Application level coding (and existing software) will continue working as usual.

If this turns out to be faster than current techniques: Meh. A new faster processor and a new buzzword will be born.

I'll be interested when I can buy the new hardware and run *nix on it; Until then the only buzzword that comes to mind is "vaporware".

Late, and innaccurate by gman003 · 2010-05-29 12:19 · Score: 3, Interesting

I've seen this before, except for an application that made more sense: GPUs. A GPU instruction is almost never critical. Errors writing pixel values will just result in minor static, and GPUs are actually far closer to needing this sort of thing. The highest-end GPUs draw far more power than the highest-end CPUs, and heating problems are far more common.

It may even improve results. If you lower the power by half for the least significant bit, and by a quarter for the next-least, you've cut power 3% for something invisible to humans. In fact, a slight variation in the rendering could make the end result look more like our flawed reality.

A GPU can mess up and not take down the whole computer. A CPU can. What happens when the error hits during a syscall? During bootup? While doing I/O?

Re:Late, and innaccurate by Anonymous Coward · 2010-05-29 12:34 · Score: 0

While encrypting your /home?
GPUs are used for crypto too.
Re:Late, and innaccurate by MostAwesomeDude · 2010-05-29 12:50 · Score: 1

Not *that* kind of crypto. Still...
GPUs become inaccurate intentionally. Most GPU instructions are as accurate as IEEE 754 requires, and some are *more* accurate because they are directly in silicon. For example, reciprocals and square roots usually have dedicated circuits. Everything is at least a single-precision float. The inaccuracy comes later, during output; GPUs can be configured to dither away quality or lower their color depth in order to work with software that expects lower quality.
However, general-purpose computing GPUs, from the Dx 9 era onwards, are all shaderful, and shader units are not designed to be inaccurate.
Finally, GPUs are more than just the shader unit. If an error occurs on the GPU that causes the DMA unit to lock up, then the OS will spin its wheels endlessly trying to get the GPU to talk to it again. We've mitigated this somewhat in Linux, but it's still possible for a misprogrammed or misbehaving GPU to lock up the PCI/AGP/PCIe bus entirely, something we can't possibly recover from.

--
~ C.
Re:Late, and innaccurate by Kjella · 2010-05-29 13:07 · Score: 1

But it's a long time since computers just draw something up on the screen. One little error in a video decoding will keep throwing off every frame until the next keyframe. One error in a shader computation can cause a huge difference in the output. What coordinates have an error tolerance after all is transformed and textured and tessellated and whatnot? An error in the Z-buffer will throw one object in front of another instead of behind it. The number of operations where you don't get massive screen corruption is not that high.

--
Live today, because you never know what tomorrow brings
Re:Late, and innaccurate by thegarbz · 2010-05-29 14:08 · Score: 2, Interesting

I can't see this working. The premise here was that the hardware allows faults, yet I don't see how you could design hardware like this to be accurate on demand. GPUs aren't only used for games these days. Would an error still be tolerated while running Folding@Home?
Re:Late, and innaccurate by ElForesto · 2010-05-29 14:25 · Score: 1

Another good application would be for PMPs and other mobile devices. Who cares if you have one pixel decoded improperly? Odds are you won't notice on that tiny screen and you'd happily trade that for doubling your battery life. Power consumption is, at best, a tertiary concern on a desktop or server.

--
There is a difference between "insightful" and "inciteful" other than spelling.
Re:Late, and innaccurate by Anonymous Coward · 2010-05-29 17:59 · Score: 0

Everything is at least a single-precision float.
Not quite, they added Half-precision floats at least a year ago, I think more. (16bit float numbers)

low power - for embedded and server farms not desk by mr_walrus · 2010-05-29 12:24 · Score: 1

it could take off, but in specialized areas like embedded designs (low power - long battery life consideration)
and in server farms (low power, low cooling and electric costs).

embedded and server applications do not have the bloated huge application suites that need porting.
ie: big bloated popular desktop apps likes photoshops and excels are not an issue for this new
cpu design approach to be adopted.

server apps 'relatively' speaking are much less bloated, and often have been ported a zillion times already
adding error robustness is doable and worth doing for the potential savings.

embedded apps don't mind doing what it takes to achieve battery life noticably longer than the competitor.
and often do such specialized (read: not bloated) functions that error robustness should also be doable -- even if currently
glossed over.

but, regardless how desireable this turns out to be, if a "big guy" (read Intel) couldn't be bothered to push
it, it'll die :(

Take a step back? by solid_liq · 2010-05-29 12:29 · Score: 1

He apparently wants us to take a step backwards to the days when crashes were frequent, such as with Windows 98. Software quality has a long way to go already. Does he not realize that making programmers deal with such an issue would bring software quality back into the Dark Ages?

As it is, programmers aren't given enough time to write software that works bug-free. Schedules are always rushed. This would dramatically increase: the burden on developers, the quantity of bugs, the number of developers being fired because they didn't get a project accomplished nearly as quickly as someone who pulled off a similar project 5 years earlier, the frustration of the users and developers (and transitively, the number of heart attacks around the world due to elevated blood pressure), the number of security vulnerabilities in software, and the migration rate to processor vendors who didn't make this mistake.

In short: this guy is on crack!

The perfect processor by Anonymous Coward · 2010-05-29 12:35 · Score: 1, Insightful

I think Mr Kumar is confusing the performance of the designer to develop a useful power effecient product today on a modern process with the performance of the end result. There is no law or provable proposition that that a useful processor needs to be sloppy to outperform a neat competitor. This only holds true when you fail to include the cost of being sloppy and limit the intelligence and creativity of the designer. Any figures you produce to prove your point are by definition limited to a narrow limited set of defined tradeoffs.. It does not represent what is possible when someone smarter than yourself is desinging a solution.

The space is difficult and getting more and more so. Deal with it or find another job. For quite some time now innovations in the space have always come from techniques to mitigate complexity and error. When your designing non-trivial ASICs its what you do.

In certain areas analog computers make sense. Heck our brains are analog computers but asking a classic computing environment to check itself is a non-starter in terms of any product users will accept.

Circut design is somewhat of an art. There are an infinite array of subtle tradeoffs and clever hacks one can use to improve performance such as use of crosstalk to bootstrap charging of neighboring caps, clock gating, distributed clocking, intentional glitching, even the use of analog circuts in certain limited roles.

What pisses me off the most about articles like this is that designers suffer from tunnel vision and therefore act like morons. I mean look at a modern desktop PC. Intel et al tout their speedstep, bus power management, LPC..etc to save energy and they have epicly failed. Why does a computer doing absoultely nothing need to use >100 watts to sit idle? If they can't get reasonable power scaling from clock gating then why not just design an idle processor thats slow and stupid (ie ATOM) and shut the other crap down when its not needed. If people really cared about power there are a lot of realitivly low tech solutions that would work and make huge dents in world demand for energy to power electronics.

But we still have a situation where GPU designers would rather let their processors idle at 70c to protect against temp gradients and not have to account for effects of temperature changes on their circuts.

Reminds me of Java2k by Anonymous Coward · 2010-05-29 12:53 · Score: 0

This reminds of Java2K, a esoteric programming language inspired by physics. When you do measurements in physics, you have to be specific about the error. You have to deal with it, so you have to think about how to minimize it in your contraptions.

In Java2k, all instructions can misbehave. So x1/x2 will divide x1 by x2, but do so only with a probability of 90% correctly. And all variables start with random values, "like in real physics". A language impossible to work with?! Turns out, you can at least do simple things:

integer variable x
integer variable y
y = x/x

at the end of the computation, y=1 -- with a probability of 90%. Now, how to proceed...?

Re:Reminds me of Java2k by Anonymous Coward · 2010-05-29 14:11 · Score: 0

link: http://p-nand-q.com/humor/programming_languages/java2k.html

I don't know why by BronsCon · 2010-05-29 12:55 · Score: 0, Offtopic

but now I want some White Castle.

--
APK quotes people (including myself) without context and should not be trusted. Just thought you should know.

Re:I don't know why by BronsCon · 2010-05-31 11:35 · Score: 1

Driveby WTF-mod. I was going for funny; apparently you haven't seen many movies in the last 5 years?

--
APK quotes people (including myself) without context and should not be trusted. Just thought you should know.

Assuming You May Be Wrong . . . by NicknamesAreStupid · 2010-05-29 13:01 · Score: 1

. . . is a calculated move.

What, like the brain? by Baron_Yam · 2010-05-29 13:27 · Score: 1

Bring it on down to the actual transistor level and compare it to the brain - we use x more neurons for a given job than a human might use transistors for a similar function.

The brain expects neurons to misfire and goes on averages of clusters. This allows neurons to be kept on more of a hair trigger, which makes them less energetically expensive to change state. The same can theoretically be done with transistors - we use fairly high voltage (I'm not an EE, feel free to correct me here) differences to register as 1 or 0, but if we allowed for higher error rates, we could use closer values, or have a RANGE of values and get better than binary complexity, at a lower energy cost.

Not really by Sycraft-fu · 2010-05-29 13:46 · Score: 3, Informative

Ethernet has lower latency than token ring, and is over all easier to implement. However its bandwidth scaling is poor, when you are talking old school hub ethernet. The more devices you have, the less total throughput you get due to collisions. Eventually you can grind a segment to a complete halt because collisions are so frequent little data gets through.

Modern ethernet does not have collisions. It is full duplex, using separate transmit and receive wires. It scales well because the devices that control it, switches, handle sending data where it needs to go and can do bidirectional transmission. The performance you see with gig ethernet is only possible because you do full duplex communications with essentially no errors.

Bad Title by emblemparade · 2010-05-29 14:23 · Score: 1

The point is not that mistakes can improve performance, but that allowing for mistakes improves performance.

Its actually reasonable by ratboy666 · 2010-05-29 14:25 · Score: 1

Simply put, we already use this. Network transport may have errors, and these are dealt with at higher levels. As long as a corruption can be detected, we are ok. But, if a computation results in an error, and the checking of it may also result in an error, we have a problem. Some part must be guaranteed. But the transmission can be handled the same way that networks are handled.

If the store is not reliable, we can use RAID 5 or the like. This can even be done with main memory. But, we can't easily segregate the parts that have to be retained because they are expensive to recompute from those that are easy to recompute. Certainly RAID 5 storage doesn't make that distinction.

But, between a auto correcting storage and a correcting data transport, something like this should be implementable.

Now, I have to read the fine article to determine why he thinks that this will allow speeds to increase. Certainly I can see it in limited areas. For example, a network packet buffer need not be 100% reliable. Nor must a raw disk buffer (in both cases, the error correction will happen a layer up).

--
Just another "Cubible(sic) Joe" 2 17 3061

What if .. by sandiegoguides · 2010-05-29 14:29 · Score: 1

And .. what if there is an error or corruption during the recovery process .. eh?

I'd rather do the job right than do the job fast by junglebeast · 2010-05-29 15:01 · Score: 1

Faster performance is a luxury we don't really need. We only have applications demanding higher performance because it's available. There might possibly be a niche role for this in dedicated hardware for exploring certain computational problems, such as protein folding or monte carlo simulations or whatever, but not in our home computers.

stochastic screening by jsepeta · 2010-05-29 15:51 · Score: 1

stochastic screens help when scanning documents by allowing you to get pretty good visuals with lower resolution (fewer ppi).
not sure how the stochastic method could help with computing though. don't the values need to be either ON or OFF?
stochastic screens give the illusion of gray (both on AND off)

--
Remember kids, if you're not paying for the service, YOU ARE THE PRODUCT THAT IS BEING SOLD.

Even Intel and HP couldn't manage it by Ilgaz · 2010-05-29 16:46 · Score: 1

Besides their robbing ink business, HP is a very prestigious company with decades of experience on computing. Intel is the company who does the actual chip. Both of them and their billions along with support from Microsoft couldn't manage to release the compiler which will do it and traditional "lets plug 64bit to x86" of AMD won.

I mean it really needs a lot to convince those guys to do anything to rely on "clever compilers". Billions lost for nothing, lots of code erased, lots of company image wasted. If Intel was something like ARM Holdings or AMD and did that mistake, there wouldn't be an Intel today.

Nobody (in end user, game etc.) does reliable multi-core execution yet. Of course a lot of stuff works with multi-core, I watch since my quad g5 purchase but at some point, e.g. even a simple thing (compare to game) like set MAKEFLAGS='-j4' may do insane things from time to time. That is basic multi core, perhaps it shouldn't be even called multi-core (just 4 tasks running in parallel).

Will this be 'interesting but dead-end' research by Anonymous Coward · 2010-05-29 16:48 · Score: 0

Will this be 'interesting but dead-end' research(?)

Yes.
Who wants a crap processor or at least one that has guarenteed errors?
I bet we'll get ratings like "80% accuracy" lol

Forgiving, like analogue record/35mm by Ilgaz · 2010-05-29 16:56 · Score: 1

I couldn't read the article or the journalist mess resulting from article but I guess it is something like dust particles on lens. If someone takes a photo with couple of dust particles on lens (and can't notice it), why would a device/encoder (especially lossy one) would care about minor quirks and spend time for "absolutely pixel perfect" data?

So... It was harmful after all by Ilgaz · 2010-05-29 17:00 · Score: 1

I guess people smiling at this article will have to think again ;)

http://www.thocp.net/biographies/papers/goto_considered_harmful.htm

Re:So... It was harmful after all by jd · 2010-05-29 17:58 · Score: 2, Funny

That's why some languages use ComeFrom rather than GoTo.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

Neuromorphic designs by wanax · 2010-05-29 17:15 · Score: 1

There area plenty of other ideas to deal with noisy chips.. I'd point out DARPA's SyNAPSE program as an example. Due to quantum constraints, the future of deterministic computation must eventually deal with the noise in a robust manner. The above efforts are focusing on memristor technology.

I don't know whether stochastic architectures do better than noisy memristor ones, but either way we'll have to learn how to program in an environment that the least predictable element is not the one at the console.

This branch could bear some interesting fruit... by trims · 2010-05-29 17:26 · Score: 4, Interesting

I see lots of people down on the theory - even though the original proposal was for highly-error forgiving applications - because somehow it means we can't trust the computations from the CPU anymore.

People - realize that you can't trust them NOW.

As someone who's spent way too much time in the ZFS community talking about errors, their sources and how to compensate, let me enlighten you:

modern computers are full of uncorrected errors

By that, I mean that there is a decided tradeoff between hardware support for error correction (in all the myriad places in a computer, not just RAM) and cost, and the decision has come down on the side of screw them, they don't need to worry about errors, at least for desktops. Even for better quality servers and workstations, there are still a large number of places where the hardware simply doesn't check for errors. And, in many cases, the hardware alone is unable to check for errors and data corruption.

So, to think that your wonderful computer today is some sort of accurate calculating machine is completely wrong! Bit rot and bit flipping happens very frequently for a simple reason: error rates per billion operations (or transmissions, or whatever) have essentially stayed the same for the past 30 years, while every other component (and bus design, etc.) is pretty much following Moore's Law. The ZFS community is particularly worried about disks, where the hard error rates are now within two orders of magnitude of the disk's capacity (e.g. for a 1TB disk, you will have a hard error for every 100TB or so of data read/written). But, there's problems in on-die CPU caches, bus line transmissions, SAS and FC cable noise, DRAM failures, and a whole host of other places.

Bottom line here: the more research we can do into figuring out how to cope with the increasing frequency of errors in our hardware, the better. I'm not sure that we're going to be able to force a re-write of applications, but certainly, this kind of research and possible solutions can be taken care of by the OS itself.

Frankly, I liken the situation to that of using IEEE floating point to calculate bank balances: it looks nice and a naive person would think it's a good solution, but, let me tell you, you come up wrong by a penny more often that you would think. Much more often.

-Erik

--
There are always four sides to every story: your side, their side, the truth, and what really happened.

It took so long to state the obvious? by iamacat · 2010-05-29 18:33 · Score: 1

RAM, Disk drives, CD-ROMs or modems are all designed to allow significant possibility of errors and employ redundancy to minimize impact on the end user. Why would anyone think CPUs would be exempt from similar design needs? Most demanding calculations take much less time to verify than to perform. If you can factor large numbers or compress files twice faster but with 5% error rate, wouldn't you spring up for an error-free coprocessor or slower error-correcting verification code as a trade off? No software will need to be rewritten except the error-correcting compiler, but specialized languages may be available to those who want to take advantage of raw unreliable mode for video encoding or such.

GPU, yes; general-purpose CPU, no by sydbarrett74 · 2010-05-29 18:45 · Score: 1

I can see this working for a graphics chip -- after all, who cares if a tiny portion of an image is a pixel or two off? For execution of an actual application, however, I think this idea sucks. There are far better ways to reduce power consumption, like asynchronous- or reversible computing techniques.

--
'He who has to break a thing to find out what it is, has left the path of wisdom.' -- Gandalf to Saruman

Re:GPU, yes; general-purpose CPU, no by owlstead · 2010-05-30 02:42 · Score: 1

I agree, in short it is fine for rigidly defined areas of doubt and uncertainty :)

What would you rather have control nuclear ICBMs? by Anonymous Coward · 2010-05-29 18:55 · Score: 0

Computer hardware that is slow, but does what it is told, without error?

Or computer hardware that is Defective by Design, that might launch a first strike on Russia (without human approval) at any time, and that depends on a bug-free pattern to the bugginess PLUS bug-free error handling code to prevent a nuclear holocaust?

"And the King had the idiot who proposed the over-complex 'speedup' scheme thrown to the alligators in the moat, and the rest of the kingdom got to live another day."

Re:Impossible design - no its called Hamming code by tg123 · 2010-05-29 19:18 · Score: 1

Hamming code adds parity bits to data so that errors can be detected and corrected.

Hamming code detects errors and corrects them and is used in RAID arrays and Satellite transmissions.

The Instruction set would have to be designed so that when a mistake is made it can be detected, also known as the hamming distance but it could be done.

http://en.wikipedia.org/wiki/Hamming_code

http://en.wikipedia.org/wiki/Hamming_distance

http://en.wikipedia.org/wiki/Richard_Hamming

Yeah... bullshit by synaptic · 2010-05-29 19:20 · Score: 1

What kind of name is that anyhow? Kumar? What is that five o's or two u's?

Consequence: House-fires by rawler · 2010-05-29 22:14 · Score: 1

One of the more drastic consequences of poorly-performing software, is that hardware companies keep having huge incentives for creating faster and faster machines. That in itself, would be a good thing.

The bad thing is, that the arms-race of hardware performance, is what causes the HUGE power-demand, leading to poor battery times, and lots of heat-production. Many Desktop-PSU:s today are in the 600-700W range (and above), which is about the output effect on a small microwave-oven. If you're not careful regularly opening up your machine cleaning it out, it's going to get clogged by dust. That's a recipe for a fire-hazard. (In a recent lecture on fire-safety, I learnt fire-hazards by computer-overheating is on the rise, to no surprise.)

If instead Software developers maintained a focus for designing and coding efficient applications, the performance of ~6 years old machines would be _VERY_ snappy for all common tasks today. Then the incentive for creating faster machines for the average Joe would not be as great (since Joe does not care if starting the browser takes 50ms or 100ms, he does not notice anyways), so the hardware competition could happen in other qualities; power-consumption, reliability, price.

Especially, fewer houses would burn down because someone forgot the laptop on in the sofa.

Current CPU architecture is lame by Anonymous Coward · 2010-05-30 00:48 · Score: 0

It would be awesome if it worked and the idea gets adopted.
Current CPU architecture is antiquated and lame. We need something new.

Re:This branch could bear some interesting fruit.. by asvravi · 2010-05-30 04:48 · Score: 1

Nice to read a sane and open-minded comment and after so many foolish rants from closed minded individuals here on slashdot. If it were left to the typical slashdot crowd, no far-looking scientific breakthrough would have ever progressed beyond the proposal stage.

Amen to that. by Anonymous Coward · 2010-05-30 04:58 · Score: 1, Interesting

As a game developer, I used to not think about the possibility of hardware errors much. Until I had a very difficult-to-pin-down bug which turned out to be a hardware defect in a single L2 cache line on the console hardware. It worked fine for the first few years of its life, and then this one bit developed a "stickyness" so that sometimes it would return the wrong value and cause our software to crash.

We then ran an exhaustive RAM reading and writing test on all of the devkits our team was using, and it turned up *three more* kits that couldn't read and write correctly to all of their RAM. These are $10,000 devkits, but their reliability is about the same as the consumer hardware people have at home. Is it any surprise that consoles often need to be shipped back to MS or Sony or Nintendo and replaced? It no longer surprises me in the least.

Desktop computers may be a little ahead of consoles in reliability, but when you're doing BILLIONS of calculations every second, its inevitable that random physical quirks (cosmic ray strike or whatever) will mess one of them up sooner or later. Anyone who overclocks their PC knows that if you OC too much you start to fail the reliability tests because errors are creeping in.

I think... by fyngyrz · 2010-05-30 05:25 · Score: 1

...Youtube has amply demonstrated that there is no problem at all encoding, and subsequently decoding, h.264 videos that are full of errors. Recoverable errors, fatal errors, even funny errors. All may be encoded, and then decoded.

Now, encoding/decoding tolerance? You can do it, but will anyone watch it?

This research will go nowhere.

Now, if he was encoding porn tolerance, we'd really have something society needs. But you know, that only leads to more needs. Like keyboard protectors.

--
I've fallen off your lawn, and I can't get up.

Stochastic computers of the 70's : Welcom'back hom by Anonymous Coward · 2010-05-30 10:11 · Score: 0

Strangely enough, there were such things as stochastic computers in the 70's. The Wikipedia article is in French, but you can use the Google translator (?) if you want to :

http://fr.wikipedia.org/wiki/Calculateur_stochastique

I/O: make it practical by tepples · 2010-05-30 12:18 · Score: 1

If a program supports Turing-Complete macros, then it can do absolutely anything a Turing Machine can do (given sufficient memory) and a Turing Machine can do anything that is computationally possible.

The output of a classical Turing machine is one boolean value: either "halted true" or "halted false". The theoretical Turing machine has no conception of interactive input and output. So being Turing complete isn't enough; I/O capability is also crucially important, and that's where your "make it practical" comes into play.

Re:I/O: make it practical by jd · 2010-05-30 14:44 · Score: 1

But modern I/O is merely setting the value(s) of a region of memory that is memory-mapped onto an output device in some manner. A Turing Machine can certainly set memory values, even if a classic TM doesn't directly have any concept of memory mapping.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

Oldnews by AlgorithMan · 2010-05-30 12:23 · Score: 1

1. fuzzy logic
2. http://tech.slashdot.org/article.pl?sid=09/02/08/1716235

--
The MAFIAA is a bunch of mindless jerks who will be the first up against the wall when the revolution comes

Pixel count has increased by tepples · 2010-05-30 13:02 · Score: 1

Then the incentive for creating faster machines for the average Joe would not be as great (since Joe does not care if starting the browser takes 50ms or 100ms, he does not notice anyways), so the hardware competition could happen in other qualities; power-consumption, reliability, price.

I believe this has already happened; witness the rise of netbooks and nettops. One can finally buy an entry-level gaming PC (an ION nettop such as Acer Aspire Revo) for the price of a game console. But on full-size desktop PCs, nonlinear video editing has gone from LDTV to SDTV to HDTV in the past seven years, multiplying the pixel count by a factor of 27 (corresponding to seven years of Moore's law).

Re:Pixel count has increased by Anonymous Coward · 2010-05-30 19:03 · Score: 0

Definitely. But these underpowered devices in the mind of the consumer remains just that. Underpowered.
A non-geek friend recently bought a Samsung NC10, for example, and runs the preinstalled (some-windows) OS. Her major complaint: it's slow. Too slow to bother. Instead she uses her work laptop, which is also slow, in the order of 5 minutes just to boot, but according to her, better than the laptop.
My own non-geek girlfriend uses an Asus Eeepc (904, as I recall. Equivalent hardware I think?), of which I've spent quite some hours tweaking, slimming and optimizing the software. One of our current relational issues is she won't leave the damn thing alone.

CPU, heal thyself by orgelspieler · 2010-05-30 16:03 · Score: 1

IEEE Spectrum had a similar article last year. Check out the images for a little better understanding of the tradeoff. It's pretty clever stuff.

Wrong error model by WindShadow · 2010-05-31 14:33 · Score: 1

Actually what make ethernet work is not the error recovery but the error detection. The hard part of making a more robust CPU and memory model is not recovering from errors, but detecting them at all. This adds complexity to the whole stochastic design, and I think at some level the error detection needs to be error free, so that the success of the error recovery can be evaluated. Like a Hamming code which can correct all one bit errors and detect all two bit errors, as you add robustness to the detection, through more complex schemes like Fire codes, the resources in the error detection are greater than the error detection. And balancing by making the entire system more error prone is clearly not a solution.

Interpreted Languages by Anonymous Coward · 2010-06-08 06:38 · Score: 0

If application programmers used interpreted languages such as .net or java, then the error detection/recovery could be handled by the framework instead of the individual application and no code changes/refactoring would be needed at all other than the framework itself.

Slashdot Mirror

When Mistakes Improve Performance

222 comments