I mixed up per-core and per-chip performance in original post, but assuming that the code can be scheduled to despatch 2 DP ops per cycle per core (reasonable if the code has a simple enough structure to run on a GPU), then you hit about 8 GFLOP/core. So yes, the new six-core Nehalems can hit about 50 GLOP/s. Some benchmarks put it slightly higher at about 55. Nvidia are claiming an 8x increase in DP throughput which puts them at about 616 GLOP/s for DP. It's still a ten-fold increase in performance.
When you say that Intel can double cores and vector word length in their next family are you confusing Westmere (which is a tick generation) with it's successor (which will be a tock)?
That's right. After conquering the competitive but profitable mass-market for their products, where they can make a killing in low-margin high-volume sales, they are going after a tiny niche....
It would certainly have nothing to do with competition from Larabee and the general realisation that as the GPU becomes more general purpose Games will seek to offload more calculations onto them. For graphics it's rare to hit a case that needs double-precision (it happens in HDR), but when you move your physics code on there you don't want nasty rounding errors to destroy the consistency of your virtual world.
What about it's like having a regiment of 5000 soldiers vs 5 ninjas. If the task can be accomplished by rote then the regiment will win on sheer manpower, but it requires adaptability then the ninjas will triumph.
Substitute pirates for ninjas for an instant paradox.
Maybe I am deluded about the times when redundant posts were really modded redundant.
Accept certain inalienable truths, redundancy will rise, mods will philander, you too will get old, and when you do you'll fantasize that when you were young redundancy was reasonable, mods were noble and children respected their elders.
What kind of performance were you expecting from multi-cores? I'm not too familiar with your application so I was wondering if the code could hit 1 FLOP/cycle for 3-4 GFLOP/core or even 2 FLOP/cycle for 6-8 GFLOP/core? If the programming model for DP on these boards supports what you need then you might be looking at 100x the performance on GPGPU. But then again, if lack of support for saturating error conditions was holding you back before when there was a potential 10x increase performance, is 100x enough to overcome that hurdle? I'm not actually sure how you could provide saturating errors on a vector array anyway, as it implies a different code-path for each core in the vector.
What makes you think that he did? If you are provisioning 100 racks a day it means that your total number of racks is increasing by 100/day. But if you read the subtlety in the GP's post
Lets give a 12 hour lifespan, and say 25K VMs at the same time.
then we find a subtle clue that these VMs don't live for ever. Admittedly we then have to understand his words to infer that they would only need 100 racks in total, rather than in addition every day. Go on, try it. Reading is fun. It's surprisingly informative when you get the hang of it.
Are you kidding me? The phone that I've had for four years is not unusual - it is a bog standard Sony Ericsson model. It's had working MMS since I got it. I wouldn't describe it as hard to use. After taking a picture with the phone one of the menu options that pops up is send via mms. It works to the phones of everyone that I know, regardless of their network: O2, Orange, Vodafone, Three...
Are you sure that MMS not taking off is not more to do with the US having appalling infrastructure for mobile phones? It seems to work well enough everywhere else in the world...
You should take a look at XMOS. Several of the design team worked on the Transputer at Inmos. Aimed at the embedded electronics market it is a micro-controller with multiple cores, and the same threading model and HW links from the original Transputer. They also have a range of simulators and other dev toys for free download.
So you're saying in the world of tomorrow the paradigm will be OQO? Thanks, that was really insightful. I feel that you've really added something of value to this thread. Perhaps going forward you could add your value elsewhere.
Oh I wouldn't be so sure. Most of it was covered here although no fear of a lawsuit. Academic works are great for providing prior art to thwart patents as they are time stamped public disseminations. The new version looks a more high tech, and after eight years it just takes a team of two to massively improve on the hundreds of authors on that paper.
Very nicely put. Let me give an answer that contrasts the OP and provides an example of what you are describing. I currently work in a stressful high-pressure job that involves a lot of programming. My working hours swing between 50 hours and 70 hours a week, depending on the density of upcoming deadlines. So what would I do if I could retire with $1 billion?
Exactly what I do now. In theory I could get by putting in 30-40 hour weeks and living a pretty stress free life. I know plenty of people in academia that do just that. The pressure to work those extra hours and produce those other results is an internal one. It helps that I know to get tenure in a few years I will need those publications, but to be honest even without that carrot doing research, solving problems and finding ways to explain novel results is a buzz. If I worked in a different industry it is the kind of thing I would in my spare time after a day at the office.
I fell into an academic career because it was what I wanted to do rather than what I wanted to be. I grew up programming, playing with computers and designing systems. I worked in industry before college and when I graduated doing a PhD was the logical choice - as I knew it would give me freedom for five years to do interesting things. Something that is very hard to find in a commercial setting. After I gained my doctorate it was too late to turn back. Sure I could earn two, or three times my current salary in industry, but then I would spend my days doing what somebody else tells me.
Freedom is addictive, intellectual freedom just as much as physical. After sampling it, it is very hard to give up. Currently I get paid enough that I don't have to worry about money (as I get older starting a family will definitely change that). Watching the current economic crisis unfold, it seems distant and somewhat irrelevant as I have no debts and enough savings to live on for five years while it sorts itself out. In fact I am considering a mini-version of what the submitter asks at the moment: given the complete drought of academic positions I might fund my own research for a few years.
In short I know that given economic freedom I would do exactly what I do now: wake up every day and ask myself what is the most interesting result I can achieve today?
Actually it does. I assume that what you are referring to is running the program for the unbounded length of time and checking to see if it reached a terminal configuration. The phrase "unbounded length of time" needs to be clarified somewhat. If you mean a finite but arbitrary length of time (any element chosen from the infinite set N) then for any choice of N there are still halting and non-halting programs that cannot be classified correctly. If you mean allow an infinite amount of time for the decision then you've just described a Halting Oracle as it is normally defined within Hypercomputation.
I'm guessing that your intuition comes from the observation that it is "easier" to observe when a program halts because it sticks in a single configuration. However the set of non-halting programs can be split into two. Those which loop through a set of configuration periodically, and those which are aperiodic. The periodic ones can be determined just as easily as the halting programs by observing the cycle.
The aperiodic loops are difficult and this is where the undecidability of the problem comes up.
If you are such an obvious racist then why are you trying to be subtle about it? Anyone who assumes that a choice of religion would make someone a "guest" in their country obviously has a borderline BNP mentality with a thinly veiled desire to do a spot of ethnic cleansing.
Very true. The point of the original reply to mewyn was to point out that there is a refresh speed difference. So browsing does have its limitations compared to books - where-as search is obviously much faster. One nice thing about the Asus design is that it improves browsing. Even if the screens refresh at the same speed they let you see twice as much in one go.
For my own reading habits (novels and technical papers) I can't really see refresh speed being that much of a problem. The only reason that I've held back so far from buying an ebook reader is that I want dual screens to make reading papers more bearable.... I'll have to wait and see reviews but I may buy one of the new Asus models if they do a decent job.
The original question wasn't about reading the pages, it was about how quickly the view can be updated. While I can't read that quickly it is not uncommon to flip through pages looking for one that you recognise. As memory works well with visual layout, often you can spot the page that you want by the shape of graphs, paragraphs, equations etc...
Are you aware that OCR is not perfect? You ask why Google did not do it right as if they had chosen a cheaper / faster option. The software that they used has an error rate of one in million characters. This is the best that is currently available. The problem is that with the sheer amount of text Google has scanned they expect about a million errors.
You seem quite insistent that they've messed up somewhere. How would you have done it better?
They were not trying to write yet-another-browser that would behave in the same way as every other browser. This was explicitly an experiment in what would happen if a browser was written using a modal interface as in vi. So in fact the browser works in the least surprising way for the target audience.
Having said that - I use vi, and I barely touch a mouse for anything. But web browsing is the one thing that I don't use keys for (as much). The main use of the interface (for me) is to scroll through information and make selections. The mouse does both of those tasks better than a keyboard - mainly because it is a touchpad and so the scrolling is completely analogue, unlike the discrete clicks on a scroll-wheel.
Still. It is an interesting experiment. One way to improve the keyboard as a link selector would be to take vi's context sensitive movement commands (character, word, sentence, paragraph) and graft them onto the structure of the webpage. But that would take a lot of thinking and scripting, I think I'll stick to Firefox for now.
I mixed up per-core and per-chip performance in original post, but assuming that the code can be scheduled to despatch 2 DP ops per cycle per core (reasonable if the code has a simple enough structure to run on a GPU), then you hit about 8 GFLOP/core. So yes, the new six-core Nehalems can hit about 50 GLOP/s. Some benchmarks put it slightly higher at about 55. Nvidia are claiming an 8x increase in DP throughput which puts them at about 616 GLOP/s for DP. It's still a ten-fold increase in performance.
When you say that Intel can double cores and vector word length in their next family are you confusing Westmere (which is a tick generation) with it's successor (which will be a tock)?
Go and have a look at GPGPU. There's tons of material on there about techniques, some tutorials and a busy forum.
That's right. After conquering the competitive but profitable mass-market for their products, where they can make a killing in low-margin high-volume sales, they are going after a tiny niche....
It would certainly have nothing to do with competition from Larabee and the general realisation that as the GPU becomes more general purpose Games will seek to offload more calculations onto them. For graphics it's rare to hit a case that needs double-precision (it happens in HDR), but when you move your physics code on there you don't want nasty rounding errors to destroy the consistency of your virtual world.
That is an astoundingly bad analogy.
What about it's like having a regiment of 5000 soldiers vs 5 ninjas. If the task can be accomplished by rote then the regiment will win on sheer manpower, but it requires adaptability then the ninjas will triumph.
Substitute pirates for ninjas for an instant paradox.
Accept certain inalienable truths, redundancy will rise, mods will
philander, you too will get old, and when you do you'll fantasize
that when you were young redundancy was reasonable, mods were
noble and children respected their elders.
And trust me on the sunscreen.
What kind of performance were you expecting from multi-cores? I'm not too familiar with your application so I was wondering if the code could hit 1 FLOP/cycle for 3-4 GFLOP/core or even 2 FLOP/cycle for 6-8 GFLOP/core? If the programming model for DP on these boards supports what you need then you might be looking at 100x the performance on GPGPU. But then again, if lack of support for saturating error conditions was holding you back before when there was a potential 10x increase performance, is 100x enough to overcome that hurdle? I'm not actually sure how you could provide saturating errors on a vector array anyway, as it implies a different code-path for each core in the vector.
You may have to write a new false trichotomy page to really let his comment shine.
What makes you think that he did? If you are provisioning 100 racks a day it means that your total number of racks is increasing by 100/day. But if you read the subtlety in the GP's post
then we find a subtle clue that these VMs don't live for ever. Admittedly we then have to understand his words to infer that they would only need 100 racks in total, rather than in addition every day. Go on, try it. Reading is fun. It's surprisingly informative when you get the hang of it.
You are the silver lining to this discussion.
Are you kidding me? The phone that I've had for four years is not unusual - it is a bog standard Sony Ericsson model. It's had working MMS since I got it. I wouldn't describe it as hard to use. After taking a picture with the phone one of the menu options that pops up is send via mms. It works to the phones of everyone that I know, regardless of their network: O2, Orange, Vodafone, Three...
Are you sure that MMS not taking off is not more to do with the US having appalling infrastructure for mobile phones? It seems to work well enough everywhere else in the world...
You should take a look at XMOS. Several of the design team worked on the Transputer at Inmos. Aimed at the embedded electronics market it is a micro-controller with multiple cores, and the same threading model and HW links from the original Transputer. They also have a range of simulators and other dev toys for free download.
So you're saying in the world of tomorrow the paradigm will be OQO? Thanks, that was really insightful. I feel that you've really added something of value to this thread. Perhaps going forward you could add your value elsewhere.
Oh I wouldn't be so sure. Most of it was covered here although no fear of a lawsuit. Academic works are great for providing prior art to thwart patents as they are time stamped public disseminations. The new version looks a more high tech, and after eight years it just takes a team of two to massively improve on the hundreds of authors on that paper.
Very nicely put. Let me give an answer that contrasts the OP and provides an example of what you are describing. I currently work in a stressful high-pressure job that involves a lot of programming. My working hours swing between 50 hours and 70 hours a week, depending on the density of upcoming deadlines. So what would I do if I could retire with $1 billion?
Exactly what I do now. In theory I could get by putting in 30-40 hour weeks and living a pretty stress free life. I know plenty of people in academia that do just that. The pressure to work those extra hours and produce those other results is an internal one. It helps that I know to get tenure in a few years I will need those publications, but to be honest even without that carrot doing research, solving problems and finding ways to explain novel results is a buzz. If I worked in a different industry it is the kind of thing I would in my spare time after a day at the office.
I fell into an academic career because it was what I wanted to do rather than what I wanted to be. I grew up programming, playing with computers and designing systems. I worked in industry before college and when I graduated doing a PhD was the logical choice - as I knew it would give me freedom for five years to do interesting things. Something that is very hard to find in a commercial setting. After I gained my doctorate it was too late to turn back. Sure I could earn two, or three times my current salary in industry, but then I would spend my days doing what somebody else tells me.
Freedom is addictive, intellectual freedom just as much as physical. After sampling it, it is very hard to give up. Currently I get paid enough that I don't have to worry about money (as I get older starting a family will definitely change that). Watching the current economic crisis unfold, it seems distant and somewhat irrelevant as I have no debts and enough savings to live on for five years while it sorts itself out. In fact I am considering a mini-version of what the submitter asks at the moment: given the complete drought of academic positions I might fund my own research for a few years.
In short I know that given economic freedom I would do exactly what I do now: wake up every day and ask myself what is the most interesting result I can achieve today?
Not always. That's how Termination Analysis works as well.
Actually it does. I assume that what you are referring to is running the program for the unbounded length of time and checking to see if it reached a terminal configuration. The phrase "unbounded length of time" needs to be clarified somewhat. If you mean a finite but arbitrary length of time (any element chosen from the infinite set N) then for any choice of N there are still halting and non-halting programs that cannot be classified correctly. If you mean allow an infinite amount of time for the decision then you've just described a Halting Oracle as it is normally defined within Hypercomputation.
I'm guessing that your intuition comes from the observation that it is "easier" to observe when a program halts because it sticks in a single configuration. However the set of non-halting programs can be split into two. Those which loop through a set of configuration periodically, and those which are aperiodic. The periodic ones can be determined just as easily as the halting programs by observing the cycle.
The aperiodic loops are difficult and this is where the undecidability of the problem comes up.
If you are such an obvious racist then why are you trying to be subtle about it? Anyone who assumes that a choice of religion would make someone a "guest" in their country obviously has a borderline BNP mentality with a thinly veiled desire to do a spot of ethnic cleansing.
I can't see you winning many mod points in this debate, so have some thing rarer, a reply: well said.
Very true. The point of the original reply to mewyn was to point out that there is a refresh speed difference. So browsing does have its limitations compared to books - where-as search is obviously much faster. One nice thing about the Asus design is that it improves browsing. Even if the screens refresh at the same speed they let you see twice as much in one go.
For my own reading habits (novels and technical papers) I can't really see refresh speed being that much of a problem. The only reason that I've held back so far from buying an ebook reader is that I want dual screens to make reading papers more bearable.... I'll have to wait and see reviews but I may buy one of the new Asus models if they do a decent job.
The original question wasn't about reading the pages, it was about how quickly the view can be updated. While I can't read that quickly it is not uncommon to flip through pages looking for one that you recognise. As memory works well with visual layout, often you can spot the page that you want by the shape of graphs, paragraphs, equations etc...
Strange. I can flip through a 1000 page book in a couple of seconds. Something doesn't quite add up properly...
Are you aware that OCR is not perfect? You ask why Google did not do it right as if they had chosen a cheaper / faster option. The software that they used has an error rate of one in million characters. This is the best that is currently available. The problem is that with the sheer amount of text Google has scanned they expect about a million errors.
You seem quite insistent that they've messed up somewhere. How would you have done it better?
You mean throughput on sequential reads. What makes you assume that is the type of throughput they are measuring?
They were not trying to write yet-another-browser that would behave in the same way as every other browser. This was explicitly an experiment in what would happen if a browser was written using a modal interface as in vi. So in fact the browser works in the least surprising way for the target audience.
Having said that - I use vi, and I barely touch a mouse for anything. But web browsing is the one thing that I don't use keys for (as much). The main use of the interface (for me) is to scroll through information and make selections. The mouse does both of those tasks better than a keyboard - mainly because it is a touchpad and so the scrolling is completely analogue, unlike the discrete clicks on a scroll-wheel.
Still. It is an interesting experiment. One way to improve the keyboard as a link selector would be to take vi's context sensitive movement commands (character, word, sentence, paragraph) and graft them onto the structure of the webpage. But that would take a lot of thinking and scripting, I think I'll stick to Firefox for now.
And you will join him for the poor application of relativistic thought.