Lots of heavy number crunching applications don't really care how long they run.
I would say that is true for number crunching, but not for heavy number crunching, with any reasonable definition for heavy. 10 or 20 twenty times faster would generally mean that you can go down from running it on a cluster overnight to running it on a local machine, which tends to be more convenient -- or run it over a coffee break on that cluster.
If you ever cared about multithreading to be efficient on modern systems, you should seriously consider what language you use as well.
It is useful if you otherwise need to navigate through several steps of a touch menu, or alternatively woodpeck over a large table where muscle precision will be needed every time to hit the very right button.
It's kind of sad when anyone states that graphs is a good example of what's not needed when writing games. I highly doubt that OpenAL keeps a core busy. It's not the number of threads that matters, it's whether you are actually saturating the CPU or not. And, well, smart pointers are never needed, but they are frequently nice, basically as soon as you might have otherwise wanted to write the cleanup code in multiple places (and that need includes exception handling, if you ever use that).
For me, Boost has been a very useful way to get shit done, including to get a nicer STL-compatible allocator when the freeing strategy could be quite loose, but allocation overhead was all important. I could have got that done by rewriting the code to exclude STL completely, or by writing my own allocator. Using an already existing library was far faster. I am probably looking to use Boost serialization as the starting point for a similar reason now -- there is a codebase with a data hierarchy, that was never intended for state persistence. Rather than writing rather tedious specific code for each class*, or coming up with my own framework, I can reuse something pre-existing.
* and, yes, it does get tedious if you really do error checking and handle the web of references that is really included here
In my experience, once a defect is "well understood and documented", fixing it is usually easy enough. Or sometimes you can find a workaround that doesn't trigger the problematic behavior. Worst case, remove the feature that relies on the buggy code.
That's not the case if the real root of the bug is as much related to design decisions and external systems as it is a function of your own code. Sure, you might be able to circumvent the behavior or reconcile the requirements, but even after gaining a good understanding, it's possible that there is no easy way out.
I wonder what level of power saving this would mean with somewhat affordable bearings. That's of course dependent on what level of sophistication would be needed for an active pumping solution to maintain the vacuum. Compared to an ultra-centrifuge, we have the big advantage of much smaller volume, and no foreign matter introduced... ever.
The more interesting aspect is that you can tweak those still photos, and then transfer them back. Photoshop some key frames, and you have suddenly created a video with the same manipulation. The video is just a cheap source for spatial data, which you can then texture with your photos.
Unfortunately life is not that simple, at 10gigabit you get 33bits per meter. That means that a 1500byte frame occupies about 360m, even if you could knock the speed down 90% you would still need 36m of whatever. And that's just so that you can get it all out before it starts coming back in again.
Your math is off. If we move a wave from one medium to another, the frequency will be preserved, not the wavelength. The "width" of a singal in a slow medium will be far lower as well.
Factorization is not NP-complete. On the other hand, a polynomial algorithm doesn't have to be low-order. Shor's happens to be n^3 for a quantum computer, but consider if it would be, say, n^12 in number of bits. That's 10^39 for 2048 bits. A single computer in one year might be able to go through 10^17 of those. Oh, only 10^22 computer years.
The only real problem would be finding an algorithm that's on par with the normal multiplication, since cracking would be comparable to the workload for normal authentication. Exponents anywhere above 5 or 6 would make it fully reasonable to start the arms race with far longer keys as a viable solution.
More of it is absorbed than we thought, but we still have those pretty graphs showing the increase that has happened so far. Those are, you know, made from actual measurements. Like, you know, the absorption spectrum of CO2. The effect of CO2 alone is easy, while figuring out the complete set of feedbacks is hard.
Yes, but then we are down to 5 % greenhouse effect. It it also a more interesting prospect to sequester methane from the atmosphere for burning, there is at least one step there which is highly exotermic.
There are things you can do with an MMU that you can't do without one, as long as you do not intend to emulate the whole userspace instruction set. Of course you can port some kind of Linux to an environment with no virtual memory and no memory protection, but it will have all the glory of Win32s.
Re:Anyone with more knowledge explain this to me
on
AMD Fusion Details Leaked
·
· Score: 2, Interesting
Can't this code be put in the driver?
Not really, as I see it. The driver should naturally be written to use the faster bus, but the availability of this communication channel could be used for doing some special effect stages on the CPU and then hand the data back (assuming that the effect for some reason cannot be implemented as a shader). Some kind of dynamic off-loading if the GPU turns out to be the bottleneck could be handled in driver, and that would surely be interesting, but the traditional cores would be a very minor addition to the total performance. It's like having a broadband link, but everyone except for a few academics are just providing dial-up content.
I am familiar with GPGPU and so on, but the pure scientific market is not large enough to warrant the development of these chips, and it certainly doesn't serve as a plausible excuse for buying ATI. Also note that (almost) all stuff done today as GPGPU is high-latency, you send a large chunk of data and read the results back. You just keep feeding a stream to the computation kernels. The thing is also that they are now taking an existing GPU core, which is still tuned for this kind of workloads. These days, the PCI-E bus is quite OK compared to the scheduling latency of the core. This can naturally improve in the future, though.
(Personally, I got a lot more excited by today's details on Larrabee, as there are many tasks involving really frequent small vector or operations, combined with random memory accesses and branching. Most of it could probably be refactored to something suiting the GPUs of today, but being able to just spread it out on a Larrabee-style chip would be fantastic.)
Re:Anyone with more knowledge explain this to me
on
AMD Fusion Details Leaked
·
· Score: 5, Interesting
A higher level of integration makes sense for laptops. Putting the GPU with the CPU also makes a lot more sense when we consider that the CPU these days also means the place closest to the memory controllers.
In addition, you have an interconnect between the two which is far faster than anything else available today. However, there is no code today that will use it explicitly, the whole paradigm of a GPU is that you do not read data back to the CPU.
So, for now, the benefits are really physical size and cost. A CPU-integrated graphics core can be better than one placed on the motherboard when you have an integrated memory controller, but a separate card with dedicated RAM should beat both, as long as you do not expect a new "chatty" paradigm of GPU usage.
You realize that no human has been beyond LEO (more or less) for over 30 years? We DO have the technology to whip up something similar to Deep Impact, or Rosetta, or Deep Space 1, rather quickly. And, frankly, that's all it takes. A reprogrammed ICBM might also makes sense, but a manned mission does not.
We have lots of ICBMs to expend if the first one fails, and the biggest concern with that approach (ignoring the "many small, but still huge" chunks one, which is quite important) is still the risk for failure during launch or soon after liftoff. A manned mission won't solve that aspect.
What are you expecting, really? We will know if the ship is in place, we will know if its thrusters are firing. We can trust Newtonian mechanics. In addition, with suitable equipment on the asteroid, it would actually be doable to get micron-level laser distance measurements. (A firing thruster would upset it quite a bit, eh, to say the least, but over a month or so it should be possible to see the trend.)
Well, the GP has kind of a point. If we would find carbon-based life with DNA and the same mapping between triplet codons and amino acids as is found on Earth, the sensible conclusion would be that we still have not seen two instances of life originating, but only a single on that was capable of spreading to another planet. That is still interesting, but the amount of material that would leave a life-inhabited planet with enough velocity to ever get to another star system would be miniscule.
It would still be totally possible that the solar system would be the only inhabited system in the galaxy, or even the observable universe. If we find life on Mars, that is recognizable as such, but still radically different, THEN we are really talking.
Well, as was noted in another thread, available features are not always fast. SSE2 is vastly faster on Core 2 than on Core (or Dothan), for example, while the supported instruction set is almost identical. The difference is large enough that for some real coding, the extra work to mangle data into a SSE2-esque format might be worth it on one architecture, but not on another. The recommended loops for copying large blocks are also different.
(AMD used to list assembler code samples in their optimization guide. These days they give more general advice, but it's still not identical to Intel's, and the differences are not related to specific instructions per se).
We can really question what a benchmark should test. One view is that it should try to gather as much as possible about the chip and do its very best to feed the chip a tailored path. After all, if you are serious enough about performance, that's the kind of work that you would do after choosing a chip based on the benchmark. On the other hand, this means that benchmarks should be updated really frequently.
Maybe with Apple machines. For some reason, I can't stand the trackpad on my MacBook for any prolonged time (maybe it's really the keyboard, as I generally use either external KB + mouse, or internal KB + trackpad). Therefore, it hasn't really been heavily used. My last two portable PCs were used for 4 years each. The trackpad surface never gave me any problems, and it was used heavily on both. (The last one was a Dell Inspiron 8600.)
Any Peltier element can give you power as well. The point is that even the theoretically optimal difference is totally lousy if your heat difference is somewhere like the one between water freezing and water boiling. You need a colder cold sink, or a much hotter heat source, to get some serious efficiency. RTGs tend to be quite hot in the hot end.
This allows better RTGs, but they would only be marginally efficient for, say, reclaiming computer case waste heat. This is especially so as you can't put them on the CPU directly, where the differential is great, because they are insulating as well. You will need to put it at the radiating end, over a large surface.
PLEASE, free space does not matter. Wear levelling will and should move stuff around. The controller will frequently not even know what file system structures represent "empty" space. In practice, they implement another layer of mapping between logical and physical sectors. This is in fact quite consistent with the use of flash anyway, since the process of erasing is time-consuming, it's highly beneficial if a suitable set of blocks have been prepared beforehand. A new write will go to such a block. The previous physical block containing the data for the same logical block is subsequently tagged as free (not in the file system sense, because in the file system sense, this block does not exist at all!), erased at some point and placed in the queue for reuse. In such a scheme, wear levelling only means that the number of uses is recorded as well and that the queue is more or less a sorted priority queue with the least frequently used blocks being reused first.
A gallon of gas contains approx. 1.3 x 10^8 joules of energy, and there are 3.6 x 10^6 joules in a kilowatt hour. At $0.10 per kilowatt hour, that is equivalent to $3.61 worth of electricity to replace a gallon of gas. Which isn't a whole lot cheaper than current gas prices.
Of course, this leaves out difference in conversion efficiency of gas v.s. electricity.
Yep, and that is a difference of at the very least a factor of 2. Naturally, regenerative braking and other nice aspects of hybrids that would be quite unfeasible in a gas car are also still there.
Lots of heavy number crunching applications don't really care how long they run.
I would say that is true for number crunching, but not for heavy number crunching, with any reasonable definition for heavy. 10 or 20 twenty times faster would generally mean that you can go down from running it on a cluster overnight to running it on a local machine, which tends to be more convenient -- or run it over a coffee break on that cluster.
If you ever cared about multithreading to be efficient on modern systems, you should seriously consider what language you use as well.
It is useful if you otherwise need to navigate through several steps of a touch menu, or alternatively woodpeck over a large table where muscle precision will be needed every time to hit the very right button.
It's kind of sad when anyone states that graphs is a good example of what's not needed when writing games. I highly doubt that OpenAL keeps a core busy. It's not the number of threads that matters, it's whether you are actually saturating the CPU or not. And, well, smart pointers are never needed, but they are frequently nice, basically as soon as you might have otherwise wanted to write the cleanup code in multiple places (and that need includes exception handling, if you ever use that).
For me, Boost has been a very useful way to get shit done, including to get a nicer STL-compatible allocator when the freeing strategy could be quite loose, but allocation overhead was all important. I could have got that done by rewriting the code to exclude STL completely, or by writing my own allocator. Using an already existing library was far faster. I am probably looking to use Boost serialization as the starting point for a similar reason now -- there is a codebase with a data hierarchy, that was never intended for state persistence. Rather than writing rather tedious specific code for each class*, or coming up with my own framework, I can reuse something pre-existing.
* and, yes, it does get tedious if you really do error checking and handle the web of references that is really included here
In my experience, once a defect is "well understood and documented", fixing it is usually easy enough. Or sometimes you can find a workaround that doesn't trigger the problematic behavior. Worst case, remove the feature that relies on the buggy code.
That's not the case if the real root of the bug is as much related to design decisions and external systems as it is a function of your own code. Sure, you might be able to circumvent the behavior or reconcile the requirements, but even after gaining a good understanding, it's possible that there is no easy way out.
I wonder what level of power saving this would mean with somewhat affordable bearings. That's of course dependent on what level of sophistication would be needed for an active pumping solution to maintain the vacuum. Compared to an ultra-centrifuge, we have the big advantage of much smaller volume, and no foreign matter introduced... ever.
Who said it was a he?
TFA? I leave it as an exercise to the reader to determine whether it was likely that the GP read it, though.
The more interesting aspect is that you can tweak those still photos, and then transfer them back. Photoshop some key frames, and you have suddenly created a video with the same manipulation. The video is just a cheap source for spatial data, which you can then texture with your photos.
Unfortunately life is not that simple, at 10gigabit you get 33bits per meter. That means that a 1500byte frame occupies about 360m, even if you could knock the speed down 90% you would still need 36m of whatever. And that's just so that you can get it all out before it starts coming back in again.
Your math is off. If we move a wave from one medium to another, the frequency will be preserved, not the wavelength. The "width" of a singal in a slow medium will be far lower as well.
Factorization is not NP-complete. On the other hand, a polynomial algorithm doesn't have to be low-order. Shor's happens to be n^3 for a quantum computer, but consider if it would be, say, n^12 in number of bits. That's 10^39 for 2048 bits. A single computer in one year might be able to go through 10^17 of those. Oh, only 10^22 computer years.
The only real problem would be finding an algorithm that's on par with the normal multiplication, since cracking would be comparable to the workload for normal authentication. Exponents anywhere above 5 or 6 would make it fully reasonable to start the arms race with far longer keys as a viable solution.
More of it is absorbed than we thought, but we still have those pretty graphs showing the increase that has happened so far. Those are, you know, made from actual measurements. Like, you know, the absorption spectrum of CO2. The effect of CO2 alone is easy, while figuring out the complete set of feedbacks is hard.
Yes, but then we are down to 5 % greenhouse effect. It it also a more interesting prospect to sequester methane from the atmosphere for burning, there is at least one step there which is highly exotermic.
There are things you can do with an MMU that you can't do without one, as long as you do not intend to emulate the whole userspace instruction set. Of course you can port some kind of Linux to an environment with no virtual memory and no memory protection, but it will have all the glory of Win32s.
Can't this code be put in the driver?
Not really, as I see it. The driver should naturally be written to use the faster bus, but the availability of this communication channel could be used for doing some special effect stages on the CPU and then hand the data back (assuming that the effect for some reason cannot be implemented as a shader). Some kind of dynamic off-loading if the GPU turns out to be the bottleneck could be handled in driver, and that would surely be interesting, but the traditional cores would be a very minor addition to the total performance. It's like having a broadband link, but everyone except for a few academics are just providing dial-up content.
I am familiar with GPGPU and so on, but the pure scientific market is not large enough to warrant the development of these chips, and it certainly doesn't serve as a plausible excuse for buying ATI. Also note that (almost) all stuff done today as GPGPU is high-latency, you send a large chunk of data and read the results back. You just keep feeding a stream to the computation kernels. The thing is also that they are now taking an existing GPU core, which is still tuned for this kind of workloads. These days, the PCI-E bus is quite OK compared to the scheduling latency of the core. This can naturally improve in the future, though.
(Personally, I got a lot more excited by today's details on Larrabee, as there are many tasks involving really frequent small vector or operations, combined with random memory accesses and branching. Most of it could probably be refactored to something suiting the GPUs of today, but being able to just spread it out on a Larrabee-style chip would be fantastic.)
A higher level of integration makes sense for laptops. Putting the GPU with the CPU also makes a lot more sense when we consider that the CPU these days also means the place closest to the memory controllers.
In addition, you have an interconnect between the two which is far faster than anything else available today. However, there is no code today that will use it explicitly, the whole paradigm of a GPU is that you do not read data back to the CPU.
So, for now, the benefits are really physical size and cost. A CPU-integrated graphics core can be better than one placed on the motherboard when you have an integrated memory controller, but a separate card with dedicated RAM should beat both, as long as you do not expect a new "chatty" paradigm of GPU usage.
You realize that no human has been beyond LEO (more or less) for over 30 years? We DO have the technology to whip up something similar to Deep Impact, or Rosetta, or Deep Space 1, rather quickly. And, frankly, that's all it takes. A reprogrammed ICBM might also makes sense, but a manned mission does not.
We have lots of ICBMs to expend if the first one fails, and the biggest concern with that approach (ignoring the "many small, but still huge" chunks one, which is quite important) is still the risk for failure during launch or soon after liftoff. A manned mission won't solve that aspect.
What are you expecting, really? We will know if the ship is in place, we will know if its thrusters are firing. We can trust Newtonian mechanics. In addition, with suitable equipment on the asteroid, it would actually be doable to get micron-level laser distance measurements. (A firing thruster would upset it quite a bit, eh, to say the least, but over a month or so it should be possible to see the trend.)
Well, the GP has kind of a point. If we would find carbon-based life with DNA and the same mapping between triplet codons and amino acids as is found on Earth, the sensible conclusion would be that we still have not seen two instances of life originating, but only a single on that was capable of spreading to another planet. That is still interesting, but the amount of material that would leave a life-inhabited planet with enough velocity to ever get to another star system would be miniscule.
It would still be totally possible that the solar system would be the only inhabited system in the galaxy, or even the observable universe. If we find life on Mars, that is recognizable as such, but still radically different, THEN we are really talking.
(AMD used to list assembler code samples in their optimization guide. These days they give more general advice, but it's still not identical to Intel's, and the differences are not related to specific instructions per se).
We can really question what a benchmark should test. One view is that it should try to gather as much as possible about the chip and do its very best to feed the chip a tailored path. After all, if you are serious enough about performance, that's the kind of work that you would do after choosing a chip based on the benchmark. On the other hand, this means that benchmarks should be updated really frequently.
Not the USS Enterprise, but the OV-101 Enterprise.
Maybe with Apple machines. For some reason, I can't stand the trackpad on my MacBook for any prolonged time (maybe it's really the keyboard, as I generally use either external KB + mouse, or internal KB + trackpad). Therefore, it hasn't really been heavily used. My last two portable PCs were used for 4 years each. The trackpad surface never gave me any problems, and it was used heavily on both. (The last one was a Dell Inspiron 8600.)
This allows better RTGs, but they would only be marginally efficient for, say, reclaiming computer case waste heat. This is especially so as you can't put them on the CPU directly, where the differential is great, because they are insulating as well. You will need to put it at the radiating end, over a large surface.
You make electricity directly from heat. You can't make electricity directly from temperature (or stored heat) though.
PLEASE, free space does not matter. Wear levelling will and should move stuff around. The controller will frequently not even know what file system structures represent "empty" space. In practice, they implement another layer of mapping between logical and physical sectors. This is in fact quite consistent with the use of flash anyway, since the process of erasing is time-consuming, it's highly beneficial if a suitable set of blocks have been prepared beforehand. A new write will go to such a block. The previous physical block containing the data for the same logical block is subsequently tagged as free (not in the file system sense, because in the file system sense, this block does not exist at all!), erased at some point and placed in the queue for reuse. In such a scheme, wear levelling only means that the number of uses is recorded as well and that the queue is more or less a sorted priority queue with the least frequently used blocks being reused first.
A gallon of gas contains approx. 1.3 x 10^8 joules of energy, and there are 3.6 x 10^6 joules in a kilowatt hour. At $0.10 per kilowatt hour, that is equivalent to $3.61 worth of electricity to replace a gallon of gas. Which isn't a whole lot cheaper than current gas prices.
Of course, this leaves out difference in conversion efficiency of gas v.s. electricity.
Yep, and that is a difference of at the very least a factor of 2. Naturally, regenerative braking and other nice aspects of hybrids that would be quite unfeasible in a gas car are also still there.