AFAIK an intervention is not an event that would have been an accident, but rather a situation in which the control software of the vehicle decides that it cannot solve the current driving situation. Without human intervention such a car is expected to pull over in a safe manner.
Also, self-driving car companies are offering the combination of software + human operators for these interventions. Hence the measure should be the same as for human drivers: miles driven per accident caused.
Honestly, that device is almost exactly what you speficied: 17", high DPI display, options for IIRC 3 2.5" drives (or swap a 2.5" drive for two M.2 drives), loads of ports... Too bad OSX is so hard to get running on non-Apple hardware.
The USP of AMD's APUs used to be having the GPU and the CPU on the same die. This is true for Jetson as well, but it is compatible with the whole CUDA universe, too. So now NVIDIA is eating AMD's lunch.
Basically, the procurement process for supercomputers is like this: the buyer (e.g. a DOE lab) will ready a portfolio of apps (mostly simulation codes) with a specified target performance. Vendors then bid for how "little" money they'll be able to meet that target performance. And of course the vendors will use the most (cost/power) efficient hardware they can get.
The reason why we're no longer seeing custom built CPUs in the supercomputing arena, but rather COTS chips or just slightly modified versions, is that chip design has become so exceedingly expensive and that the supercomputer market is marginalized by today's mainstream market.
Also, the simulation codes running on these machines generally far outlive most supercomputers. The stereotypical supercomputer simulation code is a Fortran program written 20 years ago, which received constant maintenance in the past years, but no serious rewrite is viable (costs exceed price of hardware). So vendors will look for low-effort ways of tuning these codes for their proposed designs. Sticking with general purpose CPUs is in most cases the most cost efficient way.
So, what you describe is essentially the difference between capacity and capability machines. The national labs have both, as there are use cases for both. But the flagship machines, e.g. Titan at the Oak Ridge Leadership Computing Facility (OLCF), are always capability machines -- built to run full system jobs, jobs that scale tens of or hundreds of thousands of nodes.
These Peta/Exascale supercomputers are build for computer simulations (climate change, nuclear weapons stewardship, computational drug design, etc.), not for breaking encryption. That's also one reason no one is using them to mine Bitcoins: they're just not efficient at that job. To compute lots of hashes, dedicated hardware designs (read: ASICS) far outpace "general purpose" supercomputers.
...as cars in the US aren't allowed to go that fast. In Germany the ICE (Inter City Express, the German bullet train) needs to be really fast to beat the Autobahn.
Computational drug design is already a big topic in supercomputing, although it's much more focused on interactions of individual molecules. That's currently so complex that it's more efficient to build specialized machines (e.g. http://en.wikipedia.org/wiki/A... ).
From what I read the dongle is merely the interface from the camera (USB) to the smartphone (USB). That should be trivial. (For my setup a USB OTG cable + adapter to mini USB is sufficient, there are tons of apps to control cameras).
The article states that they had to use a beefier micro controller etc., but I wonder: why not do all the processing on the smart phone? These days our phones have so much processing power AND sensors, there should be no need to do any kind of non-trivial logic outside, especially when you're just trying to launch your first product.
I wish people would stop treating modern C++ as if time had been standing still in the past decades. Yes, C++ is complex, but also expressive. Modern features (e.g. lambdas+auto+templates) often let you write code which is just as concise as its Ruby counterpart, but much more efficient.
Architectural improvements for general purpose CPUs yield less and less benefits: Even more registers? Even better branch prediction? Even larger caches? It'll all yield but a few percent, at least for current Intel designs. So, the way to go is currently more and more cores, but what good is it to have many cores that can't all fire simultaneously?
Of course general purpose CPUs exist, simply because we call them that way. But it is also true that each design has it's own strengths, and "dark silicon" is another driver for special purpose hardware. Efficiency is another. Andrew Chien has published some interesting research on this subject. In his 10x10 approach he suggests to use 10 different types of domain-specific compute units (e.g. for n-body, graphics, tree-walking...), each of which is 10x more efficient than "general purpose CPUs" in its domain (YMMV). Those compute units bundled together, make up one core of the 10x10 design. Multiple cores can be connected via a NoC.
Let's see how software will cope with this development...
ps: can special purpose hardware exist if general purpose hardware doesn't?
One reason might be that railways are more efficient in densely populated areas. There express trains can even compete with airplanes. Yesterday we went from Tokyo to Osaka. Flight time would have been ~1h, plus 1h checkin and transfer to/from the airport (~45min. each). The Nozomi Shinkansen took us there in 2:30, and both stations were directly at the center of the cities.
Most of Japan's population is situated in coastal regions, so just a hand full of routes can service all major cities. Imagine how many connections you'd need in the US...
It's not that specialized. It's just plenty of DSPs strapped together on a torus.
Actually Anton uses ASICS, their cores are specially geared at MD codes. This goes way beyond just "strapping together DSPs". They have IIRC ~70 hardware engineers on site. (Source: I've been to DE Shaw Research last year).
Unlike what wikipedia claims, you could probably achieve comparable performance using a more classical and general-purpose supercomputer setup with GPU or Xeon Phi accelerators, provided the network topology is well tuned to address this sort of communication scheme
No, you can't, and here is why: Anton is built for strong scaling of smallish, long running simulations. If you ran the same simulations on a "x86 + accelerator" system (think ORNL's Titan) then you'd observe two effects:
The GPU itself might idle a lot as each timestep only involves few computations, leaving many shaders idle or waiting for the DRAM.
Anton's network is insanely efficient for this use case. IIRC it's got a mechanism equivalent to Active Messages, so when data arrives, the CPU can immediately forward it to the computation which is waiting for it. That leads to a very low latency compared to a mainstream "InfiniBand + GPU" setup.
Granted, it sounds a tad like an episode from Thunderbirds, but it's real: Project Pluto was a nuclear powered Supersonic Low Altitude Missile (SLAM). The idea was to drive the reactor into critical state and superheat the inflowing air, efficiently creating a nuclear powered scamjet. Downside: because the reactor was almost unshielded, all controls had to be designed to withstand extreme radiation and heat (they had to work in white heat conditions). The project was canceled in the 60s, but they actually built and powered up the engines.
Computational drug design and bitcoin miners have in common that both run best on custom hardware. The crux is, that both require very different types of hardware. As an example, please refer to Anton, designed by DE Shaw Research exactly for molecular dynamics (MD) codes.
Bitcoin mining is classified as a so called embarrassingly parallel algorithm, while MD is a tightly coupled problem. Hence an efficient parallelization for MD codes is much harder to speed up: communication gets in the way, and communication is essentially always bound by the speed of light.
ps: fun fact: bitcoin mining and MD can be carried out (at least somewhat) efficiently on GPUs.
OT: "Geldreich" is a German compound of Money (Geld) and rich/plentyful (reich). So if he's called Rich Geldreich, that could be written as Rich Rich... Yeah, I know: no one knows Richie Rich today.
Was just released: http://fabiensanglard.net/gebb...
See subject.
AFAIK an intervention is not an event that would have been an accident, but rather a situation in which the control software of the vehicle decides that it cannot solve the current driving situation. Without human intervention such a car is expected to pull over in a safe manner. Also, self-driving car companies are offering the combination of software + human operators for these interventions. Hence the measure should be the same as for human drivers: miles driven per accident caused.
Sounds exactly like a GPU to me. :-P
Honestly, that device is almost exactly what you speficied: 17", high DPI display, options for IIRC 3 2.5" drives (or swap a 2.5" drive for two M.2 drives), loads of ports... Too bad OSX is so hard to get running on non-Apple hardware.
The USP of AMD's APUs used to be having the GPU and the CPU on the same die. This is true for Jetson as well, but it is compatible with the whole CUDA universe, too. So now NVIDIA is eating AMD's lunch.
Basically, the procurement process for supercomputers is like this: the buyer (e.g. a DOE lab) will ready a portfolio of apps (mostly simulation codes) with a specified target performance. Vendors then bid for how "little" money they'll be able to meet that target performance. And of course the vendors will use the most (cost/power) efficient hardware they can get.
The reason why we're no longer seeing custom built CPUs in the supercomputing arena, but rather COTS chips or just slightly modified versions, is that chip design has become so exceedingly expensive and that the supercomputer market is marginalized by today's mainstream market.
Also, the simulation codes running on these machines generally far outlive most supercomputers. The stereotypical supercomputer simulation code is a Fortran program written 20 years ago, which received constant maintenance in the past years, but no serious rewrite is viable (costs exceed price of hardware). So vendors will look for low-effort ways of tuning these codes for their proposed designs. Sticking with general purpose CPUs is in most cases the most cost efficient way.
So, what you describe is essentially the difference between capacity and capability machines. The national labs have both, as there are use cases for both. But the flagship machines, e.g. Titan at the Oak Ridge Leadership Computing Facility (OLCF), are always capability machines -- built to run full system jobs, jobs that scale tens of or hundreds of thousands of nodes.
These Peta/Exascale supercomputers are build for computer simulations (climate change, nuclear weapons stewardship, computational drug design, etc.), not for breaking encryption. That's also one reason no one is using them to mine Bitcoins: they're just not efficient at that job. To compute lots of hashes, dedicated hardware designs (read: ASICS) far outpace "general purpose" supercomputers.
...as cars in the US aren't allowed to go that fast. In Germany the ICE (Inter City Express, the German bullet train) needs to be really fast to beat the Autobahn.
JR will finance this through earnings from their Shinkansen lines.
Computational drug design is already a big topic in supercomputing, although it's much more focused on interactions of individual molecules. That's currently so complex that it's more efficient to build specialized machines (e.g. http://en.wikipedia.org/wiki/A... ).
From what I read the dongle is merely the interface from the camera (USB) to the smartphone (USB). That should be trivial. (For my setup a USB OTG cable + adapter to mini USB is sufficient, there are tons of apps to control cameras).
The article states that they had to use a beefier micro controller etc., but I wonder: why not do all the processing on the smart phone? These days our phones have so much processing power AND sensors, there should be no need to do any kind of non-trivial logic outside, especially when you're just trying to launch your first product.
I wish people would stop treating modern C++ as if time had been standing still in the past decades. Yes, C++ is complex, but also expressive. Modern features (e.g. lambdas+auto+templates) often let you write code which is just as concise as its Ruby counterpart, but much more efficient.
Architectural improvements for general purpose CPUs yield less and less benefits: Even more registers? Even better branch prediction? Even larger caches? It'll all yield but a few percent, at least for current Intel designs. So, the way to go is currently more and more cores, but what good is it to have many cores that can't all fire simultaneously?
Of course general purpose CPUs exist, simply because we call them that way. But it is also true that each design has it's own strengths, and "dark silicon" is another driver for special purpose hardware. Efficiency is another. Andrew Chien has published some interesting research on this subject. In his 10x10 approach he suggests to use 10 different types of domain-specific compute units (e.g. for n-body, graphics, tree-walking...), each of which is 10x more efficient than "general purpose CPUs" in its domain (YMMV). Those compute units bundled together, make up one core of the 10x10 design. Multiple cores can be connected via a NoC.
Let's see how software will cope with this development...
ps: can special purpose hardware exist if general purpose hardware doesn't?
...that 35k come to Point Lay to get laid?
One reason might be that railways are more efficient in densely populated areas. There express trains can even compete with airplanes. Yesterday we went from Tokyo to Osaka. Flight time would have been ~1h, plus 1h checkin and transfer to/from the airport (~45min. each). The Nozomi Shinkansen took us there in 2:30, and both stations were directly at the center of the cities.
Most of Japan's population is situated in coastal regions, so just a hand full of routes can service all major cities. Imagine how many connections you'd need in the US...
On my machine, I have clang 3.4.2-r100, gcc 4.9.0, 4.8.3, 4.7.4, 4.6.4 and icc 14.0.3.174 installed. All simultaneously, no hassle.
It's not that specialized. It's just plenty of DSPs strapped together on a torus.
Actually Anton uses ASICS, their cores are specially geared at MD codes. This goes way beyond just "strapping together DSPs". They have IIRC ~70 hardware engineers on site. (Source: I've been to DE Shaw Research last year).
Unlike what wikipedia claims, you could probably achieve comparable performance using a more classical and general-purpose supercomputer setup with GPU or Xeon Phi accelerators, provided the network topology is well tuned to address this sort of communication scheme
No, you can't, and here is why: Anton is built for strong scaling of smallish, long running simulations. If you ran the same simulations on a "x86 + accelerator" system (think ORNL's Titan) then you'd observe two effects:
(most recent supercomputers don't use tori)
Let's take a look at the current Top 500:
So, torus networks are the predominant topology for current supercomputers.
Granted, it sounds a tad like an episode from Thunderbirds, but it's real: Project Pluto was a nuclear powered Supersonic Low Altitude Missile (SLAM). The idea was to drive the reactor into critical state and superheat the inflowing air, efficiently creating a nuclear powered scamjet. Downside: because the reactor was almost unshielded, all controls had to be designed to withstand extreme radiation and heat (they had to work in white heat conditions). The project was canceled in the 60s, but they actually built and powered up the engines.
Computational drug design and bitcoin miners have in common that both run best on custom hardware. The crux is, that both require very different types of hardware. As an example, please refer to Anton, designed by DE Shaw Research exactly for molecular dynamics (MD) codes.
Bitcoin mining is classified as a so called embarrassingly parallel algorithm, while MD is a tightly coupled problem. Hence an efficient parallelization for MD codes is much harder to speed up: communication gets in the way, and communication is essentially always bound by the speed of light.
ps: fun fact: bitcoin mining and MD can be carried out (at least somewhat) efficiently on GPUs.
Emacs FTW!
OT: "Geldreich" is a German compound of Money (Geld) and rich/plentyful (reich). So if he's called Rich Geldreich, that could be written as Rich Rich... Yeah, I know: no one knows Richie Rich today.
Boost multi_array is pretty powerful and supports all sorts of slices.