IANAL (I'm an entrepreneur who has been a subcontractor and has subcontracted in the software business), but I know that you can pretty much set the terms you want on a contract. In particular, you can ask that an incident you file to a subcontractor is investigated by said subcontractor in a given time frame.
In truth, it is up to the buyer and the seller to negotiate the terms of the contract, the lawyers are just there to advise both parties. The seller can agree to provide minimal service for a minimal price, but the buyer can ask for better service, which he will probably have to pay extra for.
How is Floating Point Operations Per Second not a real unit?
Because 1) not all operations are equal and 2) the clock rate isn't constant. It's not a very useful information, unless what you really want to count is the amount of data treated by second, which is better addressed by a simple B/s bandwidth unit.
And, I would argue that cache is extremely important when considering vectorization, especially when considering loop nests. I might get much more impressive vectorization if I execute a loop nest in a particular order. But, if I get better cache locality by interchanging two loops, I may see much better performance in the second case. Matrix multiply is a poster child for this.
So if you're looking at the output of the compiler's optimizer and saying "compiler A is better than compiler B at vectorizing" looking only at the instruction sequence, and ignoring the actual memory access pattern and the effects of the cache, you might draw the wrong conclusion.
Cache is of course of utmost importance for performance, but you fail to see it's a different problem entirely. Vectorization happens at a different level. I don't know what world you live in, but no optimizer of any C compiler changes the memory access pattern of your code. If your code is bad from that point of view, there is nothing it can do, only the developer can optimize that. What the compiler can do is schedule your instructions better so as to fill the pipeline to the maximum while trying to minimize register usage, taking into account the fact that scalar and simd use different registers and pipelines (though if you want that to be done well, you're better off doing that by hand while reading the processor specs). AFAIK the only negative effects compiler optimizations can have on the cache is that they can generate code bloat by unrolling or inlining too much. That's unlikely to prevent any vectorization though. Conservative inlining or unrolling policies in the middle-end can however prevent vectorization because they don't take into account the gains associated to switching to the simd ISA.
Depending on the support contract you negotiated with them, they shouldn't need to admit it. I recommend you tell management to improve their legal department.
Here is an example in GCC: the optimizer assumes the SSE minps instruction is commutative. It isn't. As a result you can get unexpected results depending on the optimizer mood when you call this instruction with a NaN and a non-NaN value.
Actually, no. Computers are not getting faster. Microprocessors stopped getting faster a few years ago, now we just get more of them. Supercomputers have mostly reached the limits of scalability, so there is a limit to that too.
Why are you counting in FLOPS in the first place? Use a real unit. Cache is independent from vectorization. While both affect the performance of the code, when evaluating the performance of vectorization on its own only how many cycles the computation would take if all data were in L1 cache is considered.
I wouldn't say extremely rare. It depends entirely of what you are doing. Some parts of the compiler are more stable than others.
Advanced C++ and gcc-specific extensions are two things that can break from time to time. Combine the two together, and running into bugs isn't so rare.
Most likely, you were just invoking undefined behaviour. GCC 4.8 has new optimizations tied to signed integer overflow, for example. a+b is the same in hardware regardless of whether the inputs are signed or not (assuming two's complement hardware), but to the compiler, that's not the case.
Explicit vectorization is indeed much more reliable than automatic vectorization, and it will always deliver better performance.
Interestingly, there seems to be quite a few abstraction layer libraries for SIMD. There are also at least Boost.SIMD (part of NT2 [1]) and Vc [2]. Several array-handling libraries (NT2 [1], Eigen [3]) also a leverage SIMD explicitly. Alternatively there are plenty of languages based on C with explicit SIMD programming, like the Intel SPMD Compiler [4].
If you're interested in SIMD, there is also apparently a workshop being held soon on this subject in Orlando [5].
I run into codegen errors with various compilers fairly often (surprisingly, not so much with clang, they must have a much better software architecture). They're not that hard to find, and once found, it's not that hard to find why they happen either. What is hard, however, is getting the developers to fix it and include the fix in the next release.
It's a depiction. It doesn't mean it is real. (In truth, it probably is real to some extent in the sense that the women are truly being subjected to abuse even if they did agree to it for the purpose of filming)
As a test, I took a look at each of your mainstream porn site. Each of them has at least one video depicting elements of rape on its first page. When Cameron has its way, it will be a criminal act to visit any of those sites.
A manual optimization would easily yield a 2 times improvement on that.
IANAL (I'm an entrepreneur who has been a subcontractor and has subcontracted in the software business), but I know that you can pretty much set the terms you want on a contract. In particular, you can ask that an incident you file to a subcontractor is investigated by said subcontractor in a given time frame.
In truth, it is up to the buyer and the seller to negotiate the terms of the contract, the lawyers are just there to advise both parties.
The seller can agree to provide minimal service for a minimal price, but the buyer can ask for better service, which he will probably have to pay extra for.
Because 1) not all operations are equal and 2) the clock rate isn't constant.
It's not a very useful information, unless what you really want to count is the amount of data treated by second, which is better addressed by a simple B/s bandwidth unit.
Cache is of course of utmost importance for performance, but you fail to see it's a different problem entirely. Vectorization happens at a different level.
I don't know what world you live in, but no optimizer of any C compiler changes the memory access pattern of your code. If your code is bad from that point of view, there is nothing it can do, only the developer can optimize that. What the compiler can do is schedule your instructions better so as to fill the pipeline to the maximum while trying to minimize register usage, taking into account the fact that scalar and simd use different registers and pipelines (though if you want that to be done well, you're better off doing that by hand while reading the processor specs).
AFAIK the only negative effects compiler optimizations can have on the cache is that they can generate code bloat by unrolling or inlining too much. That's unlikely to prevent any vectorization though. Conservative inlining or unrolling policies in the middle-end can however prevent vectorization because they don't take into account the gains associated to switching to the simd ISA.
Depending on the support contract you negotiated with them, they shouldn't need to admit it.
I recommend you tell management to improve their legal department.
What's so expensive?
Doesn't a real laptop cost at least $1,500? This is pretty cheap.
Here is an example in GCC: the optimizer assumes the SSE minps instruction is commutative. It isn't.
As a result you can get unexpected results depending on the optimizer mood when you call this instruction with a NaN and a non-NaN value.
Actually, no. Computers are not getting faster.
Microprocessors stopped getting faster a few years ago, now we just get more of them. Supercomputers have mostly reached the limits of scalability, so there is a limit to that too.
Because the slowest part of the computer is memory, and vector notation leads to more cache misses.
Is that some sort of joke? Surely you can tell this is not the optimal assembly code at all.
Why are you counting in FLOPS in the first place? Use a real unit.
Cache is independent from vectorization. While both affect the performance of the code, when evaluating the performance of vectorization on its own only how many cycles the computation would take if all data were in L1 cache is considered.
I wouldn't say extremely rare. It depends entirely of what you are doing.
Some parts of the compiler are more stable than others.
Advanced C++ and gcc-specific extensions are two things that can break from time to time. Combine the two together, and running into bugs isn't so rare.
That's why you don't buy software without support.
With support, they'd be contractually obliged to debug it.
Most likely, you were just invoking undefined behaviour.
GCC 4.8 has new optimizations tied to signed integer overflow, for example. a+b is the same in hardware regardless of whether the inputs are signed or not (assuming two's complement hardware), but to the compiler, that's not the case.
Explicit vectorization is indeed much more reliable than automatic vectorization, and it will always deliver better performance.
Interestingly, there seems to be quite a few abstraction layer libraries for SIMD. There are also at least Boost.SIMD (part of NT2 [1]) and Vc [2].
Several array-handling libraries (NT2 [1], Eigen [3]) also a leverage SIMD explicitly.
Alternatively there are plenty of languages based on C with explicit SIMD programming, like the Intel SPMD Compiler [4].
If you're interested in SIMD, there is also apparently a workshop being held soon on this subject in Orlando [5].
[1] https://github.com/MetaScale/nt2
[2] http://code.compeng.uni-frankfurt.de/projects/vc/
[3] http://eigen.tuxfamily.org/index.php?title=Main_Page
[4] http://ispc.github.io/
[5] https://sites.google.com/site/wpmvp2014/
I run into codegen errors with various compilers fairly often (surprisingly, not so much with clang, they must have a much better software architecture). They're not that hard to find, and once found, it's not that hard to find why they happen either.
What is hard, however, is getting the developers to fix it and include the fix in the next release.
It's a depiction. It doesn't mean it is real.
(In truth, it probably is real to some extent in the sense that the women are truly being subjected to abuse even if they did agree to it for the purpose of filming)
As a test, I took a look at each of your mainstream porn site. Each of them has at least one video depicting elements of rape on its first page.
When Cameron has its way, it will be a criminal act to visit any of those sites.
In the situation in question, they fully control the load of the computer.
I'm not sure. I can barely see anything.
a camera where a human can tell what it is seeing
It cannot carry a real camera...
No, it will not. We've already have all kinds of drones that are actually functional. This device cannot carry any payload, hence it is useless.
Assistant professor demonstrates useless device.
Where are the news here?
Yet it actually has good exclusive games, unlike the other consoles.
Video games are a serious entertainment medium, certainly more interesting than some others like movies. They're not for kids.