> Fortran code ignores the very possibility that pointer content can overlap. Modern compilers do not.
Fortran's language specification doesn't allow pointers to overlap. Inhibiting programmer freedom in this way ironically gives the compiler greater freedom to perform optimizations.
In contrast C & co, give the programmer this freedom, resulting in the compiler having to be more conservative.
Language semantics are the real difference. To get comparable semantics in a C program and a Fortran program solving the same problem, the C program has to use the "restrict" keyword everywhere (or nearly everywhere/or where it counts). Fortran by default disallows aliasing (the purpose of restrict) in contrast to C.
Location aliasing inhibits intermediate code[1] optimization as the optimizer cannot assume that two pointers point to difference locations. If the optimizer can safely make that assumption it can do more with the code.
Aside: during my compiler design class, the lecturer spent time going over some optimizations (obviously). Towards the end of the lecture: "oh but if p aliases q, you cannot use that optimization".
[1] Any compiler worth its use translates input files into some form of intermediate representation for optimization purposes.
Fortran doesn't restrict multiple threads accessing the same array (see OpenMP), instead it restricts how pointers may be used. A frequent problem with program optimization is that if two or more pointers address the same location, it is aliased.
This wrecks havoc as the compiler can no longer assume some optimizations are safe.
Afaik, the problem with VLIW processors in general is that they attempt to exploit instruction-level parallelism.
This is an entirely different beast to what's presented in the paper.
Instruction-level parallelism occurs when there are instructions within a (fixed) window of a code stream where there are no dependencies between two or more instructions.
The VLIW paradigm is have bundles of instructions which contain all instructions that can be executed simultaneously. This shifts complexity from the hardware[1] to the compiler.
Unfortunately, ILP can be very difficult to extract from arbitrary code, though cases exist where it's trivial.
[1] Latter RISC chips and today's non-mobile CPUs take advantage of ILP through the use of multi-issue out of order execution. Out-of-order execution typically defers execution of any given instruction until all its dependencies have been fulfilled i.e. memory/cache accesses have occurred, previous results are available, etc. By making these units multi-issue the CPU dynamically exploits ILP to the availability of hardware, no recompilation required (though it may help).
These hardware techniques are slowly coming to the mobile arena as they are relatively expensive transistor wise.
Actually, they were ordered to return the recovered salvage by a US federal court. The court decided Spanish government had a sovereign claim over the shipwreck of the Nuestra Señora de las Mercedes.
> Maybe Oracle can actually expand Java. Oracle owns silicon, so why not make a processor that is designed from the ground up for Java bytecode? Perhaps even build it into the SPARC architecture .
ARM tried it with Jazelle in earlier cores which they've replaced with the ThumbEE and successor.
JIT compilers (and in ARM's case simpler+compact instructions) seem to have been more economical than implementing a (partial) second instruction set in a processor and requiring to be at least as fast the JIT competition.
Re:This is what's wrong with private healthcare.
on
How Doctors Die
·
· Score: 2
Mind you this was a walk-in procedure, not an impacted tooth or anything. And it definitely wasn't subisidized by the Irish government (that's where you get a discount for paying PRSI). Which appears to have been cut.
Leaching indirectly off insurance companies? That'd be interesting given the VHI tend to refund costs of low priced stuff to the you directly afaik.
Re:This is what's wrong with private healthcare.
on
How Doctors Die
·
· Score: 1
A $1000 dollars for a wisdom tooth extraction? I had one extracted in Ireland as a walk-in patient. No insurance mentioned, no PRSI slips shown.
60 Euro. And that included an X-Ray to say, "Yes, that tooth is pretty much irrecoverable".
And I probably could have gotten it performed cheaper outside the capital.
You can't utilize multiple processors with OCaml directly. There is some effort going towards building a multicore version but it's not being undertaken by INRIA.
I'd disagree on the use of OCaml for certain applications as there is no multi-core support. You might be able to hack something in using multiple processes, but it will probably be pretty ugly.
Sure, there's components written in assembly + C, C++. But it's quite possible to write the vast majority of an OS in a variant of C#. Also, have a look at C#'s "unsafe" extension which permits the normal C hackery.
> It would mean that development cycles slow down, algorithmics finally win over brute force and that software quality would have a chance to improve (after going downhill for a long time).
Um, nope. Companies will simply sell bigger boxes to run their bloated code.
> GPUs as CPUs? Ridiculous! Practically nobody can program them
http://www.nvidia.com/object/cuda_apps_flash_new.html
> and very few problems benefit from them.
Media encoding/transcoding. Scientific code, minimum spanning trees can also be done a a GPU.
If you mean by a 'few problems' that it doesn't run Word/Office/Java etc, then yes. Otherwise if it's a case that the algorithmics (sic) can be done in a data parallel fashion, then the problem might be able to done on a GPU.
Except any decent allocator should only need to cross the kernel threshold when expanding the heap -or- releasing "excess" for some value worth of free memory back to the OS.
Sun's java still needs to perform it's own internal memory management for the main heap. The minor heap is bump allocated, and live data copied from that to the main heap.
Only if your compiler/VM does escape analysis for stack allocation. Afaik, Sun's Hotspot and the JHC haskell compiler are capable of it, I don't know of any others.
Nitpick, but reference counting isn't the ony form of garbage collection out there. Reference counting is actaully fairly attractive as you get very incremental collection , not simply your application freezing dead while you examine the entire heap (as android does currently, and Go's current collector).
Finally, using the MMU on a cpu to assist with garbage collection is generally a disaster. Your program needs to be able to inspect and modify it's own page tables, or you're using the memory fault mechanism to allow the collector to progress. The first is a security + OS nightmare, and the second tends to be very slow.
The FSB is grateful for your assistance citizen! I
> Fortran code ignores the very possibility that pointer content can overlap. Modern compilers do not.
Fortran's language specification doesn't allow pointers to overlap. Inhibiting programmer freedom in this way ironically gives the compiler greater freedom to perform optimizations.
In contrast C & co, give the programmer this freedom, resulting in the compiler having to be more conservative.
Language semantics are the real difference. To get comparable semantics in a C program and a Fortran program solving the same problem, the C program has to use the "restrict" keyword everywhere (or nearly everywhere/or where it counts). Fortran by default disallows aliasing (the purpose of restrict) in contrast to C.
Location aliasing inhibits intermediate code[1] optimization as the optimizer cannot assume that two pointers point to difference locations. If the optimizer can safely make that assumption it can do more with the code.
Aside: during my compiler design class, the lecturer spent time going over some optimizations (obviously). Towards the end of the lecture: "oh but if p aliases q, you cannot use that optimization".
[1] Any compiler worth its use translates input files into some form of intermediate representation for optimization purposes.
Fortran doesn't restrict multiple threads accessing the same array (see OpenMP), instead it restricts how pointers may be used. A frequent problem with program optimization is that if two or more pointers address the same location, it is aliased.
This wrecks havoc as the compiler can no longer assume some optimizations are safe.
Otherwise your post is mostly correct.
The Weimar republic and Zimbabwe suffered massive inflation, not deflation.
Afaik, the problem with VLIW processors in general is that they attempt to exploit instruction-level parallelism.
This is an entirely different beast to what's presented in the paper.
Instruction-level parallelism occurs when there are instructions within a (fixed) window of a code stream where there are no dependencies between two or more instructions.
The VLIW paradigm is have bundles of instructions which contain all instructions that can be executed simultaneously. This shifts complexity from the hardware[1] to the compiler.
Unfortunately, ILP can be very difficult to extract from arbitrary code, though cases exist where it's trivial.
[1] Latter RISC chips and today's non-mobile CPUs take advantage of ILP through the use of multi-issue out of order execution. Out-of-order execution typically defers execution of any given instruction until all its dependencies have been fulfilled i.e. memory/cache accesses have occurred, previous results are available, etc. By making these units multi-issue the CPU dynamically exploits ILP to the availability of hardware, no recompilation required (though it may help).
These hardware techniques are slowly coming to the mobile arena as they are relatively expensive transistor wise.
Vaccines do not guarantee immunity, but are very, very likely to.
Actually, they were ordered to return the recovered salvage by a US federal court. The court decided Spanish government had a sovereign claim over the shipwreck of the Nuestra Señora de las Mercedes.
> Maybe Oracle can actually expand Java. Oracle owns silicon, so why not make a processor that is designed from the ground up for Java bytecode? Perhaps even build it into the SPARC architecture . ARM tried it with Jazelle in earlier cores which they've replaced with the ThumbEE and successor. JIT compilers (and in ARM's case simpler+compact instructions) seem to have been more economical than implementing a (partial) second instruction set in a processor and requiring to be at least as fast the JIT competition.
Really?
http://www.irishdentist.ie/news/news_detail.php?id=3969
Mind you this was a walk-in procedure, not an impacted tooth or anything. And it definitely wasn't subisidized by the Irish government (that's where you get a discount for paying PRSI). Which appears to have been cut.
Leaching indirectly off insurance companies? That'd be interesting given the VHI tend to refund costs of low priced stuff to the you directly afaik.
A $1000 dollars for a wisdom tooth extraction? I had one extracted in Ireland as a walk-in patient. No insurance mentioned, no PRSI slips shown.
60 Euro. And that included an X-Ray to say, "Yes, that tooth is pretty much irrecoverable".
And I probably could have gotten it performed cheaper outside the capital.
Thanks and good luck on new ventures.
You can't utilize multiple processors with OCaml directly. There is some effort going towards building a multicore version but it's not being undertaken by INRIA.
I'd disagree on the use of OCaml for certain applications as there is no multi-core support. You might be able to hack something in using multiple processes, but it will probably be pretty ugly.
http://www.microsoft.com/interop/cp/default.mspx
Shame really that someone couldn't even do the research to see if such wild claims about MS are in any way true.
To some degree yes.
http://research.microsoft.com/en-us/um/people/simonpj/papers/list-comp/index.htm
http://en.wikipedia.org/wiki/Singularity_(operating_system)
Sure, there's components written in assembly + C, C++. But it's quite possible to write the vast majority of an OS in a variant of C#. Also, have a look at C#'s "unsafe" extension which permits the normal C hackery.
SSTO is a dumb idea for bell shaped rocket engines due to their limitations. Aerospike and linear aerospikes can potentially offer SSTO capability.
http://en.wikipedia.org/wiki/Aerospike_engine
> It would mean that development cycles slow down, algorithmics finally win over brute force and that software quality would have a chance to improve (after going downhill for a long time). Um, nope. Companies will simply sell bigger boxes to run their bloated code. > GPUs as CPUs? Ridiculous! Practically nobody can program them http://www.nvidia.com/object/cuda_apps_flash_new.html > and very few problems benefit from them. Media encoding/transcoding. Scientific code, minimum spanning trees can also be done a a GPU. If you mean by a 'few problems' that it doesn't run Word/Office/Java etc, then yes. Otherwise if it's a case that the algorithmics (sic) can be done in a data parallel fashion, then the problem might be able to done on a GPU.
Strange. Mine's "Positive" and i'm swimming in mod points. Even when I don't use them.
Except any decent allocator should only need to cross the kernel threshold when expanding the heap -or- releasing "excess" for some value worth of free memory back to the OS.
Sun's java still needs to perform it's own internal memory management for the main heap. The minor heap is bump allocated, and live data copied from that to the main heap.
Only if your compiler/VM does escape analysis for stack allocation. Afaik, Sun's Hotspot and the JHC haskell compiler are capable of it, I don't know of any others.
Google are currently implementing a JIT for Dalvik. I don't have any experience using it though, as I've been hacking on the garbage collector.
Nitpick, but reference counting isn't the ony form of garbage collection out there. Reference counting is actaully fairly attractive as you get very incremental collection , not simply your application freezing dead while you examine the entire heap (as android does currently, and Go's current collector).
Also, IBM's Recycler which they are proposing was designed by David Bacon. Some of his more recent work is on hard-real time collectors for Java.
http://domino.research.ibm.com/comm/research_projects.nsf/pages/metronome.index.html
Finally, using the MMU on a cpu to assist with garbage collection is generally a disaster. Your program needs to be able to inspect and modify it's own page tables, or you're using the memory fault mechanism to allow the collector to progress. The first is a security + OS nightmare, and the second tends to be very slow.