smallfries · Slashdot Mirror

Re:Cluster = Cloud on Wolfenstein Ray Traced and Anti-Aliased, At 1080p · 2011-09-19 18:42 · Score: 1

I meant the SDK for ray-tracing, rather than the ray-tracing demo in the SDK. I've tried that on a GTX-580 and it seemed to have two different rendering modes, low quality when you move the model for 2-3fps and then a refinement step that took a couple of seconds to get the highest quality.

Re:Cluster = Cloud on Wolfenstein Ray Traced and Anti-Aliased, At 1080p · 2011-09-16 05:35 · Score: 1

Have you used it? The IDF is for interactive frame-rates (haven't checked but last Intel demo I saw was about 20fps). That ray-tracer on the card takes several seconds per frame. They are not really comparable in performance.

Re:Usage predicts lifespan on Sixteen Years Later: GNU Still Needs An Extension Language · 2011-09-02 03:35 · Score: 1

I don't see any trap, although if you've studied more rhetoric than logic it does not surprise me that you believe that you do. Your argument was circular so I simply took it at face value. I've never disputed that I understood what you intended to say, merely that what you intended to say is wrong. Some of the things that seem to confuse you are standard terminology, such as circuit (as opposed to electronic circuitry).

So, if your argument is simply "a circuit is lesser because it is finite", and that's the only difference between it and a program, then really, it's a bad definition, because programs have to have a finite number of instructions in order to operate. "No they don't, they could be infinite", yeah, well so can circuits, and both are equally tractable matters.

Every computation realisable on a circuit is an instance of a FSM, where-as Turing-Machines can execute a larger class of computation. It is the very point of whether or not a machine is universal. So although you don't see anything interesting or relevant about the distinction it does tend to matter in the "compiler business" as input languages that are universal have undecidable analysis problems, while FSMs are merely intractable.

Being complete wrt a set of logical operators is only one small part of what is necessary to simulate a Turing Machine. Your claim that it was sufficient is incorrect. If you remove the finite/infinite distinction that seems to trouble you then the logic for controlling the head does not require a complete operator: simple equality checking is enough to build a functioning controller. So the completeness of the operator is irrelevant to building the finite part of the machine - the connection that you were trying to make is simply wrong. Turing Completeness is only defined for languages, your attempts to claim that the bits other than a logic operator are irrelevant simply show that you do not understand the concepts as well as you think that you do.

Re:Want. on Details About Raspberry Pi Foundation's $25 PC · 2011-09-02 00:38 · Score: 1

Because they're fun? I did development on a board with a similar spec about ten years ago - back then it was an expensive and unusual board in a research lab. The difference that comes from a "disposable" price-point is amazing. I'm sure there will be a huge number of fun projects with these boards outside of their target educational market. If you want to go for the luxury $100 market then gumstix are quite nice boards to play with.

Re:Usage predicts lifespan on Sixteen Years Later: GNU Still Needs An Extension Language · 2011-09-01 03:49 · Score: 1

You seem slightly confused, or ignorant of some fundamental definitions in your area. That is not meant to be an offensive remark; people can quite happily spend their lives working in compilers with ignorance of some theory beneath their level of interest. Circuits (either in the concrete electronic sense, or in more general theoretical terms) are defined as statically connected collections of operators over finite sets. That is, the wires carry elements of a finite set, and the gates are operators over this set. There is no equivalence between circuits and programs as programs can execute a strictly larger set of computations than circuits.

While operators are in this sense a fundamental definition of gates (I can provide you with some citations if you are interested), your counterarguments breaks in several places. You are confusing fundamental, in the sense of deeper, with simpler, as in the most optimal description. As programs form a larger set of computations than circuits it is clear that every circuit can be implemented as a program. Converting an operator into a sequence of steps in a program does nothing to show that the operator is not a fundamental description. As circuits require a static finite set of connections it is far from "trivial" to extend circuits to simulate Turing Machines. So neither you multiplication or divisions examples are a counter-argument. While every circuit can be simulated by a program, not every program can be simulated by a circuit.

It doesn't offend me in the slightest when people make mistakes in my field, and I even take the time to correct them. You seem to think that your point was commonly understood, when sadly it is in fact a commonly misunderstood point, as your replies demonstrate.

Re:Usage predicts lifespan on Sixteen Years Later: GNU Still Needs An Extension Language · 2011-08-31 23:16 · Score: 1

I'm afraid that I wouldn't know much about the compiler business that you are in, I only design them and publish papers on them. The business of them would be far too complex for me. Nice response btw, very unexpected and entertaining for slashdot. But sadly almost everything has a concrete definition if you look deep enough, which is the most general reason for my nitpicking (yes, yes, you clearly know what you are talking about, I wouldn't suggest otherwise). Operators are the fundamental definition of gates, so many methods of composing operators will only produce circuits - in particular if the connection graph is static then what you produce will be within the set of circuits. You need something more than flow-control to define a TC language: you need state. The standard test of TC is the ability to write a TM simulation within the language. This requires sufficiently strong logic operations (such as nand), some form of flow control and some form of storage.

I would argue that just the operator without any of the machinery to operate it has little in common with the final product. But even confusing an operator that is complete over the set of all operators with a language that can simulate any other language is too large a nit for my tastes. Obviously I can accept that things are different in the compiler business...

Re:Usage predicts lifespan on Sixteen Years Later: GNU Still Needs An Extension Language · 2011-08-31 22:05 · Score: 1

It might seem like a petty distinction, but it is quite important: anything built out of nand gates is a circuit, never a program. Thus nand gates cannot describe a language, only a circuit that implements something that processes the language. The difference only becomes visible when you look at large (ie Turing Complete) languages as they are too large (infinite) to be implemented on a finite size circuit, although we use approximations of them all the time. Only on slashdot would you be nitpicked for conflating the two :)

Re:Usage predicts lifespan on Sixteen Years Later: GNU Still Needs An Extension Language · 2011-08-31 05:30 · Score: 1

Universal sets of gates and complete languages have little in common with one another.

Re:And The Rest Of What Makes Windows Garbage on Estimated Transfer Time Is No More In Windows 8 · 2011-08-24 18:47 · Score: 1

And app bundles are just folders anyway. /Applications/Safari.app is no different than C:\Program Files\Safari except one of the two OSes hides the implementation from the user. I don't like my computer hiding things from me, mmkay.

Nope. They are self-contained applications within a folder - not just a folder. If I install /Applications/Safari.app then I know that everything to do with that application is in that folder, not scattered across lib folders elsewhere on the disk. So uninstallation is just trashing a folder, moving the app to another machine is a simple copy and keeping separate versions of something is trivial. It's the unix philosophy of exposing semantics through the file-system, it's just a shame that none of the linux distros have tried it. Although sadly OSX still needs a full-blown package manager with all of its warts and issues to handle installation of non-OSX software.

Re:Questions from the original article... on Ask Slashdot: What Will IT Look Like In 10 Years? · 2011-08-22 19:22 · Score: 1

Japan didn't really stop when they caught up. They're currently 100 years ahead of the West after their recent set-backs, but once they reactivate their zero-point energy field they should be back to the good old 25th century.

Re:Questions from the original article... on Ask Slashdot: What Will IT Look Like In 10 Years? · 2011-08-20 21:31 · Score: 1

Also, learning Chinese will be essential in an engineering career.

Re:Quite agreeable on ARM Is a Promising Platform But Needs To Learn From the PC · 2011-08-19 19:30 · Score: 1

Sure, sure. Play the old cynic. The one under my desk is running particle simulations interactively using hundreds of GFlop/s. Looking back 15 years would have been the first 3dfx board and maybe a version of Quake. I think they are doing quite a few new things in that time. Improving vector processing performance by four orders of magnitude has resulted in new applications (for the PC, these things would have been available on mainframes previously) ...

Re:Go Pypy! on See the PyPy JIT In Action · 2011-08-16 06:07 · Score: 1

You've missed the point: yes it is easy to do this dynamically at the cost of performance. The comment was in response to the claim that doing it statically is easy. It's not easy to analyse this statically, but when it can be done the performance increase is large. One optimisation that it allows straight away is unboxing which can easily increase the performance of numerical code by an order of magnitude.

Re:PyPy solves a very hard problem, but is still s on See the PyPy JIT In Action · 2011-08-15 01:32 · Score: 1

Actually he just completely misread my post. The 60x increase was Python -> C, not from C -> assembly.

Re:PyPy solves a very hard problem, but is still s on See the PyPy JIT In Action · 2011-08-13 20:51 · Score: 1

I noticed that their progress seemed to have stopped, is there any official announcement?

The Google team tried to fix that with Unladen Swallow, but gave up when their JIT system was barely faster than CPython.)

Re:PyPy solves a very hard problem, but is still s on See the PyPy JIT In Action · 2011-08-13 20:48 · Score: 1

It matches what I've seen. I've written compilers, static analysers and visualisation code in Python. In most cases there is about a 100x difference in speed between the Python code and C code to implement the same algorithm. Python is still useful for prototyping in as most of that code could be written 5x to 10x more quickly than the C code.

Re:Go Pypy! on See the PyPy JIT In Action · 2011-08-13 20:41 · Score: 1

So you are claiming that duck typing is not used in practice other than in esoteric projects. That is a little bit naive, consider why would something so complex be left in the language design if it was not necessary? The entire language relies on duck typing being there to make some other design decisions easier to make, for example all containers are polymorphic collections rather than the monomorphic collections in almost every other language. Your fragment of code is a single instance where forward interference would work with name splitting on use (aka ssa form). But it is not powerful enough to handle the whole language. Consider: x = Foo() if(...) else : Bar(). What type is x after the statement? In the presence of conditional branches you need some kind of phi nodes to merge the symbols. It's the same problem that ssa suffers from in maintaing the precision of modelling which values exit a computation.

That's one of the easy ones, how about d[key1] = Foo() ... d[key2] = Bar() .... Lots of different object type under different keys. Somewhere later your analyser hits the value d[some expression]. What type comes back out?

Re:Ray Tracing != Ray Casting on Carmack On 'Infinite Detail,' Integrated GPUs, and Future Gaming Tech · 2011-08-12 22:32 · Score: 1

What do you think memory access coalescing is, and why do you think it needs to be on aligned boundaries? Exploiting spatial locality of reference is still a cache, even it only has a single line. Given the huge impact of non-aligned access within a warp it seems silly to pretend that a GPU is not a cached architecture.

Re:Ray Tracing != Ray Casting on Carmack On 'Infinite Detail,' Integrated GPUs, and Future Gaming Tech · 2011-08-12 22:16 · Score: 2

Raytracing is extremely straight-forward and parallel.

If only.... In the real world that would be an exclusive or, so pick one of the two. On small uncomplicated scenes it is straight-forward to make the tracing of each ray happen in parallel. As you add more geometry though it begins to behave differently. You will need a giant cache of rays to handle all of the bounces otherwise all of the recomputation will kill your performance. This was well known in the graphics community prior to GPUs and the same scaling issues were seen on clusters used in rendering. Working out how to implement that cache so that it scales nicely with the number of nodes is a really difficult data-structures problem.

Say that you avoid this problem (for now) by only looking at a low number of bounces per ray, and you favour performance over fidelity by only casting a few (or just one) ray per pixel. You certainly don't want an O(n) intersection test with your geometry, and to hit some kind of O(log n) you will need a spatial subdivision tree of some kind. The problem that you will now face is that every ray traverses the tree in a slightly different order. This means that memory locality amongst your cores is a absolute bitch.

Either replicate *all* of your geometry into huge local memories for each thread (not feasible). Or try and cover the latency with throughput. Of course high-throughput on a GPU depends entirely on memory coalescing (not possible because the access patterns diverge in each thread) and a high arithmetic to access ratio (not really going to happen during tree traversal). Then you have the smaller issues like the SIMD restrictions that mean every thread in a workgroup must execute the same trace, so both sides of branches are run and recursive traversal functions are pig-slow.

There is some really, really nice work out there. But it is hideously inefficient because the problem does not map onto the architecture. The Sparse Efficient Volumetric work from (siggraph 2 years ago?) shoots about 140Mray/s on a GTX-580. It looks pretty gorgeous, but it expands out all of the geometry into 2.7GB on the card to make the approach a little bit closer to brute-force and spend less time on processing acceleration structures. But that card can hit 800Gflop/s so each ray is still costing 6000 arithmetic instructions (or more likely their equivalent as they are memory bound). They are not computing hundreds of bounces per ray on average, that huge cost is because their arithmetic to access ratio is so bad, and that is a direct result of it being so difficult to parallelise the ray processing.

Of course dynamic scenes are much, much harder than these minor problems that you get for static scenes...

Re:Ray Tracing != Ray Casting on Carmack On 'Infinite Detail,' Integrated GPUs, and Future Gaming Tech · 2011-08-12 21:21 · Score: 1

At 13:30 he seems to be talking about path-tracing. That accumulation step that he describes with an arbitrary cut-off to make the frame deadline and the random jitter makes it quite certain. The random jitter in time reusing previous pixel results somes like a very bizarre and trippy form of "motion" blur - it would look similar but blurring would be proportional to path length casting from that pixel. I might have to hack that up and try it...

Re:At least... on Apple's Unlikely Security Mentor: Microsoft · 2011-08-12 06:58 · Score: 0

It is a good summary of a confused article though.

Final conclusions in the article are that while a mac is more secure than a PC, mac users are at more risk than PC users. Hmmm, fanbois line up on my left, haters on my right, and THREE, TWO, ONE.....

Re:Hz != Power on A Quest For the Perfect SNES Emulator · 2011-08-10 17:52 · Score: 1

That works better. Defining bsnes as a specific benchmark does produce something to measure. One potential problem would be that there may be a non-linear relationship between the processor frequency and the benchmark performance, so your proportionality would be between the frequency and some function of the work. That would depend on the size of the working set inside bsnes (which I know nothng about) and exactly how it accesses memory.

Re:Hz != Power on A Quest For the Perfect SNES Emulator · 2011-08-10 09:45 · Score: 1

So you were disagreeing with me saying:

His objection is almost certainly that work per cycle is not constant.

And the basis of your argument is:

... in a discrete piece of time, but the actual work done is not constant.

Well, your argument certainly wins points for subtlety.

Re:Really? Vigilantes? on The London Riots and Facial Recognition Technology · 2011-08-10 09:43 · Score: 1

Rene Descartes step forward from your anonymity and create a user account.

Re:Hz != Power on A Quest For the Perfect SNES Emulator · 2011-08-10 07:37 · Score: 1

What you have said makes no sense at all. Work in a processor is a discrete quantity, not a continuous level.

Slashdot Mirror

User: smallfries

Comments · 2,506