Actually, it's even better, Wall Street uses tons and tons of the relatively obscure language K (and its related database product KDB or Q). This language looks like so:
example xml parser in K (notice that the code is up until the lone "\", then the rest is just documentation).
The code above, and a lot of code like it, is actually used in production work at Wall Street. K actually has a lot of great merits, being a very efficient APL descendant, but it is still kind of fun that Wall Street does use a language far less readable than Perl as a core computation language for a lot of these kinds of models:)
Battery life estimates are partly driven by statistics collected during previous battery drains though, so depending on the nature of the changes you make the results may not be reflected in those reports until quite a while later.
Both OOo and KWord insert their own tags in ODF files written. Notably of the form '' and such. Faulting the standard for oddities in high-profile implementations is not really useful, there are almost invariably extensions to any good standard, and they tend to remain useful anyway.
My reading of all Intel graphics commentary from game industry people is not so much that they don't work great on pretty much all operating systems, but rather just that they aren't in line with their expectations. Which is really their problem, they can't go around demanding that people buy high-end GPU's just to make their life simpler. I'm with you on the kudos to Intel for some solid if not high-end products there.
The ISO OOXML debacle has merit as a complaint against Microsoft sure. What I don't think has merit is conflating that situation with some kind of wider war and using that as an argument for rather unrelated issues. I have comments on the level of discourse that often happens on that issue too though. One should take care in how one argues even there, especially since there are plenty of real heavy-hitting arguments it is just silly how so many opinion pieces get tangled up in the wrong things. For example a lot of the time in the heat of the moment ODF fans make it sound like Microsoft shouldn't be allowed to get OOXML standardized by the ISO no matter what (sometimes by arguing that "we already have" ODF). That kind of argument is just dangerous thinking, before standardization OOXML needs to have a solid specification and needs to be treated according to correct procedures, but with that given it is of course only right and appropriate to let Microsoft go through with the standardization if they so please. One can feel what one wants about how beneficial such a standard would be to the world at large, but of course no entity, neither the OASIS nor the OSS community or Microsoft has any business blocking standards procedure because they prefer some other standard, especially since the scopes are a fair bit different.
Also, in much the same way, there is much too much sloppy arguing based on technical merit going on, like the debacle about the trigonometric functions not being specified to be based on radians (which is after all the normal case and was added to the specirfication after it was pointed out), and is yet still so very often slipped out in the guise of a real argument against the standards in itself. Or the tags for previous version formatting for that matter, a type of tag which the ODF standard lacks but which are still just output in a non-standardised way by the major ODF applications (this is an arguable point, but it seems like a very weak argument in the real world when the flagship ODF implementation steps outside the standard to output similar tags anyway). Arguments of that kind just confuse the main point; the standard needs to take its time and be put through the correct procedure.
Well, this is drifting off topic quite a bit. I just wanted to point out that one really needs to take care in how one argues ones case, there are much too many people falling into a habit of sloppy arguments that would make a creationist blush, and into arguing under the assumption of a full-blown Microsoft conspiracy in all situations. I don't believe such a conspiracy exists, but even if it does it is still detrimental to ones case to use it as an argument with people who don't believe in it.
Take a few steps back from this and consider when you, Bruce Perens, became a anti-Microsoft troll rather than an OSS evangelist. Really. Because this post just makes me sad, a big name in the community having descended to the worst kind of Slashdot anti-Microsoft knee-jerks.
I would at this point like to point out that Alex St. John was fired from Microsoft in 1997, and no other Microsoft employees are involved in this article at all. But even if he still worked at Microsoft your post would still just be random accusations thrown together from nothing other than a deluded idea that there is some kind of war on and that everything that happens in some way has to allude to it.
Really, think about that post, because it is just sad when the self-appointed leaders of the OSS community start to look like nothing more than old Usenet trolls raving on about a company conspiracy, never noticing that the truths they tell are ignored because the other half of their arguments are just delusions and conspiracy theories made up on the spot.
Hehe, but if you take n to be the size of the problem then the originally suggested algorithm collapses into being "look at all nodes in a path and reorder them in some way". I already let slide the fact that once we have more than a pair we need a way to pick how to reorder the nodes. First and foremost though; I think that if you intend to be on the side that suggests way to approximate and/or solve a problem that has proven to be NP-hard to solve and approximate you are the ones that should stop digging. Being on the critiquing end seems like a far more relaxing idea;)
True enough, but if we let the algorithm look at any subsequence of at most n elements we can still trip it up in exactly the same way as long as we know what n is. It is in fact a simple generalization of the previous example, lets just instead rearrange it as:
| A C E G | | | | B D F H
where we say that A is a cluster containing points 0 and 1 close together, B is the cluster of point 2 and 3 close together and so on. This is still the same problem as before, but now we can generalize the key point in it.
If the algorithm can look at larger subsequences than a pair at once it will be able to completely cover a cluster plus some other point at once, and rearrange the cluster in relation to the other points. What we can do to defeat the algorithm when it can look at subsequences of length n then, however, is to simply increase the number of points in each cluster to n. So A contains points 0,...,(n-1), B contains n,...,(2*n-1), and so on. Then this new algorithm falls for exactly the same trap, since it can't see a whole cluster at and thus all rearrangements it performs will either be useless (rearranging inside one single cluster) or detrimental (rearranging a part of the points in two clusters, causing the path to go back and forth).
Interestingly this is not really an as artificial idea as it sounds, since most natural graphs will tend to cluster, for example in travels we will tend to have bunches of points in and around densely populated and differently zoned areas (but, as I said before, real world travels actually are fairly easily approximable).
Interestingly this is not really the case in the general problem. Notably the travelling salesman problem is not in APX unless the hamiltonian path problem is in P. That is, unless P=NP, there exists no algorithm which can give an approximation of the solution to the travelling salesman problem which is guaranteed to be within a constant factor c of the correct answer.
However, in graphs upholding the triangle inequality (that is, if the distance A->C is d then the distance A->B plus B->C is at least d) there are algorithms that do give guaranteed good approximations. So in the real world you are correct.
As an example of how approximations tend to fail we can look at a case where the greedy inversions of pairs you suggest breaks down (the lameness filter screws up my fine ASCII art. The numbers now signify nodes and the initial path runs in number order):
| 0 3 4 7 8 | | | | | 1 2 5 6 9
Now, this is not a good path through all the points on this map, but flipping the order of any pair will only make the path worse. The correct solution would be much shorter, just going straight through the top and bottom rows. But that path cannot be reached by greedingly inverting the order of pairs.
Yes, we should do everything we can to let people take their own direction. Except give them the option of running Microsoft software. Because we certainly can't have people going around making choices we don't like now can we?
There's a lot more to be excited about when it comes to DirectX than when it comes to Windows though. DirectX really is, in a sense, in the forefront of an exciting field of technology. In fact, if the Windows monopoly falters it would be nice to see Microsoft reinvent its primary business as a DirectX platform vendor; Considering how they have already extended it to consoles (the 360 SDK and DirectX 10 are fairly closely related, and Microsoft appear to be working hard on unifying the technology for Windows and console gaming) and it is for the most part platform-agnostic enough to be ported to any other number of OS's and devices. Which would, most importantly, be a good thing, since DirectX really is actually a very nice platform, probably the best one Microsoft has ever designed (and this counts.NET, which has made a fair splash even in the OSS community).
I guess I may be called a DirectX ntut for that, despite the fact that I spend most of my time coding on SDL+OpenGL, but hey, a risk I am willing to take:)
Not only does Windows 2000 have DirectX 9.0c (the latest and greatest before DirectX 10), but even Windows 98 does. In fact, up to and including DirectX 8.0a Windows 95 was still dragged along. So this being some standard Microsoft tactic, having been used to try to sell XP, is just plain false.
The suggestion is most likely that AMD is doing a random payout simply to keep Transmetas lawsuit against Intel going. Just like it has been argued that Microsoft did for SCO to support the lawsuit against various Linux companies (though I am personally not all that convinced about this theory either).
I am not sure what cause your specific problems, but IE for Solaris was not all that bad really. Sure it didn't fit in all that well (shipped with a sizable part of the WIN32 API, including the widgets), but then, what applications actually do on a UNIX desktop even today?
It did work pretty well though, and was in my opinion a superior alternative to the horrors of the really early Mozilla project.
I think you've just ignored everything the poster said - that the common case for desktop use doesn't need to share data and that shared cache has costs that are effectively the result of each core stepping on the other core's memory accesses - false cache-line sharing, bus contention and stuff he didn't mention like the fact that the bigger a single cache, the slower it is to access.
I didn't read the poster that way at all, and if that is what the poster meant to say I simply disagree. I read his post as an argument that actual inter-core communication (as in, one core really reading another cores writes to continue computation) is not a great bottleneck, but saying that data (pristine from memory) is seldom shared between cores is really way off. Running several completely independent applications does happen of course, but as the number of cores increases one really has to expect that either an application with several threads or possibly several instances of the same application is involved.
Also, of course the claim "shared cache is strictly a good thing" if the question is "do you want shared cache or 10X faster cache". That is not the question in practice however, the Conroe L2 has a 14 cycle latency, compared to the K8's 12 cycle latency. A latency increase of 16.6% for a chip that, on its very first revision, clocks 12.7% higher, not at all a bad scenario.
The associativity argument is real of course, but it is not nearly as problematic as it is sometimes made out to be, with 16-way associativity on L2 it seems extremely unlikely that any real issues will arise from this. If anything 16-way associativity still appears to be a bit overkill.
In the end, of course "shared cache is a strictly good thing" is a simplification, but in the same way that "more cache is a good thing" is a simplification. It is not nearly as unreasonable as you make it out to be to expect an implementation to have a near-zero performance impact for a single-thread only using half the cache. As it happens, the Conroe really does a really nice job of that as well.
AMD has a better solution for that at the moment, but it is not due to some kind of trade-off, they would be better off with shared cache and HyperTransport.
Oh, and one more thing: As has already been pointed out, this is indeed what will happen with the K8L.
That does not change the fact that shared cache is a strictly good thing though. There are many other advantages. Sharing cache means more cache overall (since no data needs to be duplicated when both processors need it, a huge saving for common workloads), and more cache means a lot less memory accesses. Shared cache also means that such common data only needs to be read from memory once, where the reads would need to be duplicated when cache is not shared.
On the other hand, the only thing I replied to was that Intel has a great approach to doing multi-core, no matter how "kludgy" some people claim it is. The inter-core interconnect is overall pretty damn good. The total system bandwidth is a very separate issue. AMD has a better solution for that at the moment, but it is not due to some kind of trade-off, they would be better off with shared cache and HyperTransport.
As it happens Intel is not doing all that badly there either, the FSB design is kind of old, but it always was a great piece of engineering. They have cranked up the speed nicely and have two separate buses in the current platform, so properly feeding a quad-core does not seem like it should be an all that tricky obstacle to overcome. It is still a problem that Intel needs to solve sooner or later, but it is not currently a disaster in any way.
The Core 2 Duo, clocked to 3GHz, has a L2 cache bandwidth of 96 gigabytes per second. Not to mention latency, HyperTransport has a latency on the level of a 100-200 nanoseconds, compared to the 14 cycle latency of the Core 2 Duo L2. This works out to a 23-46 times higher latency for HyperTransport. I would expect a more modern version of HyperTransport to bring this number down quite a bit, but still, it is probably safe to assume that HyperTransport will for the forseeable future have a latency well over ten times higher than a shared cache solution.
Yeah, K8L will get interesting of course, I don't expect things to stay the same. I just wanted to point out that it is not quite correct to accuse Intel of not having "proper" quad-core when the interconnect is in some ways superior to the way AMD currently does multi-core.
Calling either solution a "kludge" is of course wrong. However, just running everything across HyperTransport is an obviously worse approach for core-to-core communication than shared L2. The trick about sharing cache though is that it stops making sense to talk about the cores "sharing access to main memory", since any memory fetches go into the shared cache. Plus that Intel isn't stupid, their current platform has two separate front-side buses, so there is quite a bit of bandwidth to work with.
On the other hand it is of course also true that the HyperTransport approach is perfectly symmetrical and scales to almost any number of processors, so AMD has a good hold on the 8 core and above market.
It can equally well be argued that AMD's solution is a "kludge". Intel has four processors arranged in two pairs, within each pair the processors are connected by shared L2 cache, but the pairs are connected by the FSB. AMD on the other hand have all four processors communicating over HyperTransport links. Shared L2 is clearly better than HyperTransport links, and the HyperTransport links are better than Intel's current FSB.
The physical packaging simply doesn't tell much about the quality of the interconnect. Sure it is harder to make a truly great interconnect with separate packages, but looking directly at the interconnect tells the much more accurate story.
Either way, it is not an all that great suprise that the dual-FSB design of modern Intel platforms manages four cores decently, but yes, AMD probably still has a clear edge on 8 core systems.
The code above, and a lot of code like it, is actually used in production work at Wall Street. K actually has a lot of great merits, being a very efficient APL descendant, but it is still kind of fun that Wall Street does use a language far less readable than Perl as a core computation language for a lot of these kinds of models :)
Battery life estimates are partly driven by statistics collected during previous battery drains though, so depending on the nature of the changes you make the results may not be reflected in those reports until quite a while later.
Both OOo and KWord insert their own tags in ODF files written. Notably of the form '' and such. Faulting the standard for oddities in high-profile implementations is not really useful, there are almost invariably extensions to any good standard, and they tend to remain useful anyway.
Millionth time this little fact gets brought up in this type of discussion but:
The second-place winner is Sweden, which has a population density of 52 people per mile square, as compared to the US' 80 people per mile square.
My reading of all Intel graphics commentary from game industry people is not so much that they don't work great on pretty much all operating systems, but rather just that they aren't in line with their expectations. Which is really their problem, they can't go around demanding that people buy high-end GPU's just to make their life simpler. I'm with you on the kudos to Intel for some solid if not high-end products there.
The ISO OOXML debacle has merit as a complaint against Microsoft sure. What I don't think has merit is conflating that situation with some kind of wider war and using that as an argument for rather unrelated issues. I have comments on the level of discourse that often happens on that issue too though. One should take care in how one argues even there, especially since there are plenty of real heavy-hitting arguments it is just silly how so many opinion pieces get tangled up in the wrong things. For example a lot of the time in the heat of the moment ODF fans make it sound like Microsoft shouldn't be allowed to get OOXML standardized by the ISO no matter what (sometimes by arguing that "we already have" ODF). That kind of argument is just dangerous thinking, before standardization OOXML needs to have a solid specification and needs to be treated according to correct procedures, but with that given it is of course only right and appropriate to let Microsoft go through with the standardization if they so please. One can feel what one wants about how beneficial such a standard would be to the world at large, but of course no entity, neither the OASIS nor the OSS community or Microsoft has any business blocking standards procedure because they prefer some other standard, especially since the scopes are a fair bit different.
Also, in much the same way, there is much too much sloppy arguing based on technical merit going on, like the debacle about the trigonometric functions not being specified to be based on radians (which is after all the normal case and was added to the specirfication after it was pointed out), and is yet still so very often slipped out in the guise of a real argument against the standards in itself. Or the tags for previous version formatting for that matter, a type of tag which the ODF standard lacks but which are still just output in a non-standardised way by the major ODF applications (this is an arguable point, but it seems like a very weak argument in the real world when the flagship ODF implementation steps outside the standard to output similar tags anyway). Arguments of that kind just confuse the main point; the standard needs to take its time and be put through the correct procedure.
Well, this is drifting off topic quite a bit. I just wanted to point out that one really needs to take care in how one argues ones case, there are much too many people falling into a habit of sloppy arguments that would make a creationist blush, and into arguing under the assumption of a full-blown Microsoft conspiracy in all situations. I don't believe such a conspiracy exists, but even if it does it is still detrimental to ones case to use it as an argument with people who don't believe in it.
Take a few steps back from this and consider when you, Bruce Perens, became a anti-Microsoft troll rather than an OSS evangelist. Really. Because this post just makes me sad, a big name in the community having descended to the worst kind of Slashdot anti-Microsoft knee-jerks.
I would at this point like to point out that Alex St. John was fired from Microsoft in 1997, and no other Microsoft employees are involved in this article at all. But even if he still worked at Microsoft your post would still just be random accusations thrown together from nothing other than a deluded idea that there is some kind of war on and that everything that happens in some way has to allude to it.
Really, think about that post, because it is just sad when the self-appointed leaders of the OSS community start to look like nothing more than old Usenet trolls raving on about a company conspiracy, never noticing that the truths they tell are ignored because the other half of their arguments are just delusions and conspiracy theories made up on the spot.
Hehe, but if you take n to be the size of the problem then the originally suggested algorithm collapses into being "look at all nodes in a path and reorder them in some way". I already let slide the fact that once we have more than a pair we need a way to pick how to reorder the nodes. First and foremost though; I think that if you intend to be on the side that suggests way to approximate and/or solve a problem that has proven to be NP-hard to solve and approximate you are the ones that should stop digging. Being on the critiquing end seems like a far more relaxing idea ;)
If the algorithm can look at larger subsequences than a pair at once it will be able to completely cover a cluster plus some other point at once, and rearrange the cluster in relation to the other points. What we can do to defeat the algorithm when it can look at subsequences of length n then, however, is to simply increase the number of points in each cluster to n. So A contains points 0,...,(n-1), B contains n,...,(2*n-1), and so on. Then this new algorithm falls for exactly the same trap, since it can't see a whole cluster at and thus all rearrangements it performs will either be useless (rearranging inside one single cluster) or detrimental (rearranging a part of the points in two clusters, causing the path to go back and forth).
Interestingly this is not really an as artificial idea as it sounds, since most natural graphs will tend to cluster, for example in travels we will tend to have bunches of points in and around densely populated and differently zoned areas (but, as I said before, real world travels actually are fairly easily approximable).
However, in graphs upholding the triangle inequality (that is, if the distance A->C is d then the distance A->B plus B->C is at least d) there are algorithms that do give guaranteed good approximations. So in the real world you are correct.
As an example of how approximations tend to fail we can look at a case where the greedy inversions of pairs you suggest breaks down (the lameness filter screws up my fine ASCII art. The numbers now signify nodes and the initial path runs in number order):
Now, this is not a good path through all the points on this map, but flipping the order of any pair will only make the path worse. The correct solution would be much shorter, just going straight through the top and bottom rows. But that path cannot be reached by greedingly inverting the order of pairs.Yes, we should do everything we can to let people take their own direction. Except give them the option of running Microsoft software. Because we certainly can't have people going around making choices we don't like now can we?
Yes, it would sure have been nice if the article could at least had a small P.S. to note if the iPod Touch was covered or not.
I guess I may be called a DirectX ntut for that, despite the fact that I spend most of my time coding on SDL+OpenGL, but hey, a risk I am willing to take :)
Not only does Windows 2000 have DirectX 9.0c (the latest and greatest before DirectX 10), but even Windows 98 does. In fact, up to and including DirectX 8.0a Windows 95 was still dragged along. So this being some standard Microsoft tactic, having been used to try to sell XP, is just plain false.
Unrelated riders are for the most part unconstitutional in Europe, in the case of EU as well I believe.
The suggestion is most likely that AMD is doing a random payout simply to keep Transmetas lawsuit against Intel going. Just like it has been argued that Microsoft did for SCO to support the lawsuit against various Linux companies (though I am personally not all that convinced about this theory either).
Apple asserts patents on the canvas tag, interestingly directly mirroring the concerns Microsoft IE developers expressed about WHATWG all along. So don't expect the canvas tag in IE ever.
It is for posts like this one wishes there were a negative insightful score.
It did work pretty well though, and was in my opinion a superior alternative to the horrors of the really early Mozilla project.
I think you've just ignored everything the poster said - that the common case for desktop use doesn't need to share data and that shared cache has costs that are effectively the result of each core stepping on the other core's memory accesses - false cache-line sharing, bus contention and stuff he didn't mention like the fact that the bigger a single cache, the slower it is to access.
I didn't read the poster that way at all, and if that is what the poster meant to say I simply disagree. I read his post as an argument that actual inter-core communication (as in, one core really reading another cores writes to continue computation) is not a great bottleneck, but saying that data (pristine from memory) is seldom shared between cores is really way off. Running several completely independent applications does happen of course, but as the number of cores increases one really has to expect that either an application with several threads or possibly several instances of the same application is involved.
Also, of course the claim "shared cache is strictly a good thing" if the question is "do you want shared cache or 10X faster cache". That is not the question in practice however, the Conroe L2 has a 14 cycle latency, compared to the K8's 12 cycle latency. A latency increase of 16.6% for a chip that, on its very first revision, clocks 12.7% higher, not at all a bad scenario.
The associativity argument is real of course, but it is not nearly as problematic as it is sometimes made out to be, with 16-way associativity on L2 it seems extremely unlikely that any real issues will arise from this. If anything 16-way associativity still appears to be a bit overkill.
In the end, of course "shared cache is a strictly good thing" is a simplification, but in the same way that "more cache is a good thing" is a simplification. It is not nearly as unreasonable as you make it out to be to expect an implementation to have a near-zero performance impact for a single-thread only using half the cache. As it happens, the Conroe really does a really nice job of that as well.
AMD has a better solution for that at the moment, but it is not due to some kind of trade-off, they would be better off with shared cache and HyperTransport.
Oh, and one more thing: As has already been pointed out, this is indeed what will happen with the K8L.
That does not change the fact that shared cache is a strictly good thing though. There are many other advantages. Sharing cache means more cache overall (since no data needs to be duplicated when both processors need it, a huge saving for common workloads), and more cache means a lot less memory accesses. Shared cache also means that such common data only needs to be read from memory once, where the reads would need to be duplicated when cache is not shared.
On the other hand, the only thing I replied to was that Intel has a great approach to doing multi-core, no matter how "kludgy" some people claim it is. The inter-core interconnect is overall pretty damn good. The total system bandwidth is a very separate issue. AMD has a better solution for that at the moment, but it is not due to some kind of trade-off, they would be better off with shared cache and HyperTransport.
As it happens Intel is not doing all that badly there either, the FSB design is kind of old, but it always was a great piece of engineering. They have cranked up the speed nicely and have two separate buses in the current platform, so properly feeding a quad-core does not seem like it should be an all that tricky obstacle to overcome. It is still a problem that Intel needs to solve sooner or later, but it is not currently a disaster in any way.
The Core 2 Duo, clocked to 3GHz, has a L2 cache bandwidth of 96 gigabytes per second. Not to mention latency, HyperTransport has a latency on the level of a 100-200 nanoseconds, compared to the 14 cycle latency of the Core 2 Duo L2. This works out to a 23-46 times higher latency for HyperTransport. I would expect a more modern version of HyperTransport to bring this number down quite a bit, but still, it is probably safe to assume that HyperTransport will for the forseeable future have a latency well over ten times higher than a shared cache solution.
Yeah, K8L will get interesting of course, I don't expect things to stay the same. I just wanted to point out that it is not quite correct to accuse Intel of not having "proper" quad-core when the interconnect is in some ways superior to the way AMD currently does multi-core.
Calling either solution a "kludge" is of course wrong. However, just running everything across HyperTransport is an obviously worse approach for core-to-core communication than shared L2. The trick about sharing cache though is that it stops making sense to talk about the cores "sharing access to main memory", since any memory fetches go into the shared cache. Plus that Intel isn't stupid, their current platform has two separate front-side buses, so there is quite a bit of bandwidth to work with.
On the other hand it is of course also true that the HyperTransport approach is perfectly symmetrical and scales to almost any number of processors, so AMD has a good hold on the 8 core and above market.
It can equally well be argued that AMD's solution is a "kludge". Intel has four processors arranged in two pairs, within each pair the processors are connected by shared L2 cache, but the pairs are connected by the FSB. AMD on the other hand have all four processors communicating over HyperTransport links. Shared L2 is clearly better than HyperTransport links, and the HyperTransport links are better than Intel's current FSB.
The physical packaging simply doesn't tell much about the quality of the interconnect. Sure it is harder to make a truly great interconnect with separate packages, but looking directly at the interconnect tells the much more accurate story.
Either way, it is not an all that great suprise that the dual-FSB design of modern Intel platforms manages four cores decently, but yes, AMD probably still has a clear edge on 8 core systems.