Maybe that's the business model of the future, but I would want to make sure at least some of my money ended up in the hands of the creators of the media I'm enjoying via that service.
If Blockbuster or Amazon or iTunes did it, there would have to be an agreement with *someone* involved in the production of the works (hopefully artists directly; unfortunately it would probably be record companies and studios). If TPB does it, $0 goes to the artists, and I cannot agree with that. I know that studios and record companies in general suck because they don't pay the artists enough, but they still pay them more than $0.
There's obviously a trade-off. For some problems, buying a faster chip is MUCH lower-return than rewriting something to be multithreaded, and you must weight that with the cost of rewriting.
Buying a faster chip gets you on average about 10% speedup. But if your application gets, say, a 200-300% speedup on a quad-core processor (from being multithreaded and having nearly-linear scaling behavior as you add cores), it might be worth the investment. Depends what you're doing - how much parallelism is inherent in the problem, how much it costs for a someone to parallelize it, and how much you value having the results sooner.
Yes, decode adding latency to the pipe was discussed in theseposts. The GP was talking about decoding x86 instructions in parallel, which to me indicates he's concerned about decode bandwidth, not latency.
Take a look at what it takes to decode x86 instruction in parallel and then we'll talk.
With decode hints that come out of the instruction cache along with the instruction bytes themselves, it's not too bad. Besides, I don't think x86 machines are decode-bandwidth-limited. I don't think any modern microprocessor is decode-bandwidth-limited, it's memory-bandwidth and cache-miss-latency limited.
Yes, I said that. I also said that compared to die area spent on lots of other things in a modern microprocessor/SoC (cache, memory controllers, point-to-point coherent and IO links, buffers everywhere), it's probably minor.
they still can't be turned off (to save power),
You would be surprised what intelligent clock gating can do. Clock gating can't reduce leakage but it does reduce dynamic switching power.
and they still introduce defects (what cache almost doesn't do).
That's another side effect of them taking up die area. You said it three times in your post; I've acknowledged it but asserted that the die area is small compared to other things. I bet defects in cache arrays are WAY more likely to cause a chip to be scrapped or sold as a lower-performing part.
It doesn't matter (except for the die area, number of defects and not being able to turn offf), the compiler won't use it, unless you are compiling your own binaries to your own machine.
I don't believe you, and in fact for one case I can prove you wrong. I can't speak for every software distributor that ships binaries, but since every single 64-bit x86 processor has support for both SSE and SSE2, gcc, by default, uses them when it's compiling code for an x86-64 target (look under the fpmath section):
http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html
RISC machines used to give you 32 register of each kind, by the time they were 32 bits.
RISC machines are a pure Load/Store architecture and don't (or didn't) have the notion of load-execute or load-execute-store instructions. That necessitates more temp registers for doing operations to avoid too many loads/stores to stack-local variables. Ld-ex and Ld-ex-st help to reduce your need for registers in some cases. I would be interested to see register utilization and stack spills for a RISC machine vs. an x86-64 machine, but I haven't seen any data about it.
One of the problems the compact instruction set gives you is that once you use all the instructions, you can't address more registers, that is why x86 is stuck at 16 of them.
The extension from 8 to 16 used an instruction prefix which added an extra bit onto the register address fields in the sources and dest of the instruction. There's nothing stopping AMD or Intel from adding another prefix to add yet another register bit to those fields if anyone ever wanted to go to 32.
The extra decode complexity of x86 chips adds more pipeline stages to the execution path, increasing latencies for executing uncached instructions and increasing branch mispredict penalties.
If your instruction stream is uncachable, that's pretty much a don't-care for performance; it's going to be super slow regardless of pipeline depth because you have to go out to DRAM or IO (hundreds of clocks) every time you want to fetch a new block of instructions. Saving the few clocks in decode isn't going to make a big difference.
Bracnh mispredict penalty I'll give you, x86 decode does increase that by a few clocks in some cases. But there are a number of arch techniques which help reduce or hide that front-end latency (fetch/decode) from branch mispredict penalties: checkpointing the back-end so the front-end can be steered to the branch target early, once the branch mispredict is discovered (MIPS R10000, I don't know if x86 processors do it), using a trace cache to remove decode stages from the front-end pipe (P4, yes we know it was a failure but it had some interesting uarch ideas), storing decode hints in the instruction cache and L2 cache to speed up the variable-length instruction decode (AMD processors from Athlon to present).
All of the numerous add-ons to the instruction set (SSE, x64) require more instruction prefixes and counter the small code size benefit of a well designed CISC architecture.
I'm no RISC ISA expert, but aren't the SIMD instructions also extensions to their instruction set as well? Maybe they had enough free opcodes to do it without expanding any code size, I don't know. If so, good for them. If not, that's the same as x86.
In any case, it sounds like a wash, then. An x86 machine spends its code size benefit on useful ISA extensions and ends up with about the same instruction cache utilization as a RISC machine. A redesigned CISC could beat x86 in that one metric but would necessarily break ISA compatibility. That's possibly the one area where x86 compatibility costs it in performance. I say, BFD.
I do know that multiple implementations of ARM's NEON instructions basically tack a SIMD FP unit on to the *back* end of its pipeline. So doing SIMD FP on an ARM is likely to be slower than x86.
You often get a performance increase by moving to x64.
The single biggest reason for the performance bump when moving to 64-bit x86 was the increase in the number of both integer and SSE registers from 8 to 16. That means more data in registers, less data spills to/fills from the stack (fewer loads and stores). Yes, having a limited number of registers was a problem with x86 prior to 64-bit. But like I just said, they fixed it! What are you complaining about?
I wish Slashdotters would stop with the incessant "x86 sucks" mantra. You're all fools.
There's plenty of crufty old instructions in the x86 ISA; no modern compilers generate them though, so no one cares that they're there. They take up a couple pages in the ISA manual I guess. The die area it takes to implement them is totally, completely insignificant. They're either in microcode (along with a bunch of other really useful instructions) or the hardware already exists for some other reason.
There's plenty of crufty segmentation and weird ways of laying out memory and whatnot; no modern OS uses that though, so no one cares that it's there. And again with the ISA manuals and some transistors. And there's plenty of modern paging and flat memory models and whatnot too.
AMD and Intel both know how to make good, fast, and (relatively) small hardware to decode variable-length x86 instructions. Yes, of course an x86 decoder is bigger (i.e. more expensive, more difficult to implement, etc.) than a RISC fixed-length decoder, but again, no one cares because we already know how to do it fast enough and cheap enough. Check out an x86 die photo sometime; most of it is cache. Probably about 1/50th is decoder.
And CISC-style+variable-length instructions get you a smaller code footprint and thus better instruction cache utilization vs. what you'd get with a fixed-length instruction stream. Examples: common ops get shorter instructions, there are more flexible addressing modes, more flexible sources/dests within a single instruction, you get one x86 instruction (no more than 15 bytes) to do what would take multiple RISC-style instructions (probably more than 15 bytes).
Sure there's the crufty x87 floating point stack. But there's also the shiny new SSE/SSE2/SSE3/whatever instructions, and modern compilers can exclusively use SSE/SSE2 to do the exact same thing (-mfpmath=sse does it in gcc). And again, die area for x87 FP stuff isn't a big deal since a lot of the hardware is shared with SSE.
ISA extensions have been added to cover all the newfangled SIMD stuff and virtualization you can want. AMD64 covers 64-bit stuff. And 64-bit stuff gives you extra registers too (8 extra integer, 8 extra SSE for a total of 16 each), which is great and a nod to the large number of registers that RISC machines give you.
In short, what the hell is everyone bitching about?
The only reason there are businesses (not "organizations"...it's an important distinction) willing to spend millions of dollars paying people to find a cure for cancer is because they think they will be able to make more money than that selling the cure. If the cure must be "free", the businesses aren't going to invest millions researching it.
As the doctors who want to help society...yes, I'll grant that that's probably one of their primary motivations, but that motivation MUST be secondary to their basic needs for food, clothing, shelter, transportation, their own health care, etc....i.e., money, since money is required to purchase all those things. If the businesses won't pay them because it can't make any money (see above), what is the likelihood that those doctors will continue to research cancer drugs instead of doing something else where they CAN make money? My argument is that it's close to 0.
Considering the general societal good involved in finding a cure for cancer, a taxpayer-funded not-for-profit organization is probably the way to go for cancer drug development. E.g., socialism. And we do have some of that going on now. Pure socialism would also benefit the above doctor; as he is researching cancer drugs, his basic needs would be met by the government at no cost to him. But that's a wildly different society than the one currently in place. And history shows us that implementing non-corrupt socialist states has proven difficult.
My point is, when people talk about abolishing copyright and patent law, LOTS more stuff would need to become taxpayer-funded (socialist), or it just wouldn't get done.
Chair designer designs and builds one (1) chair over the course of a few months
Everyone who wants that particular chair uses hctp to give themselves a copy of that chair for $0.
Obviously no one has destroyed the original chair which was created by the chair designer, but what everyone has done was declare the chair designer's months of time spent designing the original chair worth $0.
Now, maybe a chair designer's time really is worth $0 in today's world; chair designs in general are already in the "public domain" since they've been around for thousands of years. But let's replace "chair designer" with "cancer drug designer", and let's replace "few months" with "20 years" to account for the cancer drug designer's time spent in med school, as a resident, and finally as a cancer drug researcher. Is the cancer drug designer's 20 years worth of time spent on developing a drug that cures cancer worth $0, just because everyone can give themselves a copy of his cancer drug for $0?
Good for them. If there's enough volume for them to do this profitably, then I've probably already made enough money and can move on to something else.
There is always enough volume for anyone to do this. If I can get something for $0, and if that thing has any value at all - entertainment value, real-life utility, whatever - then I've just profited since I paid $0 for something of value. Economies-of-scale or volume discounts or anything do not apply; it can happen with a volume of 1. Example: downloading music for $0 from a P2P network.
When something costs $0 to replicate, the "free-market price determination" method of pricing it (like Slashdotters like to so often quote) just plain does not work. The market will *always* settle on $0 as its price, regardless of its value, simply because it's possible to get one for $0, not because it is worth $0.
The cost of producing that thing isn't in replicating it, it was in its initial creation. That is, the entire cost was in creating ONLY the first one...none of the rest of them cost anything. But, unfortunately for the creator of that thing, the marginal cost ($0) does not reflect the value. The creator is now denied from amortizing the creation cost of that thing over more than one sale of the thing.
I'm legitimately not sure if we're agreeing or disagreeing.
I was disagreeing with the OP who called Intellectual Property "propaganda"; I believe that no-marginal-cost-of-reproduction stuff HAS lots of value, including monetary value. Its value lies in the thought, knowledge, novelty, work, and creativity to conceive, design, and implement the design/music/art/software/etc. and the fact that it requires a special skill/talent/education to know enough about the area to create such a thing. The marginal cost of producing another copy of the IP is irrelevant - its entire cost, and also its entire value, lies in its initial creation.
What do you call the design of your Intel Core 2 processor before it gets fabbed into silicon and metal? Are you claiming that the RTL/schematics/"design" of that processor are worthless?
There are a number of problems with both OP's and your analysis.
You imply charting sales in dollars is problematic, and I agree but for a different reason. Sales in dollars does not account for growth of the industry as a whole. If the industry grows at a rate higher than the piracy rate, raw sales in dollars will go up. We're basically measuring the difference between the growth rate and the piracy rate, which unfortunately is not what anyone cares about. If he had instead charted sales in dollars PER karaoke establishment, that might be a crude way to factor industry growth in with his numbers.
You are assuming there is no time lag in your correlation of the two curves, which is a very strong assumption. If there were 1 year of lag between a trend shown in the "# of posts" curve and the "sales in dollars" curve, you might end up with a different analysis. Even if there IS a causal link between two trends, the best anyone can do is estimate what the time lag might be. No one would simply assume it is 0 or close to 0.
What are your great ways of showing correlation between sets of data like this? The best way I can think of is to get honest surveys of the degree to which people pirate songs/albums. Unfortunately, any survey that shows a correlative link between piracy and decreased sales is immediately going to be discounted as biased, so then what?
Given the above, I don't know what we can say from the data - probably nothing.
Well, thanks for offering that lovely response which starts off by you effectively saying "No, you're wrong!", and then argues that we should just roll over and accept a zero-value-in-IP society because everyone can get shit for free from P2P networks. Wake up and realize that there's more at stake here than your fucking pop music and episodes of Grey's Anatomy.
I don't know how you define real goods, but a lot of them are just IP goods which are subsequently transcribed or created in physical form. That nice Core 2 processor running your laptop? Until it gets fabbed, it's just a bunch of Verilog in a CVS repository somewhere. And then after that it's some pictures of some lines and rectangles. Both forms are just bits on a hard drive somewhere, easily replicated at no cost.
The value in that processor is as much in the *design* of it - which is an IP good - as it is in the silicon and metal and packaging. Take away the design (which everyone on Slashdot so proudly calls Imaginary Property), and all we have is a really, really expensive hunk of sand. Hope you enjoy it!
In your Real Economy, should IP creators have to sell shoes to pay the bills while also designing microprocessors?
Here's what everyone on Slashdot seems to miss. IP goods - those easily-reproducible but hard-to-think-up-or-produce-in-the-first-place goods - are the future of modern society. Capitalism requires that IP creators be rewarded *monetarily* for their effors, so that they can buy the non-IP goods they need to survive, things like food and shelter and clothing and transportation.
We can either a) hope that somehow society will evolve to the point where the non-IP goods will become free or easily accessible to those of us involved in the production of IP (some sort of non-corrupt communal/communist state?), or b) we can try to make money from the things we're good at! I'm not betting on a non-corrupt a), which is why I have a hard time completely opposing these types of things which ostensibly to reward IP creators for their work.
I understand the corruption involved in rewarding IP creators (middle men like the RIAA taking all the profits, leaving the IP creators with nothing), but assuming THAT can be fixed, we're back to the basic question: how do we protect the value of IP so that we can be rewarded *monetarily* in our capitalist society?
P.S. - On one hand, I understand that most people on Slashdot are left-oriented "free" folk, but a lot are also employed in some sort of tech or IP-related field. Your future depends on getting this right!
Intel slapped tons of cache on their P4s and Core 2s because it was an effective way of masking the poor memory access speeds (due to the lack of an integrated memory controller).
All cache can possibly do is mask memory access speeds. Intel might have had more of a need to mask the memory access speeds with P4 and Core 2, but that doesn't mean that they somehow gain less of an advantage by throwing tons of it on there.
I also have a feeling AMD saw that their performance with a small cache + IMC was great compared to P4, and so left it small because it obviously costs less to put less cache on the die (less area, less defects, more die per wafer, etc.).
For AMD, the extra cache made basically no difference in performance, so it was a waste to add it.
No. Even with the DRAM controller on-die, L2 cache is faster, by many, many clock cycles. How much that matters obviously depends on the dataset and code size you're working with, but to say that there was no performance difference is flat-out wrong.
the 1M cache models were promptly dropped.
You must be referring to AMD not making a 65nm Athlon64 (the "Brisbane" core) with 1M after they made plenty of 90nm parts with 1M. I don't know the reasons for that, but it certainly had nothing to do with a lack of performance gain from more cache. It might have been lack of demand for that extra performance on what amount to a desktop-only part, since they never sold Brisbanes as Opterons, only as Athlons. Or something to that effect.
Cache size is an area where AMD has considerably lagged Intel for quite some time. I'd consider AMD's bumping up of cache size on the 45nm parts to be moreso evening out a competitive disadvantage they had for 2 years, rather than a desperate attempt to make a non-competitive product compete with an obviously superior one (a la P4 vs. K8).
The Core 2 Quads have large amounts of beyond-L1 cache: 2 x 4M of L2 (or something 2 x 6M), so 8-12M total plus some inter-die communcation latency between the two L2s if you needed to go across the MCM to get a cacheline.
The earlier 65nm Phenoms had 4 x 512k L2 + 2M L3, so a total of 4M of beyond-L1 cache plus the inter-core latency of accessing any of the other cores' L2 or the L3. That's half or less than that of the Core 2. That alone can mostly account for the clock-for-clock advantage that the Core 2 had over the Phenom. (There were other tweaks Intel put in the Core 2 which are great too...Core 2 is a good part...but doubling the effective cache size is a big deal.)
Slapping the 6M L3 on the Phenom II helps even out that disadvantage.
1. The transcripts aren't free. I think this one cost around $2500.
You must have missed the irony and analogy here. An electronic court transcript is just a bunch of bits on a hard drive somewhere, whose creation price has been (arbitarily) set at $2500. Since the electronic document is now in the ether of the Internet, no one will ever again pay $2500 for it, they will pay $0. Does this remind you of anything?
Of course this isn't actually a good example, because the creation cost (the court reporter's time and equipment) is probably already paid for by the time the transcript is sold.
The funny thing about graphics is that it's entirely about throughput, not latency. You could have a pretty dog-slow "stream processor", but as long as you build enough of them on a chip and/or stick enough chips on your board, and have enough memory bandwidth to feed all of them, the performance of a single shader core pretty much doesn't matter.
Like you said, DAAMIT built a pretty middle-of-the-road RV770 (compared to the raw performance of the NV chip) but was able to tie two of them together in a reasonable thermal/power/bandwidth/cost board, and took the graphics performance crown. I wouldn't be surprised to see a 4870 X3 or X4 sometime soon as long as those other factors (thermal/power/bandwidth/cost) remain in line.
It's not my problem when creators of Imaginary Property can't find a working business model for themselves. They're perfectly welcome to go find something else to do with their lives.
That's the scary thing! What if all IP creators decided they'd be better off if they didn't try to survive on thinking up/designing/creating new non-physical goods because they can't get paid? What would happen if all the auxiliary software-as-a-service type business models fail? I sincerely hope some smart people keep finding business models to support the creation of IP, so that IP creators can continue doing their thing and not having to take jobs at McDonald's or selling shoes or something. Lots of innovation and ideas are at stake. Then it might become everyone's problem.
I realize this is almost certainly a non-plausible scenario, that taken to that extreme the free market would find a way to reward IP creators monetarily. But I'm trying to make a point by taking the limit of your utopian "information should be free" idea, and apply that same point to today's situation.
No matter what DRM or legal measures they try to go to, it's just not going to happen.
I hate DRM too, I wish it would die. But that's orthogonal to my point: that IP has value, and the IP creator deserves to be compensated appropriately for that value, somehow. I obviously don't know how given the zero-replication-cost problem.
"We want users of information to be able to freely modify and share it without restriction."
But what you're missing is that this is not possible in a capitalist society where zero-cost-of-replication "Intellectual Property" (IP) is considered to have value. Let me break it down for you.
Businesses amortize the development cost of a product in with the manufacturing costs, and they come up with a sale price which covers both. That way, if they sell enough copies of the product, they cover the initial cost of developing the product, as well as the production costs. Fortunately for them, the value of the product is both in the idea/design/creation, as well as in the physical item itself. So they can get away with charging a price *higher* than just the manufacturing costs, and people have to pay it because they can't manufacture it themselves.
With IP, the production cost is 0. The *entire cost*, and also the *entire value* of the IP, is in the idea/design/creation. The cost of replication is zero, and there is no value in any physical item because there is none.
Say I produce some non-tangible IP (code, music, hardware designs in Verilog, pictures, film, whatever) which in its final form is just some bits on my hard drive. My IP, in and of itself, has worth because other people enjoy it, or because they want it but cannot produce it on their own. Producing that IP took many months or even years of my time. I'd like to be able to sell my IP for whatever the economy deems its *creation* value so I can cover the cost of creation, and maybe make some profit like any good capitalist should be able to.
So how do I go about getting paid? In an "information wants to be free" society, I can only get paid for the first instance of that IP. Every other copy nets me $0, because once the information is out there, it's free. So, I better set my price for that first copy really high, because that one lump sum is all I'm ever going to get, and I have to cover months or years of development costs in a single shot. Since the price is so high, no one can afford my IP. The only price anyone can afford won't cover my development costs. Learning from this, I then decide to never do IP development again, since I am now broke and penniless. (Remember, we're still in a capitalist society). IP, and all the good things that come with it, die a quick death.
In a perfect communist society, I wouldn't mind putting a year into the creation of some IP that I would give away for free, because over the course of that year I'd have my needs met for "free" as well.
The old arguments are that you can't sell just the IP - you have to sell "service" or a "live show" or something else which isn't zero-cost-of-replication, and in doing that, recoup the initial creation cost. But whenever someone makes that argument, they imply that the IP itself has no dollar value because it can't be sold for anything more than $0. I strongly disagree with that implication.
Anyway, what we capitalists have come up with is this artificial restriction on IP, so that IP creators can amortize the cost of conception/design/creation over more than one sale. The only way to enforce this is to restrict what people do with the IP. It's up to the creator of the IP what those restrictions are. FSF might choose one way of restricting the IP (and ironically that restriction basically throws to the wind the whole point of the restriction), but that's not the only restriction possible.
Maybe that's the business model of the future, but I would want to make sure at least some of my money ended up in the hands of the creators of the media I'm enjoying via that service.
If Blockbuster or Amazon or iTunes did it, there would have to be an agreement with *someone* involved in the production of the works (hopefully artists directly; unfortunately it would probably be record companies and studios). If TPB does it, $0 goes to the artists, and I cannot agree with that. I know that studios and record companies in general suck because they don't pay the artists enough, but they still pay them more than $0.
There's obviously a trade-off. For some problems, buying a faster chip is MUCH lower-return than rewriting something to be multithreaded, and you must weight that with the cost of rewriting.
Buying a faster chip gets you on average about 10% speedup. But if your application gets, say, a 200-300% speedup on a quad-core processor (from being multithreaded and having nearly-linear scaling behavior as you add cores), it might be worth the investment. Depends what you're doing - how much parallelism is inherent in the problem, how much it costs for a someone to parallelize it, and how much you value having the results sooner.
Yes, decode adding latency to the pipe was discussed in these posts. The GP was talking about decoding x86 instructions in parallel, which to me indicates he's concerned about decode bandwidth, not latency.
Take a look at what it takes to decode x86 instruction in parallel and then we'll talk.
With decode hints that come out of the instruction cache along with the instruction bytes themselves, it's not too bad. Besides, I don't think x86 machines are decode-bandwidth-limited. I don't think any modern microprocessor is decode-bandwidth-limited, it's memory-bandwidth and cache-miss-latency limited.
They still use die area,
Yes, I said that. I also said that compared to die area spent on lots of other things in a modern microprocessor/SoC (cache, memory controllers, point-to-point coherent and IO links, buffers everywhere), it's probably minor.
they still can't be turned off (to save power),
You would be surprised what intelligent clock gating can do. Clock gating can't reduce leakage but it does reduce dynamic switching power.
and they still introduce defects (what cache almost doesn't do).
That's another side effect of them taking up die area. You said it three times in your post; I've acknowledged it but asserted that the die area is small compared to other things. I bet defects in cache arrays are WAY more likely to cause a chip to be scrapped or sold as a lower-performing part.
It doesn't matter (except for the die area, number of defects and not being able to turn offf), the compiler won't use it, unless you are compiling your own binaries to your own machine.
I don't believe you, and in fact for one case I can prove you wrong. I can't speak for every software distributor that ships binaries, but since every single 64-bit x86 processor has support for both SSE and SSE2, gcc, by default, uses them when it's compiling code for an x86-64 target (look under the fpmath section): http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html
RISC machines used to give you 32 register of each kind, by the time they were 32 bits.
RISC machines are a pure Load/Store architecture and don't (or didn't) have the notion of load-execute or load-execute-store instructions. That necessitates more temp registers for doing operations to avoid too many loads/stores to stack-local variables. Ld-ex and Ld-ex-st help to reduce your need for registers in some cases. I would be interested to see register utilization and stack spills for a RISC machine vs. an x86-64 machine, but I haven't seen any data about it.
One of the problems the compact instruction set gives you is that once you use all the instructions, you can't address more registers, that is why x86 is stuck at 16 of them.
The extension from 8 to 16 used an instruction prefix which added an extra bit onto the register address fields in the sources and dest of the instruction. There's nothing stopping AMD or Intel from adding another prefix to add yet another register bit to those fields if anyone ever wanted to go to 32.
The extra decode complexity of x86 chips adds more pipeline stages to the execution path, increasing latencies for executing uncached instructions and increasing branch mispredict penalties.
If your instruction stream is uncachable, that's pretty much a don't-care for performance; it's going to be super slow regardless of pipeline depth because you have to go out to DRAM or IO (hundreds of clocks) every time you want to fetch a new block of instructions. Saving the few clocks in decode isn't going to make a big difference.
Bracnh mispredict penalty I'll give you, x86 decode does increase that by a few clocks in some cases. But there are a number of arch techniques which help reduce or hide that front-end latency (fetch/decode) from branch mispredict penalties: checkpointing the back-end so the front-end can be steered to the branch target early, once the branch mispredict is discovered (MIPS R10000, I don't know if x86 processors do it), using a trace cache to remove decode stages from the front-end pipe (P4, yes we know it was a failure but it had some interesting uarch ideas), storing decode hints in the instruction cache and L2 cache to speed up the variable-length instruction decode (AMD processors from Athlon to present).
All of the numerous add-ons to the instruction set (SSE, x64) require more instruction prefixes and counter the small code size benefit of a well designed CISC architecture.
I'm no RISC ISA expert, but aren't the SIMD instructions also extensions to their instruction set as well? Maybe they had enough free opcodes to do it without expanding any code size, I don't know. If so, good for them. If not, that's the same as x86.
In any case, it sounds like a wash, then. An x86 machine spends its code size benefit on useful ISA extensions and ends up with about the same instruction cache utilization as a RISC machine. A redesigned CISC could beat x86 in that one metric but would necessarily break ISA compatibility. That's possibly the one area where x86 compatibility costs it in performance. I say, BFD.
I do know that multiple implementations of ARM's NEON instructions basically tack a SIMD FP unit on to the *back* end of its pipeline. So doing SIMD FP on an ARM is likely to be slower than x86.
You often get a performance increase by moving to x64.
The single biggest reason for the performance bump when moving to 64-bit x86 was the increase in the number of both integer and SSE registers from 8 to 16. That means more data in registers, less data spills to/fills from the stack (fewer loads and stores). Yes, having a limited number of registers was a problem with x86 prior to 64-bit. But like I just said, they fixed it! What are you complaining about?
I wish Slashdotters would stop with the incessant "x86 sucks" mantra. You're all fools.
There's plenty of crufty old instructions in the x86 ISA; no modern compilers generate them though, so no one cares that they're there. They take up a couple pages in the ISA manual I guess. The die area it takes to implement them is totally, completely insignificant. They're either in microcode (along with a bunch of other really useful instructions) or the hardware already exists for some other reason.
There's plenty of crufty segmentation and weird ways of laying out memory and whatnot; no modern OS uses that though, so no one cares that it's there. And again with the ISA manuals and some transistors. And there's plenty of modern paging and flat memory models and whatnot too.
AMD and Intel both know how to make good, fast, and (relatively) small hardware to decode variable-length x86 instructions. Yes, of course an x86 decoder is bigger (i.e. more expensive, more difficult to implement, etc.) than a RISC fixed-length decoder, but again, no one cares because we already know how to do it fast enough and cheap enough. Check out an x86 die photo sometime; most of it is cache. Probably about 1/50th is decoder.
And CISC-style+variable-length instructions get you a smaller code footprint and thus better instruction cache utilization vs. what you'd get with a fixed-length instruction stream. Examples: common ops get shorter instructions, there are more flexible addressing modes, more flexible sources/dests within a single instruction, you get one x86 instruction (no more than 15 bytes) to do what would take multiple RISC-style instructions (probably more than 15 bytes).
Sure there's the crufty x87 floating point stack. But there's also the shiny new SSE/SSE2/SSE3/whatever instructions, and modern compilers can exclusively use SSE/SSE2 to do the exact same thing (-mfpmath=sse does it in gcc). And again, die area for x87 FP stuff isn't a big deal since a lot of the hardware is shared with SSE.
ISA extensions have been added to cover all the newfangled SIMD stuff and virtualization you can want. AMD64 covers 64-bit stuff. And 64-bit stuff gives you extra registers too (8 extra integer, 8 extra SSE for a total of 16 each), which is great and a nod to the large number of registers that RISC machines give you.
In short, what the hell is everyone bitching about?
The only reason there are businesses (not "organizations"...it's an important distinction) willing to spend millions of dollars paying people to find a cure for cancer is because they think they will be able to make more money than that selling the cure. If the cure must be "free", the businesses aren't going to invest millions researching it.
As the doctors who want to help society...yes, I'll grant that that's probably one of their primary motivations, but that motivation MUST be secondary to their basic needs for food, clothing, shelter, transportation, their own health care, etc....i.e., money, since money is required to purchase all those things. If the businesses won't pay them because it can't make any money (see above), what is the likelihood that those doctors will continue to research cancer drugs instead of doing something else where they CAN make money? My argument is that it's close to 0.
Considering the general societal good involved in finding a cure for cancer, a taxpayer-funded not-for-profit organization is probably the way to go for cancer drug development. E.g., socialism. And we do have some of that going on now. Pure socialism would also benefit the above doctor; as he is researching cancer drugs, his basic needs would be met by the government at no cost to him. But that's a wildly different society than the one currently in place. And history shows us that implementing non-corrupt socialist states has proven difficult.
My point is, when people talk about abolishing copyright and patent law, LOTS more stuff would need to become taxpayer-funded (socialist), or it just wouldn't get done.
Let's continue your analogy:
Obviously no one has destroyed the original chair which was created by the chair designer, but what everyone has done was declare the chair designer's months of time spent designing the original chair worth $0.
Now, maybe a chair designer's time really is worth $0 in today's world; chair designs in general are already in the "public domain" since they've been around for thousands of years. But let's replace "chair designer" with "cancer drug designer", and let's replace "few months" with "20 years" to account for the cancer drug designer's time spent in med school, as a resident, and finally as a cancer drug researcher. Is the cancer drug designer's 20 years worth of time spent on developing a drug that cures cancer worth $0, just because everyone can give themselves a copy of his cancer drug for $0?
Good for them. If there's enough volume for them to do this profitably, then I've probably already made enough money and can move on to something else.
There is always enough volume for anyone to do this. If I can get something for $0, and if that thing has any value at all - entertainment value, real-life utility, whatever - then I've just profited since I paid $0 for something of value. Economies-of-scale or volume discounts or anything do not apply; it can happen with a volume of 1. Example: downloading music for $0 from a P2P network.
When something costs $0 to replicate, the "free-market price determination" method of pricing it (like Slashdotters like to so often quote) just plain does not work. The market will *always* settle on $0 as its price, regardless of its value, simply because it's possible to get one for $0, not because it is worth $0.
The cost of producing that thing isn't in replicating it, it was in its initial creation. That is, the entire cost was in creating ONLY the first one...none of the rest of them cost anything. But, unfortunately for the creator of that thing, the marginal cost ($0) does not reflect the value. The creator is now denied from amortizing the creation cost of that thing over more than one sale of the thing.
Probably "Trade Secret"
That's just another form of IP.
I'm legitimately not sure if we're agreeing or disagreeing.
I was disagreeing with the OP who called Intellectual Property "propaganda"; I believe that no-marginal-cost-of-reproduction stuff HAS lots of value, including monetary value. Its value lies in the thought, knowledge, novelty, work, and creativity to conceive, design, and implement the design/music/art/software/etc. and the fact that it requires a special skill/talent/education to know enough about the area to create such a thing. The marginal cost of producing another copy of the IP is irrelevant - its entire cost, and also its entire value, lies in its initial creation.
What do you call the design of your Intel Core 2 processor before it gets fabbed into silicon and metal? Are you claiming that the RTL/schematics/"design" of that processor are worthless?
Given the above, I don't know what we can say from the data - probably nothing.
Well, thanks for offering that lovely response which starts off by you effectively saying "No, you're wrong!", and then argues that we should just roll over and accept a zero-value-in-IP society because everyone can get shit for free from P2P networks. Wake up and realize that there's more at stake here than your fucking pop music and episodes of Grey's Anatomy.
I don't know how you define real goods, but a lot of them are just IP goods which are subsequently transcribed or created in physical form. That nice Core 2 processor running your laptop? Until it gets fabbed, it's just a bunch of Verilog in a CVS repository somewhere. And then after that it's some pictures of some lines and rectangles. Both forms are just bits on a hard drive somewhere, easily replicated at no cost.
The value in that processor is as much in the *design* of it - which is an IP good - as it is in the silicon and metal and packaging. Take away the design (which everyone on Slashdot so proudly calls Imaginary Property), and all we have is a really, really expensive hunk of sand. Hope you enjoy it!
In your Real Economy, should IP creators have to sell shoes to pay the bills while also designing microprocessors?
Here's what everyone on Slashdot seems to miss. IP goods - those easily-reproducible but hard-to-think-up-or-produce-in-the-first-place goods - are the future of modern society. Capitalism requires that IP creators be rewarded *monetarily* for their effors, so that they can buy the non-IP goods they need to survive, things like food and shelter and clothing and transportation.
We can either a) hope that somehow society will evolve to the point where the non-IP goods will become free or easily accessible to those of us involved in the production of IP (some sort of non-corrupt communal/communist state?), or b) we can try to make money from the things we're good at! I'm not betting on a non-corrupt a), which is why I have a hard time completely opposing these types of things which ostensibly to reward IP creators for their work.
I understand the corruption involved in rewarding IP creators (middle men like the RIAA taking all the profits, leaving the IP creators with nothing), but assuming THAT can be fixed, we're back to the basic question: how do we protect the value of IP so that we can be rewarded *monetarily* in our capitalist society?
P.S. - On one hand, I understand that most people on Slashdot are left-oriented "free" folk, but a lot are also employed in some sort of tech or IP-related field. Your future depends on getting this right!
Intel slapped tons of cache on their P4s and Core 2s because it was an effective way of masking the poor memory access speeds (due to the lack of an integrated memory controller).
All cache can possibly do is mask memory access speeds. Intel might have had more of a need to mask the memory access speeds with P4 and Core 2, but that doesn't mean that they somehow gain less of an advantage by throwing tons of it on there.
I also have a feeling AMD saw that their performance with a small cache + IMC was great compared to P4, and so left it small because it obviously costs less to put less cache on the die (less area, less defects, more die per wafer, etc.).
For AMD, the extra cache made basically no difference in performance, so it was a waste to add it.
No. Even with the DRAM controller on-die, L2 cache is faster, by many, many clock cycles. How much that matters obviously depends on the dataset and code size you're working with, but to say that there was no performance difference is flat-out wrong.
the 1M cache models were promptly dropped.
You must be referring to AMD not making a 65nm Athlon64 (the "Brisbane" core) with 1M after they made plenty of 90nm parts with 1M. I don't know the reasons for that, but it certainly had nothing to do with a lack of performance gain from more cache. It might have been lack of demand for that extra performance on what amount to a desktop-only part, since they never sold Brisbanes as Opterons, only as Athlons. Or something to that effect.
Cache size is an area where AMD has considerably lagged Intel for quite some time. I'd consider AMD's bumping up of cache size on the 45nm parts to be moreso evening out a competitive disadvantage they had for 2 years, rather than a desperate attempt to make a non-competitive product compete with an obviously superior one (a la P4 vs. K8).
The Core 2 Quads have large amounts of beyond-L1 cache: 2 x 4M of L2 (or something 2 x 6M), so 8-12M total plus some inter-die communcation latency between the two L2s if you needed to go across the MCM to get a cacheline.
The earlier 65nm Phenoms had 4 x 512k L2 + 2M L3, so a total of 4M of beyond-L1 cache plus the inter-core latency of accessing any of the other cores' L2 or the L3. That's half or less than that of the Core 2. That alone can mostly account for the clock-for-clock advantage that the Core 2 had over the Phenom. (There were other tweaks Intel put in the Core 2 which are great too...Core 2 is a good part...but doubling the effective cache size is a big deal.)
Slapping the 6M L3 on the Phenom II helps even out that disadvantage.
Are you suggesting that printing, binding, and shipping a single copy of 700 pages of text costs $2500?
1. The transcripts aren't free. I think this one cost around $2500.
You must have missed the irony and analogy here. An electronic court transcript is just a bunch of bits on a hard drive somewhere, whose creation price has been (arbitarily) set at $2500. Since the electronic document is now in the ether of the Internet, no one will ever again pay $2500 for it, they will pay $0. Does this remind you of anything?
Of course this isn't actually a good example, because the creation cost (the court reporter's time and equipment) is probably already paid for by the time the transcript is sold.
See http://slashdot.org/comments.pl?sid=1059301&cid=26084387 for the rest of my thoughts on IP, creation cost, and the value of someone's time, creativity, and ideas.
Get ready for mandatory RFID license plates, and all the privacy and security problems that come with them.
The funny thing about graphics is that it's entirely about throughput, not latency. You could have a pretty dog-slow "stream processor", but as long as you build enough of them on a chip and/or stick enough chips on your board, and have enough memory bandwidth to feed all of them, the performance of a single shader core pretty much doesn't matter.
Like you said, DAAMIT built a pretty middle-of-the-road RV770 (compared to the raw performance of the NV chip) but was able to tie two of them together in a reasonable thermal/power/bandwidth/cost board, and took the graphics performance crown. I wouldn't be surprised to see a 4870 X3 or X4 sometime soon as long as those other factors (thermal/power/bandwidth/cost) remain in line.
It's not my problem when creators of Imaginary Property can't find a working business model for themselves. They're perfectly welcome to go find something else to do with their lives.
That's the scary thing! What if all IP creators decided they'd be better off if they didn't try to survive on thinking up/designing/creating new non-physical goods because they can't get paid? What would happen if all the auxiliary software-as-a-service type business models fail? I sincerely hope some smart people keep finding business models to support the creation of IP, so that IP creators can continue doing their thing and not having to take jobs at McDonald's or selling shoes or something. Lots of innovation and ideas are at stake. Then it might become everyone's problem.
I realize this is almost certainly a non-plausible scenario, that taken to that extreme the free market would find a way to reward IP creators monetarily. But I'm trying to make a point by taking the limit of your utopian "information should be free" idea, and apply that same point to today's situation.
No matter what DRM or legal measures they try to go to, it's just not going to happen.
I hate DRM too, I wish it would die. But that's orthogonal to my point: that IP has value, and the IP creator deserves to be compensated appropriately for that value, somehow. I obviously don't know how given the zero-replication-cost problem.
"We want users of information to be able to freely modify and share it without restriction."
But what you're missing is that this is not possible in a capitalist society where zero-cost-of-replication "Intellectual Property" (IP) is considered to have value. Let me break it down for you.
Businesses amortize the development cost of a product in with the manufacturing costs, and they come up with a sale price which covers both. That way, if they sell enough copies of the product, they cover the initial cost of developing the product, as well as the production costs. Fortunately for them, the value of the product is both in the idea/design/creation, as well as in the physical item itself. So they can get away with charging a price *higher* than just the manufacturing costs, and people have to pay it because they can't manufacture it themselves.
With IP, the production cost is 0. The *entire cost*, and also the *entire value* of the IP, is in the idea/design/creation. The cost of replication is zero, and there is no value in any physical item because there is none.
Say I produce some non-tangible IP (code, music, hardware designs in Verilog, pictures, film, whatever) which in its final form is just some bits on my hard drive. My IP, in and of itself, has worth because other people enjoy it, or because they want it but cannot produce it on their own. Producing that IP took many months or even years of my time. I'd like to be able to sell my IP for whatever the economy deems its *creation* value so I can cover the cost of creation, and maybe make some profit like any good capitalist should be able to.
So how do I go about getting paid? In an "information wants to be free" society, I can only get paid for the first instance of that IP. Every other copy nets me $0, because once the information is out there, it's free. So, I better set my price for that first copy really high, because that one lump sum is all I'm ever going to get, and I have to cover months or years of development costs in a single shot. Since the price is so high, no one can afford my IP. The only price anyone can afford won't cover my development costs. Learning from this, I then decide to never do IP development again, since I am now broke and penniless. (Remember, we're still in a capitalist society). IP, and all the good things that come with it, die a quick death.
In a perfect communist society, I wouldn't mind putting a year into the creation of some IP that I would give away for free, because over the course of that year I'd have my needs met for "free" as well.
The old arguments are that you can't sell just the IP - you have to sell "service" or a "live show" or something else which isn't zero-cost-of-replication, and in doing that, recoup the initial creation cost. But whenever someone makes that argument, they imply that the IP itself has no dollar value because it can't be sold for anything more than $0. I strongly disagree with that implication.
Anyway, what we capitalists have come up with is this artificial restriction on IP, so that IP creators can amortize the cost of conception/design/creation over more than one sale. The only way to enforce this is to restrict what people do with the IP. It's up to the creator of the IP what those restrictions are. FSF might choose one way of restricting the IP (and ironically that restriction basically throws to the wind the whole point of the restriction), but that's not the only restriction possible.