I personally use slackware because most of it has to be configured and maintained by hand after the initial installation. There are scripts to simplify some of it, but I ignore many of them (netconfig will munch any tweaking I've done to the networking setup). I enjoy hand-tuning the configuration, and dislike distributions that have extensive autogeneration scripts that try to do all of the configuration for me (coughcoughCalderacough).
This is just personal taste. Slackware is actaully very _unfriendly_ for casual users or users who don't want to have to be constantly tinkering with the guts of the system when installing or reconfiguring something. I just happen to like configuring things myself.
I'm told that Debian is also good for this, and doesn't _make_ you do this, but I haven't had a good reason to switch so far.
Here's a challenge - try doing that with Windows95/98/2000 !!!
You can do it, by much the same method (hand-picking packages to install). I've installed Windows 95 on a 486 DX-33 with a 120 meg HD and 8 megs of RAM. It _crawled_, but still ran.
I can get Windows 98 down to 200 megs without too much effort. Not sure what the minimum size is.
The biggest difference that I know of is that Windows performance suffers terribly with less RAM, less disk space, and a slower machine, while Linux's doesn't (unless you're doing compiles or using processor-intensive applications or have a huge desktop).
I haven't used BSD extensively, but I suspect that it behaves similarly.
Windows 2000 is a renamed NT 5.0. Different beast from the 9x series, and more resource-hungry. I have no idea what the minimum practical installation size is for it.
What exactly is the difference between GLIBC and the current standard (I forget its name)? How will this benefit all of us Slack-users?
If I understand correctly, the main functional difference is a different format for binaries. This means that you can't link object files produced with glibc with object files produced with libc5. This doesn't matter if you're compiling everything from source code (because your compiler will give you the same binary format that the rest of your system uses), but it makes it impossible to link in libraries that you receive just as binaries (object files) that are in the wrong format.
Disclaimer: I haven't messed with the compilers in detail, so I may have missed several very large points:).
NATs are the bane of net existance, they break the end-to-end model that the Internet is based on. IP security won't work with NATs, neither will many apps these days (lots of games don't work through a NAT).
I happen to like NATs - they are a good way of making sure that the network inside my workplace or home isn't visible to the outside world. As far as the ISP is concerned, my house consists of the firewall machine, and my workplace consists of a firewall and a mail server, which IMO is as it should be.
I readily agree that using NATs as a means of packing more machines into the address space is a Bad Idea - I'd like to have the potential for more than a few billion world-visible boxes. They're also a bad idea on an internal network that has to be able to see all parts of itself from all parts of itself, and for cell phone networks. However, I don't see why they're intrinsically evil.
I haven't had a problem running games behind a masquerading firewall. Tribes 'net play works fine. Quake 'net play works fine.
Now my question is, are moderators on/. decided by posts alone? If so, the guy who's always "I posted first! Phhht" will be in quite a good position!
According to the moderator guidelines, moderators are chosen at random from a fairly wide cross-section of the slashdot population. Anyone who posts a bit and reads a fair bit is elligeable if I understand correctly, which means at least a third of the people visiting the site (a guess - don't flame me for this).
The only people who can't become moderators, again if I understand correctly, are ones who either never post, never read, or constantly hit "reload". Check out the guidelines themselves for more detailed information.
The main problem with IPv4 that IPv6 is trying to solve is a lack of address space. By using IP masquerading, that problem can be alieviated indefinately, at the cost of increasing the lag time.
Not quite. If we only want to have an arbitrarily large number of user machines that aren't serving anything to the world at large, that works, but if we want to have an arbitrarily large number of world-visible servers, it breaks down.
Also, you only have 65536 ports on your masquerading firewall. If you put that at, say, the top of a class A private subnet and more than 65536 machines try to access the world at a time, congestion becomes a problem.
Though I'll admit that congestion won't be *much* of a problem under real conditions, for the next little while (Fermi 100 trillion users maximum).
A plan to squeeze a few more IPs out of IPv4 is simply a quick and dirty solution that, given the exponential growth of the Internet, would only last about a few years (I have no idea how he thinks this will last another 100 years - I assume his math is as bad as his grammar.)
He actually did propose extending the number of bits in IP addresses. The main point of the new subnetting scheme, AFAICT, is to make it easier to add these bits while keeping older addresses valid. However, his new scheme isn't necessary for that (click on "user info" to see my previous response).
If someone finds a kernel of truth or reason in this article, please speak up. But don't go in there without your brain firmly strapped in.
I've made it through most of the article. AFAICT, he postulates adding bits as a method of getting around the number-of-addresses problem, and proposes a different way of organizing subnets.
The *one* (1) saving grace that his article has is that his proposed organizational scheme makes it relatively painless to increase the number of bits down the road, without having to reassign addresses. OTOH, it's easy enough to do that with the present system too (treat your 4-byte IP as the _least_significant_ part of a larger address).
The article was poorly organized and incredibly obfuscated. I really do hope that this person isn't really a member of any decision-making organization. I could give a summary containing all of the useful information on it in a tenth the space, and more clearly.
In fact, I'm seriously considering doing this just so that nobody has to wade through this monstrosity in its original form.
This is "solid state" the way hard drives are.
on
2.3TB drives for $50
·
· Score: 2
Now, the solid state bit is an interesting spin, but think about it: 1. How much faster than 10K RPM can we spin drives?
Actually, according to the article they'll still need actuators to move the read/write head over the material... which is starting to sound suspiciously like an ordinary hard drive (actuators move on one axis and the disk medium spins on the other). Solid state starts looking like a bit of a misnomer here.
AFAICT from the article this is just a device working much like a hard drive with multiple layers per platter that uses a magneto-optical system to do layer selection (much as DVDs can focus on different layers). Where they get their size, cost, and capacity numbers from I'm not sure.
How comparable is the hardware in a Dreamcast to a new PC with a good 3D card?
Comparable. The Dreamcast uses the PowerVR 2 graphics chipset, which is also available as a PC card (go to Sharkey Extreme's archives for the article). The PVR2 card benchmarked at about two thirds the speed of a high-end consumer card, which suggests that the Dreamcast is slightly worse than a PC, but a friend who works in the console gaming industry insists that optimizations in the Dreamcast make up for that.
The same friend insists that the Dreamcast has more than enough processing power to handle all geometry for the card, and I'm inclined to agree. For general-purpose, the SH4 isn't that great, and for double-precision floating-point, it's pretty horrid, but it works amazingly well for single-precision floating-point and vector/matrix computations, due to a specialized instruction set and specialized floating-point hardware heavily optimized for that specific purpose. You can find more information in the spec sheets for it, which are linked from the SH4/BSD article referred to above.
So, I can believe that the Dreamcast would make as good a game machine as a present high-end PC. The main problem is that the PCs will be twice as powerful by Christmas (when 0.18 micron technology has matured), while the Dreamcast will be waiting a while for a successor.
As with the original Playstation, what will make or break it will be the quality of its games, though. The Playstation renders like a first-generation 3D card, but it's still fun.
I suspect that the H64's instruction set is similar to Motorola's 68000 only better.
Not quite. The 68k series was still a bit CISCian, while IIRC the SH4 had a smaller, more RISC-like instruction set with a few specialty FP instructions added. [Before this starts another Holy War, let me point out that both RISC and CISC can be used efficiently; CISC is just more difficult to optimize hardware for.] The page referenced in the article contains a link to the SH4 reference manuals; among other things, these contain the instruction set.
Call me crazy, but won't this actually be slower? I mean, putting that many chips on the same bus... even if you do sync in the manner they're talking about, that doesn't help much when you're working with large volumes of data
This is correct. Intel seems to have made a half-hearted switch to a crossbar bus system, with the result that their bus can still be easily saturated. Programs that aren't memory-intensive will still benefit from the SMP. Programs that are memory-intensive will saturate the bus and leave many of the processors idle.
It'll be good for Quake III, but I suspect anyone in the know will probably stick with a RISC design.
Memory performance has nothing to do with whether the processor is RISC or CISC. Memory bus design is the only thing that matters here. Point-to-point implemented on a crossbar bus beats a shared bus no matter what chips are used.
Getting back to the Holy War, most modern implementations of CISC are almost as efficient as RISC (look at the guts of the K7 design for an example). You wind up with an extra stage in your pipeline for decoding, and that's about it. RISC is generally favoured because there's no real reason to use CISC any more (higher code density isn't as vital), and removing CISC support simplifies chip design and optimization.
The main problem with Intel chips is that they've been repeatedly extending a design that wasn't designed to be extensible, in a rather kludgy manner. They're still stuck with it, because if they switch to a completely new architecture they lose their installed base of customers. This is why the Merced still supports x86 modes.
Re:Variants would make nice gateway/firewall/route
on
Linux on a SIMM
·
· Score: 2
I have my suspicions about an 18MHz processor being able to handle a 100BaseT connection.
I realize that; that's one of the many things that would make this a variant:).
I'd like to see someone integrate an ARM core and a couple of ethernet controller cores on a die for use in a module like this.
Variants would make nice gateway/firewall/routers.
on
Linux on a SIMM
·
· Score: 2
I just finished setting up an old Pentium as a firewall. It would be nice to be able to use something that doesn't take up as much closet space:).
If they make a version of this that can handle two or more network interfaces at 100 Mbit, I'm sold.
Also, does anyone have any info on actually processor performance comparisons between a PPC and a Pentium/K7^H^HAthlon? I know there won't be Athlon data yet, but I figure the more exposure they get the better.
Check www.spec.org. It provides standard benchmarking code, and collects benchmarks for everything from PCs up to Big Iron. It will certainly have comparisons between PIIIs and the PPC-750, and should have Athlon data as soon as AMD gets around to compiling the benchmark software.
I want a Space Cadet keyboard. This would be a wonderful Neat Item to have lying around, and should I ever get around to building an adapter for it, a neat toy as well.
Does anyone know where keyboards of this type can be obtained, or even where I could find a picture of one?
'to beg the question' is not used in this way... begging the question means that your conclusion is stated in your premises...
That's true in logic, but not in colloquial speech. Treating both definitions as being valid leads to the least frustration. I doubt the colloquial use will disappear any time soon.
If it's really necessary to remove keys from the keyboard, ditch the main numeric keys or use them for something else; the numeric keypad is far more useful, especially if you do a lot of numeric data entry
While I agree about the numeric keypad being very useful, IMO the main numeric keys are too. They let me type occasional numbers without having to move my hands. Doing something like coding while having to move my hand to another area of the keyboard to enter numbers would slow me down a lot.
RISC86 instructions share some characteristics with Pentium microinstructions: They're quite long (the Nx586's eight-chip predecessor used 104-bit RISC86 instructions) and carry vital information of processor states that normally wouldn't be known to a true external RISC instruction. But there's still an important difference: Unlike microcode, NexGen's RISC86 can theoretically support its own assemblers, compilers, and application software. The Nx586 bus can bypass the 80x86 decoder and feed RISC86 instructions directly into the execution stages of the pipeline at full bus speeds.
Ok... Their marketing department calls these RISC instructions, and I call them microcode bit vectors. The article you cited points out itself the differences between this and conventional RISC instructions - RISC instructions don't contain as much internal control state information. They also are usually one machine word in size (32 bits on most PCs, 64 bits on most workstations). These bit vectors are much longer, which leads into the main problem with this approach, which I mentioned in other responses - very low code density.
By writing in microcode, you skip any decoding latency that might exist, but you end up having to transfer several times more information than you otherwise would. This means it takes several machine words to specify each instruction, and several bus cycles to load each instruction (full bus speed != one instruction per bus clock). You could put in a wider memory bus, but the same problem applies - your instruction stream bandwidth just went down a lot. As memory bandwidth is one of the main limiting factors for system performance, this will hurt, a lot. Additionally, the instruction cache can suddenly hold far fewer instructions - you either need a bigger cache to compensate, or suffer through many cache misses.
I am very surprised that NextGen actually built a chip using this approach. It's an interesting idea, but as I state in other messages, conventional RISC is almost as easy to decode and doesn't suffer from the code density problems described above.
The question is, does the K7 bus do what the Nx586 bus did and what the Socket 7 motherboard bus did not -- can it bypass the 80x86 decoder?
I strongly doubt it. No mention of this was made in any of AMD's releases, there is no pressing market demand for it, and it would probably _worsen_ performance, as described above. What they _could_ do is allow an additional RISC instruction set, which would accomplish much the same thing, but it is open to question whether the (very substantial) additional design effort would be worth it for their target market.
Intel is doing something similar to this with the Merced, with x86 and VLIW modes, but that design has its own problems.
In a simple RISC pipeline, there is really no microcode per se involved. The decode from instruction to control signals is through much more straightforward circuit-based transformations than a table lookup. I wouldn't really equate a mux based on bits of the instruction to a full-fledged ROM indexing as with microcode.
I am using "microcode" to refer to the final signal patterns on the control lines during each clock, which may not be the standard usage. Re. MUX vs. table lookup, both take a significant amount of time, which is what I am getting at. By making your instruction opcode a huge bit vector specifying the state of all control lines for the current clock and possibly the next few clocks, you could eliminate the decoding, but at the expense of a silly amount of code bloat, which IMO is impractical (among other things you'd need several memory reads to read each opcode).
I readily agree that complicated CISC translation takes longer than RISC translation, but IMO going any farther than RISC gets impractical very quickly.
There is a big difference between simply being able to run x86 programs and being able to run them well. Excuse my language, but if you buy a Merced to run x86 software YOU'RE A FUCKING MORON!
It's there to provide an Intel-brand x86-compatible chip after Intel switches to IA64. Current users who want to upgrade and still want to buy Intel will buy it, which gets them "hooked" on the new architecture. If Intel broke compatibility, they would lose their market. x86 clones would be chosen by the masses, and people who wanted a high-performing non-x86 system would choose other processors.
If, on the other hand, people can upgrade their x86 "workstations" to Merced "workstations", they can still run their old software, while software written with the Merced in mind gets a speed boost.
Think about it: the entire point of the Merced is to place the burden of optimizing a processor's operation on the compiler rather than the processor itself. Merced's performance is totatly reliant of the quality of the compiler.
For the VLIW instructions, yes. However, it wouldn't cost them much space to put in x86 optimizations. Heck, most of them are already present in the Merced hardware - pipelining, superscaling, and branch prediction are still there. x86 emulation would just require a second instruction decoder and a bit of extra glue logic.
The reason that the compiler is a bugger is that it's difficult to use _VLIW_ properly outside of hand-coded assembly.
IMO, the Merced will either take a 1.5x speed hit over straight x86 clones, or will require 1.5x as much silicon. I'm betting that Intel will throw extra silicon at it and wind up with a chip that runs x86 code at reasonable speed but that costs twice as much as a Pentium-III to fabricate.
Did I mention that it will also cost a freakin fortune?
I strongly suspect that they'll release a cheaper low-end version with smaller cache, just as with the Celeron/Pentium/Xeon spectrum. They need something to offer in the x86 regime, and they need consumers to upgrade to something Merced-compatible if they want to keep their market share while moving on with the Merced. If they don't do this immediately, they'll do it reasonably soon afterwards.
The nextgen chip (which is what the k6+ as based on) had the ability to do this.
To do what, specifically?
To switch between two separate instruction sets, I'll believe. You need more decoding logic, but it can be done fairly easily.
To allow programs to be written in microcode, I doubt - microcode has a very low code density, which leads to many, many problems. RISC is just as easy to optimize, so you don't gain much for your pains.
But everything changes when they break compatability. If Merced isn't x86, then both the locked-in and the scale advantages evaporate.
My understanding is that the Merced is x86-compatible. They added a new processing mode in which the Merced's new instruction set and new register structure are accessible, I think.
I do know that the K7, as with all the other K series chips from AMD, is a RISC based processor that uses one or more instruction translators. I'm not sure if it's possible to write code that bypasses the instruction translators, but if it *is* possible, then yes...
I'm going into my fourth year of Computer Engineering, focusing on chip design. The short answer is that it isn't possible to bypass the instruction translators. They aren't translators per se, but something closer to macro expanders.
A CISC instruction (or to a lesser extent, even a RISC instruction) is a concise way of saying that you want the chip to do something fairly complex. For both RISC and CISC processors, these instructions have to be expanded out into a series of truly elementary hardware operations for the chip to perform. RISC instructions tend to be a lot simpler, and are a lot closer to the final "microcode" that controls the various parts of the chip, but they still need _some_ decoding to be processed.
The statement that the K7 (or Pentium-whatever) has a "RISC core" is a bit of a misnomer. What they actually do is allow different tasks required by a CISC instruction to be executed independently. This could be thought of as breaking it into a series of equivalent RISC instructions, but no such instructions actually exist (though you could argue that "micro-ops" and "macro-ops" are close).
Short answer, as above: You can't bypass translation and write in native RISC, because there isn't really a native RISC to write in and analogous translation would still be required no matter what sets of opcodes you were using.
Hopefully this was interesting for anyone that read this far:).
This is just personal taste. Slackware is actaully very _unfriendly_ for casual users or users who don't want to have to be constantly tinkering with the guts of the system when installing or reconfiguring something. I just happen to like configuring things myself.
I'm told that Debian is also good for this, and doesn't _make_ you do this, but I haven't had a good reason to switch so far.
You can do it, by much the same method (hand-picking packages to install). I've installed Windows 95 on a 486 DX-33 with a 120 meg HD and 8 megs of RAM. It _crawled_, but still ran.
I can get Windows 98 down to 200 megs without too much effort. Not sure what the minimum size is.
The biggest difference that I know of is that Windows performance suffers terribly with less RAM, less disk space, and a slower machine, while Linux's doesn't (unless you're doing compiles or using processor-intensive applications or have a huge desktop).
I haven't used BSD extensively, but I suspect that it behaves similarly.
Windows 2000 is a renamed NT 5.0. Different beast from the 9x series, and more resource-hungry. I have no idea what the minimum practical installation size is for it.
If I understand correctly, the main functional difference is a different format for binaries. This means that you can't link object files produced with glibc with object files produced with libc5. This doesn't matter if you're compiling everything from source code (because your compiler will give you the same binary format that the rest of your system uses), but it makes it impossible to link in libraries that you receive just as binaries (object files) that are in the wrong format.
Disclaimer: I haven't messed with the compilers in detail, so I may have missed several very large points
I happen to like NATs - they are a good way of making sure that the network inside my workplace or home isn't visible to the outside world. As far as the ISP is concerned, my house consists of the firewall machine, and my workplace consists of a firewall and a mail server, which IMO is as it should be.
I readily agree that using NATs as a means of packing more machines into the address space is a Bad Idea - I'd like to have the potential for more than a few billion world-visible boxes. They're also a bad idea on an internal network that has to be able to see all parts of itself from all parts of itself, and for cell phone networks. However, I don't see why they're intrinsically evil.
I haven't had a problem running games behind a masquerading firewall. Tribes 'net play works fine. Quake 'net play works fine.
According to the moderator guidelines, moderators are chosen at random from a fairly wide cross-section of the slashdot population. Anyone who posts a bit and reads a fair bit is elligeable if I understand correctly, which means at least a third of the people visiting the site (a guess - don't flame me for this).
The only people who can't become moderators, again if I understand correctly, are ones who either never post, never read, or constantly hit "reload". Check out the guidelines themselves for more detailed information.
Not quite. If we only want to have an arbitrarily large number of user machines that aren't serving anything to the world at large, that works, but if we want to have an arbitrarily large number of world-visible servers, it breaks down.
Also, you only have 65536 ports on your masquerading firewall. If you put that at, say, the top of a class A private subnet and more than 65536 machines try to access the world at a time, congestion becomes a problem.
Though I'll admit that congestion won't be *much* of a problem under real conditions, for the next little while (Fermi 100 trillion users maximum).
He actually did propose extending the number of bits in IP addresses. The main point of the new subnetting scheme, AFAICT, is to make it easier to add these bits while keeping older addresses valid. However, his new scheme isn't necessary for that (click on "user info" to see my previous response).
I've made it through most of the article. AFAICT, he postulates adding bits as a method of getting around the number-of-addresses problem, and proposes a different way of organizing subnets.
The *one* (1) saving grace that his article has is that his proposed organizational scheme makes it relatively painless to increase the number of bits down the road, without having to reassign addresses. OTOH, it's easy enough to do that with the present system too (treat your 4-byte IP as the _least_significant_ part of a larger address).
The article was poorly organized and incredibly obfuscated. I really do hope that this person isn't really a member of any decision-making organization. I could give a summary containing all of the useful information on it in a tenth the space, and more clearly.
In fact, I'm seriously considering doing this just so that nobody has to wade through this monstrosity in its original form.
Actually, according to the article they'll still need actuators to move the read/write head over the material... which is starting to sound suspiciously like an ordinary hard drive (actuators move on one axis and the disk medium spins on the other). Solid state starts looking like a bit of a misnomer here.
AFAICT from the article this is just a device working much like a hard drive with multiple layers per platter that uses a magneto-optical system to do layer selection (much as DVDs can focus on different layers). Where they get their size, cost, and capacity numbers from I'm not sure.
Comparable. The Dreamcast uses the PowerVR 2 graphics chipset, which is also available as a PC card (go to Sharkey Extreme's archives for the article). The PVR2 card benchmarked at about two thirds the speed of a high-end consumer card, which suggests that the Dreamcast is slightly worse than a PC, but a friend who works in the console gaming industry insists that optimizations in the Dreamcast make up for that.
The same friend insists that the Dreamcast has more than enough processing power to handle all geometry for the card, and I'm inclined to agree. For general-purpose, the SH4 isn't that great, and for double-precision floating-point, it's pretty horrid, but it works amazingly well for single-precision floating-point and vector/matrix computations, due to a specialized instruction set and specialized floating-point hardware heavily optimized for that specific purpose. You can find more information in the spec sheets for it, which are linked from the SH4/BSD article referred to above.
So, I can believe that the Dreamcast would make as good a game machine as a present high-end PC. The main problem is that the PCs will be twice as powerful by Christmas (when 0.18 micron technology has matured), while the Dreamcast will be waiting a while for a successor.
As with the original Playstation, what will make or break it will be the quality of its games, though. The Playstation renders like a first-generation 3D card, but it's still fun.
I agree that it gets redundant, but this time it might actually be appropriate. A dreamcast makes a relatively cheap and relatively powerful node.
OTOH, it was correctly pointed out that most people don't have any _use_ for a cluster, but it would still be a neat toy if you have the budget.
Not quite. The 68k series was still a bit CISCian, while IIRC the SH4 had a smaller, more RISC-like instruction set with a few specialty FP instructions added. [Before this starts another Holy War, let me point out that both RISC and CISC can be used efficiently; CISC is just more difficult to optimize hardware for.] The page referenced in the article contains a link to the SH4 reference manuals; among other things, these contain the instruction set.
This is correct. Intel seems to have made a half-hearted switch to a crossbar bus system, with the result that their bus can still be easily saturated. Programs that aren't memory-intensive will still benefit from the SMP. Programs that are memory-intensive will saturate the bus and leave many of the processors idle.
It'll be good for Quake III, but I suspect anyone in the know will probably stick with a RISC design.
Memory performance has nothing to do with whether the processor is RISC or CISC. Memory bus design is the only thing that matters here. Point-to-point implemented on a crossbar bus beats a shared bus no matter what chips are used.
Getting back to the Holy War, most modern implementations of CISC are almost as efficient as RISC (look at the guts of the K7 design for an example). You wind up with an extra stage in your pipeline for decoding, and that's about it. RISC is generally favoured because there's no real reason to use CISC any more (higher code density isn't as vital), and removing CISC support simplifies chip design and optimization.
The main problem with Intel chips is that they've been repeatedly extending a design that wasn't designed to be extensible, in a rather kludgy manner. They're still stuck with it, because if they switch to a completely new architecture they lose their installed base of customers. This is why the Merced still supports x86 modes.
I realize that; that's one of the many things that would make this a variant
I'd like to see someone integrate an ARM core and a couple of ethernet controller cores on a die for use in a module like this.
If they make a version of this that can handle two or more network interfaces at 100 Mbit, I'm sold.
Check www.spec.org. It provides standard benchmarking code, and collects benchmarks for everything from PCs up to Big Iron. It will certainly have comparisons between PIIIs and the PPC-750, and should have Athlon data as soon as AMD gets around to compiling the benchmark software.
Does anyone know where keyboards of this type can be obtained, or even where I could find a picture of one?
That's true in logic, but not in colloquial speech. Treating both definitions as being valid leads to the least frustration. I doubt the colloquial use will disappear any time soon.
While I agree about the numeric keypad being very useful, IMO the main numeric keys are too. They let me type occasional numbers without having to move my hands. Doing something like coding while having to move my hand to another area of the keyboard to enter numbers would slow me down a lot.
Ok... Their marketing department calls these RISC instructions, and I call them microcode bit vectors. The article you cited points out itself the differences between this and conventional RISC instructions - RISC instructions don't contain as much internal control state information. They also are usually one machine word in size (32 bits on most PCs, 64 bits on most workstations). These bit vectors are much longer, which leads into the main problem with this approach, which I mentioned in other responses - very low code density.
By writing in microcode, you skip any decoding latency that might exist, but you end up having to transfer several times more information than you otherwise would. This means it takes several machine words to specify each instruction, and several bus cycles to load each instruction (full bus speed != one instruction per bus clock). You could put in a wider memory bus, but the same problem applies - your instruction stream bandwidth just went down a lot. As memory bandwidth is one of the main limiting factors for system performance, this will hurt, a lot. Additionally, the instruction cache can suddenly hold far fewer instructions - you either need a bigger cache to compensate, or suffer through many cache misses.
I am very surprised that NextGen actually built a chip using this approach. It's an interesting idea, but as I state in other messages, conventional RISC is almost as easy to decode and doesn't suffer from the code density problems described above.
The question is, does the K7 bus do what the Nx586 bus did and what the Socket 7 motherboard bus did not -- can it bypass the 80x86 decoder?
I strongly doubt it. No mention of this was made in any of AMD's releases, there is no pressing market demand for it, and it would probably _worsen_ performance, as described above. What they _could_ do is allow an additional RISC instruction set, which would accomplish much the same thing, but it is open to question whether the (very substantial) additional design effort would be worth it for their target market.
Intel is doing something similar to this with the Merced, with x86 and VLIW modes, but that design has its own problems.
I am using "microcode" to refer to the final signal patterns on the control lines during each clock, which may not be the standard usage. Re. MUX vs. table lookup, both take a significant amount of time, which is what I am getting at. By making your instruction opcode a huge bit vector specifying the state of all control lines for the current clock and possibly the next few clocks, you could eliminate the decoding, but at the expense of a silly amount of code bloat, which IMO is impractical (among other things you'd need several memory reads to read each opcode).
I readily agree that complicated CISC translation takes longer than RISC translation, but IMO going any farther than RISC gets impractical very quickly.
It's there to provide an Intel-brand x86-compatible chip after Intel switches to IA64. Current users who want to upgrade and still want to buy Intel will buy it, which gets them "hooked" on the new architecture. If Intel broke compatibility, they would lose their market. x86 clones would be chosen by the masses, and people who wanted a high-performing non-x86 system would choose other processors.
If, on the other hand, people can upgrade their x86 "workstations" to Merced "workstations", they can still run their old software, while software written with the Merced in mind gets a speed boost.
Think about it: the entire point of the Merced is to place the burden of optimizing a processor's operation on the compiler rather than the processor itself. Merced's performance is totatly reliant of the quality of the compiler.
For the VLIW instructions, yes. However, it wouldn't cost them much space to put in x86 optimizations. Heck, most of them are already present in the Merced hardware - pipelining, superscaling, and branch prediction are still there. x86 emulation would just require a second instruction decoder and a bit of extra glue logic.
The reason that the compiler is a bugger is that it's difficult to use _VLIW_ properly outside of hand-coded assembly.
IMO, the Merced will either take a 1.5x speed hit over straight x86 clones, or will require 1.5x as much silicon. I'm betting that Intel will throw extra silicon at it and wind up with a chip that runs x86 code at reasonable speed but that costs twice as much as a Pentium-III to fabricate.
Did I mention that it will also cost a freakin fortune?
I strongly suspect that they'll release a cheaper low-end version with smaller cache, just as with the Celeron/Pentium/Xeon spectrum. They need something to offer in the x86 regime, and they need consumers to upgrade to something Merced-compatible if they want to keep their market share while moving on with the Merced. If they don't do this immediately, they'll do it reasonably soon afterwards.
To do what, specifically?
To switch between two separate instruction sets, I'll believe. You need more decoding logic, but it can be done fairly easily.
To allow programs to be written in microcode, I doubt - microcode has a very low code density, which leads to many, many problems. RISC is just as easy to optimize, so you don't gain much for your pains.
My understanding is that the Merced is x86-compatible. They added a new processing mode in which the Merced's new instruction set and new register structure are accessible, I think.
I'm going into my fourth year of Computer Engineering, focusing on chip design. The short answer is that it isn't possible to bypass the instruction translators. They aren't translators per se, but something closer to macro expanders.
A CISC instruction (or to a lesser extent, even a RISC instruction) is a concise way of saying that you want the chip to do something fairly complex. For both RISC and CISC processors, these instructions have to be expanded out into a series of truly elementary hardware operations for the chip to perform. RISC instructions tend to be a lot simpler, and are a lot closer to the final "microcode" that controls the various parts of the chip, but they still need _some_ decoding to be processed.
The statement that the K7 (or Pentium-whatever) has a "RISC core" is a bit of a misnomer. What they actually do is allow different tasks required by a CISC instruction to be executed independently. This could be thought of as breaking it into a series of equivalent RISC instructions, but no such instructions actually exist (though you could argue that "micro-ops" and "macro-ops" are close).
Short answer, as above: You can't bypass translation and write in native RISC, because there isn't really a native RISC to write in and analogous translation would still be required no matter what sets of opcodes you were using.
Hopefully this was interesting for anyone that read this far