I look at this from a rather different perspective - not as a quick hack to make up the difference in speeds of the processor and the external memory, but rather as an intermediate stage in the gradual integration of memory onto processors, those being the two things that have to communicate the most in your whole system.
It's possible that they're planning to do this with integrated DRAM (which can be much smaller than the usual SRAM) like that new graphics chip from Future Crew.
You might think that integrated DRAM wouldn't be fast enough for cache, but there are many advantages to integration that are not always obvious. While DRAM does have much worse latency than SRAM (normally used for cache), the external SRAM that they are using now takes 8 L2 external cycles (that's 25ns @ 600Mhrz) to transmit the data, excluding latency. Integration would cut data transmission time down to less than 5ns. Joined with additional savings from not having to drive the external address lines and not having to deal with the DRAM row registers so much might actually make internal DRAM faster than external SRAM.
(I've been wishing they'd integrate DRAM onto processors for a long time now, if you can't tell from my advocation speech)
In 32-bit code, near addresses are much more common than far addresses, but far addresses are possible. 32-bit far addresses consist of a 16-bit segment and a 32-bit offset. The segment refers to a page table, a begining, and an ending. The page tables then refer to physical ram, which has 36 address bits these days.
Environments that have all segments more or less equivalent are called 32-bit flat mode (as opposed to 32-bit segmented mode).
I look at this from a rather different perspective - not as a quick hack to make up the difference in speeds of the processor and the external memory, but rather as an intermediate stage in the gradual integration of memory onto processors, those being the two things that have to communicate the most in your whole system.
You might think that integrated DRAM wouldn't be fast enough for cache, but there are many advantages to integration that are not always obvious. While DRAM does have much worse latency than SRAM (normally used for cache), the external SRAM that they are using now takes 8 L2 external cycles (that's 25ns @ 600Mhrz) to transmit the data, excluding latency. Integration would cut data transmission time down to less than 5ns. Joined with additional savings from not having to drive the external address lines and not having to deal with the DRAM row registers so much might actually make internal DRAM faster than external SRAM.
(I've been wishing they'd integrate DRAM onto processors for a long time now, if you can't tell from my advocation speech)
They're still there, albiert in a different form.
In 32-bit code, near addresses are much more common than far addresses, but far addresses are possible. 32-bit far addresses consist of a 16-bit segment and a 32-bit offset. The segment refers to a page table, a begining, and an ending. The page tables then refer to physical ram, which has 36 address bits these days.
Environments that have all segments more or less equivalent are called 32-bit flat mode (as opposed to 32-bit segmented mode).