AMD QuadFX Platform and FX-70 Series Launched
MojoKid writes, "AMD officially launched their QuadFX platform and FX-70 series processors today, previously known as 4x4. FX-70 series processors will be sold in matched pairs at speeds of 2.6, 2.8, and 3GHz. These chips are currently supported by NVIDIA nForce 680a chipset-based, dual-socket motherboards, namely the Asus L1N64-SLI WS, which is currently the only model available. HotHardware took a fully configured AMD QuadFX system out for a spin and though performance was impressive, the fastest 3GHz quad-core FX-74 configuration couldn't catch Intel's Core 2 Extreme QX6700 quad-core chip in any of the benchmarks. The platform does show promise for the future, however, especially with AMD's Torenzza open socket initiative." And mikemuch writes that the QuadFX "not only fails to take the performance crown from Intel's quad-core Core 2 Extreme QX6700, but in the process burns almost twice as much electricity and runs significantly hotter in the process. ExtremeTech has a plethora of application and synthetic benchmarks on QuadFX, including gaming and media-encoding tests."
AMD unveils processors for 'power' computer users
Not 8 cores 80 cores
thank God the internet isn't a human right.
The QX6700 has the same TDP(125-130W) per socket as the FX70-74 so I assume they run at about the same temperature on chip. Overall system temperature might be higher for the FX based quad core system since it uses twice as many sockets, but that's a matter of case design, if the case design can eliminate the heat from the heatsink effectively I would imagine both systems would run at the same temperature. This is of course ignoring the fact that AMD TDP is worst case and Intel's is average case.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
has another review that says reaffirms the same findings. Performance is not beating Intel yet and the AMD/ASUS solution is very expensive. I feel the only market here is those that cannot wait and have money to burn.
http://www.hothardware.com/printarticle.aspx?artic leid=911
So... L2 cache speed. When I look at Memtest86+ numbers, I see:
~19700 MB/s for L1
~4700 MB/s for L2
~3000 MB/s for main memory
This is on a Athlon64 X2 4600+ w/ low-speed DDR2 RAM (4 sticks of 1GB).
I'm guessing that L2 gains are because it can respond to a memory request faster (fewer clock cycles) then because of the bandwidth? Because the L2 bandwidth of 4.7GB/s doesn't seem to be that exciting anymore once main RAM can feed the CPU at 3GB/s.
Wolde you bothe eate your cake, and have your cake?
Again... what is this mythical "true SMP operations" that people keep mentioning? Are you talking about MIMD code?
I don't understand the "places" you mention. L2 cache has been multiported for a long time. Additionally, the cache subsystem should be able to handle simultaneous requests from both cores. There should be no stalling due to simultaneous cache accesses from both cores in a shared cache system. As far as cache spills, any situation that should cause spills in a shared cache should cause spills in non-shared (I'll mention this later). Basically, the shared 2M cache can mimic the degenerate case of two 1M caches exactly, but has the flexibility to also be the same as one core having a 512K cache and the other having a 1.5M cache, if working sets dictate, for example (I'll mention this later too).
I don't get your discussion... I'm just not following your verbage. I'm trying to understand it but can't get your metaphors or something.
Anyway I'll try to discuss what I think you are talking about. Shared L2 cache is considered the superior design compared to each core having unshared cache. There are numerous discussions on this around the 'net. However, I'll talk about several specific examples.
In a non-shared cache configuration with two cores on the same die running multithreaded code, you can easily get into situations where each thread wants access to the same piece of data for writing. When this happens (which is fairly common... mutex/semaphore/etc in fine-grained code are good examples of this), in a non-shared cache system, you can get a lot of MOESI traffic and passing around of that data between the two non-shared caches (takes inter-cache bandwidth to do that). However, in the shared cache system, that data is in the shared L2 cache exactly once and, furthermore, there is no passing it around... no MOESI traffic, no usage of any intercache bandwidth because no copy takes place. In such a situation (two threads competing for writes on the same data), the shared L2 cache can be very much faster than the non-shared L2 cache. In addition, the absence of the MOESI traffic is a lighter load on the MOESI subsystem, leaving it free to do other MOESI traffic and do other transfers. In some codes, MOESI traffic between non-shared cache and data copying between the unshared L2 caches can be almost pathological behaviour, leading to heavy slowdown as the two cores fight for access to the data. To summarize: Shared L2 = much lower MOESI traffic in a competing writes situation and little/no intercache bandwidth utilization because no copies between caches occurs. Non-shared L2 in such a situation is more MOESI traffic and intercache bandwidth utilized (and cores waiting for the data to transfer) to transfer the data back and forth. It's easy to write a simulation of this problem.
A second example is cache utilization. If you have two threads in a dual core system that are asymmetric in cache working set size, you can