NVIDIA Unveils (And Tom's Reviews) The GeForce4
EconolineCrush writes: "NVIDIA has finally revealed its GeForce4 Titanium and MX graphics processors. Tom's Hardware has a some benchmarks comparing the new offerings to current products, and the results are pretty interesting. Meanwhile, The Tech Report does an excellent job cutting through the hype with an examination of each new chip's features. Both articles are well worth reading to get the full story on the latest from NVIDIA."
Actually, when they announced the speed bumped towers a few weeks ago, they noted that the higher end ones included the GeForce4. Of course, nVidia had not announced the existence of such a product yet, leading to some speculation here on slashdot.
As far as Apple having a history of slow graphics cards, they have done pretty well in the towers for the last year or two. They were the first (by a couple of days) to have the GeForce 3 even.
Do not taunt Happy Fun Ball(TM)
You can't go off of the chipsets in the iMacs - the iMac was essentially a laptop with a CRT on it (now it's a laptop in a bigger package). But the G3s (after the beige boxes) and G4s (G4s especially) have always had strong card options, both at Apple and outside of it.
Anandtech has quite a good review here. They also have benchmarks from the lastest build of the unreal engine here. Enjoy :)
Unstable Apps: Our Android Apps Don't Suck
Well, there is the normal average-joe meaning of interesting and there is the understated-all-to-hell meaning of interesting.
An example of the latter: at the University of Texas at Austin Hans Mark - former Director or NASA Ames, Deputy Administrator of NASA, ect., ect. - used to teach a class in which the Airborne Laser system used to become a topic of conversation. When asked about its range (since he'd seen the classified testing documents), all he'd say was that it was effective at a 'militarily interesting distance'.
Now, that's a far cry from Tom's Hardware and the GeForce4, but maybe they're trying to get a little reflected glory rather than simply grossly underusing the language.
We can hope, right?
Don Negro
Perl 6 will give you the big knob. -- Larry Wall
Squid
Wolfman (i guess this is the best)
Tidepool
Looks like they had some spelling errors on some of the videos (they spelled content as contnent).
The idea is that if you are maxing out at 30fps, then when a more complicated scene (player turning, multiple weapon fire, smoke grenades, other players in field of view) your rate will drop well below 30... if your rate is 60fps, the times when it will drop to a visibly "chunky" speed are fewer and further between (on average). So, if you have a base rate of 150fps, you should be able to handle just about any event without noticable slowdown in drawing.
That being said, I just upgraded my 16MB TNT (original) to a 64MB GF2MX400 recently, and it is more than suitable for the 3D games I play (mostly sports sims and RTS types). The only difference is that I can now render Yankee Stadium at 1280x1024+ with better detail. Playability hasn't changed much.
"It's tough to be bilingual when you get hit in the head."
The same thing happened on a $900 19" Toshiba monitor at buy.com last year. It was listed for $100 something. Thousands of people ordered it. There was a class action lawsuit, and each person that ordered (myself included) it got a ended up getting a check for $45.
Bottom line: Go order it, even if you don't get it, you might get some cash out of the settlement.
If you're sick of all these senseless video card upgrades, just follow the $150 video card rule. No game is really going to take full advantage of a card less then $150. If you're paying more then that, you're wasting money.
Your money would be better spend putting the extra money towards a better monitor for instance. Be surprised the number of people that spend $400 on a video card to play on a $150 montior. Then wonder why things are still jumpy. A nice subwoofer and new speakers would also enhance your gaming experience.
Prices and availability are subject to change without notice. Errors will be corrected where discovered, and Best Buy reserves the right to revoke any stated offer and to correct any errors, inaccuracies or omissions (including after an order has been submitted). Best Buy may, at its own discretion, limit or cancel quantities purchased per person, per household or per order. These restrictions may include orders placed by the same BestBuy.com account, credit card, and also orders which use the same billing and/or shipping address. Notification will be sent to the e-mail and/or billing address provided should such change occur.
-From the BestBuy website.
So this means that this probably won't be honored. Bummer.
-Julius X
remove "-whatkindofspamdoyoutakemefor-" from email to send
price has been fixed. Now, will they cancel my order? ;)
semantics are everything!
Yep, it was a mistake... they just jacked the price up to $399.99.
Adidas To Bring Back Sneakernet
No, Best Buy's cost is most likely somewhere in the neighborhood of $350-$375. They make a *huge* profit on things like cables, but computer parts are usually pretty reasonably priced. I bought a Radeon 8500 from them a few months ago using an employee discount, and the price dropped from about $290-$260. I imagine the ti4600 is quite similar in markup.
More shaders, More pixel pipelines, More memory bandwidth... whoopee...
When the hell are they going to ditch the antiquated scanline rendering method and go work on some tile based rendering methods?
Probably never, and for very good reason. Tile-based rendering is a very efficient architecture whose time has already come and gone.
For those who don't know, tile-based rendering divides an image up into a number of smaller squares ("tiles") and renders them independently, as opposed to the traditional method ("immediate-mode rendering") of rendering an image one polygon at a time. The major benefits claimed for tile-based renderers are that the process is more parallelizable (no risk of two chips rendering to the same area if they are working on different tiles) and that it is an easy modification to check each polygon's z-buffer (its distance from the camera) as you add it to the poly-list for its tile, and then to only texturize those polygons which are not occluded (i.e. actually visible). This is in contrast to the traditional immediate-mode rendering algorithm, where polygons are textured more or less in random order, leading to situations where a polygon will go through the entire process of being textured and rendered, only to later be completely covered up by a later poly--a situation which wastes a lot of (especially) memory bandwidth, fetching all those useless textures and such.
Cool! Sounds great! Let's hear it for tile-based rendering! Too bad ATI and NVIDIA have clearly never ever heard of this miracle technique! After all, it's not like they would ever make (gasp!) an informed choice not to use it!
Well...not so fast. Basically what we've seen is that tile-based rendering offers two potential benefits: it eliminates *some* of the complexity of enabling multi-GPU implmentations, and it uses quite a bit less memory bandwidth in the base case. The problem is that both of these supposed benefits really buy you very little when designing a consumer-level graphics card today.
First, the problem of "dividing up the work" isn't really what's preventing multi-chip graphics cards these days. Indeed, it's really a rather easy problem. Here's a clue: have alternate chips render alternate frames. Gee...that wasn't so tough, now was it? Well, no. But the other problems of implementing a multi-chip card for the consumer market sure are. For example, we have our choice of implementing an (expensive, performance gating) point-to-point bus to handle memory traffic (and have memory bandwidth/chip cut in half anyways), or of completely mirroring the memory, using twice as much for the same capacity (expensive). Then there's the cost of a second chip (expensive), the cost of packaging the second chip and connecting it to memory (expensive), and the cost of the extra power and cooling, the cost of trying to squeeze it all onto one card (results in a bigger, more expensive card; may gate clockability). And this is without mentioning the extra development and debugging time that goes into getting a multi-chip solution to work correctly. (In general this is one of the most difficult issues design engineers face.) Golly, it's almost enough to make you remember how when 3dfx tried to make a multi-chip product it was 6 months late, the single-chip card was far too slow, the double-chip (and cancelled quad-chip) card too expensive, and, due to the release delay, no longer competitive. (OTOH John C has hinted that a scalable multi-chip architecture might be on the way from one of the major players. Tie that in with the fact that Anand reports the GF4 will be the last to use the GF name, and that NVIDIA owns the remnants of 3dfx, and I start scratching my head...)
Second, the problem of memory bandwidth. Or rather, the former problem of memory bandwidth. Yes, the traditional rendering pipeline is very inefficient with memory bandwidth. Thing is, the prices on high-speed DDR have been coming down so fast that it hardly matters. You can find a Radeon 7500 with 64MB of 128-bit-wide DDR running at 2x230 MHz (i.e. 7.4GB/s bandwidth) for as low as $85 on pricewatch.com. (Actually there's one for $79 but it may be mislabeled.) The memory is probably less than $30 of the cost. Or maybe even less--the 64MB and 32MB GF2Pros (6.4GB/s bandwidth) only differ by $6. And the new GF4 MX460 hits the street with 64MB of 2x275 MHz DDR (8.8GB/s) for $179, list, on a brand new card.
As for the price premium of using relatively high-speed DDR instead of the same amount of SDRAM, it's pretty neglibible. Even for the highest speed DDR it's not such a big deal. Sure NVIDIA charges an extra $100 for another 25MHz on the GPU and an extra 1.6GB/s from the memroy (GF4 Ti4600 vs. Ti4400), but that doesn't mean it costs them anywhere near that much. (depending on GPU yields) It just means they like to bilk those in the $400-for-a-video-card crowd for the full $400. So how much does the stuff cost? Well...Hynix recently announced samples and volume production of 2x375 MHz x32 DDR selling at $10 for 128Mbit chips. That means $40 for 64MB of 128-bit-wide DDR with 12GB/s bandwidth. Not too shabby.
Ok, ok...so maybe the benefits of tile-based rendering don't really mean all that much in today's consumer GPU market. But better is better: why wouldn't ATI and NVIDIA use tile-based architectures for the benfits it does provide. After all, it's not like there might be some (gasp!) downsides to tile-based rendering!
Well, actually, there are. For one thing, it's more difficult to design a tile-based GPU and get it running at high speeds. For another both NVIDIA and ATI have years and years of research and experience with implementation techniques and algorithms for immediate-mode renderers, much of which wouldn't apply to tile-based designs.
For another, neither ATI nor NVIDIA really uses traditional immediate-mode rendering anymore. Instead they use modified immediate-mode rendering, with lots of algorithmic tricks and tweaks to lessen the memory bandwidth inefficiencies of traditional immediate-mode rendering. Things like lossless z-buffer compression and various early polygon-culling algorithms. No they aren't quite as effective in reducing overdraw as tile-based rendering, but they provide quite a significant benefit. Indeed, the GF4 Ti4600 has more or less caught up with the (tile-based) KyroII in Kyro's own villagemark benchmark, which is contrived entirely to test massive overdraw of the sort which is never encountered in a game. The KyroII is only 8 months old. Sure it's much much cheaper than a Ti4600, but if Kyro can barely keep the lead in the one benchmark specially designed to make the case for tile-based rendering then something is wrong here.
Meanwhile there are very serious issues with the ability of tile-based rendering to scale to meet future challenges. In particular, the tile-based rendering algorithm works very naturally so long as there are no polygons which find themselves spread into more than one tile, and so long as you don't use transparent or translucent textures. Of course it's not that tile-based chips can't handle these situations--the KyroII is here and works just fine, after all--but just that they require complicated workarounds which are more inefficient than for immediate-mode rendering, which handles these cases naturally.
The problem is that both cases are going to be more and more likely as graphics continue to improve. As tile-based rendering tries to scale with increasing scene polygon counts and resolutions, you get more tiles per scene and many more polygons crossing tile boundries. And as graphical effects get more realistic, the alpha channel (i.e. transparency) starts coming into play more and more. Indeed much of the recent research in non-real-time computer graphics has focused on adding translucent "subsurface" reflections to the ray-tracing algorithm. This (and approximations of it) is the sort of thing that future pixel shaders are going to be called on to do, and tile-based rendering is a bad match for it.
Indeed, most of the recent advances in graphics are pointing towards a world in which the assumptions which tile-based rendering is based on no longer hold. How, for example, does tile-based rendering handle cubic environment mapping across tile boundries, or cast dynamic shadows across tile boundries? What happens if a dot3 bump map extends a texture from one tile into another? I'm sure clever solutions can be found to these and all the other dozens and dozens of issues that will arise when you try to mix DX8-style effects and tile boundries, but the main point is that tile-based rendering was an algorithm developed under two assumptions which increasingly do not hold:
1) If one polygon occludes another, the other's texture will never be visible to the camera;
2) Objects in one section in the screen can be rendered without reference to any other parts of the screen.
Of course, we may never know the difficulties of trying to make a DX8-compliant tile-based renderer; after all, the KyroII hasn't even made it to DX7, since it is still missing integrated T&L. I have no idea whether this is because of any difficulties integrating T&L with a tile-based rendering pipeline (can't think of why it would be a problem, but it may be), or just because the Kyro doesn't have the money or manpower behind it to keep up with 3 year old technology, but this lack is already preventing the KyroII from competing effectively with the cheaper GF2MX on modern high-poly games. I am pretty sure that integrating a programmable pixel shader into a tile-based architecture would be pretty tough, if not pretty impossible.
Which brings me to the main point: you started out writing "More shaders, More pixel pipelines, More memory bandwidth... whoopee..." and in a sense, this is the right attitude. To which we should very quickly add "tile-based screen division...deferred rendering algorithm...whoopee..." All these technical details only mean something insofar as they give us the capability for more realistic graphics--this means high FPS, high color depth, higher resolutions, lack of aliasing problems, high-quality mip-mapping/anisotropic filtering, realistic--or even dynamic--lighting and shadows, realistic and/or impressive pixel effects, high polygon counts, useful and realistic vertex effects, etc.--for a reasonable price. It is pretty damn hard to argue that the last few years, under NVIDIA's leadership (and ATI's pursuit) have not resulted in huge improvements on these measures. Again, the new GF4 Ti4600 may be ridiculously expensive and may not change your experience with today's games very much (besides enabling 1600x1200x32 with 4xAA at playable framerates), but when the new Doom game comes out, a card with similar specs and selling for ~$100 will bring you decent performance on an engine which offers a totally new level of graphical realism. Same thing when Unreal Warfare, Unreal 2, Deus Ex 2, and all the other Unreal 2-engine games start coming out. Believe me, a GF4 caliber card will improve the experience of playing those and later games significantly over a GF3 and especially a non-DX8 compliant card like a GF2 (and, sadly, a GF4MX). And, believe me, those games are going to provide significantly more realistic graphical experiences than those of today.
Immediate-mode rendering is doing just fine, and the GF4 marks an evolutionary but very significant improvement to the state-of-the-art. A switch to tile-based would require significant retreading to reach the same level, and might form a poorer basis for future improvements. But, if I'm wrong, then ATI and NVIDIA will make the switch. Believe me, they know all about tile-based rendering, and NVIDIA even owns Gigapixel (via 3dfx) and their tile-based rendering engine. I think they'll stick to modifications of immediate-based rendering, but no matter what they do it will be whatever they think offers the best graphics performance at the lowest cost to them.
And now to correct some minor misconceptions in your post:
Hell, the reason why the Geforce line has to keep doubling its fill rates every generation is because its architechture is so god damn ineffecient. Look at the memory bandwidth requirements for the cards!
The reason the GeForce line increases its texel fill rates continually is because consumers want to run new games which have higher multi-texturing requirements (Carmack has said Doom3 will have something like ~8 textures/pixel), and to run existing games in higher resolutions and at higher FPS.
The memory bandwidth "requirements" for the cards don't matter, only the prices. If a recent card with 7.4GB/s only costs $85 (Radeon 7500) and a brand new card with 8.8GB/s lists for $179, then the costs of increasing memory bandwidth are obviously not so terrible. Today's $400 card is next year's $80 card. Similarly, immediate-mode rendering's inefficiencies need to be measured according to their dollar costs, not their bandwidth costs.
Instead of using the relatively limited bandwidth of AGP for streaming textures from main memory (where it should god damn be) to the texture cache, the card is busy wasting bandwidth on the damn Z-buffer (which would be eliminated if they implemented hidden surface removal like the PowerVR chipsets).
???
First off, textures most certainly should not "god damn be" in main memory! The AGP bus is there to stream vertex data from the CPU (pre- or post-transformation, it's the same amount of data). That's all it's there to do, and good thing, too, because today's high-poly games can already generate enough vertex data to make AGP 2x a bottleneck, and those of a couple years will do the same to AGP 4x. (Which is why AGP 8x is on the horizon.) Increasing the bandwidth of a bus from the northbridge across the motherboard through a slot to an add-on card is a whole lot harder than increasing the bandwidth from soldered DDR to a soldered GPU a few centimeters away. AGP should only carry the data which it absolutely is forced to--namely initial vertex data from the game's engine running on the CPU.
Z-buffer lookups only waste bandwidth between the GPU and the on-card memory. Technically, you don't eliminate z-buffer lookups with a tile-based architecture; you eliminate texture lookups (and texture application) on occluded polygons. However, by dealing with a small tile at a time, you can read all the z-buffer data for the tile in from memory all at once, and store it in an on-chip cache until you're done with that tile. (This is essentially why higher poly-count games mean smaller and smaller tiles.)
And last, they do implement hidden surface removal techniques, like I pointed out before, even though they are less effective than with a tile-based architecture.
Check out this article here:
http://www.anandtech.com/showdoc.html?i=1577
It's all about the board manufacturers putting crap low-pass filters on the boards. Solution: rip those suckers off!
FUNK!