NVIDIA Unveils (And Tom's Reviews) The GeForce4
EconolineCrush writes: "NVIDIA has finally revealed its GeForce4 Titanium and MX graphics processors. Tom's Hardware has a some benchmarks comparing the new offerings to current products, and the results are pretty interesting. Meanwhile, The Tech Report does an excellent job cutting through the hype with an examination of each new chip's features. Both articles are well worth reading to get the full story on the latest from NVIDIA."
I almost can't stand it when I buy a new flashy graphics card that is praised by every magazine, and then a NEWER card comes out, that supports DX8 pixel shaders, etc., etc. (IE I bought a Radeon 64MB DDR card....two weeks later, hello GeForce3)
I hope if I buy a GeForce4, it'll last, in both speed and 3D technology.
And in an almsot suprising move, apple's offering as a build to order option in their towers (announced yesterday. For a company that almsot always has hidiously slow graphics cards, its kind of a nice change tosee them ahead of the game for once in this department.
Mod point free since 2001
http://www.anandtech.com/video/showdoc.html?i=1583
And to everyone's suprise. Geforce4 is faster then
the previous chipsets. Has more pipelines and
bigger memory bandwidth. When will someone try
the new and fresh marketing trick and announce
hardwarre that is slower then the old hardware.
(I hope MS didnt hear this and starts making hardware)
- To understand recursion, we must first understand recursion -
LeadTek has a Geforce3 Ti200 with 128M of memory
for under $200. I just got one of these a
couple of days ago. Heaviest video card I've ever
owned. Looks great in windows. (I did windows
first because I knew it would take longer). If anybody's curious, mail me; I should have it
working under linux tonite if nothing comes up
after work.
funny story: I upgraded my mobo as well to
a soyo dragon+... That thing does NOT turn off
power to the keyboard or ps/2 mouse port when it
powers down. I finally had to unsolder that idiot
taillight on my MS optical mouse so I could get
some sleep.
I can't find my car keys. (no a's in email)
After this article and yesterday's overly-glowing review of the Xbox, it seems to me that Tom's has fallen on hard times. Consider the following sentence:
"The test guys who aught [sic] to have caught this driver bug seem to be busy selling their stock our [sic] counting their money instead."
All their articles now seem to have been written in five minutes and sent though to door without the slightest bit of editing- or even spell checking!
I don't mean to nitpick, but Tom's used to be a very reliable source- and a great read. Not so much anymore.
Does anyone use these cards for anything other than games?
These cards cost as much as a decent CPU... or a console game system- yet are the fraction the cost of a CAD card. Their shelf life seems pretty limited as well. In a year or two they will all have a half gig of Rambus or DDR and we'll have 16X AGP? Then we'll all need high definition monitors because today's pixels will all look "blocky" by comparison. Then we'll be right back to unusable framerates at higher resolutions... it all goes full circle.
I've never been able to justify the cost, but then again I don't game. The ironic thing is that "fun and games" arguably stress the hardware more than any other apps for most general home users.
Those that suggest you "dance like no one is watching" really want to see you make a complete fool of yourself.
The THG article indicates that for all intents and purposes, the average home-computer user still has enough power in his 700-1000MHz machine that upgrading to the rediculously overpowered 2GHz P4s and Athlon XP 2000+ etc, just isn't worth it for them (unless of course their livelihood is dependant upon computing time). I believe the same is starting to happen in the GPU field as well. A brother of mine recently bought a GeForce 3 card, just after the introduction of the whole Ti 500/200 updates. To this day it's still more power than he needs and should be able to outlast the TNT2 Ultra card he replaced it with. The main point being that except for those people that crave "the fastest," and there's nothing wrong with that ;-) , these incremental increases in performance are going to mean less and less to the consumer, most of whom go to the biggest electronics store around and say "my kid needs a special 3d thingy to play this new game." Although I honestly believe people would be happier if they informed themselves a little, it's impossible to think that they will and in the end it doesn't matter. We've been years away from any new device that shows real promise, instead the best some people can come up with is an integrated cell-phone / PDA. Hmmm... who would have thought... until something does show up... I'll be playing Quake on an 8MB single-head graphics card. Humiliation!
Anandtech has quite a good review here. They also have benchmarks from the lastest build of the unreal engine here. Enjoy :)
Unstable Apps: Our Android Apps Don't Suck
I know you're out there John. :)
Lemme ask you this: it seems that with the previous generation of 3D cards, the technology had reached the point where any game with a reasonable game engine could be run at 1024X768x32bit with all the detail goodies turned on at framerates that were completely playable.
(Perhaps this is a mistaken assumption?)
If so, then what does this card bring to the table from a game designer/coder's perspective?
If there's no point in driving a Quake3 style engine any faster (because it's already fast enough) then what will you be able to do with this new hardware that you couldn't do with older stuff?
Or to rephrase, what hardware feature do you most wish was availible on the current generation of 3D cards, and does this new card have that feature?
DG
Want to learn about race cars? Read my Book
So, does any company make good graphics cards with open specs?
"The question of whether a computer can think is no more interesting than that of whether a submarine can swim" -EWD
Um... this has to be a mistake, but apparently Best Buy is letting you Pre-Order these little slices of heaven for $129.00
Check it out.
Well, there is the normal average-joe meaning of interesting and there is the understated-all-to-hell meaning of interesting.
An example of the latter: at the University of Texas at Austin Hans Mark - former Director or NASA Ames, Deputy Administrator of NASA, ect., ect. - used to teach a class in which the Airborne Laser system used to become a topic of conversation. When asked about its range (since he'd seen the classified testing documents), all he'd say was that it was effective at a 'militarily interesting distance'.
Now, that's a far cry from Tom's Hardware and the GeForce4, but maybe they're trying to get a little reflected glory rather than simply grossly underusing the language.
We can hope, right?
Don Negro
Perl 6 will give you the big knob. -- Larry Wall
Squid
Wolfman (i guess this is the best)
Tidepool
Looks like they had some spelling errors on some of the videos (they spelled content as contnent).
If you're sick of all these senseless video card upgrades, just follow the $150 video card rule. No game is really going to take full advantage of a card less then $150. If you're paying more then that, you're wasting money.
Your money would be better spend putting the extra money towards a better monitor for instance. Be surprised the number of people that spend $400 on a video card to play on a $150 montior. Then wonder why things are still jumpy. A nice subwoofer and new speakers would also enhance your gaming experience.
I've noticed that /. uses the word 'interesting' when an article/review/benchmark doesn't show the community's favoured product (linux/AMD/ATI) as a superior one.
Most slashdotters see nVidia as an evil corporation because they don't open source their drivers for linux. This leaves ATI as the favourite. The benchmarking shows that in almost every test (except aniso) the GF4 smokes the 8500, therefore the results are summarized as 'interesting'.
If the ATI card actually did outperform the nVidia one, then the post would contain something like "ATI crushes the evil nVidia, we are 1337".
I'm not the one to look up previous articles, but I do recal some benchmarks (biased or not) where NT/2000 did something better than linux. The poster stated that the results were "interesting".
I think this is slashdot's attempt to hide the truth that it is possible for the 'evil' corporation to do something good.
On another note, who else thinks that it is pointless to use Q3 as a benchmark. Start using RTCW or another game that actually makes modern cards break a sweat.
The average slashdotter thinks that any program could be reduced to the following if it were written by "skilled" programmers:
int main(int argc, char *argv[]) { return 0; }
Basically it's better to do nothing quickly than to actually accomplish something more slowly.
My only political goal is to see to it that no political party achieves its goals.
The exciting thing about the GeForce 4 is not that it's faster or cheaper, it's that finally the programmability is at an appropriate level.
Uh-huh. 15%. Yawn. Don' need that. I can play Deus Ex just fine. Well, guess what. Even if you think that games are the entire universe, some day you might just need an MRI and need someone to be able to look at it and find something that will keep you from dying. Medical imaging is one of the things that the GeForce 4 will be good enough to do. Scientific visualization, volumetric rendering, that sort of stuff.
Why is this? About a decade ago, everything was basically SGI. These were big, expensive machines, suitable for vertical markets. It was possible to get the engineers to work with the microcode for the sales of a small number of units.
Then various card companies came along (NVidea has a lot of ex-SGI engineers) and started making cards for the horizontal gaming market. They concentrated, of course, on satisfying the needs of their biggest customers/promoters, which were the gaming people. Many of these cards were customizable, but at a level of abstruseness that made it so that maybe three people in the world could really hack them up the wazoo.
In the mean time, SGI suffered, because even people who should know better make decisions on the basis of "gee whiz." No magazine is going to benchmark a card on how accurately it shows a tumor from real data. A perception rose that the graphics problem had been solved for cheap, when it really hadn't been.
The GeForce 4 finally brings little-card graphics up to the point where mere mortals can actually do customization for vertical markets.
;-)
First of all, nobody uses scanline rendering. Maybe NEC PowerVR if they're still around. 'Scanline' as most graphics guys use the term means you do hidden surface removal with something like Brezenham's algorithm rather than a Z-buffer. But everybody uses Z buffers and, as far as I can tell, a 'sort-middle' approach.
Second, tile-based rendering has been tried many many times, both by high-end graphics companies (HP's PixelFlow effort a few years back) and by low-end companies (PowerVR's scanline approach, Dynamic Pictures did tiles under the covers IIRC, MS Talisman, PixelFusion, Gigapixel, and others I'm no doubt forgetting of the 40+ PC 3D companies that were around 5 years ago...). Basically it's a loser. It doesn't fit well with DirectX and OpenGL APIs, it creates almost as many problems as it solves (e.g. load-balancing among tiles, bandwidth-sucking data overlap/duplication among tiles), and the marginal improvements it might generate in theory in speed are outweighed by the retraining time required for graphics developers worldwide to learn programming techniques oriented around tile-based hardware. I could describe these problems in more detail if you indicate interest in a follow-up posting, but I don't have the time now in the middle of the day.
Pixel and vertex shaders are at least relatively innovative. If they can figure out how to tie together not just 2 or 4, but 8 or 32 together in a simple, yet flexible and comprehensible way (I saw Pat Hanrahan give a proposal on how to do this at Eurographics a couple years ago) that makes it easier for developers to use them, that'd be an innovation in parallelism that really pays off IMHO.
--LP
Disclaimer: Any 3D expertise I have is a bit rusty. Feel free to correct any technical misstatements.
Nonsense, who moded this to 5?? This guy doesn't have a clue. This card is the fastest, the policy of whatever works should apply, and will ultimately win in the market, people have tried deferred shading and tiled approaches, and while the NVIDIA system is not a scanline approach, it is not the scheme you probably envision that's WHY it's the fastest. The other approaches failed, and many of the people who worked on them now work for NVIDIA. There are hundreds of engineers at NVIDIA who make these design decisions based on what will work in terms of power requirements, implementation, programmability, speed and a host of other reasons. NVIDIA leads in performance because they get this right. Programmers DO know how to use the programmable shaders, but there are other more traditional ways to use this hardware, and the other pixel pipeline will help even simple multitexture applications too. Even scanline systems can scale very nicely, so the scalability of the tiled approach is just not true, you seem to have forgotten Voodoo SLI, but there are other ways to scale graphics systems too. Your post is a plea to support your pet favourite graphics scheme, but there are detailed technical issues to be considered beyone the glib appeal to emotion. The facts and NVIDIAs performance speaks for itself, and your post is the graphics equivalent of complaining that Ford doesn't make water powered cars.
More shaders, More pixel pipelines, More memory bandwidth... whoopee...
When the hell are they going to ditch the antiquated scanline rendering method and go work on some tile based rendering methods?
Probably never, and for very good reason. Tile-based rendering is a very efficient architecture whose time has already come and gone.
For those who don't know, tile-based rendering divides an image up into a number of smaller squares ("tiles") and renders them independently, as opposed to the traditional method ("immediate-mode rendering") of rendering an image one polygon at a time. The major benefits claimed for tile-based renderers are that the process is more parallelizable (no risk of two chips rendering to the same area if they are working on different tiles) and that it is an easy modification to check each polygon's z-buffer (its distance from the camera) as you add it to the poly-list for its tile, and then to only texturize those polygons which are not occluded (i.e. actually visible). This is in contrast to the traditional immediate-mode rendering algorithm, where polygons are textured more or less in random order, leading to situations where a polygon will go through the entire process of being textured and rendered, only to later be completely covered up by a later poly--a situation which wastes a lot of (especially) memory bandwidth, fetching all those useless textures and such.
Cool! Sounds great! Let's hear it for tile-based rendering! Too bad ATI and NVIDIA have clearly never ever heard of this miracle technique! After all, it's not like they would ever make (gasp!) an informed choice not to use it!
Well...not so fast. Basically what we've seen is that tile-based rendering offers two potential benefits: it eliminates *some* of the complexity of enabling multi-GPU implmentations, and it uses quite a bit less memory bandwidth in the base case. The problem is that both of these supposed benefits really buy you very little when designing a consumer-level graphics card today.
First, the problem of "dividing up the work" isn't really what's preventing multi-chip graphics cards these days. Indeed, it's really a rather easy problem. Here's a clue: have alternate chips render alternate frames. Gee...that wasn't so tough, now was it? Well, no. But the other problems of implementing a multi-chip card for the consumer market sure are. For example, we have our choice of implementing an (expensive, performance gating) point-to-point bus to handle memory traffic (and have memory bandwidth/chip cut in half anyways), or of completely mirroring the memory, using twice as much for the same capacity (expensive). Then there's the cost of a second chip (expensive), the cost of packaging the second chip and connecting it to memory (expensive), and the cost of the extra power and cooling, the cost of trying to squeeze it all onto one card (results in a bigger, more expensive card; may gate clockability). And this is without mentioning the extra development and debugging time that goes into getting a multi-chip solution to work correctly. (In general this is one of the most difficult issues design engineers face.) Golly, it's almost enough to make you remember how when 3dfx tried to make a multi-chip product it was 6 months late, the single-chip card was far too slow, the double-chip (and cancelled quad-chip) card too expensive, and, due to the release delay, no longer competitive. (OTOH John C has hinted that a scalable multi-chip architecture might be on the way from one of the major players. Tie that in with the fact that Anand reports the GF4 will be the last to use the GF name, and that NVIDIA owns the remnants of 3dfx, and I start scratching my head...)
Second, the problem of memory bandwidth. Or rather, the former problem of memory bandwidth. Yes, the traditional rendering pipeline is very inefficient with memory bandwidth. Thing is, the prices on high-speed DDR have been coming down so fast that it hardly matters. You can find a Radeon 7500 with 64MB of 128-bit-wide DDR running at 2x230 MHz (i.e. 7.4GB/s bandwidth) for as low as $85 on pricewatch.com. (Actually there's one for $79 but it may be mislabeled.) The memory is probably less than $30 of the cost. Or maybe even less--the 64MB and 32MB GF2Pros (6.4GB/s bandwidth) only differ by $6. And the new GF4 MX460 hits the street with 64MB of 2x275 MHz DDR (8.8GB/s) for $179, list, on a brand new card.
As for the price premium of using relatively high-speed DDR instead of the same amount of SDRAM, it's pretty neglibible. Even for the highest speed DDR it's not such a big deal. Sure NVIDIA charges an extra $100 for another 25MHz on the GPU and an extra 1.6GB/s from the memroy (GF4 Ti4600 vs. Ti4400), but that doesn't mean it costs them anywhere near that much. (depending on GPU yields) It just means they like to bilk those in the $400-for-a-video-card crowd for the full $400. So how much does the stuff cost? Well...Hynix recently announced samples and volume production of 2x375 MHz x32 DDR selling at $10 for 128Mbit chips. That means $40 for 64MB of 128-bit-wide DDR with 12GB/s bandwidth. Not too shabby.
Ok, ok...so maybe the benefits of tile-based rendering don't really mean all that much in today's consumer GPU market. But better is better: why wouldn't ATI and NVIDIA use tile-based architectures for the benfits it does provide. After all, it's not like there might be some (gasp!) downsides to tile-based rendering!
Well, actually, there are. For one thing, it's more difficult to design a tile-based GPU and get it running at high speeds. For another both NVIDIA and ATI have years and years of research and experience with implementation techniques and algorithms for immediate-mode renderers, much of which wouldn't apply to tile-based designs.
For another, neither ATI nor NVIDIA really uses traditional immediate-mode rendering anymore. Instead they use modified immediate-mode rendering, with lots of algorithmic tricks and tweaks to lessen the memory bandwidth inefficiencies of traditional immediate-mode rendering. Things like lossless z-buffer compression and various early polygon-culling algorithms. No they aren't quite as effective in reducing overdraw as tile-based rendering, but they provide quite a significant benefit. Indeed, the GF4 Ti4600 has more or less caught up with the (tile-based) KyroII in Kyro's own villagemark benchmark, which is contrived entirely to test massive overdraw of the sort which is never encountered in a game. The KyroII is only 8 months old. Sure it's much much cheaper than a Ti4600, but if Kyro can barely keep the lead in the one benchmark specially designed to make the case for tile-based rendering then something is wrong here.
Meanwhile there are very serious issues with the ability of tile-based rendering to scale to meet future challenges. In particular, the tile-based rendering algorithm works very naturally so long as there are no polygons which find themselves spread into more than one tile, and so long as you don't use transparent or translucent textures. Of course it's not that tile-based chips can't handle these situations--the KyroII is here and works just fine, after all--but just that they require complicated workarounds which are more inefficient than for immediate-mode rendering, which handles these cases naturally.
The problem is that both cases are going to be more and more likely as graphics continue to improve. As tile-based rendering tries to scale with increasing scene polygon counts and resolutions, you get more tiles per scene and many more polygons crossing tile boundries. And as graphical effects get more realistic, the alpha channel (i.e. transparency) starts coming into play more and more. Indeed much of the recent research in non-real-time computer graphics has focused on adding translucent "subsurface" reflections to the ray-tracing algorithm. This (and approximations of it) is the sort of thing that future pixel shaders are going to be called on to do, and tile-based rendering is a bad match for it.
Indeed, most of the recent advances in graphics are pointing towards a world in which the assumptions which tile-based rendering is based on no longer hold. How, for example, does tile-based rendering handle cubic environment mapping across tile boundries, or cast dynamic shadows across tile boundries? What happens if a dot3 bump map extends a texture from one tile into another? I'm sure clever solutions can be found to these and all the other dozens and dozens of issues that will arise when you try to mix DX8-style effects and tile boundries, but the main point is that tile-based rendering was an algorithm developed under two assumptions which increasingly do not hold:
1) If one polygon occludes another, the other's texture will never be visible to the camera;
2) Objects in one section in the screen can be rendered without reference to any other parts of the screen.
Of course, we may never know the difficulties of trying to make a DX8-compliant tile-based renderer; after all, the KyroII hasn't even made it to DX7, since it is still missing integrated T&L. I have no idea whether this is because of any difficulties integrating T&L with a tile-based rendering pipeline (can't think of why it would be a problem, but it may be), or just because the Kyro doesn't have the money or manpower behind it to keep up with 3 year old technology, but this lack is already preventing the KyroII from competing effectively with the cheaper GF2MX on modern high-poly games. I am pretty sure that integrating a programmable pixel shader into a tile-based architecture would be pretty tough, if not pretty impossible.
Which brings me to the main point: you started out writing "More shaders, More pixel pipelines, More memory bandwidth... whoopee..." and in a sense, this is the right attitude. To which we should very quickly add "tile-based screen division...deferred rendering algorithm...whoopee..." All these technical details only mean something insofar as they give us the capability for more realistic graphics--this means high FPS, high color depth, higher resolutions, lack of aliasing problems, high-quality mip-mapping/anisotropic filtering, realistic--or even dynamic--lighting and shadows, realistic and/or impressive pixel effects, high polygon counts, useful and realistic vertex effects, etc.--for a reasonable price. It is pretty damn hard to argue that the last few years, under NVIDIA's leadership (and ATI's pursuit) have not resulted in huge improvements on these measures. Again, the new GF4 Ti4600 may be ridiculously expensive and may not change your experience with today's games very much (besides enabling 1600x1200x32 with 4xAA at playable framerates), but when the new Doom game comes out, a card with similar specs and selling for ~$100 will bring you decent performance on an engine which offers a totally new level of graphical realism. Same thing when Unreal Warfare, Unreal 2, Deus Ex 2, and all the other Unreal 2-engine games start coming out. Believe me, a GF4 caliber card will improve the experience of playing those and later games significantly over a GF3 and especially a non-DX8 compliant card like a GF2 (and, sadly, a GF4MX). And, believe me, those games are going to provide significantly more realistic graphical experiences than those of today.
Immediate-mode rendering is doing just fine, and the GF4 marks an evolutionary but very significant improvement to the state-of-the-art. A switch to tile-based would require significant retreading to reach the same level, and might form a poorer basis for future improvements. But, if I'm wrong, then ATI and NVIDIA will make the switch. Believe me, they know all about tile-based rendering, and NVIDIA even owns Gigapixel (via 3dfx) and their tile-based rendering engine. I think they'll stick to modifications of immediate-based rendering, but no matter what they do it will be whatever they think offers the best graphics performance at the lowest cost to them.
And now to correct some minor misconceptions in your post:
Hell, the reason why the Geforce line has to keep doubling its fill rates every generation is because its architechture is so god damn ineffecient. Look at the memory bandwidth requirements for the cards!
The reason the GeForce line increases its texel fill rates continually is because consumers want to run new games which have higher multi-texturing requirements (Carmack has said Doom3 will have something like ~8 textures/pixel), and to run existing games in higher resolutions and at higher FPS.
The memory bandwidth "requirements" for the cards don't matter, only the prices. If a recent card with 7.4GB/s only costs $85 (Radeon 7500) and a brand new card with 8.8GB/s lists for $179, then the costs of increasing memory bandwidth are obviously not so terrible. Today's $400 card is next year's $80 card. Similarly, immediate-mode rendering's inefficiencies need to be measured according to their dollar costs, not their bandwidth costs.
Instead of using the relatively limited bandwidth of AGP for streaming textures from main memory (where it should god damn be) to the texture cache, the card is busy wasting bandwidth on the damn Z-buffer (which would be eliminated if they implemented hidden surface removal like the PowerVR chipsets).
???
First off, textures most certainly should not "god damn be" in main memory! The AGP bus is there to stream vertex data from the CPU (pre- or post-transformation, it's the same amount of data). That's all it's there to do, and good thing, too, because today's high-poly games can already generate enough vertex data to make AGP 2x a bottleneck, and those of a couple years will do the same to AGP 4x. (Which is why AGP 8x is on the horizon.) Increasing the bandwidth of a bus from the northbridge across the motherboard through a slot to an add-on card is a whole lot harder than increasing the bandwidth from soldered DDR to a soldered GPU a few centimeters away. AGP should only carry the data which it absolutely is forced to--namely initial vertex data from the game's engine running on the CPU.
Z-buffer lookups only waste bandwidth between the GPU and the on-card memory. Technically, you don't eliminate z-buffer lookups with a tile-based architecture; you eliminate texture lookups (and texture application) on occluded polygons. However, by dealing with a small tile at a time, you can read all the z-buffer data for the tile in from memory all at once, and store it in an on-chip cache until you're done with that tile. (This is essentially why higher poly-count games mean smaller and smaller tiles.)
And last, they do implement hidden surface removal techniques, like I pointed out before, even though they are less effective than with a tile-based architecture.