ATI Radeon X800 Pro and XT Platinum Edition: R420 Arrives
by Derek Wilson on May 4, 2004 10:28 AM EST- Posted in
- GPUs
Depth and Stencil with Hyper Z HD
In accordance with their "High Definition Gaming" theme, ATI is calling the R420's method of handling depth and stencil processing Hyper Z HD. Depth and stencil processing is handled at multiple points throughout the pipeline, but grouping all this hardware into one block can make sense as each step along the way will touch the z-buffer (an on die cache of z and stencil data). We have previously covered other incarnations of Hyper Z which have done basically the same job. Here we can see where the Hyper Z HD functionality interfaces with the rendering pipeline:
The R420 architecture implements a hierarchical and early z type of occlusion culling in the rendering pipeline.
With early z, as data emerges from the geometry processing portion of the GPU, it is possible to skip further rendering large portions of the scene that are occluded (or covered) by other geometry. In this way, pixels that won't be seen don't need to run through the pixel shader pipelines and waste precious resources.
Hierarchical z indicates that large blocks of pixels are checked and thrown out if the entire tile is occluded. In R420, these tiles are the very same ones output by the geometry and setup engine. If only part of a tile is occluded, smaller subsections are checked and thrown out if possible. This processing doesn't eliminate all the occluded pixels, so pixels coming out of the pixel pipelines also need to be tested for visibility before they are drawn to the framebuffer. The real difference between R3xx and R420 is in the number of pixels that can be gracefully handled.
As rasterization draws nearer, the ATI and NVIDIA architectures begin to differentiate themselves more. Both claim that they are able to calculate up to 32 z or stencil operations per clock, but the conditions under which this is true are different. NV40 is able to push two z/stencil operations per pixel pipeline during a z or stencil only pass or in other cases when no color data is being dealt with (the color unit in NV40 can work with z/stencil data when no color computation is needed). By contrast, R420 pushes 32 z/stencil operations per clock cycle when antialiasing is enabled (one z/stencil operation can be completed per clock at the end of each pixel pipeline, and one z/stencil operation can be completed inside the multisample AA unit).
The different approaches these architectures take mean that each will excel in different ways when dealing with z or stencil data. Under R420, z/stencil speed will be maximized when antialiasing is enabled and will only see 16 z/stencil operations per clock under non-antialiased rendering. NV40 will achieve maximum z/stencil performance when a z/stencil only pass is performed regardless of the state of antialiasing.
The average case for NV40 will be closer to 16 z/stencil operations per clock, and if users don't run antialiasing on R420 they won't see more than 16 z/stencil operations per clock. Really, if everyone begins to enable antialiasing, R420 will begin to shine in real world situations, and if developers embrace z or stencil only passes (such as in Doom III), NV40 will do very well. The bottom line on which approach is better will be defined by the direction the users and developers take in the future. Will enabling antialiasing win out over running at ultra-high resolutions? Will developers mimic John Carmack and the intensive shadowing capabilities of Doom III? Both scenarios could play out simultaneously, but, really, only time will tell.
95 Comments
View All Comments
l3ored - Tuesday, May 4, 2004 - link
only the 800xt was winning, the pro usually came after the 6800'sKeeksy - Tuesday, May 4, 2004 - link
Yeah, it is funny how ATi excels in DirectX, yet loses in the OpenGL bechmarks. Looks like I'm going to have both an NVIDIA and an ATi card. The first to play Doom3, the other to play HL2.peroni - Tuesday, May 4, 2004 - link
I wish there was some testing done with overclocking.There are quite a few spelling errors in there Derek.
Did I miss something or I did not see any mention of prices for these 2 cards?
Glitchny - Tuesday, May 4, 2004 - link
#11 thats what everyone thought when Nvidia bought all the people from 3dFX and look what happened with that.araczynski - Tuesday, May 4, 2004 - link
i agree with 5 and 10, still the same old stalemate as before, one is good at one thing, the other is good at another. i guess i'll let price dictate my next purchase.but ati sure did take the wind out of nvidia's sails with these numbers.
i wish one of the two would buy the other one out and combine the technologies, one would think they would have a nice product in the end.
eBauer - Tuesday, May 4, 2004 - link
#8 - OpenGL still kicks butt on the nVidia boards. Think of all the Doom3 fans that will buy the 6800's....As for myself, I will wait and see how the prices pan out. For now leaning on the X800.
ViRGE - Tuesday, May 4, 2004 - link
...On the virge of ATI's R420 GPU launch...Derek, I'm so touched that you thought of me. ;)
Tallon - Tuesday, May 4, 2004 - link
Ok, so let's review. with the x800XT having better image quality, better framerates, only taking up one slot for cooling and STILL being cooler, and only needing one molex connecter (uses less power than the 9800 XT, actually), who in their right mind would choose a 6800u over this x800XT? I mean, seriously, NVIDIA is scrambling to release a 6850u now which is exactly identical to a 6800u, it's just overclocked (which means more power and higher temperatures). This is ridiculous. ATI is king.noxipoo - Tuesday, May 4, 2004 - link
ATi wins again.Akaz1976 - Tuesday, May 4, 2004 - link
Dang! On one hand, I am saddened by the review. My recently purchased (last month) Radeon9800PRO would be at the bottom of the chart in most of the tests carried out in this review :(On the other hand this sure bode well for my next vid card upgrade. Even if it is a few months off! :)
Akaz