DirectCompute, OpenCL, and the Future of CAL

As a journalist, GPGPU stuff is one of the more frustrating things to cover. The concept is great, but the execution makes it difficult to accurately cover, exacerbated by the fact that until now AMD and NVIDIA each had separate APIs. OpenCL and DirectCompute will unify things, but software will be slow to arrive.

As it stands, neither AMD nor NVIDIA have a complete OpenCL implementation that's shipping to end-users for Windows or Linux. NVIDIA has OpenCL working on the 8-series and later on Mac OS X Snow Leopard, and AMD has it working under the same OS for the 4800 series, but for obvious reasons we can’t test a 5870 in a Mac. As such it won’t be until later this year that we see either side get OpenCL up and running under Windows. Both NVIDIA and AMD have development versions that they're letting developers play with, and both have submitted implementations to Khronos, so hopefully we’ll have something soon.

It’s also worth noting that OpenCL is based around DirectX 10 hardware, so even after someone finally ships an implementation we’re likely to see a new version in short order. AMD is already talking about OpenCL 1.1, which would add support for the hardware features that they have from DirectX 11, such as append/consume buffers and atomic operations.

DirectCompute is in comparatively better shape. NVIDIA already supports it on their DX10 hardware, and the beta drivers we’re using for the 5870 support it on the 5000 series. The missing link at this point is AMD’s DX10 hardware; even the beta drivers we’re using don’t support it on the 2000, 3000, or 4000 series. From what we hear the final Catalyst 9.10 drivers will deliver this feature.

Going forward, one specific issue for DirectCompute development will be that there are three levels of DirectCompute, derived from DX10 (4.0), DX10.1 (4.1), and DX11 (5.0) hardware. The higher the version the more advanced the features, with DirectCompute 5.0 in particular being a big jump as it’s the first hardware generation designed with DirectCompute in mind. Among other notable differences, it’s the first version to offer double precision floating point support and atomic operations.

AMD is convinced that developers should and will target DirectCompute 5.0 due to its feature set, but we’re not sold on the idea. To say that there’s a “lot” of DX10 hardware out there is a gross understatement, and all of that hardware is capable of supporting at a minimum DirectCompute 4.0. Certainly DirectCompute 5.0 is the better API to use, but the first developers testing the waters may end up starting with DirectCompute 4.0. Releasing something written in DirectCompute 5.0 right now won’t do developers much good at the moment due to the low quantity of hardware out there that can support it.

With that in mind, there’s not much of a software situation to speak about when it comes to DirectCompute right now. Cyberlink demoed a version of PowerDirector using DirectCompute for rendering effects, but it’s the same story as most DX11 games: later this year. For AMD there isn’t as much of an incentive to push non-game software as fast or as hard as DX11 games, so we’re expecting any non-game software utilizing DirectCompute to be slow to materialize.

Given that DirectCompute is the only common GPGPU API that is currently working on both vendors’ cards, we wanted to try to use it as the basis of a proper GPGPU comparison. We did get something that would accomplish the task, unfortunately it was an NVIDIA tech demo. We have decided to run it anyhow as it’s quite literally the only thing we have right now that uses DirectCompute, but please take an appropriately sized quantity of salt – it’s not really a fair test.

NVIDIA’s ocean demo is a fairly simple proof of concept program that uses DirectCompute to run Fast Fourier transforms directly on the GPU for better performance. The FFTs in turn are used to generate the wave data, forming the wave action seen on screen as part of the ocean. This is a DirectCompute 4.0 program, as it’s intended to run on NVIDIA’s DX10 hardware.

The 5870 has no problem running the program, and in spite of whatever home field advantage that may exist for NVIDIA it easily outperforms the GTX 285. Things get a little more crazy once we start using SLI/Crossfire; the 5870 picks up speed, but the GTX 295 ends up being slower than the GTX 285. As it’s only a tech demo this shouldn’t be dwelt on too much beyond the fact that it’s proof that DirectCompute is indeed working on the 5800 series.

Wrapping things up, one of the last GPGPU projects AMD presented at their press event was a GPU implementation of Bullet Physics, an open source physics simulation library. Although they’ll never admit it, AMD is probably getting tired of being beaten over the head by NVIDIA and PhysX; Bullet Physics is AMD’s proof that they can do physics too. However we don’t expect it to go anywhere given its very low penetration in existing games and the amount of trouble NVIDIA has had in getting developers to use anything besides Havok. Our expectations for GPGPU physics remains the same: the unification will come from a middleware vendor selling a commercial physics package. If it’s not Havok, then it will be someone else.

Finally, while AMD is hitting the ground running for OpenCL and DirectCompute, their older APIs are being left behind as AMD has chosen to focus all future efforts on OpenCL and DirectCompute. Brook+, AMD’s high level language, has been put out to pasture as a Sourceforge project. Compute Abstract Layer (CAL) lives on since it’s what AMD’s OpenCL support is built upon, however it’s not going to see any further public development with the interface frozen at the current 1.4 standard. AMD is discouraging any CAL development in favor of OpenCL, although it’s likely the High Performance Computing (HPC) crowd will continue to use it in conjunction with AMD’s FireStream cards to squeeze every bit of performance out of AMD’s hardware.

The First DirectX 11 Games Eyefinity
Comments Locked

327 Comments

View All Comments

  • SiliconDoc - Thursday, September 24, 2009 - link

    Oh really ? Now wait a minute, spin master. When the site here whined about "paper launch" it was Derek who brought up a two or three year old nvidia card, and cried and whined about it. Then speculated the GTX275 was paper, and then "a phantom card".
    Well, that didn't happen.... no apologies about it ever either.
    ---
    The PAPER launches of late are ATI ATI ATI ! ! !
    We have the 4770, and now this one !
    ----
    Gee, when ATI BLOWS IT, we suddenly talk in vague terms about "the companies" having "papery launches" as " the general rule of thumb of how it's done.." - and that makes us "not a fan boy!??!"
    R0FLMAO !!!!
    Yes, of course, since the red ati is bleeding paper launches and the last one from nvidia one can actually cite is YEARS AND YEARS ago, yes, of course, you're correct, it's "unnamed companies in the multiple" that "do it"....
    ---
    I swear to god, I cannot even believe the massive brainwashing that is a gigantic pall all over the place.
    ---
    If I'm WRONG, please be kind, and tell me what nvidia paper launches I missed.... PLEASE LET ME KNOW.
  • Genx87 - Wednesday, September 23, 2009 - link

    Is\was a good idea to shoot for as this is most certainly what Nvidia is going to attempt to achieve. But I am a bit disappointed this care rarely achieved it.

    I do like angle independent AF though. Should be interesting to see what Nvidia brings to the table. But kind of like the CPU situation(i5) I am kind of meh. But will say this has more potential compared to its predecessor than the i5 series does compared to Core 2 Duo.
  • SiliconDoc - Wednesday, September 23, 2009 - link

    I thought that was just great, those pretty pictures, and then I get to reading. I see the 4890 and SQUARES. I see the GTX285, with CIRCLES and an outer rounded octagon.
    Then the 5870- and it's "perfectly round" angle independent algorithm, but I still see some distortions.
    --
    So I get to reading and am told "the 4890 and 285 are virtually the same". I guess the wheel was first made square, and rolled as well as when it became round. No chance the reviewer could tell the truth and remark that NVidia has the best, until today.. NOPE can't do that!
    ---
    Then, of course, the celebration for the "perfection" of the 5870 and ATI's superb success in the "round" category...
    EXCEPT:
    We get to the actual implementations and NO PERCIEVED DIFFERENCE IS VISUALLY THERE. It cannot be seen. The article even states they searched in vain for some game to show the difference. LOL
    All that extra effort to for pete sakes show that ATI superiority...all WASTED EFFORT, but for red roosters I'm certain it was a very exciting quest, titillating, gee a change to take down big green...
    ---
    So bottom line is IT'S A BIG FAT ZERO, even the older, worse ati implementation is apparently "non distinguishable".
    It is remarked that NVidia doesn't "officially" support this method in game, and of course, after much red rooster effort, one finds out why.
    THERE IS NO DIFFERENCE in visual quality. Another phantom red "win".
    Another reason NVidia makes money (why waste it on worthless crap in developement that makes no difference), while ATI does not.
    Yeah, that was so cool.
    So happy the "mental ideation of perfection in the card for ati fans" was furthered. ROFL
  • Dante80 - Wednesday, September 23, 2009 - link

    A quick question. Why is there no 5850 review available atm?

    1> Was there a separate NDA for the 2 cards?
    2> Were there no sample cards given by AMD to reviewers?
    3> Did AMD ask reviewers to postpone said reviews due to market supply problems/glitches?
    4> Was this a strategy decision by AMD, for marketing or other reasons?
  • Ryan Smith - Wednesday, September 23, 2009 - link

    AMD only provided us with 2 5870s, the 5850 was not sampled. 5800 series cards are in short supply, even for reviewers.
  • Dante80 - Thursday, September 24, 2009 - link

    Thank you for the prompt answer, that was what I was guessing too. Cheers...^^
  • Spoelie - Wednesday, September 23, 2009 - link

    To get enough 5870 cards in the channel for a hard launch, they used every possible die.

    There are probably not enough harvested dies to create the 5850 line just yet. And they're not gonna use fully functional ones that can go in a 5870 when supply for them is tight already.

    Once the 5850 is launched, demand for them is up and yields matured, they'll have to use fully functional dies to keep supply up, but now they're building up inventory for a hard launch during the coming weeks.
  • SiliconDoc - Wednesday, September 23, 2009 - link

    Uhh, just a minute there feller. The SOFT or PAPER LAUNCH has already hit, the big LAUNCH DATE is today....
    Newegg is a big ZERO available... (one Powercolor was there 30 mins ago, the other 3 listed are NOT avaailable, I watched them appear last night).
    ---
    So, when Ati has a "hard launch" they get "many weeks after the launch date" to "ramp up production" and "fill the need".
    ROFLMAO
    I was here when this site and the red roosters whined about Nvidia and appear launches, and I believe it was the GTX275 that was predicted to be PAPER (not very long ago in fact) here, and the article EVEN SPECULATED IT WAS A PHANTOM CARD.
    All the red roosters piled on, but.... the card was available on launch, it wasn't a PHANTOM, and all that bs was quickly forgotten and shoved into the memory hole like it never happened...
    ---
    Oh, but when it's ATI and not Nvidia, the 4770 can remain almost pure paper near forever, and this one, golly it can be 95% paper and it's just " getting ready for a hard launch" WEEKS BEYONDS the launch date!
    ROFLMAO
    --
    No bias here?!? "Where's da' bias?!?!" said the red rooster (to the green goblin)...
    Give me a break.
  • chrnochime - Friday, September 25, 2009 - link

    Just because you can't find it in the states doesn't mean it's a fake launch. And fake launch? What are you a 12 year old or something? You're like the nvidia version of snakeoil. Just go play with your nvidia part m'kay ?

  • SiliconDoc - Sunday, September 27, 2009 - link

    Well, since you insulted, and mischaracterized, I came across the reminder about the 4870 paper launch.
    Yes, that's correct, this is how ATI rolls, a big fat lying launch date, a piddle of a few cards, then wait a couple weeks or a month.
    --
    " The cards are fast, but as many pointed out HD 5870 is not faster than Geforce GTX 295, which is something that many have expected. Radeon 5850 will also start selling in October time, but remember, last summer when ATI launched 4870, the card was almost impossible to buy and weeks if not months after, the availability finally improved. "
    http://www.fudzilla.com/content/view/15643/1/">http://www.fudzilla.com/content/view/15643/1/
    --
    Like I've kept saying, the bias is so bad... I keep discovering more big fat ati blunderous moves, that are instead ascribed to imaginarily to Nvida.
    Thanks for the incorrect whining, anger, and standard PC e-mindless chatroom repeated, non original, heard ten thousand times, brainless insult, it actually helped me.
    I learned ATI blew their 4870 launch with paper lies as well.
    You're a great help friend.

Log in

Don't have an account? Sign up now