Lucid's Multi-GPU Wonder: More Information on the Hydra 100
by Derek Wilson on August 22, 2008 4:00 PM EST- Posted in
- GPUs
What Does This Thing Actually Do?
From a high level, Lucid's technology intercepts DirectX or OpenGL API calls, analyzes them, organizes them into distinct tasks, and based on the analysis combined with the historical performance of various cards handling of previous frames' workload, it evenly distributes the tasks across all the GPUs in the system.After the workload is distributed, the buffers are read back to the Hydra chip and composited before the final scene is sent to the proper graphics card for display. Looking a bit deeper, here is a block diagram of the process itself from Lucid's whitepaper.
The current implementation can take x16 PCIe in and can switch it to either 2x x16 PCIe channels or up to 4x x16 PCIe channels. This gives it support for 1 to 4 cards depending on how the motherboard or graphics card handles things. They do have the flexibility to scale down to x8 in and 2x x8 out, making lower cost motherboards feasible as well. Future products may support more graphics cards and more PCIe lanes, but right now 4 is what makes sense. Lucid says the hardware can scale up to any number of cards with linear performance improvement.
Some of the implications of this process are that if any graphics card in the system has other work being done on it (say maybe physics or video or something), the load will be dynamically balanced and you'll still be able to squeeze as much juice out of all the hardware in your system as possible. Pretty cool huh? If it works as advertised that is.
The demo we saw behind closed doors with Lucid did show a video playing on one 9800 GT while the combination of it and one other 9800 GT worked together to run Crysis DX9 with the highest possible settings at 40-60 fps (in game) with a resolution of 1920x1200. Since I've not tested Crysis DX9 mode on 9800 GT I have no idea how good this is, but it at least sounds nice.
Since Lucid is analyzing the data, they can even do things like not draw hidden "tasks" (if an entire object is occluded, rather than send it to a graphics card, it just doesn't send it down). I asked about dependent texturing and shader modification of depth, and apparently they also build something like a dependency graph and if something modified affects something else they are able to adjust that on the fly as well.
In theory, tracking and adjusting to dependencies on the fly will completely avoid the issues that keep NVIDIA and AMD from running AFR in all games. And they even claim that this can help give you higher than linear scaling when using their hardware with more than one card.
We asked what the latency of their implementation is, and they said it is negligible. Of course, that's not a real answer, especially for guys like us who want to know the details so we can understand what's going on better. We don't just want to see the end result, we want to know how we get there. Playing Crysis didn't feel laggy, but there is no way this solution doesn't introduce processing time.
An explanation for this is the fact that the Hydra software can keep requesting and queuing up tasks beyond what graphics cards could do, so that the CPU is able to keep going and send more graphics API calls than it would normally. This seems like it would introduce more lag to us, but they assured us that the opposite is true. If the Hydra engine speeds things up over all, that's great. But it certainly takes some time to do its processing and we'd love to know what it is.
57 Comments
View All Comments
jeff4321 - Sunday, August 24, 2008 - link
If you think that NVIDIA and AMD have been stagnant, you haven't seen the graphics industry change. The basic graphics pipeline hasn't changed. It simply got smaller. A current NVIDIA or ATI GPU probably has as much computation power as an SGI workstation from the 90's. GPGPU is a natural extension of graphics hardware. Once the graphics hardware becomes powerful enough, it starts to resemble a general purpose machine, so you build it that way. It's possible because the design space for the GPU can do more (Moore's Law).Since it's early in the deployment of using a GPU as an application-defined co-processor, I would expect there to be competing APIs. Believe it or not, in the late eighties, x87 wasn't the only floating point processor available for x86's. Intel's 387 was slower than Weitek's floating point unit. Weitek lost because the next generation CPUs at the time started integrating floating point. Who will win? The team that has better development tools or the team that exclusively runs the next killer app.
Dynamically changing between AFR and splitting the scene is hard to do. I'm sure that ATI and NVIDIA have experimented w/ this in-house and they are either doing it now, or they have decided that it kills performance because of the overhead to change it on the fly. How Lucid can do better than the designers of the device drivers and ASICs, I don't know.
Lucid Hydra is not competition for either NVIDIA or ATI. The Lucid Hydra chip is a mechanism for the principals of the company to get rich when Intel buys them to get access to Multi-GPU software for Larrabee. It'll be a good deal for the principals, but probably a bad deal for Intel.
Licensing Crossfire and SLI is a business decision. Both technologies cost a bundle to develop. Both companies want to maximize return.
AnnonymousCoward - Saturday, August 23, 2008 - link
I'm afraid this solution will cause unacceptable lag. If the lag isn't inherent, maybe the solution will require a minimum "max frames to render ahead / Prerender limit". I don't buy their "negligible" BS answer.Does SLI require a minimum? I got the impression it does, from what I've read in the past. I don't have SLI, and use RivaTuner to set mine to "1".
Aethelwolf - Saturday, August 23, 2008 - link
Lets pretend, if only for a moment, that I was a GPU company interested giving a certain other GPU company a black eye. And lets say I have this strategy where I design for the middle range and then scale up and down. I would be seriously haggling lucid right now to become a partner in supplying me, and pretty much only me, besides intel, with their hydra engine.DerekWilson - Saturday, August 23, 2008 - link
that'd be cool, but lucid will sell more parts if they work with everyone.they're interested in making lots of money ... maybe amd and intel could do that for them, but i think the long term solution is to support as much as possible.
Sublym3 - Saturday, August 23, 2008 - link
Correct me if i am wrong but isn’t this technology still depending on making the hardware specifically for each DirectX version?So when a new DirectX or OpenGL version comes out not only will we have to update our videos cards but also our motherboard at the same time?
Not to mention this will probably jack up the price on already expensive motherboards.
Seems like a step backwards to me...
DerekWilson - Saturday, August 23, 2008 - link
you are both right and wrong --yes the need to update the technology for each new directx and opengl release.
BUT
they don't need to update the hardware at all. the hardware is just a smart switch with a compositor.
to support a new directx or opengl version, you would only need to update the driver / software for the hydra 100 ...
just like a regular video card.
magao - Saturday, August 23, 2008 - link
There seems to be a strong correlation between Intel's claims about Larrabee, and Lucid's claims about Hydra.This is pure speculation, but I wouldn't be surprised if Hydra is the behind-the-scenes technology that makes Larrabee work.
Aethelwolf - Saturday, August 23, 2008 - link
I think this is the case. Hydra and Larrabee appear to be made for each other. I won't be surprised if they end up mating.From a programmers view, Larrabee is very, very exciting tech. If it fails in the PC space, it might be resurrected when next-gen consoles come along, since it is fully programmable and claims linear performance (thanks to hydra?).
DerekWilson - Saturday, August 23, 2008 - link
i'm sure intel will love hydra for allowing their platforms to support linear scaling with multigpu solutions.but larrabee won't have anything near the same scaling issues that nvidia and amd have in scaling to multi-gpu -- larrabee may not even need this to get near linear scaling in multigpu situation.
essentially they just need to build an smp system and it will work -- shared mem and all ...
their driver would need to optimize differently, but that would be about it.
GmTrix - Saturday, August 23, 2008 - link
If larrabee doesn't need hydra to get near linear scaling isn't hydra just providing a way for amd and nvidia to compete with it?