The Road to Acquisition
The CPU and the GPU have been on this collision course for quite some time; although we often refer to the CPU as a general purpose processor and the GPU as a graphics processor, the reality is that they are both general purpose. The GPU is merely a highly parallel general purpose processor, which is particularly well suited for particular applications such as 3D gaming. As the GPU became more programmable and thus general purpose, its highly parallel nature became interesting to new classes of applications: things like scientific computing are now within the realm of possibility for execution on a GPU.
Today's GPUs are vastly superior to what we currently call desktop CPUs when it comes to things like 3D gaming, video decoding and a lot of HPC applications. The problem is that a GPU is fairly worthless at sequential tasks, meaning that it relies on having a fast host CPU to handle everything else other than what it's good at.
ATI discovered that long term, as the GPU grows in its power, it will eventually be bottlenecked by the ability to do high speed sequential processing. In the same vein, the CPU will eventually be bottlenecked by the ability to do highly parallel processing. In other words, GPUs need CPUs and CPUs need GPUs for all workloads going forward. Neither approach will solve every problem and run every program out there optimally, but the combination of the two is what is necessary.
ATI came to this realization originally when looking at the possibilities for using its GPU for general purpose computing (GPGPU), even before AMD began talking to ATI about a potential acquisition. ATI's Bob Drebin (formerly the CTO of ATI, now CTO of AMD's Graphics Products Group) told us that as he began looking at the potential for ATI's GPUs he realized that ATI needed a strong sequential processor.
We wanted to know how Bob's team solved the problem, because obviously they had to come up with a solution other than "get acquired by AMD". Bob didn't directly answer the question, but he did so with a smile, by saying that ATI tried to pair its GPUs with as low power of a sequential processor as possible but always ran into the same problem of the sequential processor becoming a bottleneck. In the end, Bob believes that the AMD acquisition made the most sense because the new company is able to combine a strong sequential processor with a strong parallel processor, eventually integrating the two on a single die. We really wanted to know what ATI's "plan-B" was, had the acquisition not worked out, because we're guessing that ATI's backup plan is probably very similar to what NVIDIA has planned for its future.
To understand the point of combining a highly sequential processor like modern day desktop CPUs and a highly parallel GPU you have to look above and beyond the gaming market, into what AMD is calling stream computing. AMD perceives a number of potential applications that will require a very GPU-like architecture to solve, things that we already see today. Simply watching an HD-DVD can eat up almost 100% of some of the fastest dual core processors today, while a GPU can perform the same decoding task with much better power efficiency. H.264 encoding and decoding are perfect examples of tasks that are better suited for highly parallel processor architectures than what desktop CPUs are currently built on. But just as video processing is important, so are general productivity tasks, which is where we need the strengths of present day Out of Order superscalar CPUs. A combined architecture that can excel at both types of applications is clearly a direction that desktop CPUs need to target in order to remain relevant in future applications.
Future applications will easily combine stream computing with more sequential tasks, and we already see some of that now with web browsers. Imagine browsing a site like YouTube except where all of the content is much higher quality and requires far more CPU (or GPU) power to play. You need the strengths of a high powered sequential processor to deal with everything other than the video playback, but then you need the strengths of a GPU to actually handle the video. Examples like this one are overly simple, as it is very difficult to predict the direction software will take when given even more processing power; the point is that CPUs will inevitably have to merge with GPUs in order to handle these types of applications.
The CPU and the GPU have been on this collision course for quite some time; although we often refer to the CPU as a general purpose processor and the GPU as a graphics processor, the reality is that they are both general purpose. The GPU is merely a highly parallel general purpose processor, which is particularly well suited for particular applications such as 3D gaming. As the GPU became more programmable and thus general purpose, its highly parallel nature became interesting to new classes of applications: things like scientific computing are now within the realm of possibility for execution on a GPU.
Today's GPUs are vastly superior to what we currently call desktop CPUs when it comes to things like 3D gaming, video decoding and a lot of HPC applications. The problem is that a GPU is fairly worthless at sequential tasks, meaning that it relies on having a fast host CPU to handle everything else other than what it's good at.
ATI discovered that long term, as the GPU grows in its power, it will eventually be bottlenecked by the ability to do high speed sequential processing. In the same vein, the CPU will eventually be bottlenecked by the ability to do highly parallel processing. In other words, GPUs need CPUs and CPUs need GPUs for all workloads going forward. Neither approach will solve every problem and run every program out there optimally, but the combination of the two is what is necessary.
ATI came to this realization originally when looking at the possibilities for using its GPU for general purpose computing (GPGPU), even before AMD began talking to ATI about a potential acquisition. ATI's Bob Drebin (formerly the CTO of ATI, now CTO of AMD's Graphics Products Group) told us that as he began looking at the potential for ATI's GPUs he realized that ATI needed a strong sequential processor.
We wanted to know how Bob's team solved the problem, because obviously they had to come up with a solution other than "get acquired by AMD". Bob didn't directly answer the question, but he did so with a smile, by saying that ATI tried to pair its GPUs with as low power of a sequential processor as possible but always ran into the same problem of the sequential processor becoming a bottleneck. In the end, Bob believes that the AMD acquisition made the most sense because the new company is able to combine a strong sequential processor with a strong parallel processor, eventually integrating the two on a single die. We really wanted to know what ATI's "plan-B" was, had the acquisition not worked out, because we're guessing that ATI's backup plan is probably very similar to what NVIDIA has planned for its future.
To understand the point of combining a highly sequential processor like modern day desktop CPUs and a highly parallel GPU you have to look above and beyond the gaming market, into what AMD is calling stream computing. AMD perceives a number of potential applications that will require a very GPU-like architecture to solve, things that we already see today. Simply watching an HD-DVD can eat up almost 100% of some of the fastest dual core processors today, while a GPU can perform the same decoding task with much better power efficiency. H.264 encoding and decoding are perfect examples of tasks that are better suited for highly parallel processor architectures than what desktop CPUs are currently built on. But just as video processing is important, so are general productivity tasks, which is where we need the strengths of present day Out of Order superscalar CPUs. A combined architecture that can excel at both types of applications is clearly a direction that desktop CPUs need to target in order to remain relevant in future applications.
Future applications will easily combine stream computing with more sequential tasks, and we already see some of that now with web browsers. Imagine browsing a site like YouTube except where all of the content is much higher quality and requires far more CPU (or GPU) power to play. You need the strengths of a high powered sequential processor to deal with everything other than the video playback, but then you need the strengths of a GPU to actually handle the video. Examples like this one are overly simple, as it is very difficult to predict the direction software will take when given even more processing power; the point is that CPUs will inevitably have to merge with GPUs in order to handle these types of applications.
55 Comments
View All Comments
tygrus - Saturday, May 12, 2007 - link
See latest low-power Athlon64 <10w idle. Can further reduce max power consumption (from 30-60w) if you limit the clock speed to about 1GHz and drop the voltage (<15w).TA152H - Sunday, May 13, 2007 - link
Tygrus,Idle isn't so important to me, getting to less than 1 watt idle isn't particularly hard if you go into sleep mode. You can't build a fanless, noiseless system based on idle performance. I was looking at Intel's ULV stuff too, but it's just not there either. It's kind of disappointing, because most people would be perfectly happy with a 1 GHz K6-III using 8 watts or less as it would on modern processes, and nothing like it is available. VIA's stuff sucks and I don't think is very efficient, even though they are targetting this market. My main machine I just upgraded to a Coppermine 600 on a weird Intel VC820 board. It's perfectly capable of doing just about everything I do, except for compiles (even a Core 2 is too slow for that, Microsoft seriously needs to work on parallelizing their compilers, or if they have recently, I need to buy it :P).
It's an enormous waste of electricity to sell these processors when the vast majority of people don't need them. To Microsoft's credit, they are always up to the challenge of releasing bloated software that requires more memory and processing power but is functionally the same, but at some point even their talent for this might run out.
While I was writing the first reply, I was lamenting about how lousy the current processors are in this respect, but then I read that at least AMD had a clue and said the Athlon design could not address this space and they had to go with something different. Maybe they'll bring the K6-III back, fix it's decoding/memory problems, and have a real winner. In terms of power/performance, there is just no beating it, these superpipelined processors are inherently poor at power use, and clearly have a performance bias. Why VIA went this way is a big mystery to me.
chucky2 - Friday, May 11, 2007 - link
If this article has accomplished one thing, it would be that we finally have confirmation that AM2+ CPU's will work in AM2 motherboards. Up to this point it's been people reporting on "sources" and stuff like that, nothing direct from AMD.Anand's report is more than good enough for me, I can finally rest easy that the PC I just built my cousin will have an upgrade path for at least another year down the road (if not two).
Thanks Anand and AMD! (and screw you Intel for you rediculously short upgrade paths!)
Chuck
AdamK47 - Friday, May 11, 2007 - link
Well played, Anand. Well played.Kiijibari - Friday, May 11, 2007 - link
I would have looked at my watch, while cinebench was running on the 4x4 system to get a rough estimate :)Not a correct result, but better than nothing.
Or was the system so fast, that cinebench was done after a few ns ^^ ? :)
Apart from that, nice article, thanks :)
cheers
Kiijibari
Anand Lal Shimpi - Friday, May 11, 2007 - link
I counted seconds in my head, out of fairness to AMD I didn't report the number I calculated :)Take care,
Anand
Sunrise089 - Friday, May 11, 2007 - link
Didn't you guys notice the huge disconnect between the excitement evident in Anand's text and the fairly small ammount of new info? I think it should be obvious that AMD revlealed a lot more, but they have put various NDA dates on when the info can be released. So I would say they did open up a lot, but that we will only see the new info become available as we get closer to Barcelona.Anand Lal Shimpi - Friday, May 11, 2007 - link
I think you have to shift your expectations a bit; going into this thing I wanted to see Barcelona performance, I wanted the equivalent of what Intel did with Penryn and Nehalem. I didn't get that, but what I did get was a much clearer understanding of AMD's direction for the future. The section on Fusion is of particular importance to the future of the company, it just so happens that AMD's strategy is in line with Intel's, lending credibility to what it is doing.Then there were a handful of Barcelona tidbits that I needed to stick in some sort of an article, so this one just seemed the best venue to do so. More information is coming though, stay tuned for next week. No benchmarks yet unfortunately :(
Take care,
Anand
Stablecannon - Friday, May 11, 2007 - link
Wonderful. So basically this article was an AMD morale booster.
"Hey this Phil Hester, just wanted to say don't lose faith in us, even though we don;t have anything to show you really...that's because it's a secret. Yeah, that's it. We actually have a 16 core chip running at 3.8 that'll cream Intel. What's that? You want to see it? LOL."
TA152H - Friday, May 11, 2007 - link
First of all, I read the part about AMD becoming much more forthcoming with information, and then saw essentially nothing new in the article. Pretty much all of this stuff is known, and the important stuff you still don't know. So, how are they so much more open again? I didn't see it.Actually, I would have been disappointed if they were. I mean, you can scream about how they're not giving you what YOU want, but it's all about what they want. I don't buy them giving information out too early for Intel, you can be pretty sure there are plenty of companies designing products around AMD's new chip and you can be pretty sure at least one person has "slipped" and told Intel something. It's more likely it's not to AMD's benefit to have people knowing it's so much better than what's out now. How do they move product they are making today when people are waiting for their next great product? It's just common sense, they don't care if people whine about lack of visibility, too much is worse than too little. They have given out some numbers, and they are very high, so I doubt they're too concerned about performance. I think they're more concerned about selling stuff they have out today, which they aren't doing a great job of. What would happen if they showed a great product right around the corner? Q1 would look like a success compared to what they'd endure.
When you talk about Phil Hester you have to realize this guy referred the 8088 an eight-bit architecture (so he was not referring to the data bus). After that, I don't know what to think about what he says.
Next, the reason the 287 didn't sell was because it seriously sucked! It was worse than the 8087 because it didn't even run synchronously with the processor. Considering the 286 was way more powerful than the 8086/8088, there was a perfectly good reason why no one wanted a math coprocessor that was expensive, generally ran at 2/3 CPU speed (unless a seperate crystal was put in for it, which was done with later 286 machines), and actually had less performance than the 8087. The 387 was much more powerful and totally redesigned.
Also keep in mind the 486 was later made in an incarnation called the 486SX, that had either a disabled or no math coprocessor on it.
Saying the Cell is before it's time is implying it's fundamentally a useful product, but other things around it have to catch up. That's wrong and misleading. It's a niche product and it's a bear to program and is terrible in most things besides what it was designed for. Time won't change it, unless they change the Cell. The way it is now, it'll never be anything more than a niche product, nor was it designed to be more than that.
For their < 1 watt processors, it might be interesting to see if they bother with a decoupled architecture. My guess is they'll just run x86 instructions natively, without wasting so much silicon on the decoders.
With regards to AMD's next processor taking so long, I think it's even worse when one considers the K8 isn't really a K8 at all, it's more like a K7+. It's very similar to the K7, and is far less of a jump than the Prescott was from the Northwood. It's more like the Pentium MMX was to the Pentium (I'm not talking about the MMX instructions, there was a lot more changes than that).
The remarks about AMD coming back from this stronger than ever are absurd and ridiculous. They can come back, and they certainly have a good product in the wings, but it's got nothing to do with losing $611 million. It weakened the company, plain and simple, although not irrevocably. They had to slow down their investment and conversion, which isn't good. They had to sell $2 Billion in debts at very disadvantageous terms. Both of these are injuries that will have longer term ramifications for the company. So, yes, they aren't dead, but saying this will make them stronger in the long run is plain wrong and equally weird.