Introduction

If you believe the more tabloid-oriented hardware news sites, 16 months ago you would have thought that ATI and NVIDIA were at an all out war. Harsh phrases were flung, benchmarks were beat to death, and both sides plotted for motherboards with a third x16 PCIe slot in order to have a GPU dedicated to physics. Yes, 2006 was sure an exciting time for GPU-accelerated physics, and then the party came to a grinding halt.

Over in the Ageia camp, 2005 saw them kick off the whole subject of hardware accelerated physics with their announcement of plans to develop the PhysX hardware. 2006 saw the launch of that hardware, and while it had initial promise there was a failure to follow through with games that meaningfully used the hardware. Much like with the GPU camp, Ageia has been keeping a low profile so far this year.

To be fair, much of this is aligned with the traditional gaming seasons; titles are often loaded in to the 4th quarter for the Christmas season, leaving few games - and by extension few new uses of physics - to talk about. But it's also indicative of a general dampening of spirit for hardware accelerated physics, things have not gone as planned for anyone. Now in 2007, some 2 years after Ageia's announcement got the ball rolling, the number of released AAA titles using some sort of hardware physics acceleration can still be counted on one hand.

So what happened to the enthusiasm? It's not a simple answer as there's no single reason, but rather a combination of reasons that have done a very good job dampening things. Today we'll take a look at these reasons, the business behind all of this, and why as the days tick by hardware accelerated physics keeps looking like a pipe dream.

GPU Physics
Comments Locked

32 Comments

View All Comments

  • Bladen - Thursday, July 26, 2007 - link

    When I say first order physics, I mean the most obvious type: fully destructible environments.

    In UT3, you could have a fully destructible environment as an on/off option without making the game unbalanced in single player. The game is mindless killing, who care is you blow a hole through a wall to kill your enemy?

    I guess you could have fully destructible environments processed via hardware and software, but I'd assume that the software performance hit would be huge, maybe only playable on a quad core.
  • Bladen - Thursday, July 26, 2007 - link

    Whether or not the game has fully destructible environments, I don't know.
  • Verdant - Thursday, July 26, 2007 - link

    dedicated hardware for this is pointless, with GPU speeds and numbers of cores on a die increasing on CPUs, I see no point in focusing on another pipe.

    Plus the article has a ton of typographical errors :(.
  • bigpow - Wednesday, July 25, 2007 - link

    Maybe it's just me, but I hate multiple-page reviews
  • Shark Tek - Wednesday, July 25, 2007 - link

    Just click on the "Print this article" link and you will have the whole article in one page.
  • Visual - Thursday, July 26, 2007 - link

    indeed thats what i do. almost.
    i hate that printarticle shows up in a popup, can't open it in a tab easily with a middle-click too... same as the comments page btw. really hate it.
    so i manually change the url - category/showdoc -> printarticle, it all stays in the same tab and is great. i'm planning on writing a ".user.js" (for opera/greasemonkey/trixie) for fixing the links some time
  • Egglick - Wednesday, July 25, 2007 - link

    Other than what was mentioned in the article, I think another big problem is that the PCI bus doesn't have enough bandwidth (bi-directional or otherwise) for a card doing heavy real-time processing. For whatever reason, manufacturers still seem apprehensive about using PCIe x1, so it will be rough for standalone cards to perform at any decent level.

    I've always felt the best application for physics processors would be to piggyback them on high-end videocards with lots of ram. Not only would this solve the PCI bandwidth problem, but the physics processor would be able to share the GPU's fast memory, which is probably what constitutes the majority of the cost for standalone physics cards.

    This setup would benefit both NVidia/ATI and Ageia. On one hand, Ageia gets massive market penetration by their chips being sold with the latest videocards, while NVidia/ATI get to tout having a huge new feature. They could also use their heavy influence to get game developers to start using the Ageia chip.
  • cfineman - Wednesday, July 25, 2007 - link

    I thought one of the advantages of DX10 was that it would allow one to partition off some of the GPU subprocessors for physics work.

    I was *very* surprised that the author implied that the GPUs were not well suited to embarrassingly parallel applications.... um.... what's more embarrassingly parallel than rendering?
  • Ryan Smith - Wednesday, July 25, 2007 - link

    I think you're misinterpreting what I'm saying. GPUs are well suited to embarrassingly parallel applications, however with the core-war now you can put these tasks on a CPU which while not as fast at FP as a GPU/PPU, is quickly catching up thanks to having multiple CPU cores and how easy it is to put embarrassingly parallel tasks on such a CPU. GPUs are still better suited, but CPUs are becoming well enough suited that the GPU advantage is being chipped away.

    As for DX10, there's nothing specifically in it for physics. Using SM4.0/geometry shaders you can do some second-order work, but first-order work looks like it will need to be done with CUDA/CTM which isn't a part of DX10. You may also be thinking of the long-rumored DirectPhysics API, which is just that: a rumor.
  • yyrkoon - Thursday, July 26, 2007 - link

    Actualy, because of the limited bandwidth capabilities of any GPU interface, the CPU is far better suited. Sure a 16x PCIe interface is limited to a huge 40Gbit/s bandwidth (asyncronous), and as I said, this may *seem* huge, but I personally know many game devers who have maxed this limit easily when experimenting with game technologies. When, and if the PCIe bus expands to 32x, and *if* graphics OEMs / motherboard OEMs implement it, then we'll see something that resembles the CPU-> memory of current bandwidth capabilities(10GB/s). By then however, who is to say how much the CPU -> memory bandwidth will be capable of. Granted, having said all that, this is why *we* load compressed textures into video memory, and do the math on the GPU . . .

    Anyhow, the whole time reading this article, I could not help but think that with current CPUs being at 4 cores, and Andahls law, that the two *other* cores could be used for this purpose, and it makes total sense. I think it would behoove Aegia, and Havok both to forget about Physics hardware, and start working on a liscencable software solution.

Log in

Don't have an account? Sign up now