Hacker News new | past | comments | ask | show | jobs | submit login

Sony got pretty burned by the SPU architecture on the PS3 by being late follower to XBox, This meant that developers were porting from Xbox or PC(!) which was the Wrong Thing(TM).

You could really make the SPUs scream if you knew what you were doing and the added benefit of better cache locality on other platforms. I think they went to a similar architecture to avoid the pain that it brought to the developers on their platform that weren't ready for it.

It's kind of a shame the PS3 hadn't lead. Almost every team that went PS3->360->PC saw huge gains(due to needing to pack into 256kb blocks for the SPU). Where as everyone that went (PC)->360->PS3 was in a world of pain.




I saw this one coming. It's why I put accelerators in there. Yeah, it's best to keep the hardware mostly like whatever developers are used to using and compatible with existing code. Cell was so radically different from typical systems, even SIMD/MIMD CPU's, that it was a pain to work with even for cross-platform companies.

The Cavium model is more what I was thinking. They make the processors simple, fast, and on a good NOC. Then, add accelerators for whatever. I'd straight up ask game companies what code, algorithms, patterns, etc keep popping up in their games and could be accelerated. I'd accelerate some of that. However, I'd mostly focus on low-level stuff like disks and networking that most developers would prefer to ignore. Solid, high-performance, real-time implementations of all that with hardware acceleration of critical paths. Octeon already does that. That plus a dedicate I/O processor & asynchronous interrupts. That will let the CPU focus on gaming stuff while getting massive utilization.

Beats the hell out of "throw more cores and cache at PC architecture." Intel and AMD certainly dominate in general performance. Yet, the amount of people, tooling, and dollars that go into that is mind-boggling. Of course, that my recommendation is the better model is obvious by the fact that Intel and AMD are taking it themselves. They're also both doing well with their "semi-custom" business that does that for clients.


Sure. You can do all of this. But you'd also be requiring Sony to put in the effort to write an optimized compiler, debugger, and toolchain effort to bring FreeBSD up on a MIPS platform. MIPS is dead.

You never talk about what an "accelerator" is in your case -- you say "stuff like disks and networking". Both of those already have dedicated hardware, and they're IO bound, not CPU bound, so an "accelerator" doesn't help much.

What you want for an accelerator is something to accelerate graphics and physics, perhaps a "Graphics Processing Unit", and something realtime to do audio processing, like a "Digital Signal Processor". Modern computers already have both of those accelerators built-in. Your funky MIPS architecture doesn't add anything but dev annoyance.


"...requiring Sony to put in the effort to write an optimized compiler, debugger, and toolchain effort to bring FreeBSD up on a MIPS platform. MIPS is dead."

You have to bring Linux and BSD, which already do MIPS, up on MIPS? Gotta be hard. If it is, they have a new ARM chip (ThunderX) with similar specs. Had to go with what was already proven and more negotiable, though, so that was MIPS.

"You never talk about what an "accelerator" is in your case..."

I referenced Octeon III as an example. Had you Googled it, you would know exactly what I was talking about. Let me help you out:

http://www.cavium.com/OCTEON-III_CN7XXX.html

The prior models, with 16 cores + accelerators, were used mostly in applications that required line-rate processing on applications like stream processing, networking, etc: CPU and I/O intensive. The low end did 24 GIPS peak and supported Interlaken (10+Gbps) w/ dedicated hardware for compression, crypto, etc.

The new models go from 24-48 cores (120-240 GIPS peak), run at 2.4GHz, do 500Gbps max I/O, offer easy integration with application-specific accelerators (500 so far) directly on network-on-chip, and have mature toolchains for Linux & RTOS's. So, you could offload compression, search (eg pathfinding), graphics, crypto, physics... whatever... onto engines that handle it at hardware speed/efficiency while letting CPU focus on everything else.

"like a "Digital Signal Processor". Modern computers already have both of those accelerators built-in. Your funky MIPS architecture doesn't add anything but dev annoyance."

You must have never programmed a Digital Signal Processor. "Funky" and "dev annoyance" are very good descriptions of it. It's like a different world compared to regular programming and with no standardization. There were whole companies founded to build tools to solve that problem. OpenCL made nice strides but it's still not regular programming. Much easier for dev's to program in C/C++, use a good concurrency approach, call a library function (SW or HW accelerated) for specific hotspots, and hit compile.

Of course, if you find that very challenging, you can always handcode several models of DSP to save yourself time. ;)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: