Modern shader design is more effecient. thats literally all we got going. We are skimming the bottom of the well until we hit another huge breakthrough.dragomix, on 21 Apr 2014 - 2:53 PM, said:
What about more modern shader design? All the other advantages that came with DX11 GPU's?
The wii u (it looks almost certain right now) uses very long instruction word 5, or vliw5. WHat this means, is that each simd 'unti' or 'shader' as it is called, is comprised of 5 alu's.
Well, this turned out to be an effecient use of space, but an ineffecient use of the shaders. Because of things like dependencies (problems that require a variable that hasnt been calculated yet), or cache misses, it would turn out that on average, only 3 or 4 of the 5 units would actually be able work, while the others just sat there, not doing anything. Literally 2-3/5ths of your shaders not even doing anything!
So they switched to arranging their blocks in units of 4 instead. That was called vliw4.
Now, on average, 3-4 of these will work (for the same reasons as stated above), and only the occasional odd man out is left not doing anything. so, now you have 3/4's of your shaders working (with the occasional 2/4 or ugh, 1/4 working). So, even by having the same exact number of shaders, and the same exact resources, you get a 25% increase in performance.
Heres a diagram showing how ineffecient vliw shaders can be if used sloppily. This is vliw4, if it was 5, youd have even more unused shaders. Actually, before that, I need to post a diagram of the wavefronts, and how they are dependent on each others results.
Okay, now we feed the wavefronts into the vliw4.
Now remember, you can add one more unused unit to each of those if this was vliw5... (Which is why they got rid of it, it made no sense to organize that way when almost half the damn things never got used!)
While this is nice, it still literally has no effect whatsoever on what effects these gpu's can do. Its just one has more performance than the other.
Now we have 'GCN' or graphics core next, or the incredibly non descriptive, ridiculously marketing named architecture modern gpu's, as well as ps4 and xbone are using.
Despite its horrible, horrible gimmick name, its a pretty good job.
Using those same wavefronts, we feed it into a gcn architecture.
Well, gcn has a compute unit, that can ORGANIZE wavefronts, and single out ones that would cause a stall because they are dependent on others, and put them on hold and execute other more viable wavefronts instead, that dont need to wait, and then once the needed data comes in, the wavefronts put on hold can go through. There is more to it, but this is the basic gist of the situation.
So now, we have much more effecient use of the shader units, with most of them being used all the time.
Very cool. Less shader units go a lot farther, and more shader units, even farther still.
But again, this does NOTHING WHATSOEVER TO PROVIDE EFFECTS VLIW ARCHITECTURE CAN NOT DO. There is no revolution here. There is no 3d accelerateted hardware vs... Not. THere is no programmable shader vs matrox fixed function. Its just more effecient. They can do the same things, so its not a matter of can it do this effect, its a matter of, how many can it do at once?
Now, I dont know what the heck is REALLY going on in wii u's gpu. All signs point to using a vliw5 architecture with 128 shader units... Except... The games visuals. Mario kart 8 is mindblowing for these specs if they are true. I dont know how they are doing it, but if they are using vliw5 architecture, they must have found a way to increase effeciency. I dont know what it is. Maybe they have AMAZIBALLS prefetchers, maybe its the on the same die cpu and l3 edram cache. I dont know. But its not your run of the mill vliw5 performance. At least not what Nintendo has been showing.
That being said, the 'dx11 range gpu' crap is just that. It is LITERALLY barely anything more than marketing to get people to think their year old gpu is somehow horribly, irrefutably, useless and outdated (Lets face it, thanks to ps4/xbone being WAAAAYYYY underpowered compared to ps360 reveal point in time, several year old computers now have a free ride for another console generation, if you are packing an icore, you are good to go, with minimal upgrades of ram and maybe a mid range gpu in a few years) and buy the 'dx11 compatable' gpu like the good little lemming they are.
- dragomix and NintendoReport like this