define('DISALLOW_FILE_EDIT', true);{"id":4180,"date":"2012-05-04T16:42:14","date_gmt":"2012-05-04T21:42:14","guid":{"rendered":"http:\/\/www.unfocus.com\/?p=4180"},"modified":"2012-05-07T15:53:40","modified_gmt":"2012-05-07T20:53:40","slug":"backstage2d-the-gpu-augmented-flash-display-list","status":"publish","type":"post","link":"http:\/\/www.unfocus.com\/2012\/05\/04\/backstage2d-the-gpu-augmented-flash-display-list\/","title":{"rendered":"Backstage2D – the GPU Augmented Flash Display List"},"content":{"rendered":"

I’ve been playing with some 2D API ideas built on top of Flash’s Stage3D and Actionscript 3.0. I call it Backstage2D, the GPU augmented flash display list.<\/p>\n

Currently, Backstage2D’s code base is mostly a playground for proof of concept of some API ideas. Some stuff in this post may not match the git repo (for example, I’m still using “layer” instead of “surface”). There’s a bunch left to do, but it is working enough to run a modified version of MoleHill_BunnMark that some folks from Adobe put together (I actually lifted most of my GPU code from that example code, heh). The BunnyMark example was adapted from Iain Lobb’s BunnyMark<\/a>, with some additions from Phillipe Elsass<\/a>. You can view the Backstage2D version of BunnyMark here<\/a> (and check out the original BunnyMark MoleHill here<\/a>).<\/p>\n

Fork Backstage2D at GitHub<\/a>.<\/p>\n

The rest of this post describes the thought process that went into Backstage2D.<\/p>\n

The Flash AS3 display list API is not the best way to utilize the massively parallel capabilities of a GPU, and deal with the other limitations of a CPU\/GPU architecture. The display list’s deeply nestable DisplayObject metaphor, and all the fun filters and blend modes just doesn’t translate well to very parallel, flat GPU hardware renderer. All of this is especially true on mobile like iPhones, iPads and Android devices, and that’s the primary target for Backstage2D.<\/p>\n

With an API like the traditional flash display list, it’s easy to create situations that can’t easily be batched due to branching operations and other things which change the GPU state, and break parallel processing – slowing everything down. You see this in Adobe AIR’s GPU render mode, where seemingly random things can have a huge negative impact on performance. Behind the scenes AIR attempts to break the content into batches to speed things up. The use of certain features, or normal features in certain ways can drop you out of a batch. When performance degradation happens, it’s not always clear why. Because of that, to get great performance you must target just a subset of the normal features, and apply a lot of discipline to make sure everything keeps working smoothly.<\/p>\n

I wanted do something different. I wanted to play with an API that is intentionally unlike the Flash display list – one designed to help the implementor (Flash developer or designer) understand how to arrange their content, so that it renders very quickly, even on mobile devices – and still get the benefits of all the glorious Flash stuff we are all used to.<\/p>\n

Here are some of the primary principles I came up with, which impact the API:<\/p>\n