Well, it’s only a tech demo at the moment. I’ve been playing with this Breakout like game for a while, trying to learn the ins and outs of Flash mobile development – particularly as it relates to performance. I now have the unBrix demo running at close to 60FPS (59.1) – smooth as silk.
This won’t run at 60FPS on Android Flash Player plugin in the browser (or Firefox on Mac!) – this post is about the iPhone build – but here’s the web version to look at anyway.
Here is a blurry video of the thing running as a native iPhone app on a 3GS (I smoothed out the choppy splash transition in a later build by setting the BG element with cacheAsBitmapMatrix):
The most important thing was to make sure GPU acceleration was working, and to learn what things will impact performance in that area.
It turns out, there are some important differences with how GPU accelerated Flash woks compared with the traditional software renderer. In the software Flash renderer keeping your display list shallow, and sparse (using addChild/removeChild a lot) or avoiding the display list completely (by writing to BitmapData – as the Flixel game engine does) is a key optimization for performance. This is how the exploding bunny video demo is done, and why it’s so fast.
My current theory is that on GPU accelerated content (even on desktop) the reverse is true. You want to avoid CPU/system RAM to GPU/video RAM updates as much as possible – which means avoiding BitmapData updates which cause the player to upload a new texture to the GPU VRAM with every update. Because I don’t have access to the internals of the Flash Player architecture, I can’t be sure, but I think the bottle neck comes from clogging up the lanes between the CPU and GPU, and all stressing all three areas of the rendering pipeline (CPU, GPU and the bus) as they juggle around objects in memory. The key observation this conclusion is based on is the large performance impact addChild and removeChild has on the framerate. So I relentlessly avoid that in my iPhone Flash development – I precache everything, and don’t mess with the display list. This is also one reason why filters (which operate on a BitmapData representation of the DisplayObject you apply them to) are not recommended on mobile content.
Anyway, hopefully I can turn this into a full app for iPhone in a reasonable timeframe.
Steve Jobs is full of crap. I could actually understand and respect a straightforward admission that the Flash Platform is a threat to Apple’s iOS business model – which is the real reason Jobs won’t let Flash on the iPhone and iPad. That’s not even a very good reason – the App Store has many compelling features on it’s own, even if Flash is in the browser, not the least of which is the easy to understand path to monitization. Performance is another issue – Flash is fast enough, faster than JS/Canvas by quite a bit, but it’s still not as fast as a native app and all it’s OpenGL goodness (among all the other great Apple APIs). Keeping Flash off iPhone (and especially CS5 iPhone apps) has nothing to do with performance, or compatibility. That’s just bunk! And nobody likes a liar.
Here are some screenshots to show how well many sites I’ve been involved with work in Flash on the iPhone (3GS using jailbreak and Frash).
adcSTUDIO’s Home Page
adcSTUDIO’s About Us Screen
Terminal Design (before activating Frash)
Terminal Design’s Home Page (with Flash)
Make Beliefs Comix app.
A quit note on the technology: this is using Frash, which is a hack (a wrapper or compatibility layer) to get the Android version of Flash Player 10.1 running inside of iOS – it has not been optimized (or even completed at this point) to run well on iOS yet, and probably will never run as well as it can on Android. A recent test app I’ve been playing with runs 50% faster on a Droid Eris, vs. the iPhone 3GS – the Eris is slower hardware, running Android 2.1. It’s also crashy, and is missing features like streaming movie support (it does work with videos embedded in a swf) – and touch events are not quite as streamlined as they are on a real Android device (hover works better on Android for example, and hot spots are easier to hit). It’s also got all the quirks of the Wii and desktop Flash Player’s “noscale” feature (on Android there is a workaround that solves this, not implemented in Frash yet).
I just like to point that out, because some users are judging the viability of Flash on iOS/iPhone/iPad based on this hack (which is current at version 0.02), which is beyond silly.
For anyone interested, I followed these instructions to install it on my iPhone. Note: this uses SSH and dpkg to install. If you don’t know how to reverse that, you may want to find an apt repo and use Cydia to grab this, so you can uninstall Frash when you are done playing, as it can be quite unstable.
I wanted to see how far I could push that exploding Actionscript 3.0 code – see if Flash could handle updating each animating pixel every frame, while playing a video, then blurring it. Sure enough, it can! It did take further optimization from the version I posted the other day – including swapping copyPixels with getPixel/setPixel, and removing an anonymous function call (wow that was expensive!). Here it is:
Note: This WILL run like slush on the debug player. I don’t know why. If anyone knows why, please let me know!
Well I guess technically the pixels don’t explode as much as the DisplayObject explodes into pixels! I recently needed an effect that would make a bitmap image look sparkley, so I did some goggling, and game across a Firefly particle effect on a blog post belonging to Erik Hallander (at least I think so, the blog has been down for months, so I can’t double check). This pretty impressive effect looks like the following example (I hope reposting it here is not a problem).
Note: This is a modified version of the original adding the Stats.as box, and autolooping – and removes the actual firefly affect (I didn’t need that part for my purposes).
Very nice start! I don’t get an FPS problem – on my computer the example above rocks 62/60 fps (25% CPU on my Core 2 Duo)! So FPS was not the big problem. This example uses over ~19-23MB of RAM (with a lot of fluctuation)! And that is with 2×2 pixels, it goes up higher with 1×1 pixels. Additionally, this example already has an optimization in it to skip over empty (black) pixels in the DisplayObject it works on – which leads to a significant RAM savings.
Using the display list this way – and two filters per DisplayObject – it began causing the player to kick up a lot of invalid BitmapData/null reference errors (which I’m guessing is what happens when you run out of memory, since many many checks confirmed that the BitmapData was not invalid) – especially when I tried to make it work on 1×1 pixels to animate every pixel.
So the first thing I did was to clean up some of the obvious stuff, to bring down the memory usage – in the original blog post, Erik noted that this was unoptimized code, so I knew what I was getting into. I did things like remove the extra nested DisplayObjects (each pixel was a subclasses Sprite instance, with a BitmapData added to it), and cleaned up extra variables that were laying around, moved a lot of things inline, reused as many variables as I could, cutting down on object instantiation – and followed a lot of the other tips in a conveniently timed ByteArray post. Doing that really helped – I cut the memory use about in half – and on the initial animation (a black and white logo) the affect seemed worked quite well. But it didn’t scale well – larger images simply wouldn’t work.
I still wanted to use this affect, and I’ve seen many thousands of pixels being animated before – so I knew it was possible. So a radical departure. I’ve been reading about drawing directly to Bitmaps for quite a while, and that was going to be my path. So more optimizations – removed the display list code completely, changed the Pixel class to a simply property class (where are the enums?!), and used that to store information about where in the original source to look for the pixel data as well as other relevant animation data (most of which was already done for me – thanks!) for each pixel block. I also removed dependence on TweenMax – which is what the original uses for all the animations – and used the easing equations directly, within an ENTER_FRAME event.
The result is a RAM reduction by 25% and a steadier memory usage, coming in at ~5MB with 4x as many pixels (and roughly the same amount of CPU). The changes utilize copyPixels and a linked list, with an accumulation buffer like effect – for a total of 3 bitmaps (the original, which is rendered from the DisplayObject and stored, the scattered one the pixels get copied into, and the copy of that that gets blurred by the BlurFilter – a hidden memory cost illuminated by Thibault Imbert).
There are further optimizations that can be used as well (and should really be used for full image per pixel animations) – such as writing to an opaque BitmapData, rather than one with Alpha, and reducing the BlurFilter quality – getting a better handle on type marshaling, etc. It might also be faster to store the RGB value of each pixel, and draw those directly instead of using copyPixels, but I haven’t tried that yet.
I got so much help from the Flash community on this, that it would be irresponsible not to share this back, so feel free to check out the source.
The memory usage applies to both swfs on this page – so you can’t see the memory usage difference in these examples. I quoted the standalone Flash Player in this post.
Also, I’m getting some kind of performance problem in plugin browsers (everything except IE) on Windows, and on every browser on mac but Firefox which is limiting both of these to around 30FPS. I have no idea what’s causing it.
On the code quality – the code isn’t all that messy IMHO, but it is not well documented, and a lot of the configuration hooks I left in are not really being utilized in a decent API – I may refactor at some point to clean that up. There is also a limitation of the skip pixel check that will keep it from working well for greater than 1×1 pixel size (since it only checks the top left corner of the size rect).
While trying to come up with a way to get two different movies loaded at the same time, to play at different frame rates, I came up with a method to recursively stop all child movies of an as3 MovieClip. I didn’t end up using it, but I thought it might be useful for someone, so here it is:
The plan was to use that to stop playing movies, every other frame, and restart them on the alternative frames, but there is apparently no way to tell if a MovieClip is currently playing or not, to know which ones to restart.
I did end up hacking Tweener to add support for a Timer based update method. That way I can adjust the stage FPS to match the older timeline content (at 12fps) and have my Tweener based interactions work at a silky smooth 60 FPS.