Game Optimization Adventures, part 1

Starting on Wednesday (today is Friday) I began the process of optimizing the iPhone game app.

Using Instruments, I could see a total Live Bytes value of around 40 MB for an average while the game was running.  Now, I’ve done a couple tasks and also introduced a PVR texture (not got it working yet).  Suddenly, the live bytes (after everything has loaded) has been reduced not to 20 or 30, but 4.  When the app is started, it loads up about 15-20 MB of memory, and then dumps it all except 4.  I’ll mention more about this later on.

Now, that’s a colossal drop in memory use.  From peaking at 40 to down to 15-20, and then now a mysterious (but very happy) drop after loading to around 4.

In the last 4 hours (and some hours yesterday) I’ve been making pages of notes on values and changes from different configurations of the app.  For the configurations, one thing I’ve done is I have built 4 separate Instruments save files using the Xcode Run>Run with Performance Tool>Object Allocations automated setup.  The four setups use these configurations: using JPG texture and stop recording when at peak live bytes, using JPG texture and stop recording after drop to 4 MB, using PVR texture and stop recording when at peak live bytes, using PVR and stop recording after drop to 4 MB.

These files have let me discover the strange, inexplicable fact of the moment (and which I have not built any conclusion about):  the memory being allocated internally by glTexImage2D for my JPG textures is being released.

One reason I dropped initially from 40 MB to a moderately lower amount is I removed an unused texture that was hogging 12 MB by itself, and wasn’t even being used.  My bad.  Actually it wasn’t my fault. haha- blame someone else.  It was from the 3rd texture channel in blender, and the game engine I use claims to only use two texture channels from Blender, because OpenGL ES only supports two texture channels.  Fine.  So that’s why the documentation for the game engine recommends using texture channels outside the first two.  That I did.  But apparently it exports the rest anyway despite them being beyond the first and second channels.  So I removed that texture (2048×2048 pixels x 3 bytes/pixel = 12 MB).  Also I have removed a handful of other textures that I used before and now I have reduced the peak load size from 40 to 15-20.  The 5 MB difference there is the size of the JPG vs the PVR.

So assuming I’m still using a JPG texture

The peak live bytes is 20.30 MB, and the drop-live bytes is 4.67 MB.  Watching the ObjectAlloc graph in the upper pane of Instruments, it will be at the peak for a second, and then it suddenly nose-dives to the low drop.

Now to examine the useJPG-peakLiveBytes instruments file: sort the Live Bytes column in Summary View from high-to-low, and a person can see what Malloc categories are taking up the most space.  And in my case, the biggest ones  are  Malloc 4.00 MB ( # Living = 2 ), Malloc 1.00 MB (# Living = 5 ).   Beyond that, there is Malloc 2.66 MB (this is not a texture, it is part of the Bullet physics engine so I ignore it), and Malloc 256.00 KB ( # living = 4 ) Add up only the 2×4 + 5×1 + 4×0.25 and there’s 14 MB out of 20 MB right there.  There are dozens or hundreds more memory allocations but they become very small and they are not the significance that these large image textures are.

Examining the useJPG-dropLiveBytes Instruments file in the same manner, the biggest live bytes categories are Malloc 2.66 MB (the previously-mentioned physics stuff) … and nothing of the Malloc 4.00 MB and Malloc 1.00 MB.  So I clicked the Category column header to sort by category, and scrolled down to Malloc 4.00 MB and see the # living = 0, and Malloc 1.00 MB # living = 0, and Malloc 256.00 KB # living = 0.

That was predictable.

But the chart still shows a peak and a nose-dive, and so I asked myself “why does this build up and then nose-dive, and when I started this on Wednesday, it was building up to 40 MB and just staying there?”

Now I know why the peak isn’t so high, but why does it fall?  Instruments to the rescue!

Using the dropLiveBytes file, still sorted by Category, I click the right-pointing arrow-in-circle beside Malloc 4.00 MB.  This shows two rows, and both the rows have a blank cell under the Live column.  If looking at this same Malloc 4.00 category selection in the peakLiveBytes file, the cells would have a little black dot.

Then I clicked on the Extended Detail View button and looked at each of the two rows… in each case the 4 MB were being realloc’ed within glTexImage2D.

Now this is where random clicking and exploring Instruments really helped me.  Clicking the right-pointing arrow-in-circle beside the Object Address values in a row will show the history for that Object Address.  The columns of interest in my case are Event Type, Timestamp, Size, Responsible Caller.  I sort the list of two rows by Timestamp, ascending.

The order of Event type, shows Malloc, and Free. Okay! So what else is in the second row along with Free?  well, obviously the same Address, that’s not important. the Size, shows – 4194304 (which is 4x1024x1024.. = 4.00 MB) and Responsible caller shows glDrawElements.

So I examined the other 4MB and the 1 MB alloc objects in the same manner as above, and discover that they were all being Free’ed in glDrawElements.

So a quick jump over to khronos.org’s glDrawElements page and search the page for “free” came up with no successful information.

And now I’m puzzled… why are these objects being realloc’ed in glTexImage2D, and then freed, but they weren’t being freed before?

I have no idea.  But I know I’ve got to get the PVR texture working, and it’s nice to know that the game should not get dumped by the iPhone OS anymore ( I hope ).

More adventures will follow.  Take care.

This entry was posted in Activities & Adventures, Technology. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *