Archive for the 'Actionscript' CategoryPage 2 of 3

How slow is static access in as3/avm2 (exactly)

A reader’s comment to my previous post on Singletons asked for some evidence that static access in as3 was indeed ’10 times’ slower. I remembered having read the 10 times thing somewhere, but couldn’t find anything by quick googling. Uneasy, I decided to put up a quick benchmark. I went through more than one surprise. The code can be found here.

Four tests are being performed at 1m iterations:

1] The first test compares access times to a propety of the calling object and a static property of the class definition. Both are accessed without ‘.’ opertators: they are simply referenced by their names.

2] The second test does the same, but for propeties of a referenced object. The object’s property is accessed with a typedReference.propertyName syntax, and the static property through a ClassName.propertyName syntax.

3] The third test compares call times for a method of the calling object and a static method of the class definition. The access syntax is the same as in the first test.

4] The last, fourth test compares method call times on a referenced object. This is done like in the second test.

Without thinking too much about it, I compiled in debug mode, and ran the swf in fp10 debug. Output was as follows (imagine my surprise):

Getting & setting a property of this object :          104 millisec
Getting & setting a static property of this class :    109 millisec
Static access is slower by :                           5%

Getting & setting a property of another object :       106 millisec
Getting & setting a static property of another class : 178 millisec
Static access is slower by :                           68%

Calling a method of this object :                      317 millisec
Calling a static method of this class :                318 millisec
Static access is slower by :                           0%

Calling a method of another object :                   311 millisec
Calling a static method of another class :             397 millisec
Static access is slower by :                           28%

Thus no slowdown at all! I was already writing my apology to the reader when I realized my mistake. I recompiled the benchmark in release mode; while still running in fp10 debug, numbers changed dramatically:

Getting & setting a property of this object :          7 millisec
Getting & setting a static property of this class :    10 millisec
Static access is slower by :                           43%

Getting & setting a property of another object :       8 millisec
Getting & setting a static property of another class : 94 millisec
Static access is slower by :                           1075%

Calling a method of this object :                      90 millisec
Calling a static method of this class :                93 millisec
Static access is slower by :                           3%

Calling a method of another object :                   92 millisec
Calling a static method of another class :             176 millisec
Static access is slower by :                           91%

Finally, I opened the swf with fp10 release. Things sped up even more, and the static access overhead increased its significance in % terms. Funnily, there was one exception to the reduced timinings, in fact getting and setting a static property of another class proved to be slower in the release player than in the debug player. I would blame this on my selection of players, even though I am pretty confident I got the debug and release players in the same zip from the Adobe website.

Getting & setting a property of this object :          7 millisec
Getting & setting a static property of this class :    10 millisec
Static access is slower by :                           43%

Getting & setting a property of another object :       6 millisec
Getting & setting a static property of another class : 133 millisec
Static access is slower by :                           2117%

Calling a method of this object :                      10 millisec
Calling a static method of this class :                13 millisec
Static access is slower by :                           30%

Calling a method of another object :                   12 millisec
Calling a static method of another class :             142 millisec
Static access is slower by :                           1083%

The moral is twofold. On the one hand, accessing the static stuff of a class from within the scope of the class itself is not too expensive (which also means that Borg designs in as3 are not all that much of a bad idea performance-wise [I was wrong]), but accessing the static stuff on other classes through their Class objects is indeed very slow and should be, clearly, avoided when performance is at stake. The other is to remember the ‘Benchmarking gotchas’: or to always compile benchmarks in release mode and run them in the release player: debug mode/player can produce very distorted timings.

Bookmark and Share

True random numbers in flash/as3: measuring clock drift

This is one of those things that I will surely never come to use; it crossed my mind a couple of days ago though, and I though it would be worth putting together and then up here.

There are a many potential sources of true randomness in flash. An example would be listening for microphone or camera noise; such an approach is however contingent on access to a|v hardware. Entropy can also be pooled from simple user interactions, such as mouse movements. Still, any entropy pool could run dry after a series of requests if it is not given the chance to rebuild.

Measuring clock/cpu drift is very expensive, but provides pretty unpredictable output and, because random bits are generated and not pooled, does not limit the number of random bits that one could obtain at any given time.

package com.controul.math.rng
	import flash.utils.getTimer;
	public class ClockDrift
		public function random ( bits : uint = 32 ) : uint
			if ( bits > 32 )
				bits = 32;
			var	r : uint = 0,
				i : uint = 0,
				t : uint = getTimer ();
			for ( ;; )
				if ( t != ( t = getTimer () ) )
					if ( i & 1 )
						r |= 1;
					bits --;
					if ( bits > 0 )
						i = 0;
						r <<= 1;
				i ++;
			return r;

What the algorithm does is to count the number of loop iterations that happen during a millisecond, and then to set the next bit to true if this count is odd, or to false if it’s even. As it takes a millisecond to produce every single random bit, a random uint (zero to 0xffffffff) takes 32 milliseconds to get generated.

Such an extremely slow solution may be most useful as a last resort for keeping an entropy pool from running dry; the pool can rely on a mix of other stuff, like user mouse movements, download speed sampling, a/v hardware noise, enterframe timing, etc to provide enough random bits for occasional requests.

Anyway, only hardcore security stuff, such as the as3crypto framework, needs unpredictable ‘random’ number generation. For non-cryptographic uses, one should go for a regular prng, be it Math.random, or one with a specifiable seed value, such as the Park-Miller prng supplied by polygonal labs.


Bookmark and Share

A better as3 compiler? The logical consequences that never happened

More than an year ago, Joa Ebert posted a brief summary of the “logical consequences” of Adobe’s opening of the Flex SDK sources: now that the sources for the actionscript compiler were open, the community was free to branch out a more advanced one. Function inlining, mixing as3 and opcodes, optimised integer math (I do NOT want to work with Numbers), compile-time evaluated constant expressions; the list of possible enhancements is long. Why bother? Because the as3 compiler makes little use of the avm2′s true performance potential (things got worse with fp10 and the unavailable to as3 Alchemy bytecodes), performs no compile-time optimisation of your code and outputs redundant bytecode overall, especially in the case of integer arithmetics.

An year passed, and noone really raised the question again. Well, maybe kind of. Alchemy did create quite a buzz, due to the obvious absurdity of c++ being quicker than as3 when compiled down to abc, and some of the mighty few did start thinking about how (and why) on earth would such a thing be possible. Some explainations came along, and the dark magic of the LLVM kind of concealed the fact that the one and only as3 compiler is crap. Note that the Alchemy compiler overcomes the shortcomings of the as3 compiler by outputing assembler when necessary.

Well, if the dreamed of as3 compiler is a far-fetched thing to strive for, Joa Ebert’s as3c gave the community a glimpse into the potential for extention of the existing asc. As far as I know, as3c is still the only way to inline assembler in the body of a program that will boil down to abc. Joa’s efforts have been pretty much discontinued however and some opcodes are left unimplemented. Also, the as3c workflow is kind of heavy – one could hardly wire as3c into FlashDevelop for instance.

On the other hand, there is a way to produce highly optimised abc, if one is ready to give up the language. Nicolas Cannasse‘s haXe has pretty much all the language features an as3 coder craves for, such as enums, type generics and, recently, an api to the new avm2 Alchemy-related opcodes. What trully rocks the boat is that the haXe compiler has ability to boost code performance in the avm2 to a much greater extent than what compiled as3 gets: indeed, the haXe compiler boasts with features such as function and constant inlining, optimised integer math, and also a strong expression optimiser that does some last minute code refactoring for performance. Finally, the haXe format library even enables you to write in assembler, which you can assemble to bytecode, load an run on the go.

But seriously, if one guy can design an entirely new language, able to compile down to highly optimised avm2 bytecode, with all the language features missing in as3, why can’t an entire community (and an interested corporation, for that matter) produce a better compiler for the already exisiting as3? Obviously, it is noone’s responsibility to take on this task (community-wise; Adobe should take the matter seriously); but in my humble opinion it’s everyone’s responsibility to be talking about it.

I believe that we should put forth a community project which would at least define some of the parameters of the as3 compiler of tomorrow.

P.S. A question about haXe and Nicolas, is there no way to make bytecode insertion happen at compile time in haXe (as opposed to during runtime with the format library?) Generating and loading swf-s is extremely sweet, but ties one down to writing entire classes in opcodes; mixing haXe and opcodes, with an asm{} keyword for instance, would be much more flexible, and absolutely amazing.

Bookmark and Share

Flash/as3 custom namespaces and performance: the name qualifier is slow

I was thinking about how to organise the classes of one of the frameworks I’m working on these days. Two options: either dump everything in the same package, and make everything internal, or use namespaces to keep the API clean.

So, yeah, namespaces sounded like the right way to go. A lot of the functionality in the framework involves deep tree tranversal however, so I found myself thinking about the performance implications of using custom namespaces on ‘hot’ properties.

I put up a little benchmark to see whether namespaces slow down code execution. It runs four tests: a for loop that extracts a value from a public property of the object of the calling method, one that accesses a protected property of the object’s superclass, one that accesses a property in a custom namespace after a ‘use namespace’ statement, and one that accesses the property with the :: name qualifier. You can get the sources here.

On my notebook, the test (running on release player 10) output the following:

Accessing a property in the public namespace.
>	Running test @ 1000000 iterations ...
	Execution took  0.003  seconds.
Accessing a property of the superclass in the protected namespace.
>	Running test @ 1000000 iterations ...
	Execution took  0.003  seconds.
Accessing a property in a custom namespace.
>	Running test @ 1000000 iterations ...
	Execution took  0.003  seconds.
Accessing a property in a custom namespace with the :: name qualifier.
>	Running test @ 1000000 iterations ...
	Execution took  0.263  seconds.

Properties in the custom namespace were just as quick to access as those in the ‘built-in’ namespaces when made available with a ‘use namespace’, and ridiculously expensive when accessed with the name qualifier.

Brief, avoid using :: in loops when performance is at stake.

Bookmark and Share

The much dreaded Singleton

I have come across so much contempt towards the Singleton, it’s almost scary. Singletons are crucified all over the place, and whilst there are very sound theoretical reasons to avoid abusing the pattern, the Singleton has a healthy place in ui programming.

More often than not development for Flash puts us against unique instances of things. Usually, this happens in the context of UI programming: a single stage, a single mouse, etc. Some aspects of many applications are also better off built centralised, for sometimes one simply wants to avoid two objects performing clashing updates on something (a good example is animation of gui components and overwriting of property interpolation tasks in tweening engines). In such situations, one who blindly aims to avoid building a Singleton most probably ends up isolating the shared state needed to sync the instances in static vars and methods, achieving (in the best case) something that looks more or less like a Borg design. New catchy notions, same old enemy – global state.


This has nothing to do with loose coupling or design patterns, but a Singleton is usually a better alternative to static vars and methods for at least one big reason. Static stuff in Actionscript is very slow to access. (What was it, 10 times slower?) A call to a method on a singleton will make use of one static var lookup – the reference to the singleton – and that only if a reference to the object is not already locally available. The same method, if declared static, will probably have to access at least one more static variable in order to read and write global stuff. [EDIT: This is not entirely correct, see the follow-up benchmark on static access performance.]

It seems that in Python sharing state among instances is considered more elegant than using Singletons. One could easily extend the same logic to Actionscript: more transparent code, good-looking instantiation with new, etc. Still, if more than one static var is to be accessed during method execution on such objects, or if the objects are to be created and discarded on a usual basis, performance-wise, Singletons will most likely be a faster alternative.

In this context, let me demonstrate my personal approach to implementing a Singleton:

public class Singleton
	public static const instance : Singleton = new Singleton ();
	public static function Singleton ()
		if ( instance ) throw new Error ( "Already instantiated." );

My trick looks like an exploit and a generally terrible idea (now how could a constant change its value?), but works perfectly fine AND avoids the need for the additional overhead from a getInstace () method call and from accessing a private static var to return the instance.

At least some loose coupling please

Static vars and methods, shared state among instances, and Singletons are almost equally deadly to loose coupling. In all of those cases, typing and access issues hinder substitution of the class in question in code variations or upgrades.

The above is completely true, unless one decides to take the Singleton pattern a bit beyond what it usually looks like and to add a layer of abstraction:

public class AbstractSingleton
	private static var instance : AbstractSingleton;
	public static function getInstance () : AbstractSingleton
		if ( !instance ) throw new Error ( "Not instantiated." );
		return instance;
	public function AbstractSingleton ()
		if ( instance ) throw new Error ( "Already instantiated." );
		instance = this;
	public function abstractMethod ( ... ) : *
		//	to be overriden.

If used as intended this should do the trick. One could probably do a few things to strengthen the implementation, preventing instantiation of the abstract class, even adding some sort of auto-instantiation try-catch trick for the child classes, and so on and so forth. What’s important however is that this ‘abstract’ Singleton pseudo-pattern provides freedom in chosing which concrete object to instantiate at app start, and that’s already quite a gain in terms of loose-coupling.

Singletons can be rather healthy in a UI context

Again, Singletons are a good thing in the context of gui programming. All voluminous and immutable objects are healthy candidates for becoming part of the global state; a provider for a huge and static bitmap is a great example, also an audio mixer for button noises and ambient loops illustrates the point, for there is an absolute benefit of keeping those globally accessible and limit them to a single instance – there is a guarantee that the underlying bulky resource will be loaded only once in the memory.

Also, when an application relies heavily on client/server interactions, and when these put ui-responsiveness at stake, a globally accessible remoting service, be it static, Singleton or Borg, will enable the programme to centrally cache repetitive interactions’ results.

An edit to my boring conclusion

Instead of my previous empty conclusion, I’ll post someday an example of how one could actually get unmatchable loose coupling with the help of a Singleton registry for services.

Bookmark and Share