Monthly Archive for April, 2009

Random numbers: standard normal distribution in flash/as3

Posting about measuring clock drift got me in the mood of poking random numbers a bit more. Well, about an year ago I decided I needed normally distributed prng output, but never really used it for anything. I am sharing a solution in my best hope that someone will spare themselves the headbanging part.

The normal distribution is one of those things one should be acquainted with if about to deal with statistics, simulations or even procedural art. Anyhow, the main problem with normal pseudo-random numbers in flash is obtaining them, because the usual output of every prng follows the uniform distribution and one has to map that output to the distribution of interest. The reasonable solutions I have explored are three; one is using lookup tables, which is self-explanatory and very inexpensive, but unfortunately also painfully boring.

Another is making use of the central limit theorem, which in real life boils down to this. One can take, say, 10 numbers extracted from a uniformly distributed population (read Math.random) and take their average, whose sampling distribution will approximate to some extent the normal distribution.

I fancy using the Box-Muller transform or, more specifically, the Marsaglia polar method, which is a less computationally expensive variation of the former. Both take a pair of uniformly distributed inputs and map them to two (independent and uncorrelated) standard normal outputs. Effectively, this means that one could take any prng, be it Math.random, a Park-Miller lcg (again, check out the one on Polygonal labs), a Mersenne twister or whatever, and map its output to the standard normal distribution.

For better performance, the mapping algorithm could be built into the prng so that all operations are performed within one call. This is even more valid for the Marsaglia method, because it relies on rejective sampling, which means that it rejects a pair of random inputs every now and then and requests another. The code snippets below demonstrate the Marsaglia transform built into a Park-Miller prng, whose core logic consists in just a couple of lines:

public class ParkMiller
{
	/**
	 *	Seeds the prng.
	 */
	private var s : int;
	public function seed ( seed : uint ) : void
	{
		s = seed > 1 ? seed % 2147483647 : 1;
	}

	/**
	 *	Returns a Number ~ U(0,1)
	 */
	public function uniform () : Number
	{
		return ( ( s = ( s * 16807 ) % 2147483647 ) / 2147483647 );
	}

When you call uniform(), the seed value is multiplied by 16807 (a primitive root modulo) and set to the remainder of the product divided by 2147483647 (a Mersenne prime, 2^31-1, or the int.MAX_VALUE). This new value is returned as a Number in the range (0,1).

The histogram below illustrates the uniform distribution of the prng output:

The uniform pseudo-random values will be fed to the Marsaglia transform, whose simple algorithm is well described in its Wikipedia article. An as3 implementation with inlined uniform() getters could look like this:

	/**
	 *	Returns a Number ~ N(0,1);
	 */
	private var ready : Boolean;
	private var cache : Number;
	public function standardNormal () : Number
	{
		if ( ready )
		{				//  Return a cached result
			ready = false;		//  from a previous call
			return cache;		//  if available.
		}

		var	x : Number,		//  Repeat extracting uniform values
			y : Number,		//  in the range ( -1,1 ) until
			w : Number;		//  0 < w = x*x + y*y < 1
		do
		{
			x = ( s = ( s * 16807 ) % 2147483647 ) / 1073741823.5 - 1;
			y = ( s = ( s * 16807 ) % 2147483647 ) / 1073741823.5 - 1;
			w = x * x + y * y;
		}
		while ( w >= 1 || !w );

		w = Math.sqrt ( -2 * Math.log ( w ) / w );

		ready = true;
		cache = x * w;			//  Cache one of the outputs
		return y * w;			//  and return the other.
	}
}

The following histogram displays the distribution of this new getter:

Performance-wise, there is some ground for reducing the algorithm’s overhead. One could, for ranges of values, approximate the square root with, for example, an inlined Babylonian calculation. The natural logarithm can also be easily approximated for values not too close to zero. Still, every approximation comes at the cost of some precision, and the above implementation will be fast enough for most uses; on my notebook I get a million numbers for about 350 milliseconds.

Finally, the Ziggurat algorithm is an alternative to the Marsaglia transform, and has the promise of better perfomance if well optimised. Still, I personally haven’t managed to make it work all that great in as3.

Source: ParkMiller.as

Bookmark and Share

How slow is static access in as3/avm2 (exactly)

A reader’s comment to my previous post on Singletons asked for some evidence that static access in as3 was indeed ’10 times’ slower. I remembered having read the 10 times thing somewhere, but couldn’t find anything by quick googling. Uneasy, I decided to put up a quick benchmark. I went through more than one surprise. The code can be found here.

Four tests are being performed at 1m iterations:

1] The first test compares access times to a propety of the calling object and a static property of the class definition. Both are accessed without ‘.’ opertators: they are simply referenced by their names.

2] The second test does the same, but for propeties of a referenced object. The object’s property is accessed with a typedReference.propertyName syntax, and the static property through a ClassName.propertyName syntax.

3] The third test compares call times for a method of the calling object and a static method of the class definition. The access syntax is the same as in the first test.

4] The last, fourth test compares method call times on a referenced object. This is done like in the second test.

Without thinking too much about it, I compiled in debug mode, and ran the swf in fp10 debug. Output was as follows (imagine my surprise):

Getting & setting a property of this object :          104 millisec
Getting & setting a static property of this class :    109 millisec
Static access is slower by :                           5%

Getting & setting a property of another object :       106 millisec
Getting & setting a static property of another class : 178 millisec
Static access is slower by :                           68%

Calling a method of this object :                      317 millisec
Calling a static method of this class :                318 millisec
Static access is slower by :                           0%

Calling a method of another object :                   311 millisec
Calling a static method of another class :             397 millisec
Static access is slower by :                           28%

Thus no slowdown at all! I was already writing my apology to the reader when I realized my mistake. I recompiled the benchmark in release mode; while still running in fp10 debug, numbers changed dramatically:

Getting & setting a property of this object :          7 millisec
Getting & setting a static property of this class :    10 millisec
Static access is slower by :                           43%

Getting & setting a property of another object :       8 millisec
Getting & setting a static property of another class : 94 millisec
Static access is slower by :                           1075%

Calling a method of this object :                      90 millisec
Calling a static method of this class :                93 millisec
Static access is slower by :                           3%

Calling a method of another object :                   92 millisec
Calling a static method of another class :             176 millisec
Static access is slower by :                           91%

Finally, I opened the swf with fp10 release. Things sped up even more, and the static access overhead increased its significance in % terms. Funnily, there was one exception to the reduced timinings, in fact getting and setting a static property of another class proved to be slower in the release player than in the debug player. I would blame this on my selection of players, even though I am pretty confident I got the debug and release players in the same zip from the Adobe website.

Getting & setting a property of this object :          7 millisec
Getting & setting a static property of this class :    10 millisec
Static access is slower by :                           43%

Getting & setting a property of another object :       6 millisec
Getting & setting a static property of another class : 133 millisec
Static access is slower by :                           2117%

Calling a method of this object :                      10 millisec
Calling a static method of this class :                13 millisec
Static access is slower by :                           30%

Calling a method of another object :                   12 millisec
Calling a static method of another class :             142 millisec
Static access is slower by :                           1083%

The moral is twofold. On the one hand, accessing the static stuff of a class from within the scope of the class itself is not too expensive (which also means that Borg designs in as3 are not all that much of a bad idea performance-wise [I was wrong]), but accessing the static stuff on other classes through their Class objects is indeed very slow and should be, clearly, avoided when performance is at stake. The other is to remember the ‘Benchmarking gotchas’: or to always compile benchmarks in release mode and run them in the release player: debug mode/player can produce very distorted timings.

Bookmark and Share

True random numbers in flash/as3: measuring clock drift

This is one of those things that I will surely never come to use; it crossed my mind a couple of days ago though, and I though it would be worth putting together and then up here.

There are a many potential sources of true randomness in flash. An example would be listening for microphone or camera noise; such an approach is however contingent on access to a|v hardware. Entropy can also be pooled from simple user interactions, such as mouse movements. Still, any entropy pool could run dry after a series of requests if it is not given the chance to rebuild.

Measuring clock/cpu drift is very expensive, but provides pretty unpredictable output and, because random bits are generated and not pooled, does not limit the number of random bits that one could obtain at any given time.

package com.controul.math.rng
{
	import flash.utils.getTimer;
	public class ClockDrift
	{
		public function random ( bits : uint = 32 ) : uint
		{
			if ( bits > 32 )
				bits = 32;
			var	r : uint = 0,
				i : uint = 0,
				t : uint = getTimer ();
			for ( ;; )
			{
				if ( t != ( t = getTimer () ) )
				{
					if ( i & 1 )
						r |= 1;
					bits --;
					if ( bits > 0 )
					{
						i = 0;
						r <<= 1;
					}
					else
						break;
				}
				i ++;
			}
			return r;
		}
	}
}

What the algorithm does is to count the number of loop iterations that happen during a millisecond, and then to set the next bit to true if this count is odd, or to false if it’s even. As it takes a millisecond to produce every single random bit, a random uint (zero to 0xffffffff) takes 32 milliseconds to get generated.

Such an extremely slow solution may be most useful as a last resort for keeping an entropy pool from running dry; the pool can rely on a mix of other stuff, like user mouse movements, download speed sampling, a/v hardware noise, enterframe timing, etc to provide enough random bits for occasional requests.

Anyway, only hardcore security stuff, such as the as3crypto framework, needs unpredictable ‘random’ number generation. For non-cryptographic uses, one should go for a regular prng, be it Math.random, or one with a specifiable seed value, such as the Park-Miller prng supplied by polygonal labs.

Source: ClockDrift.as

Bookmark and Share