Posting about measuring clock drift got me in the mood of poking random numbers a bit more. Well, about an year ago I decided I needed normally distributed prng output, but never really used it for anything. I am sharing a solution in my best hope that someone will spare themselves the headbanging part.

The normal distribution is one of those things one should be acquainted with if about to deal with statistics, simulations or even procedural art. Anyhow, the main problem with normal pseudo-random numbers in flash is obtaining them, because the usual output of every prng follows the uniform distribution and one has to map that output to the distribution of interest. The reasonable solutions I have explored are three; one is using lookup tables, which is self-explanatory and very inexpensive, but unfortunately also painfully boring.

Another is making use of the central limit theorem, which in real life boils down to this. One can take, say, 10 numbers extracted from a uniformly distributed population (read Math.random) and take their average, whose sampling distribution will approximate to some extent the normal distribution.

I fancy using the Box-Muller transform or, more specifically, the Marsaglia polar method, which is a less computationally expensive variation of the former. Both take a pair of uniformly distributed inputs and map them to two (independent and uncorrelated) standard normal outputs. Effectively, this means that one could take any prng, be it Math.random, a Park-Miller lcg (again, check out the one on Polygonal labs), a Mersenne twister or whatever, and map its output to the standard normal distribution.

For better performance, the mapping algorithm could be built into the prng so that all operations are performed within one call. This is even more valid for the Marsaglia method, because it relies on rejective sampling, which means that it rejects a pair of random inputs every now and then and requests another. The code snippets below demonstrate the Marsaglia transform built into a Park-Miller prng, whose core logic consists in just a couple of lines:

public class ParkMiller { /** * Seeds the prng. */ private var s : int; public function seed ( seed : uint ) : void { s = seed > 1 ? seed % 2147483647 : 1; } /** * Returns a Number ~ U(0,1) */ public function uniform () : Number { return ( ( s = ( s * 16807 ) % 2147483647 ) / 2147483647 ); }

When you call uniform(), the seed value is multiplied by 16807 (a primitive root modulo) and set to the remainder of the product divided by 2147483647 (a Mersenne prime, 2^31-1, or the int.MAX_VALUE). This new value is returned as a Number in the range (0,1).

The histogram below illustrates the uniform distribution of the prng output:

The uniform pseudo-random values will be fed to the Marsaglia transform, whose simple algorithm is well described in its Wikipedia article. An as3 implementation with inlined uniform() getters could look like this:

/** * Returns a Number ~ N(0,1); */ private var ready : Boolean; private var cache : Number; public function standardNormal () : Number { if ( ready ) { // Return a cached result ready = false; // from a previous call return cache; // if available. } var x : Number, // Repeat extracting uniform values y : Number, // in the range ( -1,1 ) until w : Number; // 0 < w = x*x + y*y < 1 do { x = ( s = ( s * 16807 ) % 2147483647 ) / 1073741823.5 - 1; y = ( s = ( s * 16807 ) % 2147483647 ) / 1073741823.5 - 1; w = x * x + y * y; } while ( w >= 1 || !w ); w = Math.sqrt ( -2 * Math.log ( w ) / w ); ready = true; cache = x * w; // Cache one of the outputs return y * w; // and return the other. } }

The following histogram displays the distribution of this new getter:

Performance-wise, there is some ground for reducing the algorithm’s overhead. One could, for ranges of values, approximate the square root with, for example, an inlined Babylonian calculation. The natural logarithm can also be easily approximated for values not too close to zero. Still, every approximation comes at the cost of some precision, and the above implementation will be fast enough for most uses; on my notebook I get a million numbers for about 350 milliseconds.

Finally, the Ziggurat algorithm is an alternative to the Marsaglia transform, and has the promise of better perfomance if well optimised. Still, I personally haven’t managed to make it work all that great in as3.

Source: ParkMiller.as

Just downloaded the source provided and when I try to use with my 2d random visualization the “standardNormal” reaches 15s timelimit and timeout.

Anything I’m doing wrong? Any specific setting needs to be set? using CS4.

Thanks

Hey. It should be working fine, I doublechecked the source.

You should know that standardNormal() is 2 to 3 times slower than Math.random(), but I doubt that’s your problem.

Mail me your sources and I’ll tell you what I think.

Hi, hristro,

Really appreciate your work. Thank you for sharing your knowledge.

I abeginner in action script. I’m working on my university project and for that I need to visualize “normal distribution”. I found that the way you have shown is perfect.

What I want to know is can I use this in “action script 2 based flash movie? if cannot, then what modifications should I do? After downloading the .as file what should I do to create a flash movie, like the one you have put on this blog?

hope you will guide me.

thank you.

The code above is as3, but there is nothing in there that as2 can’t do, you just need to rewrite it a bit. The visualisation above is a histogram showing how the sample builds up (you see the n number is the number of observations in the sample).

If you have any trouble getting it to work, mail me.

Hi, is the histogram available? the fla file i mean. i think it’s an interesting tool, and a good way to test other functions

Good job btw

I’ll mail it to you in a bit – wouldn’t post it here, because it’s so not the most perfectly finished tool on earth you see.

Ran across your article and found it quite interesting. But shouldn’t the lines…

cache = x * w; // Cache one of the outputs

return y * w; // and return the other.

read…

cache = x * x * w; // Cache one of the outputs

return y * y * w; // and return the other.

? the result of the standardNormal result is way out of the (0,1) bounds without the modification.

Hey Chuck,

Standard normal numbers are not bound between 0 and 1. They are instead centered on 0 and have a standard deviation of 1.

Have a look at this:

http://en.wikipedia.org/wiki/Normal_distribution#Standardizing_normal_random_variables

Which means that a standard normal number can theoretically go as high and low as positive or negative infinity, although obviously this is infinitely unlikely (and of course impossible in this implementation).

If you want numbers in the (0,1) range, you’re probably talking about the uniform distribution.