Archive for the 'Performance' Category

Speedy Voronoi diagrams in as3/flash

Voronoi diagram is a decomposition of space, here a 2d plane, on a nearest neighbour basis. The diagram splits the plane in regions, each corresponding to a site (a point of interest), and containing all the points closer to this site than to any other.

Voronoi diagrams have many applications, in fields as different as botanics and astrology. A few examples of computer graphic uses for Voronoi diagrams include procedural generation of organic forms, crazy image compression, beautiful fractals, noise, etc.

What got me in Voronoi mood however is seeing this this algorithmic collage by Santiago Ortiz from Bestiario. The work’s sheer visual impact shows off the one quality of Voronoi diagrams I appreciate the most – they look awesome. Watching the masks move for a couple of minutes got me terribly into porting Steve Fortune’s algorithm for Voronoi decomposition: I had to see for myself how much stuff could happen on the screen before the player choked.

Fortune’s algorithm

Fortune’s algorithm, ideally, computes a Voronoi diagram in O(NlogN) time, where N is the number of sites, or the minimum possible time for the job, as O(NlogN) is the time-complexity of optimal sorting algorithms. It sweeps through the plane, maintaining a front of parabolas that progressively traces out the diagram, by spliting and adding curves for each site encountered, and removing arcs whenever a region vertex is determined with certainty. The Wikipedia article has it all explained pretty well, but if you need additional info, give this and this articles a try.

I began with this c++ implementation, which is a simplified take on the algorithm. The latter uses a binary tree for the composition of the beachline, whilst here we’ve got a double linked list. However, I do think that the double-link approach would do better for most visual applications in flash, where the cost of not having quick lookup of beachline parabolas might be offset by not having to reogranise the tree, and by the low overall number of particles. I ended up modifying and optimising most of the code however, and I bet this is the reason for the artifacts that show up every now and then in the last of the three demos.

Below you have a demo showing off 1000 regions, in line with nodename‘s mona-lisa tessalator:

Then 2000 points going around with their speed determined by the values of an image:

Finally, a simple audio visualiser: 

Anyway, this really is just a study of how quickly the regions of a diagram can shape up without cheating. There is no bounding box intersection, nor sophisticated output options. Clearly, output of an optimal and informative data structure will be worth the performance kick, and if I get to do my graph theory homework, I’ll be coming back to this.

Implementations

Fortune’s algorithm is already well implemented and made available by astatic, also nodename shows it off with his Voronoi toy, even though you don’t see the sources. You get a different tool for the same job by HIDIHO!, he goes through Delaunay triangulation, and then translates to a Voronoi decomposition, the two being dual graphs.

Sources

Arc.asFortune.asNumber2.as

Bookmark and Share

How slow is static access in as3/avm2 (exactly)

A reader’s comment to my previous post on Singletons asked for some evidence that static access in as3 was indeed ’10 times’ slower. I remembered having read the 10 times thing somewhere, but couldn’t find anything by quick googling. Uneasy, I decided to put up a quick benchmark. I went through more than one surprise. The code can be found here.

Four tests are being performed at 1m iterations:

1] The first test compares access times to a propety of the calling object and a static property of the class definition. Both are accessed without ‘.’ opertators: they are simply referenced by their names.

2] The second test does the same, but for propeties of a referenced object. The object’s property is accessed with a typedReference.propertyName syntax, and the static property through a ClassName.propertyName syntax.

3] The third test compares call times for a method of the calling object and a static method of the class definition. The access syntax is the same as in the first test.

4] The last, fourth test compares method call times on a referenced object. This is done like in the second test.

Without thinking too much about it, I compiled in debug mode, and ran the swf in fp10 debug. Output was as follows (imagine my surprise):

Getting & setting a property of this object :          104 millisec
Getting & setting a static property of this class :    109 millisec
Static access is slower by :                           5%

Getting & setting a property of another object :       106 millisec
Getting & setting a static property of another class : 178 millisec
Static access is slower by :                           68%

Calling a method of this object :                      317 millisec
Calling a static method of this class :                318 millisec
Static access is slower by :                           0%

Calling a method of another object :                   311 millisec
Calling a static method of another class :             397 millisec
Static access is slower by :                           28%

Thus no slowdown at all! I was already writing my apology to the reader when I realized my mistake. I recompiled the benchmark in release mode; while still running in fp10 debug, numbers changed dramatically:

Getting & setting a property of this object :          7 millisec
Getting & setting a static property of this class :    10 millisec
Static access is slower by :                           43%

Getting & setting a property of another object :       8 millisec
Getting & setting a static property of another class : 94 millisec
Static access is slower by :                           1075%

Calling a method of this object :                      90 millisec
Calling a static method of this class :                93 millisec
Static access is slower by :                           3%

Calling a method of another object :                   92 millisec
Calling a static method of another class :             176 millisec
Static access is slower by :                           91%

Finally, I opened the swf with fp10 release. Things sped up even more, and the static access overhead increased its significance in % terms. Funnily, there was one exception to the reduced timinings, in fact getting and setting a static property of another class proved to be slower in the release player than in the debug player. I would blame this on my selection of players, even though I am pretty confident I got the debug and release players in the same zip from the Adobe website.

Getting & setting a property of this object :          7 millisec
Getting & setting a static property of this class :    10 millisec
Static access is slower by :                           43%

Getting & setting a property of another object :       6 millisec
Getting & setting a static property of another class : 133 millisec
Static access is slower by :                           2117%

Calling a method of this object :                      10 millisec
Calling a static method of this class :                13 millisec
Static access is slower by :                           30%

Calling a method of another object :                   12 millisec
Calling a static method of another class :             142 millisec
Static access is slower by :                           1083%

The moral is twofold. On the one hand, accessing the static stuff of a class from within the scope of the class itself is not too expensive (which also means that Borg designs in as3 are not all that much of a bad idea performance-wise [I was wrong]), but accessing the static stuff on other classes through their Class objects is indeed very slow and should be, clearly, avoided when performance is at stake. The other is to remember the ‘Benchmarking gotchas’: or to always compile benchmarks in release mode and run them in the release player: debug mode/player can produce very distorted timings.

Bookmark and Share

Flash/as3 custom namespaces and performance: the name qualifier is slow

I was thinking about how to organise the classes of one of the frameworks I’m working on these days. Two options: either dump everything in the same package, and make everything internal, or use namespaces to keep the API clean.

So, yeah, namespaces sounded like the right way to go. A lot of the functionality in the framework involves deep tree tranversal however, so I found myself thinking about the performance implications of using custom namespaces on ‘hot’ properties.

I put up a little benchmark to see whether namespaces slow down code execution. It runs four tests: a for loop that extracts a value from a public property of the object of the calling method, one that accesses a protected property of the object’s superclass, one that accesses a property in a custom namespace after a ‘use namespace’ statement, and one that accesses the property with the :: name qualifier. You can get the sources here.

On my notebook, the test (running on release player 10) output the following:

Accessing a property in the public namespace.
>	Running test @ 1000000 iterations ...
	Execution took  0.003  seconds.
Accessing a property of the superclass in the protected namespace.
>	Running test @ 1000000 iterations ...
	Execution took  0.003  seconds.
Accessing a property in a custom namespace.
>	Running test @ 1000000 iterations ...
	Execution took  0.003  seconds.
Accessing a property in a custom namespace with the :: name qualifier.
>	Running test @ 1000000 iterations ...
	Execution took  0.263  seconds.

Properties in the custom namespace were just as quick to access as those in the ‘built-in’ namespaces when made available with a ‘use namespace’, and ridiculously expensive when accessed with the name qualifier.

Brief, avoid using :: in loops when performance is at stake.

Bookmark and Share