haxe: compile time macros

Haxe is a really great language for me. It does cross platform in a sensible way - by compiling and generating code to a target language. The best part is that it's not just converting, it's properly compiling the code - so all errors are caught by the compiler itself long before the generated code even gets there.

One of its most powerful features is the macro system which allows you to run haxe code at compile time, to augment and empower your existing haxe code. It sounds crazy - so let's dig in.

In this post, the example output code will be shown using javascript for simplicity - just one of the many language targets that it supports - c++, c#, php, java, python, js.

Haxe in Haxe at compile time

Haxe manages to get macros right, it uses it's own language at compile time to alter the compilation state. This means you can inject expressions, code, remove code, throw errors and generally make code do things not usually possible, that are specific to your code base or target. And even better, you have during this compilation phase the full power of the language behind you to do so.

In luxe for example - there is a concept available for using a Component/Entity system. Sometimes, a user would accidentally try and use the entity parent property in the contstructor of the Component, long before it was assigned. This wasn't their fault, it's just the nature of the way the system works and that was something that would have to be learnt. But not with macros around!

One of the first macros I decided to write was based on this problem - I made an @:autoBuild macro happen on every descendent of Component - which at compile time, has a look at the expressions within the constructor of the given component. If it finds you touching the field named entity - it throws a neat and clearly marked error message to warn you. This saves oodles of time on things being null and obscure crashes, and gives a massive boost to usability when you can design for that explicitly.

The exact code is actually not complete right now - but the ability to do this type of thing is far more helpful than it first seems.

complex code rejection

Because you can alter the expressions in the macros at compile time, you can reject code from ever existing in the output. This is possible through #if loglevel > 1 using Haxe already - but what if the condition was far more complex? What if the condition was based on where the code is being built - like in a continuous integration server? What about environment variables? Or git commit revisions? Basically - any condition you can program - a macro can do. Since a macro is just haxe code, it has the full capability of the Haxe language and compiler to do it's bidding at compile time.

log code rejection

One simple example is logging code, using log levels to define what level of logging is present in a build. I like really dense detailed logs because I can write a parser for them and visualize them in ways that aid debugging complex systems quickly. This can add a large toll on a code base if the log code ends up in the output, because every logged string has to be stored and allocated and adds to the final build size output and sometimes runtime cost.

The macro rejecting the expression means the final code does not include the logging at all. Haxe already has a concept like this built in as a build flag --no-traces, which removes trace() calls - the built in debugging print command - but the concept applies not only to logging but more expensive and intricate systems like profiling and instrumentation.

profiling and instrumentation

Haxe macros let me add instrumentation code to my hearts content without it ever affecting runtime release builds, something I have been wanting an elegant solution for for quite some time. The next section is an even better option - what about deep profiling all functions automatically? Or each block, or each expression of each block?

complex code injection

Since you can emit code expressions from a macro, you can inject code as well. You can construct entire classes and types dynamically - at compile time.

Let's take the profiling example one step further and devise a conceptual macro for automatically profiling every block expression within a given class. Notice below I have tagged my class for a "build" macro - I want this class to be handled by my Profiling macro apply function at compile time. Since I only care about the update function right now in this example - let's tag that code for profiling only using custom metadata @:profiling. Note that @:build is from haxe, the custom one is ours.

Also take note that I separated logic into blocks { } of expressions - because I can use this to my advantage in the macros at compile time.

@:build(luxe.macros.Profiling.apply())
class Player {

    @:profiling
    function update(dt:Float) {
        {
            //update_ai
            ...
        }
        {
            //update_stats
            ...
        }
    }
}

automatic injection

Now I have everything I need - my macro will run at compile time on the class I am interested in measuring, my macro will check all methods in the class for @:profiling - if it finds it, it will look for each root block { } expression and automatically insert a start and end measurement at runtime so the final code would in pseudo code look like

profiler.start('update_ai') {

    //update ai
    ...

} profiler.end('update_ai').

For now - I won't be posting the code (this system is not even finished being coded heh) but the important thing is to understand the potential from macros and their ability to empower the code base for the development process to be quicker, more friendly, more streamlined in the output and more expressive.

continuing

This of course has down sides, code is being executed at compile time. While haxe compiler is incredibly fast - you can slow it to a crawl by a single compile time macro. If your macro introduces network latency for pinging a server or something - you will be waiting for that too.

The other thing to consider is that macros are quite complex and are the most advanced feature in haxe - so it often appears unapproachably difficult. Often this is not the case, and patience and examples will get you using them in no time. Haxe 3 made massive strides in simplifying their usage - they still have some things that are fairly difficult to wrap your head around that WILL take time to get used to.

This is not something you can fast track - the easiest way I have found is to learn by doing. That's why I am making this post, to hopefully inspire you to think of really simple, really easy macros that help you get your feet wet.

Simple concrete example

Most times a unique build id is useful in determining which version or specific build is being executed on a test machine or users machine for debugging purposes. To that end, our simple example will generate a unique-ish static string value for a build id. Since this code happens at compile time, a far more complex algorithm can be used to ensure uniqueness if required, but for the most part this code will do fine.

// MIT License
//https://github.com/underscorediscovery/haxe-macro-examples | notes.underscorediscovery.com

import haxe.macro.Expr;  
import haxe.macro.Context;

import haxe.crypto.Md5;  
import haxe.Timer.stamp;  
import Math.random;

class BuildID {

        /** Generate a unique enough string */
    public static function unique_id() : String {
        return Md5.encode(Std.string( stamp()*random() ));
    }

        /** Generates a unique string id at compile time only */
    macro public static function get() {
        return macro $v{ unique_id() };
    }

} //BuildID

Take note of the functions here - one is for generating a string ID at runtime - a regular public static function. You can use this any time from your program. Then, there is a macro function, these are compile time functions and can use the macro context to spit out expressions. I won't dig too much into the specifics of the expressions themselves - but $v{ } generates an expression from a value if it's a primitive type. Our case is a string but this is covered in the Haxe manual if you wanted more insight.

Let's look at what the using code would look like, and the resulting output target javascript code. This class is stand alone and can be used with the Haxe compiler to have a look at the results yourself, using the older documentation here as the new manual is still working on these introductions.

Basically to use this example at home, run haxe build.hxml from the compile_time_buildid/ folder of the repo.

// MIT License
//https://github.com/underscorediscovery/haxe-macro-examples | notes.underscorediscovery.com

class TestID {

    public static var build_id : String = BuildID.get();

    public function new() {
        trace( build_id );
        trace( 'running build ${build_id}' );
    }

        //called automatically as the entry point
    static function main() {
        new TestID();
    }

}

resulting output

There - now we have a unique value for the ID. The ... represents some haxe specifics that aren't useful in this example but notice how the build id is hardcoded into the output source file. This value will change with every build you run.

(function () { "use strict";
var TestID = function() {  
    console.log(TestID.build_id);
    console.log("running build " + TestID.build_id);
};
TestID.main = function() {  
    new TestID();
};
...
TestID.build_id = "cf30a1a97db5628b91535dfd3a972ea6";  
TestID.main();  
})();

even more hardcoded

Notice console.log(TestID.buildid); and the line below it? This is printing the value of a variable called build_id. It's a fixed value because its hardcoded into the file, but then why do we even need the variable access when we could replace every mention of TestID.build_id with the exact id string? Haxe allows this too, using inline static access.

Let's change :
public static var build_id : String = BuildID.get();
to
public inline static var build_id : String = BuildID.get();

For strings this is not that great, since it will generate a lot more strings, but for numbers, constants and the like it can really cut out a lot of code and even optimize the output significantly by doing away with superflous values at compile time.

Now that we have changed it to inline, this is the output - every mention of the variable build_id is gone and is now hardcoded into the file directly.

(function () { "use strict";
var TestID = function() {  
    console.log("10e594bd858844cb16a1577c61309b49");
    console.log("running build " + "10e594bd858844cb16a1577c61309b49");
};
TestID.main = function() {  
    new TestID();
};
...
TestID.main();  
})();

code example

The complete code for the above can be found here for convenience :

Github Repository

Links and tutorials

The official Haxe manual
Getting better all the time, this is the definitive guide though quite meaty and requires a good couple of passes before things make sense. Learn from simple examples and practicing, use the manual as a reference.

Andi Li: Everything in Haxe is an expression
This guide is a really helpful understanding that everything is an expression in haxe. This makes macros make a lot more sense for me.

Mark Knol: custom autocompletion with macros
An example of using macros to populate code for the compiler so it can code complete things that it wouldn't be able to otherwise.

Mark Weber: lots of simple macro snippet examples
A simple and useful reference for getting some ideas and introduction to the concepts behind the macro code, the different macro contexts and their uses.

Dan Korostelev writes about macros for tolerant JSON code
A look at "Using haxe macros as syntax-tolerant, position-aware json parser" with example code.

Lot's of these blogs and links include many great posts about haxe, and there are many more online if you search.

Good luck - and I hope to post more about haxe macros specifically in the near future as well.