Tuesday, April 25, 2017

@tokens and XMacros

Crack annotations are a way to extend the compiler at the parser level. A lot of them do code generation, for example the @struct annotation generates a class with a constructor:

@import crack.ann struct;

@struct Foo {
    String name;
    int val;
}

# Equivalent to:

class Foo {
    String name;
    int val;

    oper init(String name, int val) : name = name, val = val {}
}

Annotations are just crack functions that are executed at compile time. The only restriction is that they must reside in a different module from the code that uses them. An annotation is just a public function that accepts a CrackContext object:

import crack.compiler CrackContext;

void struct(CrackContext ctx) {
  ...
}

The CrackContext object is an interface to the compiler and tokenizer which references the context in which the annotation was invoked. For example, we can consume tokens from the point where the annotation is specified and generate errors at that point:

    tok := ctx.getToken();
    if (!tok.isIdent())
        ctx.error(tok, 'Identifier expected!'.buffer);

Code generation in annotations has always been done by injecting tokens and strings into the tokenizer. We make use of the fact that the tokenizer has an unlimited putback queue and just "put back" the tokens that we want the parser to get next in reverse order. There is also an "inject()" method on the crack context that lets you inject a string to be tokenized.

Neither approach has been entirely satisfactory. Obviously, generating code by injecting one token at a time is far too verbose and tedious to use for anything of any size. And while inject() fixes that part of the problem, it relies on writing code in a string, so:

  • The line numbers of the code have to be provided to the inject() function, a technique which doesn't compose well.

  • Editors don't recognize it as crack code, breaking syntax higlighting and auto-indent.

A better solution relies on the recently introduced @tokens and @xmac annotations. @tokens is effectively a "token sequence literal." It consumes the delimited tokens following it and produces an expression that evaluates to a NodeList object containing those tokens.

This lets us generate crack code defined in crack code. For example, the following ennoation emits code to print "hello world":

import crack.ann deserializeNodeList;
@import crack.ann tokens;

void hello(CrackContext ctx) {
    @tokens { cout `hello world!\n`; }.expand(ctx);
}

In the example above, we use @tokens with curly braces as delimiters. We could have also used square brackets or parenthesis. Delimiters may be nested, but the symbols that are not being used need not be paired. So we can also use @tokens for asymetric constructs:

void begin_block(CrackContext ctx) {
    # The unbalanced '{' is allowed here.
    @tokens [ if (true) { ].expand(ctx);
}

While useful, @token still doesn't let us do the kind of composition we need in order to be able to generate code. There's nothing like macro parameters for @tokens, they are essentially constants. For interpolation, we have @xmac.

@xmac is like @tokens only with parameters allowing you to expand other NodeLists. For example, here's an annotation to emit exception classes:

import crack.ann deserializeXMac;
@import crack.ann xmac;

void exception(CrackContext ctx) {
    tok := ctx.getToken();
    if (!tok.isIdent()) ctx.error(tok, 'Identifier expected!');
    @xmac {
        class $className : Exception {
            oper init() {}
            oper init(String message) : Exception(message) {}
        }
    }.set('className', tok).expand(ctx);

We have to explicitly set each of the parameters with the set() method. We'll get an error if any of them are undefined when we expand. Alternately, we can use @xmac* to do this automatically with variables of the same name:

void exception(CrackContext ctx) {
    className := ctx.getToken();
    if (!className.isIdent())
        ctx.error(tok, 'Identifier expected!');
    @xmac* {
        class $className : Exception {
            oper init() {}
            oper init(String message) : Exception(message) {}
        }
    }.expand(ctx);
}

Since it just generates a NodeList, we can use @tokens to directly generate values to interpolate into an @xmac:

    method := @tokens {
        void foo() { }
    };

    @xmac* (
        class A {
            $method
        }
    }.expand(ctx);

We can also expand an @xmac into a NodeList using the expand() method with no arguments:

    accessors := @xmac* {
        void $name() {
            return __state.$name;
        }

        void $name(int val) {
            __state.$name = val;
        }
    }.expand();

    @xmac* {
        class A {
            $accessors
        }
    }.expand(ctx);

@tokens and @xmac are both useful tools for doing code generation in Crack annotations. They will be released in Crack 1.1.

No comments:

Post a Comment