Are you dying to create a Groovy AST transformation, yet baffled and don't know where to start? You're in the right place. If you simply follow this tutorial step by step, by the time you reach the end, you will have your very own AST transformation. I promise. But first, here's what you need to know before getting started on this tutorial:

  1. You must already have Groovy programming experience. Don't let this be your first foray into Groovy.
  2. You must know what an abstract syntax tree (AST) is.
  3. You must know, conceptually, what a Groovy AST transformation is.

In this tutorial, I will show you how to create a local AST transformation. This is the type of AST transformation you're probably most familiar with; @ToString, @TupleConstructor, and the like. Local AST transformations are triggered by an annotation, so I'll show you how to create one of those as well. In addition, to keep things simple, you'll complete the entire tutorial using only the Groovy Console. Let's get started!

Step 1 - Choose the purpose of your AST transformation

We'll begin with an idea: a local AST transformation to add a method which calculates an object's MD5 hash. Let's name the method toMD5() and have it use toString() as the digest's message. I pilfered the MD5 code from the Grain framework and adapted it slightly as follows:

1
2
3
4
5
6
7
8
String toMD5() {
    def md5 = MessageDigest.getInstance('MD5')
    
    md5.update(toString().bytes)
    md5.digest().inject(new StringBuffer()) { sb, it ->
        sb.append(String.format('%02x', it))
    }.toString()
}

The idea is to have the AST transformation insert the method shown above into the annotated class. After which toMD5() can be called on an instance of the class. Here's an example of what it would be like to use the AST transformation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
@MD5
class Person {
    String firstName
    String lastName

    static void main(String[] args) {
        newInstance(firstName: 'John', lastName: 'Galt').run()
    }

    def run() {
        println "My MD5 checksum is ${toMD5()}"
        assert toMD5() == '1f7fc3ac2f7c9af4ec46635cb58c26d8'
    }

    String toString() {
        "$lastName, $firstName"
    }
}

Step 2 - Import the AST transformation classes

Now that you know the purpose of the AST transformation, It's time to code it. So open your Groovy console and add the following imports:

1
2
3
4
5
6
import org.codehaus.groovy.transform.GroovyASTTransformation
import org.codehaus.groovy.transform.ASTTransformation
import org.codehaus.groovy.control.CompilePhase
import org.codehaus.groovy.control.SourceUnit
import org.codehaus.groovy.ast.builder.AstBuilder
import org.codehaus.groovy.ast.ASTNode

Here's a rundown of the classes you just imported:

  • GroovyASTTransformation - An annotation used to mark an annotated class as a Groovy AST transformation.
  • ASTTransformation - The interface all Groovy AST transformations must implement.
  • CompilePhase - An enum listing the 9 phases (steps) of the Groovy compiler. I'll explain this in more detail in a second.
  • SourceUnit - Provides information about what is being compiled. I know, that's a vague description. The truth is, aside from it being used in global AST transformations, I have no clue what this is for.
  • AstBuilder - Provides a few methods to create AST nodes.
  • AstNode - The base class of all AST nodes.

Step 3 - Create the AST transformation class

With the imports in place, lay out the boilerplate code for the AST transformation:

1
2
3
4
5
6
7
@GroovyASTTransformation(phase = CompilePhase.CANONICALIZATION)
class MD5Transformation implements ASTTransformation {
   
    void visit(ASTNode[] astNodes, SourceUnit sourceUnit) {
        // Awesome stuff will go here.
    }
}

So now you have the class MD5Transformation which implements the ASTTransformation interface. You're marking the class as a Groovy AST transformation by annotating the class with @GroovyASTTransformation, which requires that you specify the compiler phase after which the transformation will run. WTF? Yeah, I know. Let's demystify this compiler phase thing. Here's a list of the compiler phases in sequential order:

  1. Initialization - The compiler opens any necessary files.
  2. Parsing - The source code is parsed and a concrete syntax tree (CST) is created.
  3. Conversion - The CST is converted into the AST.
  4. Semantic Analysis - Ensures the AST is valid. This phase is important to us because it's the first in which a local AST transformation can be used. Woooooo!
  5. Canonicalization - Does magical things with inner classes.
  6. Instruction selection - Adds the AST nodes which Groovy allows us to disregard, such as return statements.
  7. Class generation - This is where Groovy adds methods such as getMetaClass() and invokeMethod(), and then builds the final class.
  8. Output - During this phase the compiler vomits the byte-code.
  9. Finalization - The book-end of the phases. During this phase the compiler releases its resources.

Source: http://joesgroovyblog.blogspot.com/2011/09/ast-transformations-compiler-phases-and.html

So, which phase should you use? Honestly, I cannot give you a definitive answer because it really depends on what your AST transformation does, and how it should interact with other AST annotations the user may apply. For example, if your AST transformation adds public fields, then you may want to go before CANONICALIZATION. Why? Because that's when Groovy's @ToString() AST transformation triggers, so if you go ahead of it then it can pick up the fields you add, in the case both AST transformations are used together.

What can you do? When creating your own AST transformation, try SEMANTIC\_ANALYSIS through CLASS\_GENERATION and see if any of them fail for you. Then you'll know which phases not to use! That's the process I used to discover that the CLASS\_GENERATION phase is too late for the MD5Transformation. The method was not be added to the annotated class. I'm going to go with CANONICALIZATION so that most things have been taken care of, yet Groovy can still patch up my work, such as adding return statements.

Now you're ready to implement the visit() method; the heart of the AST transformation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void visit(ASTNode[] astNodes, SourceUnit sourceUnit) {
    def (annotation, klass) = astNodes        
    def toMD5 = new AstBuilder().buildFromString(
        """
        import java.security.MessageDigest
        
        class ${klass.name} {
            String toMD5() {
                def md5 = MessageDigest.getInstance('MD5')
                
                md5.update(toString().bytes)
                md5.digest().inject(new StringBuffer()) { sb, it ->
                    sb.append(String.format('%02x', it))
                }.toString()
            }
        }
        """
    )[1].methods[0].find { it.name == 'toMD5' }

    klass.addMethod(toMD5)
}

There's quite a bit going on, so let's take it from the top.

Get the AST nodes for the annotation and the annotated class

The line def (annotation, klass) = astNodes grabs the first two nodes in astNodes and assigns them to variables. Because this is a local AST transformation, the first node is an AnnotationNode; so it provides a way to get info about the annotation itself, which you'll create shortly. The second node is the annotated node; in this case a ClassNode. The astNodes array looks like this: [class org.codehaus.groovy.ast.AnnotationNode, class org.codehaus.groovy.ast.ClassNode].

Create the AST nodes representing the toMD5() method

Before you can add the toMD5() to the annotated class, you must create the method itself. To do this, you used AstBuilder.buildFromString(). buildFromString() takes Groovy code in the form of a String and produces the AST nodes for it. Isn't that cool! Sure, it's not the most efficient approach (since it basically kicks off another instance of the compiler), nor powerful way to create AST nodes, but it's super convenient.

Now, about methods... what you may or may not know, is that all methods have a class. Groovy gives us some flexibility with scripts by creating one on our behalf, but the class is always there. In the Groovy source passed to buildFromString() there's the expression klass.name. This creates a ClassNode for a class who's name is the same as the annotated class. Don't worry, the classes will not conflict. The end result is a class with the toMD5() method, all represented by AST nodes. But... you want a method, not a class, right?

The AstBuilder returns a list of nodes. Something like this: [org.codehaus.groovy.ast.stmt.BlockStatement@1e0f477f[], SomeClass]. The second item in the list is the ClassNode which contains the toMD5() method. The expression [1].methods[0].find { it.name == 'toMD5' } gets the classNode, looks for the toMD5() method, and returns it.

Add the toMD5() method to the annotated class

Finally, now that you have the AST nodes for the toMD5() method, you can add it to the annotated class: klass.addMethod(toMD5). Your AST transformation is done! Let's move on to the annotation.

Step 4 - Import the annotation classes

As of now, you have an AST transformation that's ready to go. But being that it's a local AST transformation, you need to associate the AST transformation with an annotation so that when the Groovy compiler encounters a class using the annotation, it will trigger the AST transformation once the specified compiler phase has completed. So now it's time to create the annotation. Begin with the following imports:

1
2
3
4
5
import java.lang.annotation.Retention
import java.lang.annotation.RetentionPolicy
import java.lang.annotation.Target
import java.lang.annotation.ElementType
import org.codehaus.groovy.transform.GroovyASTTransformationClass

Here's the breakdown:

  1. Retention - An annotation used to annotate an annotation to specify how far into the compilation process to retain an annotation. It's used with the next class, RetentionPolicy.
  2. RetentionPolicy - An enum to specify how long to retain an annotation. For AST transformations, SOURCE, which means the annotation is discarded quite early, is sufficient because we don't need to use the annotation in run-time.
  3. Target - Specifies what can be annotated (ex. class, method, field, etc). See ElementType.
  4. ElementType - An enum defining the various types which can be annotated.
  5. GroovyASTTransformation - An annotation used to link an AST transformation to its annotation.

Step 5 - Create the annotation

Next, create the annotation:

1
2
3
4
@Retention(RetentionPolicy.SOURCE)
@Target([ElementType.TYPE])
@GroovyASTTransformationClass(classes = [MD5Transformation])
@interface MD5 { }

That's all it takes to create the MD5 annotation. The retention policy is set to SOURCE, the target is set to TYPE which means the annotation can technically be applied to a class, interface, or another annotation; make a mental note of this fact. The classes attribute of the @GroovyASTTransformationClass annotation is set to a list of transformation classes; simply MD5Transformation in this case. Now, you're ready to use the transformation and see it in action!

Step 6 - Using the transformation

Before you can use the MD5 AST transformation you created, we've got a problem to solve. AST transformations modify code. However, they are also code. Which means that compilation order is important. If a class is annotated with an AST transformation, that transformation must be compiled first. For this article, we're throwing everything into a single \*.groovy file, so to control the compiling order, we'll use a neat trick: put the annotated class into a separate Groovy shell:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
new GroovyShell(getClass().classLoader).evaluate '''
@MD5
class Person {
    def firstName
    def lastName
    
    static void main(String[] args) {
        newInstance(firstName: 'John', lastName: 'Galt').run()
    }
    
    def run() {
        println "My MD5 checksum is ${toMD5()}"
        assert toMD5() == '1f7fc3ac2f7c9af4ec46635cb58c26d8'
    }
    
    String toString() { "$lastName, $firstName" }
}
'''

Go ahead and run it. If you see the message My MD5 checksum is 1f7fc3ac2f7c9af4ec46635cb58c26d8 congrats!!! You created your first Groovy AST transformation!!!

What's next?

So, you've created a local AST transformation by following this tutorial. Now what? Well... there's plenty left to do, such as:

  1. Validate your AST nodes - Recall that the annotation MD5 can be used on more that just classes. ElementType.TYPE includes classes, interfaces, annotations, and enums. So you're AST transformations need to keep this in mind and take precaution.
  2. Generate useful errors - If users of your AST transformation attempt to use it improperly, instead of failing, or acting weird, let them know with a useful error. Groovy provides an API specifically for generating errors from AST transformations.
  3. Be efficient - An AST transformation is not the place to slack off. Performance is important because poor performance affects your builds. Consider AstBuilder.buildFromString() as a way to draft an AST transformation.
  4. Integrate with your build tool - In this guide you placed all of the code in a single file (because, well, I told you to). In most cases, you'll be using a build tool, such as Gradle, to build a multi-file project. The trick with AST transformations, as I mentioned before, is to compile AST transformations before you attempt to use them.

Got AST transformation questions? Sign up and send me a question :)