Monday, March 7, 2011

Understanding Hygiene (part 2) -- SRFI 72

My critically acclaimed Understanding Hygiene (part 1) contains the absolute baseline necessary for understanding hygienic macros.

This post is about my understanding of André van Tonder's SRFI 72 - a specific way of implementing hygiene, which differs from more widely used hygienic systems such as the venerable syntax-case.

It must be said that hygiene is still an area of active research. There be dragons. Nevertheless, van Tonder is quite confident of his design, going so far as calling it "improved hygiene". David Herman, who wrote the bookdissertation on the theory of hygienic macros is skeptical. Anton van Straaten makes some mediatory points. Matthew Flatt raises even more mind-bending questions.

That said, for a hygiene noob like me, SRFI 72 seems to be an elegant and workable design. I wrote a half-assed Lisp->C compiler that contains a SRFI 72 inspired macro system (here's a test from SRFI 72 in my Lisp that works -- phew) and the experience was quite pleasurable. Of the couple thousand lines of C in the system, only a couple dozen or so deal with hygiene.

Basically, what SRFI 72 says is this: when you create a piece of code using the code constructors syntax or quasisyntax, e.g.
(syntax (let ((x 1)) (foo x)))
the identifiers in there (let, x, foo) get a fresh hygiene context (or "color"). Thus they won't conflict with identifiers of another color -- such as those created by somebody else (e.g. another macro, or the user's source). That was the easy part.

Additionally, SRFI 72 makes such pieces of code obey a discipline that is similar to lexical scoping. That is the meaning of 72's money quote that van Tonder repeats multiple times, in the hope of hammering his point home:
the evaluation of any nested, unquoted syntax or quasisyntax forms counts as part of the evaluation of an enclosing quasisyntax.
Say we have some piece of silly code, like:
(quasisyntax (bla bla bla ,(some-unquoted-stuff (lambda () (quasisyntax (bla bla bla))))))
The outer quasisyntax constructs/quotes the piece of code. Then the "," unquotes and we call some contrived function that happens to take a lambda as argument. Then, inside the lambda there's another quasisyntax, so we're quoted again. The contrived unquoted stuff and the lambda don't matter. What matters is that there's an inner quasisyntax lexically embedded in the outer quasisyntax.

Now, SRFI 72's money quote tells us that the inner quasisyntax's bla's have the same color as the outer quasisyntax's bla's. Why? Because the inner quasisyntax is lexically nested in the outer quasisyntax, and like a lexical variable binding, the outer quasisyntax's color is "inherited" down to the inner quasisyntax.

This took me a looooooooooooooooooong while to figure out, and I hope that this post will help those trying to understand SRFI 72.


No comments: