The Axis of Eval

Wednesday, May 29, 2013

Wat: now in Perl, Python, and Ruby, too

I'm delighted somebody by the name of "shadowcat-mst" has taken my Wat interpreter and reimplemented it in Perl. That really seems fitting. Wat now covers JavaScript and Perl - think of the possibilities!

Update: Piotr Kuchta ported Wat to Python.
Update: Victor Hugo Borja ported Wat to Ruby.

Thursday, May 9, 2013

Green threads in the browser in 20 lines of Wat

This page shows 5 independent, cooperatively scheduled Wat green threads (view source for full Wat code).

Each thread has an ID and is defined as a function that loops forever, repeatedly printing its ID, and then sleeping for a (randomly long) while.

(define (run-thread (id String))
  (loop
    (@appendChild (.body $document)
                  (@createTextNode $document (+ "Active thread: " id " ")))
    (sleep (* 1000 (@random $Math)))))

So, how can a Wat thread sleep inside a loop when JavaScript forbids sleeping? Why, with good ole delimited continuations:

To spawn a thread, I just wrap the call to RUN-THREAD in a prompt (which is just a unique object used as an identifier):

(define default-prompt 'default-prompt)

(define (spawn-thread (id String))
  (push-prompt default-prompt
    (run-thread id)))

Where it gets interesting is the SLEEP function which captures the continuation up to the prompt, and sets up a callback with JavaScript's setTimeout that will reinstall the continuation later:

(define (sleep (ms Number))
  (take-subcont default-prompt k
    (define (callback . #ignore)
      (push-prompt-subcont default-prompt k))
    ($setTimeout (js-callback callback) ms)))

So, first, SLEEP aborts up to and including the default prompt using TAKE-SUBCONT. It receives the continuation in the variable K. Once it has K, it defines a CALLBACK function, that will reinstall the default prompt with PUSH-PROMPT, and then continue with K again with PUSH-SUBCONT. All that's left is to give this callback to setTimeout.

Then I can spawn independent threads:

(spawn-thread "thread-1")
(spawn-thread "thread-2")
(spawn-thread "thread-3")
(spawn-thread "thread-4")
(spawn-thread "thread-5")

Wat is very new, but it's already practical for adding concurrency and metaprogramming to JavaScript. Deployment is very easy. Include the single wat.js file, put some Lisp code in a <script> tag, and run it.

Wednesday, May 8, 2013

A new low in programming language design and implementation

The new Wat is the best, most lightweight way to implement a JavaScript-based programming language I have found so far.

Basically, I get away from JS as quickly and painlessly as possible, and start writing the language in itself.

So I define a very small set of primitives on the joint foundation of Kernel-like first-class lexical environments and fexprs and delimited continuations. Fexprs are a great tool for language-oriented programming, and delimited continuations allow me to escape from the browser's (and Node's) async hell and implement any concurrency and effect system I like.

To fexprs I also add macros. When a macro is used as the operator of a form, the form's code gets changed to the macro's output when the macro is first called, a technique I learned from here. I like macros because they make syntactic abstraction cost-free - with fexprs alone there is always an interpretative overhead. Still, Wat macros, like fexprs, do not work with quoted identifiers, but with first-class values, so many hygiene problems are avoided.

To delimited control I also add classic first-order control (sequential, conditional, loop, throw, catch, finally). This runs on the ordinary JS stack. Only when a continuation is captured does the stack get reified on the heap.

And last but not least, I use a JSON-based syntax for writing the language in itself. At first this was just intended as a quick way to not have to specify a parser for Wat, but I'm starting to like it. It allows Wat to truly be embedded in JavaScript.

Wat does not have a type tagging or object system. It uses the raw JavaScript values.

The whole implementation is roughly 350 lines of JavaScript. After these 350 lines, I can already write Wat in Wat, which is just great.

Sunday, May 5, 2013

Some progress on the Wat VM

Wat is back! If you'll recall, Wat is my ultra-minimal (~500 lines of JS) interpreter for a Kernel-based language with delimited continuations as well as first-order control, and hygienic macros as well as fexprs.

I'm pretty excited by some recent and ongoing changes, which make Wat even smaller and turn it into more of a VM than a full language. Wat will provide (just) the following features:

delimited continuations and delimited dynamic binding (higher-order control); these will be used to build cooperative multithreading with thread-local bindings
try/catch/finally (first-order control) integrated with the JS stack, but suspendable by continuation capture
fexprs as well as in-source self-modifying-code memoizing macros (which are hygienic, as they're built on Kernel)
a native JS interface

And that's about it. This should give an extremely minimal yet powerful infrastructure for building JavaScript-based languages.

And I gave up on quasiquotation and Scheme-style hygienic macros again. I just cannot get them to work in a satisfying manner.

Exempli gratia, here's some initial Wat VM "microcode" for bootstrapping a vaporlanguage.

Sunday, April 28, 2013

A quasiquote I can understand

I've written two Lisps (1, 2) with quasiquotation, and in both, quasiquotation was the most difficult thing to implement, and gave me the most headaches. That shouldn't be, right? After all, it only creates new forms.

I think now I've found a formulation for quasiquote that has a really simple implementation, and yields more or less the same results as existing quasiquote implementations.

Some definitions:

`foo stands for (quasiquote foo), `(foo) stands for (quasiquote (foo)).
,foo stands for (unquote foo) and is only allowed within a quasiquote.
,@foo stands for (unquote-splicing foo) and is only allowed within a quasiquote.
Quasiquote, unquote, and unquote-splicing only ever take a single operand.
`foo = 'foo, i.e. a quasiquoted symbol yields simply the symbol.
`"foo" = "foo", `12 = 12, and likewise for other literals.

The main difficulty I previously had with quasiquote came from unquote-splicing, which inserts a list of multiple elements into the constructed list (whereas nested quoted or unquoted forms only insert a single element). The main idea in this new formulation is to make inserting multiple elements the default, and define nested quoted or unquoted elements as simply inserting a list containing a single element.

Every `(...) expression therefore stands for an APPEND of the list's further processed elements.

For example, given

(define foo 1)

(define bar 2)

(define quux '(3 4))

the quasiquote expression

`(foo ,bar ,@quux)

stands for

(append (list 'foo) (list bar) quux)

which produces the following when evaluated:

'(foo 2 3 4)

So, processing a quasiquoted list works by wrapping each element, except for unquote-splicing forms, in a call to LIST, and APPENDing the results. Quoted elements (foo) get processed recursively. Unquoted elements (bar) are passed to the call to LIST unprocessed. Unquote-splicing forms (quux) are inserted directly into the APPEND form.

I haven't implemented this yet, but I think defining a quasiquoted list `(...) as an APPEND really simplifies things.

Friday, February 1, 2013

Taf's translation to O'Caml for type-checking

Taf is my new vapor-Lisp with row polymorphism, delimited continuations, and hygienic macros.

(Warning: incoherent rambling ahead!) Taf has a class-based object system with no inheritance. A class defines which slots an instance of this class has, and which methods are applicable to it. Every class also implicitly defines a class type. In addition to class types, there are also interface types or simply interfaces. An interface type defines a suite of methods applicable to objects of this type. Everything is an object of a single class, but may have many compatible class types and interface types. Every object is a member of a special top type. There is no implicit subtyping: objects need to be upcast to top.

All Taf objects are encoded as O'Caml objects. There is one O'Caml class for each Taf class. All classes inherit from a top class. Interfaces (method suites) are also defined as O'Caml classes. Any object can be statically upcast to top or any of the interfaces it implements. This is structural: an object can be upcast to an interface if it has all its methods. Objects have full RTTI, so they can also be dynamically downcast, resulting in an exception if the object is not of the given type. (A more convenient TYPECASE is provided as a wrapper.) Internally, downcasting is implemented via Obj.magic on the O'Caml side, and via a dynamic type-check in the VM. So Taf supports for example heterogenous containers containing arbitrary instances of top or of any other type. Any object can be put into the container, and taken out again, and cast back to its original class type. Likewise, it's possible to write methods that operate on objects of arbitrary types, such as Java's EQUALS. Types are parametric.

Another aspect is the semantics of the global environment. O'Caml's is basically that of a LET* for values, and LETREC only for groups of functions. But Lisp requires LETREC* for all values. So every binding must be encoded as a reference cell containing a Maybe on the O'Caml side, to model bindings which may be undefined.

The runtime, and also the code that produces O'Caml code will run in the browser. Eventually, the type-checker will be implemented in the language itself, so O'Caml will no longer be needed.

Update: here's a sneak preview of the Taf Language Manual.

Monday, January 14, 2013

Current project

In my quest for a good Lisp, I could no longer ignore static types.

See Taf - A plan for a statically-typed Lisp.

There shouldn't be any difficult roadblocks, so I expect a release sometime in or before summer.

Saturday, November 24, 2012

This PLT Life moved

This PLT Life is now on Tumblr, thanks to bloggertotumblr.com.

Monday, November 19, 2012

When I see three layers of weird DSLs for configuring a single app

(Hello, Jetty.)

Thursday, November 15, 2012

When everybody and their dog writes a poor OOP critique

Wednesday, November 14, 2012

When I read a discussion about Scheme fundamentals on the RnRS editors list

Friday, November 9, 2012

When the designers of a bad language decide to add a poorly-understood, "powerful" new feature

When someone shows me their implementation of McCarthy's EVAL

Thursday, November 8, 2012

When I listen to someone who knows less about PLs than me

When I hear of a LISP success in industry

When somebody says "it's written in C, so it must be fast"

Monday, November 5, 2012

When I'm beginning to study a new area of CS

Sunday, November 4, 2012

When I read a script kiddie's post about PL semantics

Thursday, September 27, 2012

When I hear your new PL doesn't have exceptions and requires manual error handling instead

Wednesday, September 19, 2012

When I see a discussion about scope in CoffeeScript

Saturday, September 8, 2012

Having both fexprs and macros

Lexically-scoped fexprs and first-class environments make it simple to do hygienic metaprogramming, using less machinery than hygienic macro systems, and they also require less concepts in the language: there is no need to specify a preprocessing phase.

The downside is that fexprs always incur an interpretative overhead with current technology. Many are convinced that those fexprs that do similar things to what macros do can be partially evaluated: "my research in static analysis leads me to believe that we will be able to erase that overhead for the common case--where first-class macros are used to do the job of compile-time macros" writes Matt Might, for example.

In my new language, Wat, I've implemented both fexprs and macros. Macros are expanded at runtime, when they are first encountered, and their result is memoized in the syntax tree, a technique described in two interesting articles (1, 2) related to the SCM Scheme implementation.

In Wat, MACRO is a special form that can be wrapped around a combiner, and causes calls to that combiner to be memoized in the source tree. An example, LET:

(def let
  (macro (vau (bindings . body) #ign
           (cons (list* lambda (map car bindings) body)
                 (map cadr bindings)))))

With a bit of sugar, one can write macros that look almost like in Common Lisp:

(define-macro (until test . body)
  (list* while (list not test) body))

Macros complicate the language quite a bit; but used in moderation, especially for forms like LET that are basically never changed nor used in a higher-order fashion, they should be unproblematic, and offer a nice speed boost. Fexprs should be used for more complex tasks that require special attention to hygiene, or need to do things that macros can't do, whereas macros could be used for simple transformation and processing tasks, such as UNTIL, above.

Oh, and it should also be noted that these macros enjoy a nice level of hygiene already, by virtue of first-class environments and first-class combiners. For example, UNTIL above doesn't insert the symbols WHILE or NOT into the generated code - it inserts the actual values, therefore being protected from variable shadowing by calling code.

Thursday, September 6, 2012

Mixing first-order and higher-order control

It's desirable for a language to support exceptions (preferably restartable ones), unwind protection, dynamic binding, and delimited continuations. [Adding Delimited and Composable Control to a Production Programming Environment, Delimited Dynamic Binding]

I've found a tractable way to implement these features in the language I'm currently working on, Wat.

My approach is to totally separate first-order control from higher-order control.

There is a set of Common Lisp-like first-order forms:

block and return-from that establish and invoke a lexically-scoped one-shot escape continuation, respectively.
unwind-protect aka "finally". Notably, unwind-protect is only sensitive to return-from, not to aborts via higher-order control.

These forms are implemented natively using JS try/catch and finally.

Restartable exceptions are implemented in terms of these first-order forms and dynamically-bound variables, which are also provided natively.

In addition there's a completely separate set of higher-order control forms from A Monadic Framework for Delimited Continuations.

Delimited continuations are implemented using a technique similar to Exceptional Continuations: ordinary code paths run on the normal JS stack; when a continuation is captured, the stack is unwound frame by frame up to the prompt, and at each frame, a resumption is added to the continuation that is built up during the unwinding. This technique is ten times faster than a naive scheme with heap-allocated stack frames, but currently doesn't support TCO.

First-order control is used for quotidian control flow, whereas higher-order control is used for heavy control flow lifting, such as making a REPL written in direct style work in the browser's asynchronous environment.

This is a quite intuitive model: in the small, one has the usual Common Lisp control flow, including restartable exceptions, whereas in the large, behind the scenes, control flow may be arbitrarily abstracted and composed with the higher-order control forms.

Wednesday, September 5, 2012

Reactive Demand Programming

David Barbour has created a very promising and exciting paradigm for writing interactive, networked applications: Reactive Demand Programming (RDP).

RDP is very sophisticated and I can't really do it justice here, but its salient points are:

An RDP application is a dynamically-changing set of semi-permanent, bidirectional data exchanges, called behaviors. What pipes are to Unix, behaviors are to RDP.
A signal is a monodirectional channel carrying a value, and varies discretely over time.
A behavior is made up of one or more input signals, called demands, and a single output signal.
RDP builds in the notion of anticipation - every signal update comes with the expected duration the new value is valid. This allows low latency through smart scheduling.
[Update: See David's corrections in the comments.]

An example for a behavior would be a camera receiving move and zoom commands (or rather demands) with discrete time intervals (e.g. as long as a user moves the camera joystick) from one or more users on input signals, interpolating these commands using some deterministic decision procedure (such as averaging all commands), and outputting camera frames on the output signal, with the anticipation measure telling clients something about the rate at which the camera refreshes, allowing them to smartly perform display reprocessing.

The best way to get started is the README of David's RDP implementation. David has also just posted a progress report on his blog, which contains many articles on RDP.

I think RDP is one of the most exciting developments in interactive application development and is worth checking out.