Saturday, November 9, 2024

An alternative idea for a typed language living alongside/inside JavaScript

What if instead of trying to type existing JS code, like TypeScript, we made objects of a new typed language completely separate from JS objects?

The TypeScript idea is to be able to type any old JavaScript that's out there. On the one hand this is admirable, but OTOH it leads to a very complicated type system.

Here's an alternative idea: instead of typing existing, legacy JavaScript, treat it as completely unsafe and outside of the system. The only safe part will be that written in the new language, let's call it Qwax for now.

Qwax will have a very simple object-functional type system like Ben Titzer's Virgil: classes, functions, type parameters, nothing else.

Qwax will also have full RTTI (so you can distinguish e.g. an array of strings from an array of numbers at runtime). This is a big difference from TS, which uses type erasure.

RTTI will also make it possible to distinguish Qwax objects from JS objects. (Essentially, every Qwax object will have a special symbolic property that points to its Qwax type information, or be an instance of a special QwaxObject class.)

JS builtins like booleans, numbers, and strings will be special cased, so they are also native Qwax booleans, numbers, and strings.

Other than RTTI for all its objects, Qwax will have exactly the same semantics as JS and transpile to JS in a straightforward way.

What are the benefits of this?

A very simple type safe language for the JS ecosystem. You could write type safe code with roughly the same performance and code size as JS, without needing the complex TS type system. (If I had to guess I'd say you can probably write a type checker for a language like this in under 2000 LOC, but I'll have to check the Virgil source, and hope Ben doesn't do a big eyeroll if reads this.)

How would you interop with JS? 

JS would essentially be an unsafe "foreign" language for Qwax code. There would absolutely be no integration between JS types and Qwax types. For Qwax all JS objects (other than builtins) are of type ForeignObject. Due to RTTI you could do a safe cast from a ForeignObject to a Qwax object (the cast may fail, but will never produce an invalid result).

So there would be much more friction when calling JS from Qwax than there is when calling JS from TypeScript. But many systems have been written against such foreign function interfaces, and maybe it's not such a big deal.

JS could call all Qwax functions normally, and work with Qwax objects like normal JS objects.

How would this be different from languages like Elm or ReScript?

Unlike those languages, and like TS, it would not be a completely new language, it would be "(a subset of) JS with types". It would reuse existing JS syntax. Some programs could be ported to it simply by adding type declarations.

Furthermore, via RTTI, it would allow safely casting back Qwax objects that have been passed to unsafe JS code. I don't think Elm or ReScript support that.

Downsides

Every new language is a lot of work and introduces a lot of friction. Maybe we could reuse existing TypeScript syntax (but obviously not semantics) to get some mileage out of LLMs for this new language.

It's unclear how much overhead the RTTI introduces. I think there would essentially be one more superclass in the chain for each object due to generics. E.g. a list of numbers would be of class List<Number> which would be represented by a JS class that's a subclass of the representation of List<T>. But I could be completely wrong and missing something.

Status

I am not working on this, but I keep thinking about it, so who knows? Today I had the idea of completely separating Qwax objects from JS objects so I wanted to blog about it and invite comments.

Thursday, May 2, 2024

Delimited Generators - A more natural API for JS generators

I have been studying ways to work around the horrors issues of JavaScript's async APIs for years. I have even built a series of increasingly elaborate continuation-based Lisp interpreters (here's the latest one, it's quite good, if I may say so).

But recently I finally came to the point where I understood JS generators well enough to realize that generators already solve the problem! With a small constant syntactic overhead (having to use function* to define generators, having to use yield* to call them, and using next() to call them from non-generator functions), one can program asynchronous code in a quasi-direct style.

But the plain generator interface is rather low-level, and not very intuitive to use. So I built delimgen, a thin layer on top of plain generators, that mimics delimited control. Delimited control is initially hard to understand, but once you grok it it's a very natural approach (previous post).

Here's how a simple delimited generator looks like. See it in action here.

You can easily spawn multiple independent generators. See it in action here.

You can also do blocking event loops easily. See it in action here.

I'm not claiming any novelty here. You also cannot do anything with this library that you couldn't do with plain generators, but for me, seeing that you can write quasi-blocking code in JS with some small overhead was a real eye opener.

Update: Thanks to mmastrac on HN for pointing out that this leaks stack - back to the drawing board!

Update 2: Turns out the supposed stack leak was just an artefact of the "async call stack" feature of browser devtools, which keeps stacks for promises around for debugging, even though there isn't any real stack growth when running normally (outside of devtools).

(See also: Lobsters, with an alternative formulation using async/await by easrng)

Sunday, January 7, 2024

Common Lisp's BLOCK / RETURN-FROM and UNWIND-PROTECT

I was just chatting with Ben Titzer on Twitter about control flow in his Virgil language (which is cool and you should definitely check out), when I felt the need to once more promote how Common Lisp does non-local control flow (stuff like returning early from a function or breaking from a loop), because I think it's a very nice solution.

So in Common Lisp we have BLOCK / RETURN-FROM (which work as a pair) and UNWIND-PROTECT.

BLOCK / RETURN-FROM

BLOCK and RETURN-FROM effectively offer the same functionality as C's setjmp/longjmp -- non-local exits -- but nicely wrapped as we expect it in a lexically-scoped, expression-oriented language.

BLOCK / RETURN-FROM lets you do:

  • Early returns from functions or arbitrary code blocks
  • Breaking from loops, including any number of nested loops
  • Continuing in loops, including any number of nested loops
  • Even arbitrary GOTOs in a code block (with some macrology & trampolining, see Baker's TAGBODY)

(block name forms*) lexically binds name within the forms as a non-local exit from which you can return a value with (return-from name value). Just (return-from name) without a value uses nil as the value.

A BLOCK without any RETURN-FROM just returns the last value: 

(block b 1 2 3) returns 3.

This prints 1 and returns 2:

(block b (print 1) (return-from b 2) (print 3))

You can have any number of nested blocks:

(block b1 ... (block b2 ... (return-from b1) ...) ...)

To do an early return from a function, place a block at its beginning:

(defun foo ()

    (block b 

        ...

        (return-from b)

        ...))

(Common Lisp automatically places an anonymous block around every function body, so you don't need to do this in practice, but my hobby Lisp doesn't, and I'm using this explicit approach, and I like it.)

To break from a loop, place a block around it:

(block break

    (loop

        ...

        (return-from break)

        ...))

To continue in a loop, place a block inside it:

(loop

    (block continue

        ...

        (return-from continue)

        ...))

You can have multiple nested loops, like in Java:

(block break-outer

    (loop

        (block break-inner

            (loop

                ...

                (return-from break-inner)

                ...))))

UNWIND-PROTECT

UNWIND-PROTECT is effectively a try/finally block (without a catch).

(unwind-protect protected-form cleanup-forms*) evaluates the protected form, and regardless of whether it returns normally or does a non-local exit with RETURN-FROM, the cleanup forms are evaluated.

(unwind-protect (foo)
   (bar)
   (quux))

is analogous to

try {
   return foo();
} finally {
   bar();
   quux();
}

Both of the following expressions print 2 and return 1:

(unwind-protect 1
   (print 2))

(block exit
   (unwind-protect (return-from exit 1)
      (print 2)))

Conclusion

Common Lisp's BLOCK / RETURN-FROM and UNWIND-PROTECT offer a minimalistic and expressive system for non-local control flow.