Tuesday, August 7, 2012

Scheme's missing ingredient: first-class type tags

I've always felt that Scheme does data and types wrong, but I couldn't really say what bothered me until yesterday. After all, R7RS now has define-record-type for creating disjoint types. But that ain't enuff!

In my new language, Wat, I've found a satisfying solution, inspired by Kernel's encapsulation types, but going beyond them slightly. There are two procedures:
  • (make-type) returns a list of three elements: 1) a first-class type 2) a tagger function for tagging a value with that type, creating a tagged object and 3) an untagger function for extracting the value of tagged objects of that type.
  • (type-of obj) returns the first-class type of an object. The crucial point is that type-of not only works for tagged objects of user-defined types created by make-type, but also for all built-in types. E.g. (type-of 12) will return the number first-class type.
This system has the benefits of Kernel's encapsulated types: only someone with access to the tagger function (capability) may create new instances of a type. Only someone with access to the untagger function may access the contents of tagged objects. So object contents are potentially fully encapsulated.

But at the same time, the fact that every object, including built-in ones, has a first-class type makes it possible to efficiently program generically. E.g. one may create Smalltalk-like dynamic dispatch by using the first-class types of objects as indexes into a virtual lookup table of a generic function. This is not possible in either Scheme or Kernel. In both languages, programming generically requires one to use a cascade of type predicates (e.g. number?).

Example in Wat:
; destructuringly bind the three elements
; returned by make-type
(def (person-type person-tagger person-untagger) (make-type))

(define (make-person name email)
  (person-tagger (list name email)))

; untagger also performs a type check that
; person is in fact of type person
(define (get-name person)
  (car (person-untagger person)))

(define (get-email person)
  (cadr (person-untagger person)))

(define p1 (make-person "Quux" "quux@example.com"))
(get-name p1) --> "Quux"
(get-email p1) --> "quux@example.com"

12 comments:

Kajetan Rzepecki said...

I can't see the benefit of such system over:

(define *user-type* 0)

(define (make-type)
(let ((t *user-type*))
(set! *user-type* (+ 1 *user-type*))
(list t
(lambda (o)
(list t o))
(lambda (o)
(if (and (pair? o)
(equal? (car o) t))
(cdr o)
(error "Type error"))))))

Could you give an example of a generic function dispatching an object using this system and explain how is it superior to using my somewhat simple code?

dmbarbour said...

Use of sealer/unsealer pairs is a powerful technique able to model many things, including first-class ADTs.

Anonymous said...

How does this compared to Typed Racket?

J.V. Toups said...

Typed Racket is statically typed, this system is fully dynamic.

fogus said...

I really like the idea of separate create/access functions. I've not run into this pattern before. Care to share your sources for inspiration here? (if any)

I was a little surprised that make-type takes no arguments, but maybe this is a first pass implementation. It's not immediately apparent how to extend make-type beyond accepting validation function(s), but I like the potential there and in the split create/access.

Looking forward to learning more.

fogus said...

One other question - how does one use the type tag itself? Do you plan on using it to define some sort of polymorphic dispatch? I suppose I should look at Wat to see if I can get a clue.

fogus said...

And by that I wonder if you plan to create dispatch as part of the language itself or rather simply provide the tools to define type-dispatch in various ad-hoc ways?

Manuel Simoni said...

Yes, just the tools.

I stole the idea of using two functions from Kernel, but as David pointed out, it's a quite common pattern.

patrickdlogan said...

I'm going to have to dig deep and maybe reach up high on a bookshelf, but I recall T's type system provides for independent create and access procedures.

Shriram Krishnamurthi said...

The example here doesn't actually explain the intended meaning. Is a tag an attribute of an existing value, or a new kind of value? That is, is the return value from make-person a list or a new kind of thing entirely -- if I run (car (make-person ...)) is that a valid projection or does it signal a non-list error?

This is critical for understanding whether this is more like a capability system's brands or just a generative structure mechanism.

[NB: As dmbarbour says, this pair-of-procedures idea is ancient. Indeed, this same mechanism has been around for 10-15 years in Racket: make-struct-type returns five values, a type, a constructor, a predicate, an accessor, and a mutator. See http://docs.racket-lang.org/reference/creatingmorestructs.html?q=struct ]

Manuel Simoni said...

(car (make-person ...)) is an error.

You have to (car (person-untagger (make-person ...))).

John Cowan said...

I invented a variant of this some years back in which (make-type) does not return a first-class object, but returns a tagger, an untagger, a predicate (which serves the purpose of the first-class object, and eliminates type-of), and a thunk which when called is the same as calling (make-type) except that the predicate it returns falls back to the third value returned by this call, thus being in effect a subtype.