Friday, March 30, 2012

Programming and boringness

"Computers are useless. They can only give you answers." -- Picasso  
"The use of a program to prove the 4-color theorem will not change mathematics - it merely demonstrates that the theorem, a challenge for a century, is probably not important to mathematics." -- Perlis
Once we can program a computer to solve a problem, that problem becomes boring.

It seems that when something is programmable, computable, it immediately loses its appeal as an endeavour for humans. Something that can be solved by an unthinking machine is no longer worthy of study by humans. (What remains to be of interest to humans are better ways to solve the same problem - write the same program better, more efficiently, more elegantly - but these seem second-order, once we have solved the problem.)

Of course this question opens a whole can of worms about what it means to be computable - e.g. does it have to be economically computable, or does it have to be computable before the end of the universe? But in general, I think that programming is interesting, because it touches upon the interesting issue of what it means to be boring.

Current status

Wednesday, March 28, 2012

Current status

Tuesday, March 27, 2012

Kernel Google Group

There's now a klisp Google Group for discussion of the klisp implementation of Kernel, and of Kernel in general. Check it out, there are some interesting posts already.

Also, there are now at least 12 implementations of Kernel and related languages.

Friday, March 23, 2012

[Scheme-reports] Just a load of.. well..

J. A. "Biep" Durieux has posted a highly interesting message to Scheme-reports, that should resonate with the "oh my, Scheme is much too big" crowd. (Which does exist, and does have a point.)

I have excerpted some highlights, and recommend to read it all:



Naming the Scheme languages
Therefore I propose "Schemer" for small Scheme, and "Schemest" for large Scheme. The one name clearly indicates its roots, and the other the fact that it is the maximal Scheme. The name "Scheme" would remain as the family name, and the report could be subtitled "Scheme: Schemer, Schemest".

The asymmmetry between multiple inputs and multiple outputs for procedures
This is inherent in Scheme syntax: the receiver of the output of an expression is indicated by the place where this expression occurs - which is necessarily singular: (get-some g1 (provide p1 p2 p3) g2). Repeating the expression would lead to repeated execution: (get-some g1 (provide p1 p2 p3) (provide p1 p2 p3) g2) - apart from being cumbersome.
Prolog has a symmetry in its syntax here that languages with nested expressions simply don't (and I think can't) have. Mark that this is independent of unification: it would be possible to have an unidirectional prolog-like notation, with a group of input variables and a group of output variables, say separated by a hyphen:
(lambda (a b - result rem quot) (quotient+remainder a b - quot rem) (* quot rem - result) (display result -))
Barely possible is "splicing in" multiple results - which would require them to be adjacent. A splicing construct (+ 1 ,@(quotient+remainder 25 4) 6) might do - see Lua for an even more constricting solution.

Tables and environments
This would lead to mutable first-class environments.
Let identifiers be hygienic - translated to locations (or location indicators, as the case may be) at the moment a procedure is defined. So (+ x 1) doesn't change when another x is inserted in a more local environment - or when it is deleted from the original environment. In the latter case the location is not garbage-collected, because this code still refers to it. It is also possible to write (+ (get-location-of 'a) 1), and *that* code is sensitive to environment manipulation.
(There might well be a flag indicating that each mention of an identifier is meant as a call to get-location - some kind of "don't inline" pragma.)
Oh, and environments shouldn't be different from other tables, and should have a way to indicate where to find a location that is not locally available. In that way various schemes of multiple inheritance would become trivially implementable.

Multi-pass compilation
This is like solving a consistency maintenance system (Jon Doyle called that "truth maintenance", some others "reason maintenance"), which is more for constraint satisfaction systems than for Scheme. Single pass with maximal deferment is the Scheme way: variables inside lambdas don't need to be resolved yet, resolution can be deferred until they are called. Without very strong reasons, one should not request different behaviour from syntax transformers. Ease of compilation, or speed of the resulting system are no strong reasons.

Sublanguages
Once there was Lisp/Scheme, which was about lambda. Then a completely different language for macro rewriting was added, and now a (hidden) constraint satisfaction system. I don't like that, it is feature on top of feature. A rewriter can be written in Scheme and then used in macro expansion - great, but let the rewriter then be generally available (because it has other possible uses), and be optional in writing syntax (because there are other ways that sometimes may be better). The same is true for a consistency maintanance or constraint satisfaction language - let them be first-class, explicit languages that one may use or not according to ones liking. That's what libraries are for.

Minimality
Something else is lost here. Scheme was about the right thing, but also about the minimal thing. Scheme provided a basis which was like a semantic assembly language: it provided primitives with which to build. Yes, certain complex systems ease programming - so let's make sure such systems can be written in Scheme, but let's not make them the basis of Scheme! It sounds like old-school anthropology: let's give primitives the right to remain primitive - while making sure they are NOBLE primitives.

Efficiency
Implementability is an issue for Scheme language design, but ease of implementation, or efficiency of implementation ought not to be. (I realise there are fringe issues, such as a theoretical lower limit of 2^2^n in space or time..) Compilability is nice, but never a basis for design decisions.

Macros
The principle of least surprise is already violated with macros. Macros must be bound before use; there is no equivalent to anonymous lambdas. Imagine ((if freevar ) arg ...) - this cannot be in Scheme, because Scheme wants to evaluate the arguments together with the head, whereas the head stipulates whether the arguments need to be evaluated in the first place. I think this evaluation rule reflects a thought error: a seeming symmetry was taken for a real one. Anonymous macros would be a boon, as would be runtime evaluation of expression heads, even if they are or may evaluate to macros. Right now a real symmetry (between procedures and macros) is lost for the sake of a false one.
This is about removing restrictions in the Clinger-sense, I think - having first-class syntactic transformers. Currently they have a status comparable to CL's functions - without the possibility of anonymous transformers or SYNCALL.
There is also no syntactic equivalent of "direct code" as opposed to code stored in a lambda form. Direct syntactic code would of course be immune to later redifinitions of transformers - as the transformation has taken place at the moment the single evaluation pass dealt with that bit of code.

Strings
Wouldn't a Schemish string be a sequence of character objects (soco), where a character object can be complex (base + modifiers)? Just as s-expressions respect the grammatical complexity of expressions, a soco would respect the grammatical complexity of text. If the sequence is a vector, string indexing and string-set! would be O(1). Yes, socos take more space than classical strings, but the same is true for s-expressions as opposed to flat program code.
For Schemes with only ASCII, the set of character objects would be the ASCII set, and the pointers could be 8-bit, i.e. the strings could be the classical string that we all know and love/hate. There's backwards compatibility.
Those who want size-changing substitutions can use some kind of tree representation of the sequence instead of vectors (lists being an extreme kind of unbalanced trees).
The interesting thing would be that people can choose their sequencing level: if you want to reason about code points, use sequences of code points; if you want to reason on the syllable level, make sequences of syllables. It is possible to have several levels: a sequence of words, each of which is a sequence of characters, each of which is a sequence of code points.
In other words: Scheme should not prescribe one crippling string format, but rather a set of specifiers which which people can define the strings they need. A classical ASCII-like one is obligatory, the others are optional.

Wednesday, March 21, 2012

Nuff said

Call for use cases for Dart package system


Everybody loves to debate package/namespace management systems. But remember that some of our biggest software successes (C, Emacs, JS) get by fine withoutdon't have them. As Casey McCann says, no management is better than mismanagement.

"should I really take the time to learn fexprs?"

You should take the time to learn fexprs if you answer yes to any of these questions:
  • You like Lisp, and want to learn even more about its rhyming scheme. Subjectively, a Lisp with fexprs feels even more Lisp than Lisp.
  • You are dissatisfied with either the need for GENSYM, or for a hygienic macro system's complexity. Fexprs require neither.
  • You think that everything in a PL should be first-class.
  • You'd like to see how a single concept can replace both functions and hygienic macros.
  • You want to implement a PL with as little code as possible (since fexprs give you the power of both functions and macros, you don't need to implement a macro system to bootstrap your PL.)
  • You like to get your mind blown.
  • You like hygienic macros, and want to understand better how they work. Even if you don't use fexprs, understanding them can show you how hygienic macros are really a special case formulation of fexprs.
  • You think that PLs should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

Current status

Tuesday, March 20, 2012

"I want to understand vau"

Apparently, youse people don't know the R-1RK by heart yet.

Vau is just like lambda, except it doesn't evaluate its arguments, and receives the lexical environment in which it is called.

Vau's syntax is (vau operands environment . body).

Lambda's syntax is (lambda arguments . body).

((vau x e x) foo) ==> (foo)
((vau x e e) foo) ==> #[environment]

You can also destructure the operands:

((vau (x y z) e z) foo bar baz) ==> baz

And if you want to get at the actual value of an operand, you use eval, which takes a form and an environment as parameters:

(def msg "Hello world!")
((vau (x) e (eval x e)) msg) ==> "Hello world!"

Note: you can try out these expressions at the Virtua REPL.

Monday, March 19, 2012

Why fexprs, part N, or lambda only the penultimate

I now have a bit of experience with Kernel-style fexprs (short for fun expressions, not frankenprogramming expressions), and I must say I am amazed.

While it's true that one can obviously express fexprs easily in plain Scheme/LC through a little bit of encoding - an objection immediately raised by a friend - the converse is also true, as John Shutt explains. In a language with vau (the WOW! combinator), lambda is simply expressed as a fexpr that evaluates the operand tree, yielding a list of arguments, before entering the body. So it seems that neither vau nor lambda is preferable from a theoretical standpoint.

But when it comes to practical matters, IMO vau clearly wins over lambda, making it only the penultimate. Why? Take the (usually smug) Haskell weeniesprogrammers. They like to wax about how lazyness obviates the need for macros, but then they spend all day writing papers about parallelizing the Fibonacci Butt Sort using LaTeX, which is totally based on macros. (You can read more about the Fibonacci Butt Sort here on 4chan /prog/, the second best PLT site.) Not to speak of the use of cpp in the Haskell sources. CPP, FFS!!

So while macros are shunned for theoretical reasons of pureness, they pop up everywhere somebody writes a real program. That's simply a fact, and if you disagree please go back to writing your LaTeX paper about your cpp-processed Haskell program.

Now that we've established that macros are a necessity, we obviously want the best semantics for our macros. Common macros are preprocessors: they take in expressions and spit out expressions. Conceptually, and often also actually, macros are run in a separate phase from runtime, at compile-time. (See my cleverly titled post What's phase separation and when do you want it? for more information on phases.) This has two drawbacks. First, because macros are compile-time entities they don't actually exist as first-class entities at runtime. But wait, that's the smaller drawback. The second, REAL drawback is that because macros run outside the real program, processing its representation as an expression, they have no friggin' clue whatsoever about the structure of the program. This is the root of hygiene problems: what a hygienic macro system does is fake lexical scope at compile-time. This is necessary because at compile-time, the wonderful machinery of lexical scope (that it took us from 1959 to 1975 to discover, and is still rediscovered by peasantsignorami every day) doesn't exist. And this faking produces a huge cost in complexity. One example is that it necessitates a partial expansion step (R6RS, Racket) to find all definitions in a piece of code. This greatly adds complexity to a language. A second example is that if you want clean compilability, you end up with things like negative metalevels. All signs of Ptolemy's epicycles, as Tom Lord will tell you.

Hm, I went off on a tangent there. Where was I? Yes, semantics. We want macros, and we want good semantics for them. And that's what fexprs give us. It's getting late, and I have some Stoli to drink. More next time, Nostrovia!

Sunday, March 18, 2012

S, M, L

Quick note: I think in a user interface, objects should have at least three different sizes, or semantic zoom levels: S(mall), M(edium), and L(arge).

In a files-and-folders GUI, S would be the icon, M would be the opened file or folder, and L would be the details/properties dialog of a folder or file, showing even more information.

In a REPL, M would be the normal size for evaluated objects. Sub-objects, e.g. member variables would be displayed as S. Inspecting an object switches it to the L setting.

Monday, March 5, 2012

This is your brain on CBPV


Rob Simmons has written an amazing new post - What does focusing tell us about language design? - on Call-by-Push-Value, an exciting new paradigm that exposes previously hidden fine structure of computation (previously).

I'm a PL dilettante, so I can't follow the theory behind CBPV but it just seems right, and the thinking and writing behind it and articles like Rob's are clear, highly insightful, and practical.

Tuesday, February 14, 2012

Yet another Kernel interpreter

I wrote another Kernel-ish interpreter in JavaScript: Virtua.

This time, no tail-call optimization or first-class continuations, just plain old pedestrian control flow: LOOP, CATCH, THROW, UNWIND-PROTECT (unexpectedly, this didn't make the interpreter much smaller than the one with TCO and continuations.)

Probably the only interesting thing in there is that everything is fully OOP: EVAL is a message, as is COMBINE, as is MATCH - which means every object can potentially control its own evaluation, act as a combiner, and be used as a left-hand-side pattern in matching.

Again: Kernel is nice and elegant, but the interpretetive overhead of fexprs is killer [Edit: haven't measured it, and see this]. Maybe it doesn't matter for what I want to do: script web GUIs.

Thursday, February 2, 2012

Algebraic Data Types

Reading this incredibly insightful article The Algebra of Data, and the Calculus of Mutation finally drove home the point to me why FP folks are so in love with their Algebraic Data Types: they connect wonderfully with very old (think B.C.) mathematical ideas - that every one of us has learned in school. The question is still whether that does make them good for programming. (HT Daniel)

Friday, January 20, 2012

Fun with Kernel

Smug Kernel Weenies - known to be even smugger than Smug Lisp Weenies - are advised to check out Dale Schumacher's series on Kernel:
Besides vau (known as the Wow Combinator by fans, Frankenexpressions by critics), that unifies functions and (hygienic) macros in a single concept (with some open questions), one of my favorite Kernel features are first-class environments. They're just cute.

First-class environments have been shown to exhibit nice properties (e.g. you can write a module system in userland), and solve some of Scheme's problems (such as the letrec black hole), by Queinnec in the somewhat difficult to read Sharing Code Through First-class Environments. I think all designers of new dynamic languages should seriously consider them.

Friday, January 6, 2012

Why I find Dart's production mode evil

Why Dart Types Are Optional and Unsound contains the following example:
class Mammal {}
class Cow extends Mammal { String moo() => 'moo!'; }
class Pig extends Mammal { String oink() => 'oink!'; }

Mammal mammal = new Mammal();
Cow cow = new Cow();
Pig pig = new Pig();

mammal = cow; // [1]
pig = mammal; // [2] Checks OK statically, but now pig holds a Cow.
print(pig.oink()); // [3] NoSuchMethodException if we get this far.
In production mode, the type declaration that pig holds instances of Pig is simply ignored at [2]! IMO, this should never happen. The Dart team's reasoning is that this is still "safe", because you'll get a NoSuchMethodException at [3].

But think about it: you might get the NoSuchMethodException at a completely different place in the program! What good are type declarations/assertions when you never know whether they are true? It actually feels worse than having no type declarations at all!

Thankfully, in Dart's checked mode, you will get the error at [2], making this behavior less troubling. But given that the page says that "type checks are not a major drain on performance", why have production mode at all?

Update: Relief! Bob Nystrom writes on HN:
My understanding was that the prime original motivation for unchecked mode was indeed performance. As our Dart implementations are getting more mature, we're starting to get a better picture of how it does actually impact performance.

I think what we're starting to see is that the performance implications aren't that bad. Given that, I think we're focusing less and less on production mode. It's still a valid option and it may make sense for some users but it may not be commonly used, sort of like -Ofast in gcc.

Small data and humble sequentialism

Yeah, big data and massive parallelism in the cloud are exciting!

But there are two problems with the cloud:
  1. Technical: doing stuff in centralized data centers, when all of your users have multiple, hugely overprovisioned computing devices is a bit of a joke.
  2. Political: do you actually want to have all your users' data, and take on the huge responsibilities and risks associated with that?
The cloud is really a reaction to browsers' highly limited request-response interaction style. But browsers have changed now, they're becoming little operating systems.

May I suggest to take a fresh look at small data and humble sequentialism in 2012?

Type systems vs interactivity

On Twitter last night, Paul Snively and I had one of our recurring type systems vs interactivity discussions.

Paul made his point clear:
My thesis is that programmers' ability to be logically inconsistent in a formal sense at any stage of development is overrated. #
To which I replied, equally grandly:
Without the ability to be inconsistent, you can't support the Lisp Experience™ #
My point is this: current type systems treat type checking as a batch job. But:
Intermittently, during editing, programs are nonsense. then we bring them back into consistency. type systems should support this #
I'd really love to treat type system outputs not as one-off errors that prevent running the program, but rather as a dynamic console of warnings, that updates as I change the - running - program. Every warning corresponds to a point in the program where I might get a runtime error if I were to run the program at this moment.

Update: James Iry blogged about the same conversation in Type Errors as Warnings.

Update 2: Paul Snively added another nice quote:
I observe that I'm given to extremes: give me Kernel or give me Coq! #
Update 3: Great minds think alike. Brian McKenna reports:
Simon Peyton-Jones talked about this a couple of times with me. It's something he's going to try with GHC #
Update 4: around 27:00 in this talk, SPJ details his idea for type warnings. #

Wednesday, January 4, 2012

Restartable exceptions

TL;DR: Restartable exceptions are a must for new languages & they're dead simple to implement.
A dormant LtU thread on Go's proposed panic/recover exception-like mechanism is waking up again, and I'd like to take the opportunity to evangelize restartable exceptions once again (see also my previous post What's a condition system and why do you want one?).

Especially, because this is one of the few topics where David Barbour and me are of the same opinion. :) In the thread, David presents an interesting list of Error Handling Patterns, which includes the following points (my emphasis in bold):

Exceptions

Include a non-local exit so that error-handling may be isolated upon the stack. Unfortunately, in many languages this "unwinds" the "stack", and thus a lot of important, unfinished work is simply lost, and the error-handling policy cannot effectively say anything useful about error recovery. [...]

Resumable Exceptions

The unfinished work is maintained during error handling, and there are mechanisms available to return to that work - i.e. by return value, or via 'resumption points' annotated in the code (which look the same as exception handlers). This allows a great deal of flexibility for what those error-handling policy can express, similar to the pass-a-handler approach. The greater structure, however, can lead to better performance and static analysis.

Usefully, the difference between resumable exceptions and regular exceptions only requires a very minor tweak in implementation, even in languages such as C++ and Java: handle the exception as a new activation at the top of the stack, rather than unwinding first. (These activation records would need to include a pointer to the original frame in addition to the stack pointer.) This is exactly what Dylan does.

For a practical example of restartable exceptions, check out Chris Double's A tale of restarts ('This is a "I'm glad I used Dylan" story...') and Rainer Joswig's post on LtU ("I would not want to live without that feature.").

Now, I still don't fully grok Kernel's continuation guarding ("Exception-handling 'done right'") and how it relates to restartable exceptions. This is definitely another topic John should put on his growing list of blog posts to write. :P

Tuesday, January 3, 2012

Ringing in the new programming year

Straight outta Rob Pike's Command Center: Esmerelda's Imagination:
I resolve to recognize that a complaint reveals more about the complainer than the complained-about. Authority is won not by rants but by experience and insight, which require practice and imagination. And maybe some programming.
Gah. :) And Matt Might's 12 resolutions for programmers.

And don't worry, today's the most depressing day of the year, say experts.

Monday, January 2, 2012

(winningp) => t

Thank you Paul Graham for the Lisp renaissance.

Say what you like about the tenets of Lisp, Dude, at least it's an ethos.

Some signs that Lisp has now gained significant mindshare again:
  • Designers explicitly state that they won't have macros or don't have them yet: Deca, Lua
  • Languages actually do add macros: Scala, Elixir
Now the big problem is that, as they say, in CS not only do we not learn from our mistakes, we also don't learn from our successes.

Let me tell you: people will spend the next 20+ years reinventing hygienic macros. (Or they could just use fexprs, and get hygiene for free.)

Sunday, January 1, 2012

Isn't goto evil?

Goto-considered-harmful-weenies (don't trust 'em) should ponder the following remark from the Novelties of Lua 5.2:
continuations are much worse

yet nobody complains; it is “cool” to support continuations
Ha!