The Axis of Eval: June 2010

Tuesday, June 29, 2010

Weekly notes

Fascinosum & Tremendum

OMG, first IDEs took programming, now they're trying take programming languages. Really, ditch your IDE. Stomp on it. Burn it.

I'm afraid of what will happen when PHBs start to demand DSLs from their Maven-toting programming morons. And it's happening right now! However, since this can only further the Lisp cause in the longer run, I'm quite happy.

GADTs

GADTs are teh hotness. As you know, all Haskellers' favorite pastime is to implement an evaluator for the addition of numbers. There are literally a thousand papers on this topic. With GADTs, they can now exploit the host language type system to make sure that their petty additions are correctly typed. Of course, geeks of other languages are trembling in the face of a number addition gap, so they want to typecheck GADTs in their languages, too. And maybe even throw in a Visitor, or two.

One way of bringing GADT-like power to lesser languages is via type constructor polymorphism, as discussed on The Axis of Eval previously. However, the hordes from the Evil Northwest are proposing a different solution. Instead of higher-kinding type bounds, they are proposing a rather humble where clause. Be assured that your correspondent will report back with further information.

The mother lode of pointer tagging schemes

Nuff said.

Relatedly, from Stress-testing Control Structures for Dynamic Dispatch in Java ('02):

[O]ur results show that strength reduction of control structures is likely to be beneficial regardless of the hardware and JVM, when the number of possible receiver types can be determined to be small. For numbers of possible types up to 4, if sequences are most efficient. Between 4 and 10, binary tree dispatch is generally preferable. For more types, the best implementation is a classical table-based implementation such as currently provided by most JVMs for virtual calls. These are safe, conservative bets, that generally provide a significant improvement and, when not optimal, result only in a small decrease in performance.

And of course, Agner's The microarchitecture of Intel, AMD and VIA CPUs: An optimization guide for assembly programmers and compiler makers.

Hack we must!

Saturday, June 26, 2010

Programmer Proverb

May you program in an interesting language.

Compile-Time Resolution

David Barbour has explored some highly interesting issues over on the C2 wiki, under the title of CompileTimeResolution. (The page is several years old, though.)

I just wrote him a mail with some of my thoughts, and maybe it's useful to some of you, so I'm replicating it here.

Hi David,

I just saw your C2 page on Compile-Time Resolution.

Highly interesting stuff, and the example of embedding JPEG data right into the code clarified a lot for me.

Even more so, your statement that compile-time code should be idempotent and side-effect-free rang a big bell with me.

In Scheme, there is current work regarding exactly this topic in the context of separate compilation and modularity, under the moniker of "phase separation".

Classic Lisps didn't know or care much about this, and EVAL-WHEN, their tool for handling this is seriously broken [1].

Schemers have started a rather rigorous investigation of this issue, probably starting with Matthew Flatt's work on modularity [2] (although Queinnec's earlier work on reflective towers [3] tackles similar issues.)

In work based on Flatt's, such as R6RS, it's required to explicitly demarcate the phase(s) in which an identifier is needed -- Schemes not only have a single compile-time, they may have a nested tower of compile-times, in case of lexically nested macros (LET-SYNTAX).

One of the latest contributions to the issue is the Implicit Phasing framework [4]. It shows that the programmer can be relieved from the burden of manually having to put code into a particular phase, and let the system infer the compile-times (phases) in which a given identifer is really used. Since that code should be idempotent and side-effect-free anyway, evaluating it multiple times, or not at all, isn't a problem.

Implicit phasing however, is (rightfully, IMO) critiqued for applying this principle even to runtime code. In a Scheme with implicit phasing, REQUIRE'ing a package for runtime will only load that package if, in fact, one of its exported identifiers is accessed, thereby violating a traditional Lisp assumption, namely that simply loading [edit: requiring] a module may have (important) side-effects.

However, for compile-time code, implicit phasing seems to be an advanced solution to the issue of compile-time resolution, so you might want to look at it.

In fact, in the new Lisp I am working on, a "strict" model is used for runtime dependencies (i.e. when you require a package for runtime it will be loaded no matter what), and a "lazy" model is used for compile-time dependencies (i.e. when you require a module "for compile-time", or "for-syntax" as Schemers call it, it will get loaded automatically, in whatever compile-time it is actually used, or not at all if it isn't actually used. After all, compile-time should be idempotent and side-effect free.)

Best regards,
Manuel

P.S. Even old Common Lisp is now getting pushed towards such a saner model by Fare Rideau's XCVB build system [5].

Thursday, June 24, 2010

What Makes Lisp Great

Faré: What Makes Lisp Great:

[U]nlike most other languages, Lisp is not an arbitrary design based on the whims of some author, but it is a discovery produced by a long evolution and a deep tradition. It has seen several revolutions: transition from M-expression to S-expression, from EVALQUOTE to EVAL or from dynamic binding to static binding, introduction of data-structures, of object-oriented programming and of a meta-object protocol, the invention and latter standardization of many many individual features. Many dialects of Lisp also introduce their own revolutions, such as continuations and single name space in Scheme, the graphical user interface and dreaded DWIM features in InterLISP, etc. If anything remains among all these revolutions and variants, it is the ability to evolve. Lisp is a language fit for evolution -- not just by a small group of wizards who control the language implementation, but by every single programmer who uses the language. And this wasn't a conscious intended part of the original design: this was the great discovery of Lisp. (My emphasis.)

Now I understand better why I'm wary of Clojure: Clojure is big on revolution and small on evolution. And in a revolution, people get killed. Or, in Clojure's case, language features that have evolved over decades. Maybe devolution is the correct term.

Wednesday, June 23, 2010

Quo Vadis, Cons?

Everybody seems to agree that we have to get rid of the cons, if for different reasons.

Guy Steele proposes an incremental extension, the conc, with an eye towards losing the accumulator idiom, and emphasizing divide-and-conquer. Alexei Alexandrescu proposes the radical range, for enhanced generic algorithm programming and general awesomeness. Dylan, Goo, and PLOT have stepper-based iteration protocols, that were designed so that the compiler can erase them in some cases. Oleg proposes a left-fold operator with premature termination as the best collection API, and turns it inside-out if a stream-based interface is needed. Clojure uses the cons's first/rest interface as the generic interface for all sequences, which is kinda unbelievable.

The jury is still out, but I think ranges are the most interesting solution for stateful languages, whereas conc lists seem to have their charms in a poorly applicable, I mean, purely applicative setting. Oleg's API probably only works for people with a brain the size of Oleg's.

Sunday, June 20, 2010

Refining Syntactic Sugar

Another great dissertation just came out, Ryan Culpepper's Refining Syntactic Sugar.

Over the past two decades, Scheme macros have evolved into a powerful API for the compiler front-end. Like Lisp macros, their predecessors, Scheme macros expand source programs into a small core language; unlike Lisp systems, Scheme macro expanders preserve lexical scoping, and advanced Scheme macro systems handle other important properties such as source location. Using such macros, Scheme programmers now routinely develop the ultimate abstraction: embedded domain-specific programming languages.

Unfortunately, a typical Scheme programming environment provides little support for macro development. The tools for understanding macro expansion are poor, which makes it difficult for experienced programmers to debug their macros and for novices to study the behavior of macros. At the same time, the language for specifying macros is limited in expressive power, and it fails to validate syntactic correctness of macro uses.

This dissertation presents tools for macro development that specifically address these two needs. The first is a stepping debugger specialized to the pragmatics of hygienic macros. The second is a system for writing macros and specifying syntax that automatically validates macro uses and reports syntax errors.

It describes two contributions: a debugger that's tightly integrated with macroexpansion, as well as a new expander called syntax-parse.

syntax-parse seems like it will become a standard facility for implementing macros. It lets you describe the shapes of the s-expression inputs of macros in a parser-like language. You can define new "syntax classes", that contain both information about the shape of a form, as well as additional, procedural checks.

One example of a syntax class would be the variables bound by a LET: they have to be identifier/expression pairs, and the identifiers must be unique. syntax-parse lets you describe these constraints succinctly, and facilitates useful error reporting (i.e. you don't get an error in terms of the expanded output of a macro, which would require understanding of a macro's implementation). This obviates the need for explicitly programmed checks.

syntax-parse seems like a big step forward for macro writing. The macro debugger is also impressive. Congratulations, Ryan!

Saturday, June 19, 2010

A Theory of Typed Hygienic Macros

Dave Herman's much-awaited dissertation is out!

Hygienic macro expansion is one of the crown jewels of Scheme, but to this day nobody understands just exactly what it is.

Put that in your pipe!

Unhygienic macro expansion identifies programs with their representation as trees. But because of variable scope, programs in fact have additional graph structure, with edges between variable bindings and their references. Since these edges are not explicitly represented in S-expressions, maintaining their integrity during macro expansion becomes the responsibility of programmers. Put differently, the tree representation of a program has unenforced representation invariants that are the responsibility of programmers to maintain.

A good explanation of why "code is more than data". (Though Pascal Costanza seems to have found a way to introduce hygiene in an unhygienic macro system.)

Though the motivations are clear enough, hygienic macro expansion has so far resisted a precise, formal specification. At the heart of the problem is identifying what is meant by “scope” in a language with extensible syntax. ...

The key insight of this dissertation is that by annotating all macro definitions with interfaces describing their grammar and binding structure, we can reason formally about the binding structure of Scheme programs, and without first macro-expanding. More technically, explicit annotations provide us with enough information to obtain a formal definition of α-equivalence of pre-expansion Scheme programs.

These binding specifications make use of tree addresses, a syntax for describing subtrees of a cons tree, analogous to Common Lisp's CADR, CADDR, CADAR, etc functions. The corresponding tree addresses would be AD, ADD, and ADA.

Furthermore, the specifications make use of attribute grammars. These allow synthesized attributes (passed from children upwards to their parents) and inherited attributes (passed from parents downward to their children). For example, the specification of LAMDBA would be:

(lambda xs:formals e:expr)
↪ e.imports = xs.exports :: ε

This means that the imports attribute (the bound variables) in E corresponds to the exports attribute (the introduced formals) of XS. (There are additional rules, which I'm not showing here.)

I've only read the first two chapters so far, but this work seems like a clear winner. Very readable and insightful. Congratulations, Dave!

More later.

Friday, June 18, 2010

Letter to a Young PL Enthusiast

(I wish someone had written this letter to me years ago.—Ed.)

Dear young programming language enthusiast!

You are fascinated by programming languages. You can not not program. But all the existing languages give you woes.

The mainstream languages like Java are horribly verbose and inflexible. C is much too lowlevel. C++ is much too big. C# hails from the evil Northwest.

You've dabbled in O'Caml and Haskell, but their type systems, which no one – not even Oleg – fully understands yet, restrict you.

Smalltalk and Self are cool, but forever hampered by their closed-worldness. JavaScript is ~~teh suck~~ like assembler with hashtables. Broken hashtables.

Scheme is cute, but its users are weird – before they start programming, they build a record system out of conses. Every goddamn time.

Common Lisp is cool, but its baggage is legendary. Whatever implementation you choose, be assured that it has been hacked on since you were a small child and has more comments saying "kludge" or "fixme" than the Plan 9 kernel has LOC.

Factor sure is nice, and Slava is rightfully annoying the hell out of other dynlang implementors, but its Forth legacy means you have to write your expressions in the wrong order.

Python and Ruby, what shall I say, they have the right idea, but their messed-up-ness is legendary. Perl is a sick joke. You don't want to stay with those.

Clojure and Scala have nice ideas, but they're unproven, and they run on the JVM, a messy enterprisey bore, that contains at least seven different implementations of hashtables.

Now that we've reviewed some of the popular languages out there, let me tell you: you must create a programming language, or be enslav'd by another man's.

But don't just go forthe and put whatever features you can think up into it. History shows again and again that the fools who do this cause more harm than good. Think: a programming language is much more powerful than a mere application in fucking people's brains up, and the programming world is already full of braindead zombies. You don't want that responsibility.

Go study all languages you can find. Try to understand where they come from, and where their good ideas end. (Don't try to understand Beta, nobody does.)

Ignore the siren calls of the virtual machines. You have to be the master of your domain. Use C or assembler to implemente your language. The free Unices are your friends. Learne about the internals of your operating system, how it compiles, links, loads, and runs executables, the fruits of your labor. Learn about calling conventions, system calls, and the god-given hierarchy of the memory.

Write a compiler, not an interpreter, for in an interpreter, you will always find an easy way to cheat yourself out of the labyrinth, and you will never have to face the cold, hard walls of reality and triumph over the Minotaur.

Don't be a slave to syntax. Syntax is the Maya of programming, forever blinding many of the weaker souls to the eternal light of symbolic expressions.

Understand the closure and the lexical address before you embark on your journey, lest you putte more bitterness into this world. Employ the power of the generic function – its ad-hoc polymorphism will forever bring you joy. Harness the flexibility of optional, keyword, and rest parameters.

Learne that all control flow is one, but weigh carefully whether to heed the seductive call of the current continuation. It will make your stack messy like spaghetti and thorny like a cactus. Learne to love the one-shot continuation, the unwind protection, and the tagbody, all willing and able to help you jump around as you please, safely and nimbly.

Don't obsess over raw speed like the hare. In this day and age, many a program spends most of its time in system calls, and you can always escape to a lower-level language if needed. But don't ignore performance either. If your runtime uses lots of associative data structures, or unneededly thrashes the precious caches, it will surely be slow as the tortoise, and you'll be the only one to blame.

Build a dynamic language, that flows like water around the rigid rocks of staticity, but don't ignore the limitless wisdom that comes from studying type systems. As St. Ehud said, types are predicates that hold true for the whole runne of your programme, and ye shall seek them everywhere.

Picke thy battles. Try to make your language the best one in a narrow valley, and then follow it to whichever wide open foreign shores it takes you. (But stay clear of the cape of cursor addressing.)

Learne how to documente and speak about your language and make it crystal clear in the minds of your fellow men – its syntax, its semantics, and its pragmatics. Take a bow to the great language manuals and texts written before your time, and seek to do even better than they did.

Scale the reflective tower of macroexpansion, even if it seems unsurmountable at first, and employ it in your language, to the neverending delight of your users. Make sure that thy macros are hygienic and referentially transparent, lest thy users dirty their hands and languish in the darkness of referential opacity. Learne about the art of the metaobject protocol, for you'll be a better programmer ever after.

Learne by heart the holy CLHS and the holy RⁿRS, and recite their sutras daily. They have answers to most of the questions you don't even know yet. Use SLIME daily to teste your knowledge. Follow the confident footsteps of Common Lisp and Scheme, most merciful, most compassionate, whenever the wind of uncertainty blows snow and ice in your path.

Finally, rejoice in the wisdom of the prophet Alanius Perlisius. His epigrams will lighten the load you have chosen to take on your shoulders. His words will forever be with you:

``What you know about computing other people will learn. Don't feel as if the key to successful computing is only in your hands. What's in your hands, I think and hope, is intelligence: the ability to see the machine as more than when you were first led up to it, that you can make it more.''

Dear friend!
I hope this shorte letter will help you on your longe path. May the language you will undoubtedly build delight the world with its absence of nonsense, the gravitas of its design, its joyful voice, and the pleasure and clarity it brings to the fabulous and mysterious act of programming.

Yourse,
X

P.S. Don't forget to put restartable conditions into your language. They are the best thing since sliced bread (and the anti-gravity space pen).

It's a VM world, or is it?

Over on Charles Nutter's blog, after Charles has listed a number of shortcomings of the JVM for dynlang implementation, a commenter writes "Either way, it's a VM world".

I used to think that too. Today I see it differently. Yes it's a VM world, but the OS is the VM.

OSes have failed to provide isolation. Thus, today we virtualize them.

In the future, many apps will be distributed as a preconfigured OS image, and some even as preconfigured hardware appliances. In this world, saying that it's a VM world and meaning the JVM doesn't make sense. Who cares what runs inside your virtualized OS?

And let's face it: many apps are written in multiple languages. Insisting that they must all run in the Gosling Tarpit is silly. You lose flexibility. Furthermore, as OSes get more interesting features (think Linux' splice(2)), being separated from the OS by another layer means that you'll have to punch through that layer increasingly.

Yes, it's a VM world, but the VM doesn't have be a language VM.

Thursday, June 17, 2010

Some more notes on Newspeak

While Bracha does nothing but diss type systems in his talk on Newspeak, he's actually working on optional type systems (see Gilad is Right on LtU).

He says that type systems should be purely optional (can I getta yes please), and mustn't interfere with the runtime semantics of the language (he gives Java's evil overloading as an example). But he doesn't seem to know Cecil or Diesel, both of which already do this (and with interesting type systems). Anyway, I think that optional type systems are clearly the future (of Lisp-like languages).

A second, mildly interesting thing is mirror-based reflection. Instead of being able to call getClass() on any object, you need a special capability that can then be used to reflect upon the metaobjects. Which sounds agreeable, if somewhat software-engineerey (OMG these evil, unwashed other programmers will reflect on my private objects).

Finally, his demo again shows the failure of the antiquated image-based paradigm, or so it seems to me. Yes, he can produce a 12KB object file, but to actually run it, it needs the 7MB development environment. I can only chant, destroy the one true Lisp world, long live the multiverse!

Wednesday, June 16, 2010

Newspeak: A Principled Dynamic Language?

I just watched Gilad Bracha's talk Newspeak: A Principled Dynamic Language, and I find myself disagreeing with most of what he says. For the kind of programming I'm interested in (systems & networks scripting), Newspeak's principles seem to be all the wrong ones.

It starts with the declared goals of the language:

Modularity
Security
Reflectivity
Interoperability

Regarding modularity, I think that the new unit of modularity is a (REST) service. In this case, modularity isn't even a language issue anymore, you solve it out of band. Or, if you don't want to go that far, the unit is some big library like ImageMagick. And that kind of modularity really is a solved problem. (Of course, there are interesting research issues around modularity, say in-language dependency injection (Noop) or what Newspeak does. I just don't think it's practical right now, and maybe not even needed.)

Regarding security, I simply don't get the point of in-language, capability-oriented security. Until we can secure an OS, which we can't at the moment, the only workable solution is sandboxing the whole OS. Whether the language is safe or not doesn't matter, if you can't trust the OS, and if you want to work with unsafe languages. Again, solve it out of band.

Reflectivity seems to have become another word for "things that shouldn't be done in the first place", like say, ActiveRecord's method names like find_by_name_and_email() or whatever. (This is simply a sign that you're lacking macros, which would allow you to do the right thing and define a frickin' embedded DSL.) Of course, you should be able to introspect every object, class, etc, but I'm taking that for granted.

Regarding interoperability, there's a choice quote by Bracha: "the real world, the nasty stuff that's out there". Now, that's the attitude that killed the Lisp machines, killed Smalltalk, and will continue to kill whatever project takes that stance. If your reaction to the real world is "egad, nasty stuff", do you think you'll ever "interoperate" with it well?

After the introduction, Bracha has to explain Newspeak's precedence and parsing rules for minutes, because otherwise, as he says, nobody would be able to understand the following code. Syntax other than S-expressions is simply irrational.

Surreal Virtuality

Then there's this business about everything being a virtual method call. Now, if there's one language that combines virtual method calls and functions correctly, it's CLOS (and thus, Factor). And CLOS shows exactly that virtual lookup is not fundamental. Basically, DEFMETHOD works like this:

(defmacro defmethod (name params &rest body)
`(progn
   (put-method ,(get-the-class-of-param (first params))
               ,name
               (lambda ,params ,@body))
   (defun ,name (&rest args)
     (call-method (first args) ,name args))))

So, the virtual lookup happens inside an ordinary function, the generic function. Function calling is the primitive onto which the virtual lookup mechanism is layered, not the other way around. That's the sane, rational way to combine the functional and generic object-oriented paradigms.

(OK, I like to do as little "virtuality"/polymorphism as possible, although I do find it indispensable. The thought of having virtual lookups all over the place, when all I want is to read a variable or call a function kinda freaks me out. So probably I'm not in the right target audience to begin with.)

Taken all of this together, I find Newspeak a fairly ill-principled language for my needs.

Get the Parallelism out of My Cloud

Via Hack the Planet, Get the Parallelism out of My Cloud.

The Intel Atom is described as an inflection point in the paper, since it and similar CPUs are powerful enough for devices accessing the cloud.

From the paper:

It no longer makes sense to think of hardware trends first when trying to understand where software is headed; at best, the opposite is needed. The bottom line: Software trends are now largely independent of hardware trends. If they do relate, it is in the opposite direction: the demands of software will drive hardware, and not vice-versa. ...

Multicore will (likely) dominate in the server cloud but not on the devices used to access the cloud. Thus, how we use multicore chips in server clouds is the main issue we face. ...

Concurrency of unrelated tasks will naturally utilize the many cores of cloud servers; there will be no great need to parallelize all software in order for the cloud to be successful.

Thanks.

The Republic of Perl

A while ago I read everything I could find about Perl 6. After all, they have a lot of experience in large-scale software development.

But, oh the humanity, is this language weird. Just as one example, there's an operator, <=== IIRC, that lets you build something like pipelines of function calls.

Like the previous versions of Perl, stuff like this leads to a language where no one, not even the "designer", can tell exactly what a given expression will actually do, without evaluating it. And that's arguably even worse than a bolted-on object system.

To quote from Apocalypse Now!,

WILLARD
"They told me that you had gone totally insane and
that your methods were unsound."

KURTZ
"Are my methods unsound?"

WILLARD
"I don't see any method at all, sir."

But maybe there's a method behind the madness... If you load your language with all sorts of weird, crazy features that other languages don't have, you're increasing the potential that the language will develop a cult of equally crazy followers.

People will be so brain-damaged by your "innovations", that they will come to depend on them, learn their ways around their uncountable sharp edges, and miss them in other languages. Your language has become a world unto itself, a republic.

Furthermore, I can see how such features create a sense of community. After all, once you've invested significant brainpower in memorizing irrelevant crap, you're exhausted and like to hang around with other people who've shot their feet off, and chat and joke, and proudly call your language a Swiss army chainsaw.

Take Scheme as an anti-example. Scheme was, until R5RS, designed by unanimous vote. Thus Scheme has only generally-approved features, but lacks character. Maybe, that's why the "Scheme community" is often put into scare quotes.

So, if you'd like a cult following for your language, maybe you should go Perl's way of adding whatever crazy idea lurks in the dark recesses of your language-designer's reptile brain.

The horror, the horror.

Monday, June 14, 2010

Frank A. is back?!

He is definitely hanging around.

See Frank is back?

On this occasion, I'd like to link to one of the best pieces ever posted to LtU: Dynamism, Legacy of the Visigoths.

Saturday, June 12, 2010

REST & the next Emacs

Quick thought: the next Emacs should probably spawn a new process for each buffer (like Chrome). (This also means that Emacs' buffer-local variables are trivially implementable as dynamic variables.)

Buffers should only be accessed through a RESTish protocol, with a couple of verbs.

And of course, we gotta drop plain text and the file system, and switch to a 2D-visualizable hypermedia/hypercode format, stored in a hyperdatabase.

But, and that's important, the rest has to stay the same. The good stuff.

Thursday, June 10, 2010

The Unbearable Simplicity of C

Ponder that one of the most, if not the most important programming language today, "C", doesn't have a module system.

Consider this in light of Linus' statement that "C is a largely context-free language. When you see a C expression, you know what it does. A function call does one thing, and one thing only".

And Dave Moon, in the Arc Suggestions, says on module systems: "A C-like convention with prefixes on symbols, all my symbols are named moon-xxx and so forth, would almost be workable" (my emphasis).

So if your goal is a worse-is-better language and deep integration with the most important platforms today, POSIX and C, which is my goal, maybe your language doesn't need a module system either.

The Quotable Linus Torvalds

Linus is outspoken, profound, and profoundly funny.

On message passing as an alternative to shared state (in the context of ISAs):

How do you think CPU's do cache coherency? Do you think there is a little man inside the machine that is the "cache coherency fairy"? Or do you think that it's the CPU's sending messages back and forth and keeping track of ownership? ... The message-passing proponents should be ridiculed for the fools that they are. At least until they can back up their ridiculous religion with something real. Which I personally wouldn't hold my breadth waiting for.

To keep in mind when designing new software:

Why do you glorify doing something new and stupid, when doing good things well is what people really should be admiring?

On licensing server software:

Projects that are specifically designed for software-as-a-service in the backend, and only the output of the project is what gets distributed, then use the Affero GPL.

On C++:

And I really do dislike C++. It's a really bad language, in my opinion. It tries to solve all the wrong problems, and does not tackle the right ones. The things C++ "solves" are trivial things, almost purely syntactic extensions to C rather than fixing some true deep problem.

On C's simplicity:

when you communicate in fragments (think "patches"), it's always better to see "sctp_connect()" than to see just "connect()" where some unseen context is what makes the compiler know that it is in the sctp module.
And you have to communicate in fragments in order to communicate efficiently. And I don't mean "efficiently" as in network bandwidth - I mean as in "in general". The reason we use patches instead of sending the whole project (or even just a couple of whole files) around is not because it's denser in email - it's because the only thing that matters is the change, not the end result.

Sunday, June 6, 2010

Term rewriting ftw!

I spent yesterday BBQing in the sun with a friend who's deep into term rewriting, and between too many sesame teriyaki skewers, beers, coffees, and spliffs of highest grade, he explained to me the basics of term rewriting systems (TRS's), and I must say I'm hooked.

IIUC there's a lot of variation between TRS's, and no standard syntax and/or semantics. But basically, many TRS's seem to be equivalent to first-order code, like basic Haskell without higher-order functions. TRS's use different strategies to rewrite terms (basically s-expressions). One strategy would be to always reduce the left-most terms first, until no more rules match.

Today I found Oleg's applicative-order term rewriting system for code generation. It gives a good taste of what can be achieved by term rewriting in the context of compilation, and also intuition about how to implement a TRS.

Geeks crazed over news that LLVM's Clang will be a compiler

After being able to compile itself, as well as the Boost libraries, used by many C++ programmers, the programmer community was thrilled today by news that Clang will soon be a compiler, that can compile standard software like the Linux kernel and GNU Emacs (a FSF black op).

"I'm so thrilled. I mean look at that: they are creating a compiler. You know, it takes source code and produces native target objects. And soon, I will be able to compile real programs with it. You know how cool that is?" one commenter on the venerable Hacker News newsboard wrote.

Producing a compiler is considered no small feat in tech circles – in fact, compilers are among the most central and difficult programs in all of IT. The technology underlying Clang, called LLVM, has garnered critical acclaim as a new breed of unusual and far-reaching software by twits and pundits.

A spokesman from Apple, now the home of the project, said that "LLVM must be one of the hottest pieces of open source software available today. Mac OS X, the best platform for running Java, already uses the exciting LLVM technology for rendering some of the lickable designs and glorious effects in our WWDC-award-winning graphical user interface, which provides us with a whole lot of differentiation from Microsoft Windows."

Apple's Chief Apology Officer John Gruber was quoted as saying "Face it, Gnutards, LLVM is god. I'm letting Clang compile itself right now (which is kinda mind-blowing in itself) in my transparent Aqua terminal. Isn't it gorgeous to watch the Profont lines scroll over the lush background image? The spinning beachball gives me an opportunity to watch my reflection in the 30" glossy display browse the Wired iPad app. As you sure know, that app loves the freedom of Cocoa and InDesign, and shows what can be done without Flash, which is threatening the stability of our iHomeland."

Wednesday, June 2, 2010

What's that ATS thing?

ATS rocks the shootout, and now grandmaster Chris Double shows how to achieve remarkable feats using its (dependent and linear) type system:

By defining ‘curl_easy_cleanup’ in this way we enforce a contract which states that the programmer must call this function if they have a live ‘CURLptr l’ object. You can see this in practice if you remove the cleanup call and try to compile. A compile error will result. A compile error will also occur if you try to use the curl handle after the cleanup call.

You get a compile error if you don't dispose of our objects correctly or use them after they've been disposed of! Now that sounds like progress.

Tuesday, June 1, 2010

Dynlang Wizards Series

It's an oldie, so many of you have probably seen it already: the Dynamic Languages Wizards Series, Spring 2001, hosted by Greg Sullivan and Jonathan Bachrach, of Dylan and Goo fame.

It's a series of three panels held at MIT, discussing dynamic languages in a joyful and challenging atmosphere. All three are great, but I like the first one best, on dynlang runtimes. Richard Kelsey (Scheme 48), David Moon (all around wizard), Tucker Withington (Harlequin), Kim Barrett (embedded guy), and Scott McKay (all around wizard) are a great and funny tag team.

Moon is up to great PL design work with the Programming Language for Old Timers (aka MUD – Moon's Ultimate Distraction) nowadays.

Java's hot new syntax for closures

Quick, what does #()(3).() do? OK, I think one can get used to it.

But "support references to ‘this’" sounds really doomed. A closure simply has nothing to do with a "current object". One day, all languages will have methods outside classes, and then the whole "implicit this" charade, both in methods and lambdas, will be clearly viewed as an embarrassing historical accident with huge costs in wasted brainpower. (It's also one of JavaScript's dumbest parts.)

(Also, that they have to make a difference between "statement lambdas" and "expression lambdas" again shows the havoc wreaked by the completely superfluous and annoying distinction between statements and expressions.)