The Axis of Eval

Monday, September 27, 2010

Program Your Size

When programming, I often find myself striving for minimalism. Trying to make the program as succinct as possible, sweating over details like how to arrange the sections in a file, or the number of files to create.

But minimalism in programming may cause programs to appear cramped, may persuade you to choose forms of expression that are unneededly terse, and may also lead to programs that are hard to refactor or extend.

Point is, when you start programming, you don't know how big the program's gonna be, so starting with minimalism may be a bad choice. When looking at the sources of some huge programs, I found that some of them have found very relaxed ways to deal with their sprawling size. Gnus for example comprises dozens of large source files, containing hundreds or thousands of operational definitions, and yet remains very readable.

I don't want to get into the details of architecting large programs - I'd rather like to focus on a way to write programs, from the start, so that the source then has a "density" that's adequate to its size. Gnus for example is meandering, and that's perfectly adequate - Gnus being a mail reader for the most customizable OS in existence. That's a job that's never done, so the source should reflect that - you can't write a cute blog post about a program like Gnus, there's no single overarching, minimalist design principle.

Maybe, as Perlis said, every program needs to be written at least twice. After the first run, we know the approximate size of the program, and can then find a code density/terseness that's adequate to that size.

I find minimalism in source code is often limiting. In my next program, I'll write it from the start as if it were really huge.

Linus on transactional memory

I want transactional memory like the next guy, but comments like this one by Linus indicate that we still have a long way to go:

So that's what it boils down to: transactions are "free" and a wonderful way to elide those horrible expensive locks.
But only if you never make a mistake.
They are expensive as hell even for very low rates of transaction failures. And you really cannot know statically (even if you don't end up reaching some transaction limit, you may easily end up just having heavy contention on the data structures in question).
So I claim that anybody who does transactional memory without having a very good dynamic fallback is basically totally incompetent. And so far I haven't seen anything that convinces me that competence even exists in this area.

Monday, September 20, 2010

The ghosts of blog posts past

Ah, blogging. On the one hand there's value to reading someone's opinion unfiltered and raw. On the other hand it creates a lot of friction, and pisses people off.

This is a blog. It is a special medium. I'm not writing articles, which I expect to stand the test of time. I don't let my friends preview the posts I write, and point out their problems. Because I think there's a difference between writing articles (as, say, Paul Graham does) and blogging. And both have merits. This is a blog.

It presents my opinions and mood at a specific point in time. If you don't like it, argue with me, but don't take it too seriously. Take my posts like snarky or humorous remarks over a beer. That's how I write them. Because that's what I think is one of the things blogs are good for, and that's how I'm running this particular blog.

-- Manuel

Monday, September 13, 2010

The far side of the lambda cube

Of course, once you investigate higher-order types, you begin to think about unifying the type and term levels.

Reading a bit about this I found some interesting links:

Sage: Unified Hybrid Checking for First-Class Types, General Refinement Types, and Dynamic. A recent paper I found via Ahmed, Findler, Siek, and Wadler's new Blame for All. I find the use of a counter-example database in a compiler a bit frightening, though.

Henk: A Typed Intermediate Language, from 1997, by SPJ and Erik Meijer:

There is growing interest in the use of richly-typed intermediate languages in sophisticated compilers for higher-order, typed source languages. These intermediate languages are typically stratified, involving terms, types, and kinds. As the sophistication of the type system increases, these three levels begin to look more and more similar, so an attractive approach is to use a single syntax, and a single data type in the compiler, to represent all three.

The theory of so-called pure type systems makes precisely such an identification. This paper describes Henk, a new typed intermediate language based closely on a particular pure type system, the lambda cube. On the way we give a tutorial introduction to the lambda cube.

Languages of the Future, by Tim Sheard on Omega, on which I'll post something soon.

And, Your lambda-cube is puny.

A moment of respectful silence, please.

Sunday, September 12, 2010

Higher kinds are sexier

In the process of formalizing generics for my Lisp, I'm also studying FGJ_ω again, the extension of Featherweight Generic Java with higher kinds. Instead of just parameterizing types with first-order types, such as T, you can parameterize types with type-level "functions", such as F<T>. One example of such a function is the type List<T> - you give it a T, and it "returns" the concrete type of lists containing that type.

And it turns out, the theory behind that looks simpler, and thus sexier:

(types)             T ::= (K K*)
(type constructors) K ::= X | C | (P* => T)
(type param defs)   P ::= (X P*) | (<: (X P*) (C K*))
(class defs)        D ::= DEFCLASS (C P*) ((C K*)*)
(method defs)       M ::= DEFMETHOD (m P*) ((param T)* -> T)

I like the definition of types: a type's the application of a constructor to zero or more other constructors. Constructors may be variables (X), classes (C), or functions taking zero or more parameters and returning a type. (So you basically have a lambda calculus at the type-level.) A type parameter is a variable constructor with zero or more parameters, optionally bounded to a class type with zero or more type arguments. Class definitions are parameterized over zero or more type parameters, and extend zero or more existing classes. (I'm taking some liberties with adding multiple inheritance, hoping it doesn't mess up the theory. :P) Method definitions are, as usual, parameterized on the type- and term-level and have a result type.

It's kinda embarrassing to post this stuff here, since I'm so bad at type theory, but I think I can make it work. One insight giving me this faith is that none of this stuff survives till runtime: at runtime, only concrete instantiations, such as List<String> remain. Or so I hope.

Friday, September 10, 2010

Formalizing generics

In the previous post, Minimal generics for an untyped language, I've discussed some basic requirements for the implementation of parametric polymorphism in the Lisp I'm working on.

I've now made some progress in formalizing this, based on Featherweight Generic Java. Here's a basic overview of the type system in the FGJ style:

(types)       T ::= C | X | Top
(classes)     C ::= (klass T*)
(class defs)  D ::= DEFCLASS (klass (<: X C)*) (C*) ((slot T)*)
(method defs) M ::= DEFMETHOD (method (<: X C)*) ((param T)* -> T)

Types are either class types (C), type parameters (X), or Top. Class types have a name (klass) and zero or more types as parameters. A class definition introduces a named class, parameterized over zero or more type parameters that have class types as bounds, zero or more superclass types, and zero or more named and typed slots. A method definition introduces a named method, parameterized over zero or more type parameters that have class types as bounds, with zero or more named and typed parameters and a result type.

Type parameters in a class definition or method definition (X) are scoped over the entire definition, including the bounds (in the style of F-bounded polymorphism).

All of this is pretty much equivalent to FGJ. The only difference is that classes do not contain methods.

[Updates: I've added type parameterization to method definitions. Added Top. Added some parentheses to make the syntax unambiguous.]

Wednesday, September 8, 2010

Minimal generics for an untyped language

So, I'm developing a Lisp, which means I have to support untyped, dynamically type-checked code in any case. But I still like typeful programming.

Generics, or parametric polymorphism, are jolly useful, even on a totally limited scale: all I want from the first round of this endeavor is to support something like the following:

(defclass (list T))
(defmethod add ((list (list T)) (element T))
...)

Here, I define the class (list T) - a list parameterized over the type of its elements, T. That's the same as Java's List<T>. The method ADD takes such a list and an element, and adds it to the list.

To create an instance of a parameterized class, you have to pass a concrete type argument to use for the element type, as in:

(defvar *my-list* (make (list string)))

This creates a list that accepts strings (and instances of subclasses of string) as elements.

The instance has to remember that it's a list of strings for type safety. When ADD is called on a list, the runtime needs to check that the element type matches:

(add *my-list* "foo") ;; OK
(add *my-list* 12) ;; runtime error

One more thing I'd also like is the following;

(defclass (foo (<: X bar)))

That's a class foo that has one parameter, that must be a subtype of bar, analogous to Java's Foo<X extends Bar>. Being able to give such a bound to a type parameter seems quite essential.

Furthermore, it has to be possible to pass on type parameters to superclasses, as in:

(defclass (super-1 T))
(defclass (super-2 T))
(defclass (klass X Y) ((super-1 X) (super-2 Y)))

(klass X Y) is parameterized over two types, X and Y, and passes them on to its two superclasses, each of which is parameterized over one type.

Of course, it also has to be possible to remove parameterization, as in:

(defclass string-list ((list string)))

This defines an unparameterized class string-list, which is a subclass of (list string).

Another requirement is that it has to be possible to create instances of type arguments, as in:

(defclass (klass T))
(defmethod make-a-new-t ((k (klass T)))
(make T))

That's just cute, and may have some applications to dependency injection.

I'd also like to be able to write polymorphic functions:

(defun identity ((a T) -> T) a)

The arrow indicates the result type of the function. In this case it takes an instance of any type and returns it.

It should also be possible to give slots (member variables) of classes the types of type parameters:

(defclass (klass X Y) ()
((slot-1 X)
(slot-2 Y)))

This defines a class with two type parameters X and Y, (no superclasses,) and two slots slot-1 and slot-2 with the types X and Y, respectively.

These are the basic requirements. Now I need a plan! ;)

Tuesday, September 7, 2010

Fast dynamic casting

Fast dynamic casting, a cute paper by Michael Gibbs and Bjarne Stroustrup:

We have demonstrated that it is possible for a linker to generate integer type IDs for classes such that it may be verified by a simple integer modulo computation that one class derives from another in an object-oriented language. When combined with a suitable way of adjusting offsets, this method provides for a fast, constant-time dynamic casting algorithm. A 64-bit type ID is capable of representing quite large class hierarchies containing thousands of classes and at least nine levels deep.

I knew that prime numbers could be used for this somehow, but these guys have already worked it out.

Relatedly, I've also been thinking about using perfect hashing to implement multiple dispatch. It should work just fine, and actually reduce the number of type tests, compared to single dispatch. Unforch, I'm still in the make-it-work phase, which is followed by the make-it-correct phase. Only then does the glorious make-it-fast phase start.

Monday, September 6, 2010

Programmer feel-good quote

From a bit to a few hundred megabytes, from a microsecond to a half an hour of computing confronts us with completely baffling ratio of 10⁹! The programmer is in the unique position that his is the only discipline and profession in which such a gigantic ratio, which totally baffles our imagination, has to be bridged by a single technology. He has to be able to think in terms of conceptual hierarchies that are much deeper than a single mind ever needed to face before. — E.W. Dijkstra

Sunday, September 5, 2010

The three great virtues of programming language designers

Diligence, Patience, Humility.

Saturday, September 4, 2010

The many forms of polymorphism

It's important to realize that different kinds of polymorphism in type systems can be quite orthogonal.

Here are some forms of polymorphism, from Types and Polymorphism in Persistent Programming Systems by R.C.H. Connor:

ad hoc: An operation is defined over a number of different types, and its semantics may depend upon the type of its operands. An example is the operator "+", which may be defined over integers and reals, and has a different interpretation for each type.

parametric: Instances of the same type within a type description may be abstracted over by an implicit or explicit type parameter. An example is an identity function, where although the parameter and result may be of any type, they are statically known to be the same type.

inclusion: The type of a value may be partially abstracted over, so that unnecessary type information need not be stated. An example is a function which is defined over any record value which has an age field of type integer.

He further writes:

All of these language concepts have evolved independently from each other. Languages such as early assemblers contained no polymorphism whatsoever. Ad hoc polymorphism was the first to appear, in languages such as Fortran, which defined overloaded operators, such as "+", along with the ability to coerce values from integer to real according to the use of such operators. Parametric polymorphism appeared in ML, and inclusion polymorphism in Simula. Existentially quantified types as described by Mitchell and Plotkin is a model of abstract data types which introduces abstraction over a type description, and such types are included in the definition of parametric polymorphism given above.

The motivation for all of these diverse language models is the same: the need to abstract over type. As type systems include more static constraints, then flexibility is lost as well as safety gained. Some of this flexibility, however, may be regained without the loss of safety by the introduction of type abstraction. Polymorphism is viewed here not as some theoretical attribute of type systems but as a solution to a class of practical problems which require type abstraction.

I, for one, welcome our new, optically-interconnected blade overlords

The IBM hub module brings 48 10Gbit/s optical links to a Power7 board.

Friday, September 3, 2010

From Choosing VMs to Feng Shui for HCI

There's a nice discussion going on in the LtU thread Choosing a VM for a concurrent language. Some quotes, out of context:

Sean McDirmid:

I'm sick and tired of PL papers that present a novel language idea and then justify it with...a formal model...huh? This is a logical fallacy of the shaggy dog variety.

David Barbour:

developing a new language is also nigh completely pointless unless there is something useful and interesting you can say about it. ...

Usage doesn't justify a design. Valid technical design seems to be a relatively minor factor in market success. ...

most attempts to achieve 'ease-of-use' and 'ergonomics' involve guidance by reasoned principles... a sort of Feng Shui for HCI (which holistically includes PL).

Matt Hellige:

Programming languages are tools that require significant investment of time to learn to use well. To a large degree, their value is measured in terms of how valuable they are to the people who know them best.

Wednesday, August 25, 2010

No more "minimal early Lisps", pulleezz

If there's one thing that irritates me about the pg-led Lisp renaissance it's the fascination with the Ur-Lisp. Rewriting that Ur-Lisp has become a pastime. Here are just some of the reasons I think that's dumb:

Lisp is a bad model if you want some kind of axioms of computing

We already have the lambda calculus for that, and you can build a real language with that (see Haskell). If you want to write some minimal thingy in C, consider writing a Haskell, not a Lisp.

That you can write a Lisp evaluator in Lisp was interesting in 1959 (and maybe 1960), and let's face it, that Ur-Lisp was one broken language.

Car, cdr, wtf?

Contents of the address part of register, contents of the decrement part of register, what the fuck? What are they doing in a language, even a toy, written in 2010? Please use data structures.

Where are the closures?

If you're looking for a challenge, as opposed to redoing something that's been done ad meam nauseam for half a century fachrissakes, try to find a minimal and explainable way to do closures. Bonus points for efficient flat environments. Languages without closures are so 1959.

Where are the macros?

A Lisp without macros is meaningless.

Where's the control flow?

A Lisp without some kind of continuations and condition system is useless.

Where are the semantics?

If you read the RnRS attentively, you'll see that Lisp has evolved a deep and subtle set of semantic concepts, none of which feature in the Ur-Lisp.

What's the purpose?

Surely you're not learning much by repeating JMC's flawed Ur-Lisp from 1959. If you want to learn something, implement a language with closures and macros. If you want to learn more, make it a compiler. If you want to blow your head, implement hygienic macros or a higher-order module system or a static type checker. That's today's standard.

Look, Lisp is such a great language, but if anything we have to push it harder, not continuously go back to 1959.

Update: More fully-featured, modern Lisps, pulleezz

Saturday, August 21, 2010

Dalvik DEX

Good binary formats rock. Reading the ELF spec was quite eye-opening for me.

Dalvik's DEX format is another nice one.

Interestingly, Dalvik is the brainchild of Dan Bornstein, who was at Kaleida Labs (RIP), where he worked on ScriptX, one of the many avant-garde codes that fell victim to the WWW-induced ice age of GUI innovation.

Friday, August 20, 2010

The 2010 Linux Storage and Filesystem Summit

As usual, Jonathan Corbet does an admirable job of informing us on happenings in kernel land in his summaries of The 2010 Linux Storage and Filesystem Summit, day 1 and day 2.

On testing:

It was suggested that the real test should be "put the new code on the Google cluster and see if the Internet breaks."

On Google:

There are two fundamental types of workload at Google. "Shared" workloads work like classic mainframe batch jobs, contending for resources while the system tries to isolate them from each other. "Dedicated workloads" are the ones which actually make money for Google - indexing, searching, and such - and are very sensitive to performance degradation. ...
The workloads exhibit a lot of big, sequential writes and smaller, random reads. Disk I/O latencies matter a lot for dedicated workloads; 15ms latencies can cause phone calls to the development group.

Thursday, August 19, 2010

Meta: Blogging is difficult

The Axis of Eval is my second blog, after the venerable plans within plans within plans. (Yeah, there were others.)

Both of these blogs have gone through the same stages:

A few days of writing in solitude with a couple of friends.
Chris Neukirchen mentions the blog on Anarchaia or Trivium.
Al3x twitters about it.
Attack of the unwashed HN masses (just kidding).

It's hard to go from 1 to N. The rush of thousands of readers is great, but it's confusing (being able to write to thousands of peers in real-time isn't really accounted for by our DNA and social structure). For me, it's kinda hard to write when I know that a post I write will show up in the inboxes of many people most of which I don't know.

I'd like to keep a certain entertaining and informative niveau, and in the best case, I'd also like to improve it. In the early days, aggressive diss-posts and funny flames flow freely, because the audience is small and trusted. And those posts are entertaining. But in a more public setting, I have a bit of a bad feeling writing them, because I feel they may harm people I write about, when all I'm intending is to vent about some ideas I think are bad or ridiculous, or would like to tell a stupid joke.

So, what I want to say is that there are some difficulties to blogging that are seldom written about, and I'm still trying to figure out the boundaries of this strange new thingy, and where to draw the line between fact and fiction in blogging.

--Manuel

Monday, August 16, 2010

No Paranoia Rule

Good rules are few and far between in the programming scene.

The first time I heard of what I now call the No Paranoia Rule was in the following comment by Luke Gorrie on LtU:

"Oh my, what if Luke installed an exception handler that ROT13 encoded every string on the heap, then how would Jane debug her programs?"

I hear that early on, people opposed subroutines for similar reasons. And of course, macros are frequently criticised for their potential of wreaking havoc.

But that's where the No Paranoia Rule comes into play. Stop being paranoid, and don't discount language features for their potentially devastating effect.

As Luke further states,

This [being paranoid] is not the way to illumination.

(Clemens A. Szyperski coined the term No Paranoia Rule.)

Saturday, August 7, 2010

You know you're reading LtU when...

I believe the preferable solution is typeclasses as record types, instances as records, scoped implicit parameters for propagating them around, associated types as record members, and a backtracking logic solver for instantiation. But that just solves the immediate problem; on top of that I'd like a dependent type system with staging to enable type dependency on values (e.g. arrays-with-length), and a higher-order logic solver so simple logic-programming idioms like let zip(as,bs)=f(x) in ... can be expressed and translated via instantiation into efficient runtime code. And a total (non-partial) function subset of the language for reasoning with proofs as programs. — Tim Sweeney

Monday, August 2, 2010

Three Principles of Lisp

I've been thinking about what it is that still sets Lisp apart from all other dynamic languages. I've come to three core principles, that define what Lisp means for me.

Liberal use of syntactic abstraction

Lispers don't fret over when to build a domain-specific language (DSL) or not. When it makes sense, they build one, when it doesn't, they don't. Thanks to the trivial syntax, and the long experience with tools like pattern matching and quasiquotation, DSLs in Lisp are created rapidly.

In other languages, the creation of new language constructs or new languages requires countless hours of busy work. In Lisp, DSLs can be ready to use after a couple of lines entered at the REPL. In Lisp, creating DSLs is a no-brainer, and I think that's one of the cornerstones of Lisp.

Lisp isn't doctrinaire about creating DSLs - in fact it is standard advice to only use macros when plain functions won't do it anymore. But because creating DSLs in Lisp so easy, they are created whenever they make sense. Which, as it turns out, is a lot of times. I would go so far as to say that the liberal use syntactic abstraction makes Lisp a boilerplate-free language, but that's another topic.

There's no such thing as a constant mentality

Lisp thrives on interactivity. Newer languages like Haskell are encroaching on the natural habitat of Lisp (and surpass it some areas), but none of them can match Lisp when it comes to no holds barred, nitty gritty interactivity. Even defconstant isn't necessarily constant, and for interactive systems (Emacs!) that's exactly what you want. In an interactive system, you can't tolerate constancy - this would be like using Lego blocks that are glued together.

I think a fundamental insight for understanding Lisp's superiority in the interactive domain is that Lisp is an asynchronous language, in a very peculiar sense: any Lisp expression may be entered at any time. In many cases, you could cut up a Lisp source file at top-level expression boundaries, rearrange them, and the resulting source file would still have exactly the same effects as the original file when loaded into a Lisp. (Just as one example, in Common Lisp it's possible to create subclasses of classes that don't exist yet.)

This is simply a natural consequence of making the read-eval-print loop (REPL) the cornerstone of Lisp semantics. In C, the dominant paradigm is main(), in Lisp it's (repl). And the top-level is tricky, some say hopeless. In Lisp, it is fundamental to always be able to define a function FOO that calls an as-yet undefined other top-level function BAR - a fundamental idiom of interactivity:

(defun foo ()
  (bar))

Toolchain approach

Lispers don't need to create a new language to try out new language design ideas. Many times, new languages can be defined as macros. If they don't, and an interpreter or compiler is needed, it can still stand on the shoulders of Lisp. In the Lisp world, new languages are built buy combining large, battle-tested building blocks, and polishing or updating them when needed, instead of starting over from toothpicks and double-sided duct tape. A large Lisp like Common Lisp is like a toolchain of decades-old tools that have proven their worth, and have been codified in standards, folklore, and implementations.

Now of course, that's also true of other languages. But in other languages, starting a new language is a from-scratch affair. And on the long way of parsing, analyzing, and interpreting or compiling language constructs, the creator of the new language invariably introduces a lot of things that are different from other languages, so every new non-Lisp language is often fundamentally different from other languages, in subtle areas such as control flow or scoping, which requires decades of fixing.

In Lisp, new tools are tried out as separate functions, macros, DSLs, subsystems, or, in the extreme case, code walkers or complete new implementations. The rest of the language stays the same. Even a new Lisp implementation will often share a lot of the design space with other Lisps. And that's the reason Lisp had proper lexical scoping for decades, while it's just become a fixture in new languages, and that's also why Scheme now has hygienic macros and phase separation, and other languages will have them in decades - because in Lisp, all of these new constructs can be developed in the Lisp fabric, without rebooting the process every time. Which turns out as a nice metaphor:

Others reboot, Lisp keeps running.

Friday, July 30, 2010

So true

Great response to Ask HN: Do you use more than one programming language?

Not only do I use multiple languages professionally, I don't know some of them. – anamax

Monday, July 26, 2010

Concurrency's Shysters

Excellent article by Bryan Cantrill, Concurrency’s Shysters:

[C]oncurrency is still being used to instill panic in the uninformed. This time, it is chip-level multiprocessing (CMP) instead of SMP that promises to be the End of Days — and the shysters have taken a new guise in the form of transactional memory. The proponents of this new magic tonic are in some ways darker than their forebears: it is no longer enough to warn of Judgement Day — they must also conjure up notions of Original Sin to motivate their perverted salvation. “The heart of the problem is, perhaps, that no one really knows how to organize and maintain large systems that rely on locking” admonished Nir Shavit recently in CACM. (Which gives rise to the natural follow-up question: is the Solaris kernel not large, does it not rely on locking or do we not know how to organize and maintain it? Or is that we do not exist at all?)

I know much too little about the subject to have an informed opinion, but I think countering hype in all its forms it important.

Sunday, July 25, 2010

Happenings in GCC-land

Ian Lance Taylor is working on split stacks. This will permit the stacks of gccgo goroutines, which are mapped to native threads, to start small and grow as needed. This will of course be usable from C as well, and should be welcome news for the millions-of-threads!!!1! faction.

Tom Tromey is working on making GCC an incremental compiler: GCC will run as a server and maintain a model of the user's program. When a translation unit is recompiled, GCC will re-compile the minimum necessary. One of the goals is to make GCC a backend for IDEs that do stuff like autocompletion based on the program model (Interview, paper from GCC Summit).

People from Intel and others are working on transactional memory. Sections of code can be marked as atomic, and their temporary changes will be saved in thread-local storage. The semantics will make it appear as if transactions were protected by a single global lock.

It's a great time to be writing a Lisp->C compiler! :)