Friday, April 27, 2012

Physical code

Roly Perera, who is doing awesome work on continuously executing programs and interactive programming, tweeted that we should take inspiration from nature for the next generation of PLs:
  • Lesson #1: it's physical objects all the way down. #
  • Lesson #2: there is only one "from-scratch run", and we're living in it. #
  • Lesson #3: there are no black boxes. #
The first point is one I'm thinking about in the context of what I call hypercode.

Common Lisp is - as usual - an interesting example: with its Sharpsign Dot macro character, one can splice in real objects into the source code loaded from a file.

E.g. if you put the text (* #.(+ 1 2) 3) into a file, the actual code when the file is loaded is (* 3 3).

And with Sharpsign Equal-Sign and Sharpsign Sharpsign, we can construct cyclic structures in files: '#1=(foo . #1#) reads as a pair that contains the symbol foo as car, and itself as cdr: #1=(FOO . #1#) (don't forget to set *print-circle* to true if you don't want to get a stack overflow in the printer).

The question I'm thinking about is: what happens if source code consists of the syntax objects themselves, and not their representations in some format? Is it just more of the same, or a fundamental change?

For example: what if you have a language with first-class patterns and you can now manipulate these patterns directly - i.e. they present their own user interface? Is this a fundamental change compared to text or not?

As Harrison Ainsworth tweeted:
  • Is parsing not an illusory problem caused by using the wrong data structure? #

4 comments:

Roly Perera said...

This is one of my favourite questions. The way I want to answer it is that there is no real distinction. But that's doesn't mean there isn't something profound going on as well.

So, take your example with first-class patterns. Make this into an explicit syntax, as you suggest. It's a DSL for patterns. You write programs directly in this language. You might still want a texty interface to this language. Why not? I have a keyboard in front of me, typing is easy.

Then you might write a function that compiles this down to some other more familiar language, say Javascript. The Javascript is not a different "kind" of thing; it's just more syntax. The function interpreting one into other is a compiler.

Or you might go the other way, and build some fancy GUI sitting on top of your pattern language. But isn't that just more syntax? You're going to want some way of compiling the GUI into the pattern-language. That's just to give a semantics for the GUI by interpreting it into another language.

It's syntax all the way down.

dmbarbour said...

Separating representation from syntax is like separating the wet from water. Representation is fundamental to communication. Source code is a medium for communication. With code, we communicate between humans, between businesses, from human to machine, and between machines.

There are many things we can do, however.

We can structure syntax in multiple dimensions, to reduce need for names. We can provide constraints for syntax, e.g. for concatenative programming, or monotonic structure for streaming programs.

And, importantly, we can provide modularity - membranes that may be more or less transparent - to constrain interactions between syntax.

By making the representation first-class, you can achieve tangible benefits - for metaprogramming, language extension, language oriented programming.

But any first class representation should reflect the same structure and constraints as the syntactic form itself - the same modularity properties, the same concatenative or montonic properties, etc. Otherwise you're just begging for structure violations. Like fexprs do.

Harrison Ainsworth said...

No, it cannot be a fundamental change: because *all* data structures are 'representations in some format' -- they are all bits arranged in some certain way.

Source code in text is a data structure, source code in binary ('the syntax objects themselves') is a data structure. They are both the same, essentially (fundamentally). (Text *is* a binary format too, really.)

The difference is the affordances of the data structures. The meaning of a data structure is the algorithms that use it -- in a broad sense: whatever computation or manipulation is made easiest by it is what that data structure is really about.

Normal text formats (Lisp aside somewhat) make manipulation difficult, and so we just do not do certain things with them, hardly, or at all. If we modified/extended/changed data structures (which includes the idea of recording more data), we could and would do more things than with very dumb text.

A change need not be fundamental to be useful!

jhuni said...

Lesson #1: its physical objects all the way down.

Before the introduction of languaes like C++ and Java the "object" in object oriented programming refered to physical objects. The first OOP language, Simula, was designed to simulate physical objects.

Unlike Java style OOP the Simula style of "programs = physical objects with behavior" never caught on, and for good reason. Modern hardware is simply incapable of modeling physical properties like conservation and persistence.

Projects to develop alternative computer hardware like loper are a necessary step towards having physical objects all the way down.

what happens if source code consists of the syntax objects themselves, and not their representations in some format?

There is no one correct representation of an object, however, we can select a canonical representation. Most computer algebra systems include mechanisms for reducing objects to their canonical form, for example, the Standard Lisp algebra system REDUCE is apparently named after this reduction process.