How much of the space of user interface programming models have we, as a society, explored? The story starts with pretty raw event loops and runs through explicitly associating event handlers with elements (like in gtk+ or the HTML DOM) and MVC (as in the Smalltalk tradition). Somewhere there we got FRP-based approaches and their more-imperative Rx cousins. I think a more purely declarative approach might be possible, and I’m interested in what it would look like.
What do I mean by a fundamentally declarative client-side programming model? The view should be a pure function of the state, and the state should be a pure function of stuff that’s happened in the view. This rules out traditional event handlers that explicitly supply an effect when the event occurs, because putting effects in the view changes our system description from “the state is a pure function of stuff that’s happened in the view” to “the state is a pure function of effects that were triggered by the view”.
One option would be to take a leaf out of the FRP kids’ book and define a Html
structure to be a thing that ultimately produces some stream of values, and each event handler within the Html
returns a value that will go into the Html
‘s stream. There are a few issues around composing these things since if you have some widget generating a Html
to include in another Html
, you need to massage the values going out to fit in the containing type, and rearrange things coming back in to make sure the inner widget sees its values. All these things can be dealt with but it seems to require some pretty meticulous knot-tying. If we give ourselves permission to go with whatever wild stuff we can dream up, where might we end up?
Let’s define a page. Starting simply, maybe a page is just some text.
page = [text "Hello, World"]
But we might want some interactivity… maybe a button?
page = [text "Hello, World", button "Do Not Press"]
See, the button says “do not press” because nothing’s going to happen when you click on it. Maybe we want a button to be an increment button?
page = [button "+", text "It's been clicked 💥 times"]
But it can’t have been clicked Image may be NSFW.
Clik here to view. times. Image may be NSFW.
Clik here to view. is not a number. Nothing happens Image may be NSFW.
Clik here to view. times. If something were to happen Image may be NSFW.
Clik here to view. times, it would be a disaster. So what do we put in that space?
page = [button "+", text "It's been clicked ", text (numberToString (clickCount "+")), text " times"]
Okay, so that’s cute, but how can we define this clickCount
thing? First, we need to look at our buttons, which are the things returned by button
. We can imagine a function buttonClicks
that returns the click events for a button. But there’s a problem with this, because I want to live in a purely functional world and in a purely functional world the return value of a function is determined entirely by its arguments. That means the button
returns the same thing for any given argument. In a sense, there is only one button labeled “+”, even if that one button appears many times on the screen. We may want the button to do different things in different places, though, so we can’t just pass the button to buttonClicks
. We could try to differentiate the buttons by defining a function that produces particular buttons:
incrementCounter = button "+"
But every button with the same label is going to be the same button. How can we make sense of where exactly a click event has come from?
counterPanel n = incrementCounter page = [counterPanel 0, counterPanel 1, counterPanel 2]
I’d like to introduce an idea I’m calling a “suspension”. A suspension is a value that captures the idea of calling a function: it refers to a function, and it has some arguments to pass to that function. It also has a notion of parents, which are suspensions referring to stack frames above this one in the call stack. Once you have suspensions, you can say things like “give me the clicks on buttons returned from incrementCounter
s returned by the counterPanel
with a number argument of 1″. In a gentle homage to lisp’s quoted expressions, my syntax for the suspension describing those buttons right now would be '(incrementCounter ^(counterPanel n:1))
.
counterPanel n = [incrementCounter, length (buttonClicks '(incrementCounter ^(counterPanel n:1))) ] page = [counterPanel 0, counterPanel 1, counterPanel 2]
Saying that is one thing, but having it work at runtime is quite another. Fortunately, it’s an entirely doable thing! My implementation right now is extremely rough and ready, but programs like the one above definitely work.
A suspension is not a function call expression. The syntax for a suspension resembles that for a function call, but is different in a few key ways. One is that the start of the suspension, the reference to a function, is restricted to being a variable name and will not be evaluated as an expression. The suspension refers to frames produced by the lambda referred to by that variable name, which simplifies the semantics by avoiding questions about exactly which expressions are matched. What should happen if you have two identical function expressions produced by different expressions? Restricting the first field to be a name gets us out of this pickle.
Another reason for the suspension syntax is that the language I’m working with has a sort of haskelly flavour, so once a function expression has all its arguments applied to it, it executes. That means that if you had a suspension that specifies all of a function’s arguments, and you tried to represent it as a function call expression, the runtime will evaluate the expression and keep its result. The expression’s function-call-ness is lost! The runtime can no longer use it as a suspension. It would be possible to resolve this with a syntax that more explicitly indicates when a function invocation should occur, but only a very small proportion of expressions would be valid suspensions. It’d be easy to write something that is a suspension, then make some changes to your code, then accidentally not have a suspension anymore. I really like having a syntax for suspensions that makes it difficult to accidentally turn a suspension into not-a-suspension.
Beyond syntactic differences, I use suspensions to offer pretty alien language features. Function pointers and closures and continuations all offer some way to hold a handle to a computation, but the only thing you can do with it is commence the computation. Suspensions hook into the runtime to let you talk about function invocations that have been made or might be made in the future as part of a computation.
Why do I think this is interesting? Things in front-end programming today have a multitude of names. An element might have the name used to attach an event handler to it, the function or template returning the markup used to generate it, the names used to associate CSS styles with it, maybe some names used to manipulate it dynamically. On some level, this work is an experiment in not giving things any name at all. The increment button is the thing returned by incrementCounter
and that’s how we talk about it. When we don’t give a thing any names at all, there is no way to talk about it except by what it is, and when we only talk about what it is there’s no way for names to drift or lag or become redundant or contradictory.
This is a new way of looking at functions and their arguments. As well as talking about a function as a computation to be performed and arguments as inputs to that computation, we can also use the function and its arguments as a way of naming the value (or values) returned by that function. Maybe we can keep using this language to refer to values long after they’ve been returned from the function that has become their name.
In the context of the web, this perspective on functions and arguments can be taken further and used to organise the information by which a website decides what to render. Today, data that informs what people see on a web page comes from all over the place, particularly the URL’s path and query string, the user’s session, and any other cookies or local storage. Each of those sorts of data takes it own route to inform what users ultimately see on the page. If they are all just function arguments, then the interplay between those arguments and the function suspension mechanism dictates where suspended data needs to be stored or sent. If a suspension describes stack frames across multiple different users, then accurately rendering pages that depend on that suspension will necessitate sharing the values from those stack frames by storing them on a server. If a suspension describes frames across multiple pages, then things will need to be saved in cookies or localstorage or the server. I see glimmers here of a language describing what needs to be true of how data is stored, independently of how it is stored in any given version of an application.
There are a few caveats right now. The implementations I can offer you either remember every stack frame ever (in case they are ever suspended), or require functions to opt-in to being remembered (in which case only functions that have opted in can appear in suspensions). I’m working towards a type system that uses types to guide the opt-in flags, but that work is in very early stages. Either way, remembering stack frames (whether you remember all of them or some partial set) is likely to turn into a memory leak, and if this work is to go anywhere I will need to grapple with that somehow.
There are also some little logistical issues. My current implementation returns the values that match a suspension, and multiple buttons can potentially be described by a suspension, which means that you need to do a lot of getting lists and mapping something over the list and aggregating things. So when I say “programs like the one above” work, I mean programs that have a bunch of sadly-obfuscatory mapping and concatenating and summing in order to make them work. This is somewhat galling when the suspension you’re looking at can only really refer to one value, but that’s the nature of life with research-grade software.
Particularly eagle-eyed readers might have noted that I said above that my current implementation returns the values that match a suspension, and that this contradicts my earlier statement that every value returned by incrementCounter
is identical. Right now, the value semantics are somewhere between quirky and completely broken. When you retrieve a value from the suspension infrastructure, it’s annotated by the whole call stack that produced that value. You can’t see that at the language level, but the runtime uses it internally to route events to the right place. This is the sort of thing that’s likely to get very confusing, since you can have two values that are equal but if you pass them both to htmlElementEvents
you’ll get two different lists back. Are there implementations of ideas like this that allow exploring the potential advantages I describe above but without doing quite so much damage to the value semantics?
There are some big questions around exactly which operations you should be allowed to perform with a suspension. Right now, you can suspend any function and get a list of stack frames back. This works in full generality; the runtime will keep on running your program and passing it the new lists of frames from suspended function calls until it settles down. I do not know if this is guaranteed to converge in general, nor do I have much of an idea of what might cause it to diverge. These suspensions are a big stupid hammer that it’s pretty fun to throw around. The main motivation for them, using them to route user interface events, probably doesn’t need such a very large hammer. It’s probably worth trying to build the smallest sufficient hammer.
Is any of this a good idea? I don’t know!
There are three rough directions I could go in from here:
- Keep working and playing and experimenting with suspensions and their possible semantics.
- Step back a bit. Look at the benefits suspensions might offer for routing events and describing where data needs to be stored and try to figure out what less-exotic language features might work in the same way. As an example of what that might look like, one way of thinking of my current prototype is that there’s sort of an implicit global variable containing the history. That global variable holds a number of events of varying types, and suspensions let you get the events of a particular type out. If that ability to get elements from a heterogeneous list by type is actually the useful thing, that could presumably be built without any of the other exotica.
- Step back a lot. One of my motivating interests in all this is building an app where the server-side state is defined as a pure function of things that have happened on the client, and I could try to do that with only the current set of language features in haskell or ocaml.
At this stage I’m planning on continuing with my current direction. I don’t know if any of this is a good idea, but I definitely think there’s some interesting ideas in here, and I don’t know of anyone else pursuing them, and that’s enough for me.
Huge thanks to Julia, Vaibhav and Veit for commenting on a draft of this post!