2020-05-11

What if we'd had better html-in-js syntax all along?

I have a theory that a grave mistake was made in 1995 - the decision not to have a neat, succinct and declarative way of representing html elements in javascript. [On reading the hacker news comments this is pretty historically inaccurate, however, I think this article still stands as "look just how close javascript object notation is to a reasonable way of representing html".]

In this post, I'm going to describe the current state of affairs, show a couple of very small additions to javascript's syntax that might've made all the difference, then talk about the repercussions.

What options do we have now?

We have some APIs available to us, including all the Element properties/methods:

document.createElement("div")
document.createTextNode("Hi there and greetings!")
el.classList
el.innerHTML
el.removeAttribute()
...

We can use string templates and functions to try be more declarative:

const myLi = name => `<li>My name is <em>${name}</em></li>`
const myUl = names => `<ul class="my-ul">${names.map(myLi).join('')}</ul>`

This is rubbish for obvious reasons - composing the strings is bug-prone, no typing/linting etc etc. We then have to turn them into elements with the less than elegant:

function createElementFromHTML(htmlString) {
    var div = document.createElement('div');
    div.innerHTML = htmlString.trim();
    return div.firstChild;
}

Mmm...

We can use the new template elements, from the MDN docs:

<template id="productrow">
    <tr>
        <td class="record"></td>
        <td></td>
    </tr>
</template>
const template = document.querySelector('#productrow')
const clone = template.content.cloneNode(true)
const td = clone.querySelectorAll("td")
td[0].textContent = "1235646565"
tbody.appendChild(clone)

I don't know about you, but that felt pretty yucky.

Of course, there are also various libraries, ranging from string templating ones (like handlebars) through to compile-away ones like .jsx.

Representing html nodes now

In contemporary typescript, a node of html can map to and from the type:

type HtmlNode = {
    tag: string,
    attributes: {[key: string]: string | Function},
    children: (HtmlNode | string)[]
}

A node like:

<button id="baz" class="foo bar" data="1" onclick=f>
    Hello
    <em>there</em>
</button>

Would map to and from:

{
    tag: "button",
    attributes: {
        "id": "baz",
        "class": "foo bar",
        "data": "1",
        "onclick": f,
    },
    children: [
        "Hello",
        {
            tag: "em",
            attributes: {},
            children: ["there"],
        }
    ]
}

Notice the js representation is a bit verbose.

The most terse html description language I've worked with was jade, here's an example:

html(lang="en")
    head
        title= pageTitle
    body
        h1.some-class Jade - node template engine
        #container
        - if (youAreUsingJade)
            You are amazing

This seems nice, the main problem is we are a bit confined in our programming constructs, for example, how would we make the <h1> have the additional class hidden on some condition? Jade gets around this by having 3 different ways of specifying classes. Rather than come up with a special templating language, let's just slightly extend vanilla js syntax.

Extending the object syntax

Right, let's have a go with the example from above. I'm not going to put too much weight on the correctness of this as I'm not suggesting we change all our code, only a "what might've been".

<button id="baz" class="foo bar" data="1" onclick=f>
    Hello
    <em>there</em>
</button>

Would instead be:

button{id: 'baz' class: ['foo' 'bar'] data: '1' onclick: f
    'Hello '
    em{'there'}
}

Or formatted longhand:

button{
    id: 'baz'
    class: ['foo' 'bar']
    data: '1'
    onclick: f
    'Hello '
    em{'there'}
}

So, a checklist of things to allow existing js object syntax to represent html nodes in a reasonably succinct way:

That's it. Our javascript would use this as the main data type (everything else would remain the same). Probably, we'd start (re)writing all our XML-ish html to also use this syntax.

Toy examples

In the browser:

const someUl = document.getElementById('some-ul')

const myLi = name => li{'My name is ' em{name}}

someUl.~.push(myLi('Tommy'))

An express route:

const names = ["Barry", "Lynette"]

const myLi = name => li{'My name is ' em{name}}
const myUl = names => ul{class: ['my-ul'] ...names.map(myLi)}

app.get('/', (req, res) => res.send(myUl().asHTMLStr())

Instead of this page's html:

body{class: "language-javascript"
    a{href: "index.html" img{style: "height:2em" src: "pic.png"} "⇦"}
    h1{"What if we'd had better " code{"html"} "-in-" code{"js"} syntax all along?"}
    p{"I have a theory that a grave mistake was made in 1995 ..."}
    ...

What if we'd always had something like this in js?

What about .jsx though?

I think it's a fine enough solution, but the fact that it's a different syntax from your standard js objects encourages people to consider the VDOM objects as "not normal data", but they are. Notation as a tool of thought innit.

Other thoughts

There's some great links relating to XML <-> Scheme equivalence/syntax here.

Our alternative history having never happened, I prefer the boring hyperscript style, everything is "just code".