2021-02-21
Using immutable data confers all sorts of advantages, in practice, the main ones are:
This post briefly surveys the territory, then proposes a small syntactic addition to Javascript that would help constrain mutability, while having a clear transition path from existing code (no funky FP stuff). The examples are in Javascript, but the same concept could be ported to Python and other high level languages with mutable data.
If you're already familiar with immutable libraries in Javascript, feel free to jump to the proposal .The most well known immutable library gives us a smorgasbord of new data structures, using them in practice can be a bit clunky:
const map1 = Map({a: 1, b: 2, c: 3})
const map2 = map1.set('b', 50)
Once you start updating large data nested structures, more gnarliness ensues. In introducing a non-native data-type we lose a number of syntactic/typing niceties - it seems that for most people, using new exotic data-types is not worth this hassle.
A more recent library that aims to solve these problems is immer , immer is really neat as it sidesteps the need for new data-types. I even wrote a half baked Python port of it .
The example above becomes:
const map1 = { a: 1, b: 2, c: 3 }
const map2 = produce(map1, draft => {
draft.b = 50
})
Immer is a really cool idea and is designed well to work with contemporary frontend patterns, but suffers from two flaws:
draft
is a mega magicy proxy object, its website lists a number of
pitfalls
resulting from this.
With a library that deals in built-in data types, it's only possible to structurally share data that's passed around by reference (basically objects and arrays, not strings or numbers). If we do:
const foo = { some massive nested object }
const array1 = [foo]
const array2 = produce(array1, draft => {
array1.push(42)
})
Then array2[0]
is
foo
, we didn't have to duplicate anything in memory. Under the hood, array2
is: some reference to foo
, and the atomic value 42
.
However, if we did:
const array1 = [1, 2, 3, etc, 9999999]
const array2 = produce(array1, draft => {
array1.push(42)
})
Then under the hood array2
is
not
a reference to array2
with 42
tacked on the end, it is a whole new array [1, 2, 3, etc, 9999999, 42]
.
With the kind of data used in most frontend applications, this is not so much of a problem. If however we were writing a library that did involve mucking around with rather long arrays, this is a big problem vis-a-vis memory usage.
As a thought experiment, let's imagine what would happen if we baked the immer concept into the language...
The proposed syntax consists of one new keyword, alter
:
const map1 = { a: 1, b: 2, c: 3 }
const map2 = alter (map1) {
map1.b = 50
}
map1
within the alter
block applies only within the lexical scope of that block.
map1
.
The first rule is important, let's imagine we've added a --disallow-mutating
flag to node and attempted to run the following:
function nastyMutator(m){
m.b += 1
}
const map1 = { a: 1, b: 2, c: 3 }
const map2 = alter (map1) {
nastyMutator(map1)
}
The interpreter would raise an error for the second line at parse time.
Note that we're still allowing reassignment, so the following is permissable:let map1 = { a: 1, b: 2, c: 3 }
map1 = alter (map1) {
map1.b = 50
}
The block can just be the next single statement, as with if
and for
:
map1 = alter (map1) map1.b = 50
Standard unpacking syntax would enable doing many things at once:
[c, d] = alter ([a, b]) {
c.foo = 1
b.bar = 2
}
We solve the problems we had with using specialist data types and with immer:
alter
block, the variable we specified at the beginning is just a plain 'ol value - no pitfalls resulting from proxy magic.
const array1 = [1, 2, 3, etc, 9999999]
const array2 = alter (array1) {
array1.push(42)
}
Now let's think long term, let's imagine uptake is high - there's no need to install any libraries and converting existing code is easy, so why not - what happens then?
We can reach the stage where all the mutating in our codebase is contained within alter
blocks - we can check this by running with our --disallow-mutating
flag from earlier. (Maybe we add a mutate
keyword for specific circumstances so we can do eg: mutate this.linepos += 1
).
Now all our data outside of alter
blocks is immutable, we can:
As the syntax is lifted straight from immer, we can trivially update their reducer pattern example :
const byId = (state, action) => alter (state) {
switch (action.type) {
case RECEIVE_PRODUCTS:
action.products.forEach(product => {
state[product.id] = product
})
}
}
The Python version would look the same, just with Python's block syntax:
const array1 = [1, 2, 3, etc, 9999999]
const array2 = alter array1:
array1.push(42)
Clojure has the concept of
transients
that feel somewhat similar to alter
, I should play around with these more.
Haskellers (often it seems) use the lens library for making deep changes to objects. I've seen the Python equivalent used and the resulting code has always been reverted back to a "native" style at some point as it can be tricksy to read and doesn't play well with modern typed Python. An open question for me is:
Is there a pure-FP-ish way to doing deep updates that's always as "natural" as the classic mutational way?
Here's a contrived example of something that to me "feels natural" in an imperative style, but it could just be my unfamiliarity with FP stuff.
const stateAfter = alter (state) {
const unflagged = []
for (const message of state.messages){
if (message.flagged){
state.flaggedMessages.push(message)
state.totalCount -= 1
delete state.visibleUserIds[message.userId]
}
else unflagged.push(message)
}
state.messages = unflagged
}