A robust virtual DOM for Elm.
Important
This project has gone from heavy development to a testing stage. It is not ready for production yet.
To help test this project, you need to know:
This repo:
- Documents the changes I’ve made in forks of DOM related core Elm packages.
- Has a script for testing those forks in your own project.
- Is where you can open issues for things you find while testing my forks.
- Doesn’t break because of browser extensions or third party scripts. elm/html#44 elm/browser#121 elm/browser#66 elm/virtual-dom#147
- Supports Google Translate, and other web page translators (Firefox, Safari).
- Fixes the infamous
Html.map
bug. elm/virtual-dom#105 elm/virtual-dom#162 elm/virtual-dom#171 elm/virtual-dom#166 (PR) elm/html#160 elm/compiler#2069 - Fixes a lot of other issues.
- Makes hydrating of server-side rendered HTML usable (good for elm-pages).
See also Detailed descriptions of all the changes.
- Remove virtual DOM related hacks and workarounds in your code base (they shouldn’t be needed anymore).
- Read Are the forks drop-in replacements? below.
- Review the compatibility with tooling around Elm you use (if any).
- Make sure that you are using these exact versions (check your elm.json):
- elm/virtual-dom: 1.0.3
- elm/html 1.0.0
- elm/browser 1.0.2
- Download lydell.bash and follow the instructions inside.
- Verify that the forked code actually is used. Open a built Elm JS file and search for
_VirtualDom_wrap
. If it’s there, it worked.
Hacks and workarounds you might want to remove:
- This Google Translate workaround:
HTMLFontElement.prototype.replaceData = function replaceData(_0, _1, text) { this.parentNode?.replaceChild(document.createTextNode(text), this); };
- This tag for disabling Google Translate altogether:
<meta name="google" content="notranslate">
- This attribute for disabling Grammarly:
data-gramm_editor="false"
- Other attributes you might use for disabling problematic browser extensions.
- The patch from jinjor/elm-break-dom.
- Other “render in a div instead of body” patches you might have. If you do any patching at all, review it.
As close to drop-in as they can be. The forks don’t change the Elm interface at all (adds no functions, changes no functions, removes no functions, or types). All behavior should be equivalent, except less buggy. Performance should be unchanged.
You need to keep two things in mind:
- With my forks, Elm no longer “empties” the element you mount the Elm application on. Unless you put
data-elm
attributes on each child element. This is an unfortunate breaking-change-territory thing, that I simply could not find a better solution to. Luckily, it usually pretty easy to handle, you just need to know about it. - Hyrum’s Law. The DOM is full of complex details. Even a small fix could accidentally break something for you.
To be more compatible with third-party scripts, my forks change how Elm “virtualizes” elements. My fork only virtualizes elements with the data-elm
attribute (instead of all child elements), and lets any other elements be. It often felt like Elm would “empty” your element on first render, but that’s not actually the case. It “virtualizes” the element, and then updates it to match your view
function. That often results in whatever was there already being removed, but if you happened to already have an element of the right type in the right place, it would just be mutated to match view
.
This results in:
- If you have an element like
<p>Looks like JavaScript hasn’t run?</p>
and expect Elm to remove it, that won’t happen now. To fix, you can putdata-elm
on it:<p data-elm>
. - If you use CSS selectors like
body > :first-child
,body > :last-child
orh1 + p
, they might not apply, since your<script>
tags in<body>
might still be around, and might be mixed with the elements rendered by Elm. - If you do server-side rendering and expect Elm to hydrate/virtualize/adopt/take charge over the server-side rendered HTML, you need to make sure that all elements (except the root element) has
data-elm
. If you create the HTML with Elm (such as with elm-pages), you’ll get this automatically when you use my forks. But if you create elements some other way, make sure they havedata-elm
.
Comprehensive `Browser.application` example
Given this HTML:
<!DOCTYPE html>
<html>
<head>
<title>My app</title>
</head>
<body>
<script src="/elm.js"></script>
<script>
Elm.Main.init();
</script>
</body>
</html>
You’d previously end up with this after Elm initializes:
<body>
<h1>Welcome to my app!</h1>
<p>It is a very cool app.</p>
</body>
With my forks you end up with:
<body>
<h1>Welcome to my app!</h1>
<script src="/elm.js"></script>
<p>It is a very cool app.</p>
<script>
Elm.Main.init();
</script>
</body>
Why are the script tags kept? And why are they mixed with the other elements like that?
The script tags are kept because preserving all elements inside the mount element is the new strategy. You probably don’t care about those script tags (they aren’t needed anymore after their scripts have loaded), but you probably care about elements inserted by third-party scripts. For example, your page might look like this before your Elm application initializes:
<!DOCTYPE html>
<html>
<head>
<title>My app</title>
<!-- Inserts the iframe at the bottom. -->
<script src="https://example.com/chat-widget.js"></script>
</head>
<body>
<script src="/elm.js"></script>
<script>
Elm.Main.init();
</script>
<iframe src="https://example.com/chat.html"></iframe>
</body>
</html>
Many times, it’s difficult to predict or ensure which order scripts run in. Maybe you’re lucky and 99 % of the time Elm initializes before the chat widget. The remaining 1 % of visitors don’t get any chat. With my forks, it doesn’t matter which order things load. The chat won’t be nuked.
But why do the elements mix with the script tags like that? It’s because of whitespace in HTML. We typically indent our HTML nicely, like we do in most programming languages. However, HTML doesn’t have a concept of whitespace and indentation, really. It just has text content inside elements. If you make a line break and a bunch of spaces for indentation in your HTML, you actually create a text DOM node with just whitespace in it. But typically you can’t see that, due to CSS. By default, whitespace is displayed collapsed as one space, and around block elements you don’t see them at all. It’s just between words and inline elements that they matter. You can control this with the CSS white-space
property, and the <pre>
element has white-space
set to display all newlines and spaces by default.
Anyway, what does all that mean to the HTML in this example? Let’s look at it again:
<body>
<script src="/elm.js"></script>
<script>
Elm.Main.init();
</script>
</body>
That parses to a body
DOM node with the following children:
- A text node containing a newline and a bunch of spaces.
- A script node (with
src
). - A text node containing a newline and a bunch of spaces.
- A script node (with inline JavaScript).
- A text node containing a newline and a bunch of spaces.
With my fork, only text nodes and elements with data-elm
are “virtualized”. Virtualization (sometimes called “hydration” in other frontend tech) means that Elm “adopts” the element – turns it into virtual DOM as if it was rendered by your view
function, diffs it with the first run of view
, and then updates it to match view
, or removes it entirely if it shouldn’t be there at all according to view
. So with my forks, Elm is going to virtualize all the (whitespace only) text nodes in <body>
. The first one is going to be replaced by the <h1>
element, the second one by the <p>
element, and the third one is removed. But the <script>
tags are left alone where they were. In effect we end up with this:
<body><h1>Welcome to my app!</h1><script src="/elm.js"></script><p>It is a very cool app.</p><script>
Elm.Main.init();
</script></body>
If you really don’t like those script tags being mixed in there, you could add data-elm
to them:
<body>
<script data-elm src="/elm.js"></script>
<script data-elm>
Elm.Main.init();
</script>
</body>
Then you’d end up with:
<body><h1>Welcome to my app!</h1><p>It is a very cool app.</p></body>
Or, you could experiment with not having script tags in <body>
at all, but it’s beyond the scope of this documentation to dive into the pros and cons and differences between different script tag setups.
You can control where your Elm elements end up by:
- Being careful with whitespace.
- Adding the
data-elm
attribute to elements that you would like to “adopt”. If you think about it, that often results in them being removed. - Render just a single element in your
view
:{ title = "My app", body = [ justOneItemHere ] }
.
All this data-elm
business is needed because Elm decided to take control of the whole <body>
node. In reality, most pages have more scripts that also want their piece of <body>
. Because of that, Elm can’t know which elements it should care about when it initializes. That’s extra important if you server-side render, such as with elm-pages. The best solution I’ve found to this problem is:
- Assume all text nodes should be virtualized.
- Mark all elements that should be virtualized with
data-elm
.
Comprehensive `Browser.element` example
Given this HTML:
<!DOCTYPE html>
<html>
<head>
<title>My app</title>
</head>
<body>
<div id="root">
<p>Looks like JavaScript hasn’t run?</p>
</div>
<script src="/elm.js"></script>
<script>
Elm.Main.init({ node: document.getElementById("root") });
</script>
</body>
</html>
You’d previously end up with this after Elm initializes:
<body>
<div>
<h1>Welcome to my app!</h1>
<p>It is a very cool app.</p>
</div>
<script src="/elm.js"></script>
<script>
Elm.Main.init({ node: document.getElementById("root") });
</script>
</body>
With my forks you end up with:
<body>
<div>
<h1>Welcome to my app!</h1>
<p>Looks like JavaScript hasn’t run?</p>
<p>It is a very cool app.</p>
</div>
<script src="/elm.js"></script>
<script>
Elm.Main.init({ node: document.getElementById("root") });
</script>
</body>
Probably the easiest solution in this case is to slap data-elm
to that original <p>
tag:
<body>
<div id="root">
<p data-elm>Looks like JavaScript hasn’t run?</p>
</div>
<script src="/elm.js"></script>
<script>
Elm.Main.init({ node: document.getElementById("root") });
</script>
</body>
Then it’ll work like without my forks. See the “Comprehensive Browser.application
example” above for why this happens, and why data-elm
makes a difference.
Server-side rendering notes
When server-side rendering, make sure that your HTML doesn’t have any extra whitespace, and that all elements have data-elm
:
<!DOCTYPE html>
<html>
<head>
<title>My app</title>
</head>
<body><h1 data-elm>Welcome to my app!</h1><p data-elm>It’s a <em data-elm>very</em> cool app.</p>
<script src="/elm.js"></script>
<script>
Elm.Main.init();
</script>
</body>
</html>
Note how there is no line break after <body>
and how all the elements rendered by Elm are on just one ugly line, and how all of those elements have data-elm
.
Having no extra whitespace directly after <body>
is extra important, since it’ll make the first diff with view
off-by-one and effectively re-render the whole page. This is because my forks assumes that all text nodes should be virtualized. See the “Comprehensive Browser.application
example” above for more details around this.
It’s OK to have extra elements before or after the Elm elements. It’s also OK to have extra whitespace after all the Elm elements.
data-elm
should appear automatically on every element rendered by Elm just from using my forks, if your server-side setup works by turning the return value from view
straight into an HTML string, like elm-pages does. But if you render the HTML via for example JSDOM, they won’t appear. The data-elm
attribute is only present in the virtual DOM data (causing elm-pages to print it), but is skipped during rendering in my forks, to not clutter the browser devtools.
elm-watch
TL;DR: Any version should work.
For hot reloading purposes, elm-watch replaces some functions that I’ve also changed in my forks, losing the changes made in my forks. This affects two things:
- When virtualizing
<a>
elements, they won’t get their click listener, resulting in them causing full page reloads instead of being routed by Elm. I don’t think that many people use both server-side rendering and elm-watch though. And in elm-watch 1.1.4+, 1.2.2+ and 2.0.0-beta.6+, I’ve actually added in the missing pieces so that this will work. A caveat here is that if you install my fork of the virtual-dom package, but not my fork of the browser package, you’ll get my forked browser experience during development with elm-watch anyway, but not in production builds. Having something work during development but not in production sucks, but I don’t see any reason for someone not installing all three of my forks. - When clicking on an
<a>
element without thehref
attribute, they’ll be routed by Elm, missing out on my fix where nothing should happen instead. I don’t have a solution to this problem yet. I could include this fix for everyone, but I think that would be misleading (even worse than the above caveat). Production-only bugs suck.
Elm Land
TL;DR: Any version should work.
Elm Land uses elm-watch code under the hood, so basically the same applies there. When the pull request for using elm-watch-lib is merged, Elm Land will get the elm-watch 1.1.4+/1.2.2+/2.0.0-beta.6+ behavior with virtualized <a>
tags as mentioned in the elm-watch section above.
elm-pages
TL;DR: Any version should work, but to get the full experience you need pull request #512 and pull request #519, which should be released in whatever elm-pages version comes after 3.0.20.
Without the two pull requests mentioned above, the following caveats apply (read the No longer empties the mount element section for why):
- elm-pages 3.0.20 renders extra whitespace nodes in
<body>
, causing the first diff withview
to be off, leading to basically the entire page being re-rendered. That’s not worse than without my forks though: Without my forks your elm-pages app re-renders the entire page anyway due toLazy
andKeyed
nodes (one of the things fixed in my forks). - You’ll end up with an extra
<div data-url>
element in<body>
. I’m not sure what that affects. - You’ll end up with an extra
<div aria-live>
element in<body>
. That should be fine, since it will stay unchanged.aria-live
only announces changes to the DOM.
Lamdera
TL;DR: You need to compile Lamdera yourself with pull request #40. Note: This only applies if you actually use Lamdera. Not if you just use Lamdera as an alternative Elm compiler, like elm-pages does.
Lamdera copies some functions from elm/virtual-dom, to make modifications to them. My fork of elm/virtual-dom also changes those functions. The pull request mentioned above copies those changes, and supports both the original version and my fork. It also adds data-elm
to an element that lamdera live
expects to disappear when Elm initializes.
elm-portal
TL;DR: You need to add one thing to the elm-portal code.
Chrome has shipped support for the new Element.prototype.moveBefore
method, which allows moving an element on the page without “resetting” it (scroll position, animations, loaded state for iframes and video, etc.). My fork of the virtual-dom package uses this method if available. This means that moveBefore
needs to be proxied by the elm-portal class. All you need to do is add this:
moveBefore(...args) {
return this._targetNode.moveBefore(...args);
}
Oh, and one more thing. Don’t make a keyed elm-portal (like Html.Keyed.node "elm-portal"
). That doesn’t work. You can have a keyed element inside the portal, but you can’t make the elm-portal element itself keyed. This is because my fork of the virtual-dom package needs to check if .parentNode
is the expected element in the keyed code, which it isn’t for the children inside the portal (since they have been sneakily moved to the portal).
elm-optimize-level-2
TL;DR: I don’t know.
I have never used elm-optimize-level-2. But I suspect that supportArraysForHtml might conflict with the changes in my fork.
Download lydell.bash from this repo and follow the instructions inside.
The bash script:
- Instructs you to clone my forks.
- Instructs you to type in the command you use to run your app.
- Runs that command with
ELM_HOME
setelm-stuff/elm-home/
(local to your app) and copies files from my forks into it.
The script should work on macOS and Linux (but you might need to tweak it if you use some funky setup). On Windows, you can do the steps manually, or write your own script. Maybe share it here if you know more Windows people who might want to test?
Here are some things to look out for while testing:
- Does the page crash?
- Do elements end up in the wrong place?
- Do elements end up with the wrong style?
- Do you notice the page feeling much slower?
- Can you make Elm crash by poking around in the inspector in the browser devtools?
Here are some specific things that I have tested a bit myself, but would like to see tested more:
- Google Translate work? Does it display usable text after updates to the DOM? Can you find a language that breaks down?
- Does Grammarly work?
- Do other extensions work?
- Do your third-party scripts work?
I’m also looking for testing in apps with heavy use of:
- elm-pages.
- Web components/Custom elements.
- Multiple Elm apps on the same page.
- Heavy or important use of
Html.Lazy
. - Heavy or important use of
Html.Keyed
. - Use of elm-explorations/webgl.
- Use of elm-explorations/markdown.
- Use of elm-program-test, or any HTML based testing in elm-test.
- Crazy, weird, edge-case:y things.
It’s good testing in multiple environments:
- Different operating systems.
- Different devices (computers, tablets, phones).
- Different browsers.
When you’re done testing, take a break and then test the same amount again, if you have the time. You’ll probably find a bug when doing the most unexpected little thing.
There are three main ways to provide feedback:
- Open an issue in this repo about a problem.
- Open an issue in this repo about a successful (problem free) test. Mention what you tested and how it went.
- Chat in the
#elm-virtual-dom
channel on the Incremental Elm Discord.
If you encounter a bug, it would be very helpful if you could:
-
Save any stack traces and error messages you see, and take a screenshot.
-
Try to reproduce the bug. If you can’t, you can still mention it on Discord for example, and I might have a clue since I know all the details of the code.
-
Reduce your app down to the minimum that reproduces it. That is quite boring and tedious, but it helps a lot!
- A small example is way easier to debug than a large page.
- You save my time.
- You might not be allowed to share your production code.
- The well-known js-framework-benchmark includes Elm. I ran that Elm benchmark on my computer, with and without my forks, and got the same numbers (no significant difference).
- When testing with some large Elm applications at work, I couldn’t tell any performance difference with the forks.
- Both the official elm/virtual-dom and my fork have
O(n)
complexity. - The official elm/virtual-dom algorithm sometimes does more work, but other times my fork does more work. It seems to even out.
You can install just one of them if you want. To avoid my patches being breaking changes, I’ve taken care to make that work.
If you do, nothing should break, and nothing should be rendered wrongly or so. You just won’t get the full experience. Some features might only be half-fixed. You might run into Hyrum’s Law though, where your app happened to work due to a bug, which when fixed by the fork you installed, uncovered another bug which is not fixed since you didn’t install one of the other forks.
For example, if you server-side render HTML and only install my fork of the virtual-dom package, your page should still look correct, but Elm might unnecessarily re-apply lots of attributes during the first render, and your <a>
elements might be lacking their click listener so that they cause full page reloads instead of being routed by Elm.
The main changes are in my elm/virtual-dom fork. The changes in elm/html and elm/browser are much smaller, and are more of “side details” than the main thing.
Changes:
- A new algorithm for pairing virtual DOM nodes with the corresponding real DOM node, that should be robust against browser extensions and third-party scripts. It is in the code for the old algorithm where most crashes happen. elm/html#44 elm/browser#121 elm/browser#66 elm/virtual-dom#147
- Support for the page being translated, for example Google Translate. This requires the above change (being robust), and then a little bit of extra code to make sure that translated text isn’t left behind on the page, and so that translated text that should change actually updates.
- The
Html.map
bug where messages of the wrong type can appear inupdate
functions is fixed, by completely changing howHtml.map
works. The old code was incredibly difficult to understand, but could theoretically skip more work in some cases. The new code is instead very simple, leaving little room for errors. elm/virtual-dom#105 elm/virtual-dom#162 elm/virtual-dom#171 elm/virtual-dom#166 (PR) elm/html#160 elm/compiler#2069 - Improved
Html.Keyed
. The algorithm is slightly smarter without losing performance, and uses the newElement.prototype.moveBefore
API, if available, which allows moving an element on the page without “resetting” it (scroll position, animations, loaded state for iframes and video, etc.). elm/virtual-dom#175 elm/virtual-dom#178 elm/virtual-dom#183 - “Virtualization” is now completed, making it usable in practice, for example for elm-pages. This means that server-side rendered pages no longer have to redraw the whole page when Elm initializes.
- CSS custom properties, like
--primary-color
, can now be set withHtml.Attributes.style "--primary-color" "salmon"
. elm/html#177 Svg.Attributes.xlinkHref
no longer mutates the DOM on every single render, which caused flickering in Safari sometimes. elm/virtual-dom#62
New DOM node pairing algorithm
The original elm/virtual-dom works like this:
- Run
view
. This produces the new virtual DOM. - Diff the output of
view
, with the output fromview
last time (the old virtual DOM). This produces a list of patches. - Walk the DOM using the old virtual DOM as a guide, since the DOM is supposed to look like the old virtual DOM, since that’s how we rendered it a while ago. Attach actual DOM nodes to each patch. This counts the elements in a clever way, so that large parts of the DOM can be skipped if there aren’t any changes there.
- Apply all the patches.
It’s step 3 that is problematic. It assumes that the DOM is unchanged since our last render. Unfortunately, that isn’t always the case due to browser extensions, third-party scripts and page translators like Google Translate.
I’ll explain the new algorithm by starting from a simple conceptual model, building up to the full complexity. A bit like how atoms have multiple, simplified models that are true enough to understand different parts of chemistry.
First, what does the Html msg
type look like in Elm? It’s actually implemented in JavaScript, but if it were Elm, it would look a bit like this:
type Html msg
= Text String
| Element String (List (Attribute msg)) (List (Html msg))
| Keyed String (List (Attribute msg)) (List ( String, Html msg ))
| Map (a -> msg) (Html a)
| Lazy (b -> Html msg) b
(Hand-waving away that the a
and b
type variables aren’t declared.)
The first important thing to realize is that Html.map
does not “do” anything straight away. It is represented as another node in the tree, and your mapping function is applied later. A Map
node is not associated with any DOM node – it’s a wrapper around another node which in turn might be.
Html.Lazy
also creates new nodes wrapping other nodes. It isn’t associated with a DOM node either.
But Text
(Html.text
), Element
(for example Html.div
) and Keyed
(for example Html.Keyed.ul
) all are associated with a DOM node. Each one of those are supposed to have (or get) a corresponding DOM node on the page. The idea is to store the DOM on the virtual DOM node itself, so it always has a reference to it.
As I said before, the Html msg
type isn’t implemented in Elm, but in JavaScript. So an Element
looks conceptually like this:
var node = {
$: "Element",
tag: "div",
attributes: [],
children: [],
};
The idea is extend that to:
var node = {
$: "Element",
tag: "div",
attributes: [],
children: [],
domNode: undefined,
};
Then, when we diff the old and new virtual DOM, we would diff these two:
var oldVirtualNode = {
$: "Element",
tag: "div",
attributes: [],
children: [],
domNode: div, // Reference to a div on the page.
};
var newVirtualNode = {
$: "Element",
tag: "div",
attributes: [{ name: "id", value: "my-id" }],
children: [],
domNode: undefined,
};
The diffing algorithm is basically unchanged in my fork. It will detect that in this case, we should set the id
attribute to my-id
. The original elm/virtual-dom then creates a patch object describing this. My fork instead immediately applies the change to the domNode
reference of the old virtual DOM object. It then sets newVirtualNode.domNode = oldVirtualNode.domNode
.
This means that my version never needs to walk the DOM to find which DOM node to apply the patch to. It just has a reference to the correct DOM node. Then it doesn’t matter what a browser extension does to the page, we can still perform the mutation.
Now to the first complication of this beautifully simple model. Imagine you making this constant in your Elm code and then rendering that in a bunch of different places:
editIcon : Html msg
editIcon =
Html.img [ Html.Attributes.src "/edit.svg" ] []
view : Model -> Html msg
view model =
Html.div []
[ editIcon
, editIcon
]
In a more realistic example, there’d be something next to the edit icons as well, and they’d be buttons that edit those things, but let's keep it simple. What happens when you refer to editIcon
multiple times? Well, the above will be represented roughly like this in JavaScript:
var editIcon = {
$: "Element",
tag: "img",
attributes: [{ name: "src", value: "/edit.svg" }],
children: [],
domNode: undefined,
};
var view = function (model) {
return {
$: "Element",
tag: "div",
attributes: [],
children: [editIcon, editIcon],
domNode: undefined,
};
};
editIcon
is represented by a JavaScript object. In view
we point to it twice. The children
array in view
contains the same editIcon
object twice. And they are not just equal, they are the same reference.
But in the actual DOM, they need one unique DOM node each. The exact same DOM node cannot be inserted at multiple places in the DOM. If you try to insert the same DOM node twice, it moves the DOM node there.
That’s a problem because when we render the first editIcon
for the first time, we create an img
DOM node, and then store it on the virtual DOM node itself. So the editIcon
object would mutate to:
var editIcon = {
$: "Element",
tag: "img",
attributes: [{ name: "src", value: "/edit.svg" }],
children: [],
domNode: img, // Reference to the `img` DOM node we just made.
};
But then we go and render editIcon
the second time (remember, view
refers to it twice). Then we create a new img
DOM node again, and then overwrite editIcon.domNode
with the new one, losing the other one. Oops!
The way I solved this is by having an array of DOM nodes on the virtual DOM node instead. So editIcon
would start out like this:
var editIcon = {
$: "Element",
tag: "img",
attributes: [{ name: "src", value: "/edit.svg" }],
children: [],
domNodes: [],
};
And after the first render it would become:
var editIcon = {
$: "Element",
tag: "img",
attributes: [{ name: "src", value: "/edit.svg" }],
children: [],
domNodes: [img1, img2], // References to the two DOM nodes on the page.
};
This editIcon
constant has one more quirk to it. While most virtual DOM nodes are created during each view
and then garbage collected, the editIcon
constant is … constant. It’s the same virtual DOM node reference, render after render. That means that when we diff, oldVirtualDomNode === newVirtualDomNode
. The old and new virtual DOM node are … the exact same object. Which means that the domNodes
array is the same array too – mutating one mutates the other. For this reason, I add more than just domNodes
to the virtual nodes:
var editIcon = {
$: "Element",
tag: "img",
attributes: [{ name: "src", value: "/edit.svg" }],
children: [],
// We only read from `x.oldDomNodes`. Uses `i`. Is set to `newDomNodes` at each render.
oldDomNodes: [],
// This is set to a new, empty array on each render. We push to `y.newDomNodes`.
newDomNodes: [],
// We have a global render counter. By comparing it with this number, we know if we have encountered a virtual DOM node for the first time during a render.
renderedAt: 0,
// The index of the next DOM node in `oldDomNodes` to use.
i: 0,
};
So, we keep track of both the old DOM nodes and the new DOM nodes, and have an index i
which points to how far into oldDomNodes
we have gotten. renderedAt
is used to “reset” between renders. At the end of a render, i
will be at least 1
for all virtual DOM nodes, and newDomNodes
will contain at least one DOM node. We could then go through the entire virtual DOM again, to reset i
back to 0, move newDomNodes
to oldDomNodes
, and set newDomNodes
to a new empty array. But that would require us to traverse the whole thing one additional time. Instead, we increment a global counter right before render. If renderedAt !== globalRenderCount
, it means that we should reset and set renderedAt = globalRenderCount
before incrementing i
etc.
This design imposes a rule: On each render, we always have to recurse through the entire old virtual DOM, to “discover” all uses of each virtual DOM node, and increase that i
counter each time. This is a difference compared to the original elm/virtual-dom package, that is worth mentioning from a performance perspective:
-
A common case is that both the old and new virtual DOM are very similar. There might be just a single change in one place, maybe just a text change or an attribute change. In this case, the diffing algorithm will naturally visit every virtual DOM node to find this.
-
When a virtual DOM node is only present in the old virtual DOM, it means that it was removed. The original elm/virtual-dom then has no need to recurse through all the children of that virtual DOM node. My fork still needs to do that though, to increment the
i
counter of every virtual DOM node inside the removed one, in case one of them is used again later. Note that this doesn’t use the regular diffing recursive function, it uses a special function that recurses the virtual DOM just for this use case as quickly as possible. -
When a virtual DOM node is only present in the new virtual DOM, it means that it was inserted. Both implementations then need to recurse through all the children of the new virtual DOM node, to render all of the elements, of course. No change there.
-
When two virtual DOM nodes are for different elements (one is a
<div>
, the other is a<p>
), both implementations bail out, by removing the old DOM node completely and then rendering the new one fresh. This is just like a removal followed by an insertion, so my fork needs to recurse through the old virtual DOM node here too, while the original elm/virtual-dom does not need that. -
Finally, lazy nodes. The first thing
lazy
does is that your Elm function won’t be called unless the arguments change. The second thing is the virtual DOM diffing. If the arguments haven’t changed, then we’ll use the same virtual DOM as last time, which then by definition is unchanged. The original elm/virtual-dom then doesn’t need to look through that virtual DOM node at all, it you can just move on to the next. My fork still needs to recurse through it, for two reasons. The first one is similar to node removals: To increment thei
counter in case a virtual DOM node used inside the lazy node is used again later. The second reason is due toHtml.map
– see the section aboutHtml.map
. Just like when recursing removed virtual DOM nodes, recursing lazy nodes also has a fast path function that only increments thei
counter and makes sure event listeners are up-to-date. This means thatlazy
is slightly less lazy with my fork.
Page translation support (Google Translate)
Here’s how page translation tools work:
- Google Translate (built into Chrome): It removes all text nodes, and replaces them with
<font>
tags with new, translated text nodes inside. - Firefox’s translator: It mutates the text of existing text nodes to translate them.
- Safari’s translator: It replaces all text nodes with new, translated text nodes.
All three of them can also introduce new text nodes, and even new elements, if the target language has a different word order. For example, <em>I went</em> to school.
can be translated to <em>Ich</em> bin zur Schule <em>gegangen</em>.
Due to different word order, another <em>
element had to be introduced to preserve the text formatting.
All three of them also listen for changes to the page and translate new text as it arrives on the page.
In my fork, the general behavior is to leave unknown elements alone, and to just update the reference to the DOM node that we stored on the virtual DOM node. That works fine also for page translators – until translated text needs to update. There are two problems:
- The text node that we’re mutating the text of might not even be on the page anymore (Google Translate and Safari). But the intention of the page translator was not to remove anything – the intention was to translate everything on the page.
- We might leave a lot of stray text behind everywhere as the page changes, resulting in a very confusing page.
For these reasons, my fork detects page translators and tries to cooperate with them. The detection works like this:
- Diff two virtual DOM text nodes. Stop if the text hasn’t changed.
- Since the text has changed, the DOM text node needs to be updated. But before doing that, check if the DOM node text is equal to the old virtual DOM node. If not, we have detected Firefox’s translator – then stop.
- If the DOM node text was unchanged, check if it has a parent node. If not, we have detected Google Translate or Safari’s translator – then stop. (When they remove the original text nodes, they no longer have a parent node.)
- If the DOM node had a parent node, it is probably still on the page. Update the DOM text node. No page translator.
If a page translator was detected, tell the parent. It will then go through its children again, both virtual DOM children and actual DOM children. It removes text node DOM children and replaces them with new ones. It also removes <font>
tags not created by Elm (they are most likely created by Google Translate). While doing this, it makes sure that all the child elements are in the correct order.
The thing here is that if we detect that a text node has been removed, it most likely means that it has been replaced with a translated version. But we don’t know what DOM node or nodes on the page that replaced it, only that it or they should be somewhere in the parent element. So the only thing we can do is to tell the parent element to redo all of its text. That is also good for the word order thing: It’s better if the page translator detects a full sentence or paragraph being changed than just a word or two. There’s a chance that the parent element contains the full sentence or paragraph, but of course no guarantee. Once the page translator detects the changes, it will re-translate all of it.
This algorithm is somewhat simple and fast, but it’s not perfect due to the word order thing. There might be some leftover or misplaced text after an update. But page translators aren’t perfect in the first place, so I don’t think users of them will expect perfection. They just want a page that they can understand and that doesn’t crash.
Event listeners and Html.map
Consider the following code:
type Msg
= ButtonClicked
| GotSearchInput String
| GotCommentInput Int String
view model =
Html.div [ Html.Attributes.id "main" ]
[ Html.button [ Html.Events.onClick ButtonClicked ]
[ Html.text "Click me" ]
, Html.input [ Html.Events.onInput GotSearchInput ] []
, Html.div []
(List.range 1 5
|> List.map
(\i ->
Html.input [ Html.Events.onInput (GotCommentInput i) ] []
)
)
]
Html.Attributes.id "main"
is easy to diff. If both the old and new virtual DOM have the id
attribute, check if they are both set to the same string. If not, update it.
Html.Events.onClick ButtonClicked
is also easy to diff. ButtonClicked
is just a value, so it can be compared, to know if the event listener needs to change.
What about Html.Events.onInput GotSearchInput
then? GotSearchInput
is a function. Functions cannot be compared in general. But this happens to be the same function reference every time. So we can check for ===
reference equality in JavaScript to know if the event listener needs to change.
Html.Events.onInput (GotCommentInput i)
is problematic, though. It returns a new function every time due to the partial application. (A literal lambda function would also be a new function every time.) We simply can’t know when it changes, so the event listener needs to change every time.
(And, on the lowest level, an event handler is just a pair of an event name and a decoder (that results in a message). So when the original elm/virtual-dom compares your event handlers, it actually has to compare decoders. elm/json contains a hidden _Json_equality
only for this reason. My fork does not need that function.)
Then we need to introduce Html.map
to the mix as well. The original elm/virtual-dom assigns domNode.elm_event_node_ref = eventNode
, where eventNode
is an object with a clever system of references, where different layers of Html.map
can mutate chains of objects that eventually results in that—when an event is triggered—all mapping functions can be applied. This system is pretty difficult to grasp, and can only be fully understood for a couple of seconds at a time. It also hides the infamous Html.map
bug. All in all, this system avoids (at least theoretically) updating event listeners on every render. It also avoids having to recurse into lazy virtual nodes when they haven’t changed.
My fork takes a much simpler approach. As mentioned in the “New DOM node pairing algorithm” section, my fork needs to recurse into all virtual DOM nodes anyway, even into lazy nodes. And the diffing of event decoders often doesn’t work anyway, due to passing extra data in messages (that feels pretty common). So my fork simply updates all event listeners every render.
What does “update an event listener” mean? The naive implementation would be to do domNode.removeEventListener(eventName, oldListener); domNode.addEventListener(eventName, newListener)
on every render. That would be a bit slow, though. So the original elm/virtual-dom has always had a trick up its sleeve: It keeps the same event listener as before, and just mutates a reference to the latest event decoder. That’s much cheaper, and also what my fork does.
Then finally, how does Html.map
work in my fork? Internally, Elm has a sendToApp
function, which is used to dispatch a message, which will call update
and subscriptions
and then render. Well, this is what an event listener looks like:
function callback(event) {
var decoder = callback.decoder;
var sendToApp = callback.sendToApp;
var result = Json_runDecoder(decoder, event);
if (!Result_isOk(result)) {
return;
}
sendToApp(result.value);
}
domNode.addEventListener(eventName, callback);
You can see how it runs the event decoder, and if it succeeds calls sendToApp
with the resulting message. Also notice how it reads decoder
and sendToApp
from properties on the function itself – these are the mutable references I mentioned before.
In the whole diffing and rendering process, we pass that sendToApp
function down, so that it can eventually be used by an element with an event listener. All Html.map
then needs to do is to wrap that sendToApp
function to also apply the mapping function. A bit like this:
function render(virtualDomNode, sendToApp) {
if (virtualDomNode.$ === "Map") {
return render(virtualDomNode.node, function (msg) {
return sendToApp(virtualDomNode.f(msg));
});
}
// Then handle all other virtual DOM node variants.
}
Html.Keyed
Html.Keyed
in the original elm/virtual-dom is pretty simplistic. I basically operates on a “one lookahead” principle. It goes through the children of the old and new virtual DOM node pairwise. If the keys match, diff them and move on. Otherwise, look ahead one child on both sides and compare all four virtual DOM nodes, to find insertions, removals and swaps. If nothing still matches, degrade to the naive method of moving every child in place (which can lead to moving 10 children up instead of 2 down, for example).
When you call Html.div [] [child1, child2]
, the Html.div
function immediately iterates over the linked list of children, turning it into an array. This is true of all element creating virtual DOM functions. In my version, I not only build that array, but also build a key lookup map for keyed nodes. During the diffing, my version also goes through the children of the old and new virtual DOM nodes pairwise. If the keys don’t match, I use the lookup map to detect insertions and removals. If there’s neither a insertion nor removal, it has to be a move. I then iterate from the end instead. If that also gets stuck, I compare the virtual DOM nodes from the forwards traversal with the ones from the backwards traversal, to find swaps. If things have moved so much that that doesn’t get us going again either, I switch to the naive method of moving every child in place. The difference in my fork, is that it uses the new Element.prototype.moveBefore
API, if available, which allows moving an element on the page without “resetting” it (scroll position, animations, loaded state for iframes and video, etc.), so it isn’t as bad when this happens. Also, when I benchmarked Element.prototype.moveBefore
, it was pretty fast – fast enough to seriously compete with a git-style diffing algorithm that minimizes the amount of moves.
Virtualization
Virtualization in elm/virtual-dom has a bunch of problems:
-
It basically breaks down as soon as you use
Html.map
orHtml.Lazy
. It is very common to haveHtml.map
near the top of your Elm application, so this effectively results in most server-side rendered applications having no use of the virtualization at all – it’s just going to re-render the entire page. This is because, as I mentioned in the “New DOM node pairing algorithm” section above, becauseHtml.map
andHtml.Lazy
are nodes in the virtual DOM tree, but not in the actual DOM tree. When just seeing the HTML, the virtualize function can’t know where the map and lazy nodes should be. The original elm/virtual-dom bails diffing if it tries to diff two virtual DOM nodes of different types (such as anElement
node vs aMap
node). My version instead consumes all map and lazy nodes from each side until we get to text and element nodes – then the actual diffing starts. -
It does not support textarea. Textarea elements are weird. You are supposed to add default text as children of
<textarea>
in HTML, but in JavaScript (and therefore Elm), you are supposed to settextarea.value
instead (updating the children does nothing). My fork handles this “children to value” conversion. -
Attributes vs properties. It virtualizes all attributes on the element as attributes in Elm (
Html.Attributes.attribute
), but most string attributes are actually implemented as properties in Elm (Html.Attributes.property
). I’ve fixed this by changing elm/html to mostly use attributes instead (read more about it in the elm/html section below). -
Functions in
Html.Attributes
that take aBool
still need to be implemented as properties. However, some of them have different casing as an attribute vs an a property, such asreadonly
vsreadOnly
. I’ve went through all boolean attributes and added a lookup object of the non-esoteric ones with different casing. -
The
style
attribute needs to be turned back to eachHtml.Attribute.style
call, and not be a singlestyle
attribute. My fork solves this. -
It doesn’t virtualize namespaced elements and attributes correctly (needed for SVG).
-
It does not add click listeners on
<a>
elements forBrowser.application
. -
My fork only virtualizes text nodes, and elements with the
data-elm
attribute. This allows third-party scripts to add elements to (for example)<body>
before Elm initializing, with having Elm remove them during initialization.
The only change is that previously, most functions in Html.Attributes
that take a String
used this helper function:
stringProperty : String -> String -> Attribute msg
stringProperty key string =
Elm.Kernel.VirtualDom.property key (Json.string string)
For example, href
:
href : String -> Attribute msg
href url =
stringProperty "href" (Elm.Kernel.VirtualDom.noJavaScriptUri url)
In other words, lots of the Html.Attributes
functions were implemented by setting properties.
My fork removes stringProperty
, and instead prefers attributes over properties. Here’s href
in my fork:
href : String -> Attribute msg
href url =
Elm.Kernel.VirtualDom.attribute "href" (Elm.Kernel.VirtualDom.noJavaScriptUri url)
In short, attributes are preferred because:
- Attributes can be removed, while properties often cannot. elm/html#228 elm/html#148 elm/virtual-dom#122 elm/virtual-dom#169
- “Virtualization” is way easier when most
Html.Attributes
functions are attributes. elm/virtual-dom#144 - Some properties are read-only and therefore throw errors if you try to set them.
- Attributes are easier to diff.
I explain those points more in the properties-vs-attributes.md file. (elm/html already had that file. I extended it with the numbered list at the end.)
My fork of elm/browser is a mixed bag of small changes. I intend to make separate pull requests for each thing.
-
In my fork of elm/virtual-dom, I made the “virtualize” function complete, making it work in practice. One aspect of virtualization that could not be solved in elm/virtual-dom on its own, was the bug where
<a>
elements didn’t get their click listener attached during virtualization, making them do full page reloads instead of being routed by Elm. This had to be fixed in both the virtual-dom package, and the browser package. This is the change in my fork that is the most related to the virtual-dom changes. -
elm/browser is in charge of the Elm debugger. The debugger code has a bug where it passes
document
instead ofdocument.body
to the “virtualize” function, for the debugger window. Previously, that didn’t matter at all and happened to work anyway. With my fork of elm/virtual-dom, that caused the debugger window to be a blank page. I’ve fixed that (adding.body
), but I also added compatibility for that bug in my virtual-dom fork, so it can be used without my elm/browser fork without breaking the debugger. Basically, I addedif (node === document) { node = document.body; }
as a workaround. So this change isn’t technically needed, but it’s nice fixing problems “in the right place” too. -
Speaking of the debugger, in Firefox the debugger window background is dark if your computer is in dark mode, making it hard to read things in the debugger. I’ve set the background color of the debugger window explicitly to white, which is the recommendation for web pages anyway.
-
You might have noticed that when Elm’s virtual DOM crashes, you get an error in the browser console many times per second. This is because Elm generally draws on the next animation frame using
requestAnimationFrame
, and if it crashes during rendering it gets stuck in an infiniterequestAnimationFrame
loop. That’s really annoying. When fixing changing the code to not get caught in a loop if there is an exception, I also noticed that the wholerequestAnimationFrame
was a bit off. Basically, if you also subscribe toBrowser.Events.onAnimationFrame
, you could end up withupdate
andview
being 1 frame out of sync, and some frames could be skipped. I made a demo showing these animation frame oddities. My fork fixes that, except the demo cases where the animation frames come via a port – I don’t think that is solvable. -
Finally, there’s a bug where clicks on
<a>
elements withouthref
still end up producing aLinkClicked
messages, even though such elements shouldn’t be clickable. I’ve fixed that. (But note that this fix doesn’t work during development with elm-watch – see the Compatibility with tooling section.) elm/browser#34 elm/browser#55 elm/browser#64
This repo used to contain the code of a previous attempt at making a safer virtual DOM for Elm. You can find that code in the legacy branch.