high-performance-react/high-performance-react.org

#+TITLE: High-Performance React
#+AUTHOR: Thomas Hintz

* Preface
* Introduction
It was the late 90's and I was just a kid visiting my Aunt and Uncle
and their family in Denver. The days were packed with endless playing
and goofing around. I didn't get to see my cousins much and we were
having a good time. But it was the late 90's and the Internet was
booming. And my cousin was in on it.

A "startup", that's what he called it. I didn't understand any of what
he was saying about it. Grown-up stuff. Then he showed us the webpage
for the startup and I thought that was impressive.

"How did you make that"? I asked him. I think he was a little confused
at first about what I was even talking about but he quickly brought me
over to the computer and showed me a screen full of text.

"You just type HTML, that's how you make the webpage." I thought this
was the coolest.

"What do you type that into? What program is it? Can I do that?" He
told me it was easy: just use Notepad. I wasn't going to let him go
without some hook I could grab into this alien world. He told me it's
really easy to learn: do an AOL search for "HTML tutorial".

So began my journey with web development. I AOL searched my way
through as many blinking text tutorials as I could find. It wasn't
long until I was building AJAX. We had IE 5.5 and 6 and Mozilla
Pheonix. And GMail came out. That changed things, now web apps were
"legitimate."

A lot of the technologies and libraries came and went over the years
but one thing remained constant in large web apps: poor
performance. From the very early days I was timing things with my stop
watch. Sometimes things were slow and I had to understand why and how
to fix them. Over the years I learned all about the browser's DOM and
its APIs and how they work. I learned how jQuery worked and
backbone.js and all the rest. I made apps that didn't lag or have
jank.

I was able to do this because I understood the performance
implications of the tools and libraries I was using and I learned how
to measure performance. I had discovered the recipe for
high-performance code.

And that is what this book is: a recipe for producing high-performance
React applications. First, we learn how React works. Then we learn how
to measure performance. And last we learn how to address the
bottlenecks we find. Parts of any technical book will go stale as
technology changes and that is no less true for this book. But what I
hope you learn is not just the technical details but more importantly
the method for writing high-performance code. The API might change but
the method will remain the same.

TODO note that the book references React-DOM but the algorithms should
generally apply to all React implementations.
* Mini React
Baking bread. When I first began to learn how to bake bread the recipe
told me what to do. It listed some ingredients and told me how to
combine them and prescribed times of rest. It gave me an oven
temperature and a period of wait. It gave me mediocre bread of wildly
varying quality. I tried different recipes but the result was always
the same.

Understanding: that's what I was missing. The bread I make is now
consistently good. The recipes I use are simpler and only give ratios
and general recommendations for rests and waits. So why does the bread
turn out better?

Before baking is finished bread is a living organism. The way it grows
and develops and flavors depend on what you feed it and how you feed
it and massage it, care for it. If you have it grow and ferment at a
higher temperature and more yeast it overdevelops producing too much
alcohol. If you give it too much time acidity will take over the
flavor. The recipes I used initially were missing a critical
ingredient: the rising temperature.

But unlike a lot of ingredients: temperature is hard to control for
the home cook. So the recipe can't just tell you exactly what
temperature to grow the bread at. My initial recipes just silently
made assumptions for the temperature, which rarely turn out to be
true. This means that the only way to consistently make good bread is
to have an understanding of how bread develops so that you can adjust
the other ingredients to complement the temperature. Now the bread can
tell me what to do.

While React isn't technically a living organism that can tell us what
to do it is, in its whole, a complex, abstract entity. We could learn basic
recipes for how to write high-performance React code but they wouldn't
apply in all cases and as React and things under it change our recipes
would fall out-of-date. So like the bread, to produce consistently
good results we need to understand how React does what it does.

** Basic React

Conceptually React is very simple. It starts by walking a tree of
components and building up a tree of their output. Then it compares
that tree to the tree currently in the browser's DOM to find any
differences between them. When it finds differences it updates the
browser's DOM to match its tree.

But what does that actually look like? If your app is janky does that
explanation point you towards what is wrong? No. It might make you
wonder if maybe it is too expensive to re-render the tree or if maybe
the diffing React does is slow but you won't really know. When I was
initially testing out different bread recipes I had guesses at why it
wasn't working but I didn't really figure it out until I had a deeper
understanding of how making bread worked. It's time we build up our
understand of how React works so that we can start to answer our
questions with solid answers.

React is made up of a few pieces: ~createElement~, ~render~, and
reconciliation. The first building block is ~createElement~. While
~createElement~ is itself unlikely to be a bottleneck it's a good to
understand how it works so that we can have a complete picture of the
entire process. The more black-boxes we have in our mental model the
harder it will be for us to diagnose performance problems.

** ~JSX~

But before we get to ~createElement~ we should talk about JSX. While
not strictly a part of React it is almost universally used with
it. And if we understand it then ~createElement~ will be less of a
mystery since we will be able to connect all the dots.

Before JSX the normal way of injecting HTML into the DOM was via
directly utilizing the browser's DOM APIs. This was very cumbersome.
The code's structure did not match the structure of the HTML that it
output which made it hard to quickly understand what the output of
a piece of code would be. So naturally programmers have been endlessly
searching for better ways to mix HTML with Javascript.

And this brings us to JSX. It is nothing new; nothing
complicated. Forms of it have been made and used long before React
adopted it. Now let's see if we can discover JSX for ourselves.

To start with we need to create a data structure that both represents
a DOM tree and can also be used to insert one into the browser's
DOM. And to do that we need to understand what a tree of DOM nodes is
constructed of. What parts do you see here?

TODO include text element (Hello)
TODO change to using objects instead of arrays?
probably do after what we've already done

{
  type: 'h1',
  props: { x: y },
  children: []
}

#+BEGIN_SRC html
<div class="header">
  <h1>Hello</h1>
  <input type="submit" disabled />
</div>
#+END_SRC

I see three parts: the name of the tag, the tag's properties, and its
children. Now how could we recreate that in Javascript?

In Javascript we store lists of things in arrays and key/value
properties in objects. Luckily for us Javascript even gives us literal
syntax for both so we can easily make a compact DOM tree with our own
notation.

This is what I'm thinking:

#+BEGIN_SRC javascript
['div', { 'class': 'header' },
  [['h1', {}, ['Hello']],
   ['input', { 'type': 'submit', 'disabled': 'disabled' }, []]
  ]
]
#+END_SRC

As you can see we have a clear mapping from our notation to the
original HTML. Our tree is made up of three element arrays. The first
item in the array is the tag, the second is an object containing the
tag's properties, and the third is an array of its children; which are
all made up of the same three element arrays.

The truth is though, if you stare at it long enough, although the
mapping is clear, how much fun would it be to read and write that on a
consistent basis? I can assure you, it is rather not fun. But it has
the advantage of being easy to insert into the DOM. All you need to do
is write a simple recursive function that ingests our data structure
and updates the DOM accordingly. We will get back to this.

So now we have a way to represent a tree of nodes and we
(theoretically) have a way to get those nodes into the DOM. But if we
are being honest with ourselves, while functional, it isn't a pretty
notation nor easy to work with.

And this is where our object of study enters the scene. JSX is just a
notation that a compiler takes as input and outputs in its place a
tree of nodes nearly identical to the notation we came up with! And if
you look back to our notation you can see that you can easily embed
arbitrary Javascript expression wherever you want in a node. As you
may have realized, that's exactly what the JSX compiler does when it
sees curly braces!

There are three main differences between our data structure and the
real one that JSX compiler outputs: it uses objects instead of arrays,
it inserts calls to React.createElement on children, and spreads the
children instead of containing them in an array. Here is what "real"
JSX compiler output looks like:

#+BEGIN_SRC javascript
React.createElement(
  'div',
  { className: 'header' },
  React.createElement('h1', {}, 'Hello'),
  React.createElement(
    'input',
    { type: 'submit', 'disabled': 'disabled' })
);
#+END_SRC

As you can see it is very similar to our data-structure and for the
purposes of this book we will use our own simplified data-structure as
it's a bit easier to work with. In practice they would behave the same
in the ways that matter to us now.

So now that we've worked through JSX we're ready to tackle
~createElement~, the item on our way to building our own React.

TODO JSX also does validation and escapes input to prevent XXS

** ~createElement~

React expects nodes defined as Javascript objects that look like this:

#+BEGIN_SRC javascript
{
    type: NODE_TYPE,
    props: {
         propA: VALUE,
         propB: VALUE,
         ...
         children: STRING | ARRAY
    }
}
#+END_SRC

That is an object with two properties: ~type~ and ~props~. The ~props~
property contains all the properties of the node. The node's
~children~ are also considered part of its properties. The full
React's ~createElement~ includes more properties but they are unlikely
to be relevant to your application's performance or our version of
React here.

#+BEGIN_SRC javascript
// React's createElement
const ReactElement = function(type, key, ref, self, source, owner, props)
#+END_SRC

So all our ~createElement~ needs to do is transform our data structure
into the objects that our React expects.

#+BEGIN_SRC javascript
function createElement(node) {
    // an array: not text, number, or other primitive
    if (typeof node === 'object') {
        const [ tag, props, children ] = node;
        return {
            type: tag,
            props: {
                ...props,
                children: children.map(createElement)
            }
        };
    }

    // primitives like text or number
    return {
        type: 'TEXT',
        props: {
            nodeValue: node,
            children: []
        }
    };
}
#+END_SRC

Our ~createElement~ has two main parts: complex elements and primitive
elements. The first part tests whether ~node~ is a complex node
(specified by an array) and then generates an ~element~ object based
on the input node. It recursively calls ~createElement~ to generate an
array of children elements. If the node is not complex then we
generate an element of type 'TEXT' which we use for all primitives
like strings and numbers. We call the output of ~createElement~ a tree
of ~elements~ (surprise).

That's it. Now we have everything we need to actually begin the
process of rendering our tree to the DOM!

** Render

There are now only two major puzzles remaining in our quest for our
own React. The next piece is: ~render~: how do we go from our tree of
nodes to actually displaying something on screen?

The signature for our ~render~ method is very simple and will be
familiar to you:

#+BEGIN_SRC javascript
function render(element, container)
#+END_SRC

Doing the initial render on a tree of elements is quite simple. In
psuedocode it looks like this:

#+BEGIN_SRC javascript
function render(element, container) {
    const domElement = createDOMElement(element);
    setProps(element, domElement);
    renderChildren(element, domElement);
    container.appendChild(domElement);
#+END_SRC

Because the browser APIs for text elements are different than for generic
DOM elements and because text elements can't have children we will
split up the process in to two methods: ~renderTextElement~ and
~renderDOMElement~.

#+BEGIN_SRC javascript
function render(element, container) {
    if (element.type === 'TEXT') {
         renderTextElement(element, container);
    } else {
         renderDOMElement(element, container);
    }
}
#+END_SRC

First, we'll look at ~renterTextElement~, which is the simpler of the
two.

#+BEGIN_SRC javascript
function renderTextElement(element, container) {
    return container.appendChild(
      document.createTextNode(element.props.nodeValue));
}
#+END_SRC

~renderTextElement~ just creates a DOM ~TextNode~ and appends it to
the container.

Next, we look at renderDOMElement which must also set properties on
the newly created DOM element and render any children.

#+BEGIN_SRC javascript
function renderDOMElement(element, container) {
    const { type, props } = element;

    // create the DOM element
    const domElement = document.createElement(type);

    // set its properties
    Object.keys(props)
      .filter((key) => key !== 'children')
      .forEach((key) => {
          domElement[key] = props[key];
    });

    // render its children
    props.children.forEach((child) => render(child, domElement));

    // add our tree to the DOM!
    container.appendChild(domElement);
}
#+END_SRC

To start with we create the DOM element. Then we need to set its
properties. To do this we first need to filter out the ~children~
property and then we simply loop over they keys setting each property
directly. Then we render each of the children by looping over the
children recursively calling ~render~ on each with the ~container~ set
to the current DOM element (which is each child's parent).

Now we can go all the way from JSX to a rendered tree in the browser's
DOM! But so far we can only add things to our tree. To be able to
remove and modify the tree we need two more parts: reconciliation and
the commit phase.

** Reconciliation
A tale of two trees. These are the two trees that people most often
talk about when talking about React's "secret sauce": the VDOM or
virtual DOM and the current render tree. This idea is what originally
set React apart. React's reconciliation is what allows you to program
declaratively. Reconciliation is what makes it so we no longer have to
manually update and modify the DOM whenever our own internal state
changes and in a lot of ways is that makes React, React.

Conceptually the way this works is that React generates a new element
tree for every render and compares to the newly generated tree to the
tree generated on the previous render. Where it finds differences in
the tree it knows to mutate the DOM state. This is the "tree diffing"
algorithm.

Unfortunately those researching tree diffing in Computer Science have
not yet produced a generic algorithm with sufficient performance for
use in something like React as the current best still runs in
O(n^3). This leads to the largest performance related aspect in all of
React.

Since an O(n^3) algorithm isn't going to cut it the creators of React
instead use a set of heuristics to determine what parts of the tree
have changed. Understanding how the React tree diffing algorithm works
in general and the heuristics currently in use can help immensely in
detecting and fixing React performance bottlenecks. And beyond that it
can help one's understanding of some of React's quirks and usage. Even
though this algorithm is internal to React and can be changed anytime
its details have leaked out in some ways and are overall unlikely to
change in major ways without larger changes to React.

TODO some kind of call-out for big deal

TODO https://grfia.dlsi.ua.es/ml/algorithms/references/editsurvey_bille.pdf

The approach we will take here is to integrate the heuristics that
React uses into our render method. This is similar to how React itself
does it and we will discuss that later when we talk about Fibers.

To do this we must make some modifications to our render
methods. First, we need to be able to store and retrieve the previous
render tree. Then we need to add code to compare parts of the tree to
decide if we need to re-render something or if we can re-use it from
the previous render tree.

Here we are adding a global object that will store our last render
tree keyed by the container.

#+BEGIN_SRC javascript
const renderTrees = {};
function render(element, container) {
    const tree =
      render_internal(element, container, renderTrees[container]);
    // render complete, store the updated tree
    renderTrees[container] = tree;
}

function render_internal(element, container, prevElement) {
    if (element.type === 'TEXT') {
         return renderTextElement(element, container, prevElement);
    } else {
         return renderDOMElement(element, container, prevElement);
    }
}
#+END_SRC

TODO psuedo-code for heuristics
#+BEGIN_SRC javascript
    if (!element && prevElement)
      // delete dom element
    else if (element && !prevElement)
      // add new dom element
    else if (element.type === prevElement.type)
      // update dom element
#+END_SRC

Now that we have a way to see what we rendered last time we can go
ahead and update our render methods with the heuristics.

#+BEGIN_SRC javascript
function renderTextElement(element, container, prevElement) {
    const { nodeValue } = element.props;
    let domElement;
    if (element && prevElement &&
        element.type === prevElement.type &&
        prevElement.props.nodeValue &&
        nodeValue !== prevElement.props.nodeValue) {
        // types match but values don't; update
        prevElement.domElement.nodeValue = nodeValue;
        domElement = prevElement.domElement;
    } else {
        if (element && prevElement &&
            element.type !== prevElement.type) {
            // element types don't match so remove & append
            prevElement.parent.removeChild(prevElement.domElement);
        } else if () {
            // TODO delete node
        }
        // new type or new text node
        domElement =
          container.appendChild(document.createTextNode(nodeValue));
    }
    return {
        domElement: domElement,
        parent: container,
        ...element
    };
}
#+END_SRC

TODO don't figure event handlers are handled specially

** Commit Phase

** Fibers
* Rendering Model
  React calls shouldComponentUpdate to know if it should re-render the
  component. by default it returns true.

  generally use PureComponent/React.memo
* Diagnosing Bottlenecks
* Reducing Renders
* Improving DOM Merge Performance
* Reducing Number of Components
  higher-order components
* Windowing
* Performance Tools
  trace from scheduler/tracing/profiler component
* JS Performance Tools
* Code Splitting
  React.lazy, suspense

  use on routes
* Server Side Rendering
* Concurrent Rendering
* UX
* JS Service Workers
* Keys
* Reconciliation
  - diffing algorithm based on heuristics. generic algorithm is O(n^3)
  - "Fiber" algorithm notes
    - lists reordering without key means full list output/update
    - type changes cause full re-render
    - keys should be stable, predictable, unique