You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
416 lines
27 KiB
Plaintext
416 lines
27 KiB
Plaintext
# Fundamentals: Building our own React
|
|
|
|
Baking bread. When I first began to learn how to bake bread the recipe told me what to do. It listed some ingredients and told me how to combine them and prescribed times of rest. It gave me an oven temperature and a period of wait. It gave me mediocre bread of wildly varying quality. I tried different recipes but the result was always the same.
|
|
|
|
Understanding: that's what I was missing. The bread I make is now consistently good. The recipes I use are simpler and only give ratios and general recommendations for rests and waits. So why does the bread turn out better?
|
|
|
|
Before baking is finished bread is a living organism. The way it grows and develops and flavors depend on what you feed it and how you feed it and massage it, care for it. If you have it grow and ferment at a higher temperature and more yeast it overdevelops producing too much alcohol. If you give it too much time acidity will take over the flavor. The recipes I used initially were missing a critical ingredient: the rising temperature.
|
|
|
|
But unlike a lot of ingredients: temperature is hard to control for the home cook. So the recipe can't just tell you exactly what temperature to grow the bread at. My initial recipes just silently made assumptions for the temperature, which rarely turn out to be true. This means that the only way to consistently make good bread is to have an understanding of how bread develops so that you can adjust the other ingredients to complement the temperature. Now the bread can tell me what to do.
|
|
|
|
While React isn't technically a living organism that can tell us what to do it is, in its whole, a complex, abstract entity. We could learn basic recipes for how to write high-performance React code but they wouldn't apply in all cases and as React and things under it change our recipes would fall out-of-date. So like the bread, to produce consistently good results we need to understand how React does what it does.
|
|
|
|
## React, made of
|
|
|
|
Conceptually React is very simple. It starts by walking a tree of components and building up a tree of their output. Then it compares that tree to the tree currently in the browser's DOM to find any differences between them. When it finds differences it updates the browser's DOM to match its internal tree.
|
|
|
|
But what does that actually look like? If your app is janky does that explanation point you towards what is wrong? No. It might make you wonder if maybe it is too expensive to re-render the tree or if maybe the diffing React does is slow but you won't really know. When I was initially testing out different bread recipes I had guesses at why it wasn't working but I didn't really figure it out until I had a deeper understanding of how making bread worked. It's time we build up our understanding of how React works so that we can start to answer our questions with solid answers.
|
|
|
|
React is centered around the `render` method. The `render` method is what walks our trees, diffs them with the browser's DOM tree, and updates the DOM as needed. But before we can look at the `render` method me have to understand its input. The input comes from `createElement`. While `createElement` itself is unlikely to be a bottleneck it's a good to understand how it works so that we can have a complete picture of the entire process. The more black-boxes we have in our mental model the harder it will be for us to diagnose performance problems.
|
|
|
|
## Markup in Javascript: `JSX`
|
|
|
|
`createElement`, however, takes as input something that is probably not familiar to us since we usually work in JSX, which is the last element of the chain in this puzzle and the first step in solving it. While not strictly a part of React it is almost universally used with it. And if we understand it then `createElement` will be less of a mystery since we will be able to connect all the dots.
|
|
|
|
JSX is not valid HTML or Javascript but its own language compiled by a compiler, like Babel. The output of that compilation is valid Javascript that represents the original markup.
|
|
|
|
Before JSX the normal way of injecting HTML into the DOM was via directly utilizing the browser's DOM APIs. This was very cumbersome. The code's structure did not match the structure of the HTML that it output which made it hard to quickly understand what the output of a piece of code would be. So naturally programmers have been endlessly searching for better ways to mix HTML with Javascript.
|
|
|
|
And this brings us to JSX. It is nothing new; nothing complicated. Forms of it have been made and used long before React adopted it. Now let's see if we can discover JSX for ourselves.
|
|
|
|
To start with we need to create a data structure that both represents a DOM tree and can also be used to insert one into the browser's DOM. And to do that we need to understand what a tree of DOM nodes is constructed of. What parts do you see here?
|
|
|
|
{format: "html"}
|
|
```
|
|
<div class="header">
|
|
<h1>Hello</h1>
|
|
<input type="submit" disabled />
|
|
</div>
|
|
```
|
|
|
|
I see three parts: the name of the tag, the tag's properties, and its children.
|
|
|
|
| name | 'div', 'h1', 'input' |
|
|
| props | 'class', 'type', 'disabled' |
|
|
| children | <h1>, <input>, Hello |
|
|
|
|
```
|
|
tag name: 'div'
|
|
tag prop: 'class'
|
|
children: h1..., 'Hello', input...
|
|
```
|
|
|
|
Now how could we recreate that in Javascript?
|
|
|
|
In Javascript we store lists of things in arrays and key/value properties in objects. Luckily for us Javascript even gives us literal syntax for both so we can easily make a compact DOM tree with our own notation.
|
|
|
|
This is what I'm thinking:
|
|
|
|
{format: "javascript"}
|
|
```
|
|
['div', { 'className': 'header' },
|
|
[['h1', {}, ['Hello']],
|
|
['input', { 'type': 'submit', 'disabled': 'disabled' }, []]
|
|
]
|
|
]
|
|
```
|
|
|
|
As you can see we have a clear mapping from our notation to the original HTML. Our tree is made up of three element arrays. The first item in the array is the tag, the second is an object containing the tag's properties, and the third is an array of its children; which are all made up of the same three element arrays.
|
|
|
|
The truth is though, if you stare at it long enough, although the mapping is clear, how much fun would it be to read and write that on a consistent basis? I can assure you, it is rather not fun. But it has the advantage of being easy to insert into the DOM. All you need to do is write a simple recursive function that ingests our data structure and updates the DOM accordingly. We will get back to this.
|
|
|
|
So now we have a way to represent a tree of nodes and we (theoretically) have a way to get those nodes into the DOM. But if we are being honest with ourselves, while functional, it isn't a pretty notation nor easy to work with.
|
|
|
|
And this is where our object of study enters the scene. JSX is just a notation that a compiler takes as input and outputs in its place a tree of nodes nearly identical to the notation we came up with! And if you look back to our notation you can see that you can easily embed arbitrary Javascript expressions wherever you want in a node. As you may have realized, that's exactly what the JSX compiler does when it sees curly braces!
|
|
|
|
There are three main differences between our data structure and the real one that the JSX compiler outputs: it uses objects instead of arrays, it inserts calls to React.createElement on children, and spreads the children instead of containing them in an array. Here is what "real" JSX compiler output looks like:
|
|
|
|
{format: "javascript"}
|
|
```
|
|
React.createElement(
|
|
'div',
|
|
{ className: 'header' },
|
|
React.createElement('h1', {}, 'Hello'),
|
|
React.createElement(
|
|
'input',
|
|
{ type: 'submit', 'disabled': 'disabled' })
|
|
);
|
|
```
|
|
|
|
As you can see it is very similar to our data-structure and for the purposes of this book we will use our own simplified data-structure as it's a bit easier to work with. A JSX compiler also does some validation and escapes input to prevent cross-site scripting attacks. In practice though they would behave the same in the ways that matter to us now.
|
|
|
|
So now that we've worked through JSX we're ready to tackle `createElement`, the next item on our way to building our own React.
|
|
|
|
## Getting Ready to Render with `createElement`
|
|
|
|
React's `render` expects to consume a tree of element objects in a specific, uniform format. `createElement` is the method by which we achieve that objective. `createElement` will take as input our JSX-like notation and output a tree of objects compatible with `render`.
|
|
|
|
React expects nodes defined as Javascript objects that look like this:
|
|
|
|
{format: "javascript"}
|
|
```
|
|
{
|
|
type: NODE_TYPE,
|
|
props: {
|
|
propA: VALUE,
|
|
propB: VALUE,
|
|
...
|
|
children: STRING | ARRAY
|
|
}
|
|
}
|
|
```
|
|
|
|
That is: an object with two properties: `type` and `props`. The `props` property contains all the properties of the node. The node's `children` are also considered part of its properties. The full version of React's `createElement` includes more properties but they are unlikely to be relevant to your application's performance or our version of React here.
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function createElement(node) {
|
|
// an array: not text, number, or other primitive
|
|
if (typeof node === 'object') {
|
|
const [ tag, props, children ] = node;
|
|
return {
|
|
type: tag,
|
|
props: {
|
|
...props,
|
|
children: children.map(createElement)
|
|
}
|
|
};
|
|
}
|
|
|
|
// primitives like text or number
|
|
return {
|
|
type: 'TEXT',
|
|
props: {
|
|
nodeValue: node,
|
|
children: []
|
|
}
|
|
};
|
|
}
|
|
```
|
|
|
|
Our `createElement` has two main parts: complex elements and primitive elements. The first part tests whether `node` is a complex node (specified by an array) and then generates an `element` object based on the input node. It recursively calls `createElement` to generate an array of children elements. If the node is not complex then we generate an element of type 'TEXT' which we use for all primitives like strings and numbers. We call the output of `createElement` a tree of `elements` (surprise).
|
|
|
|
That's it. Now we have everything we need to actually begin the process of rendering our tree to the DOM!
|
|
|
|
## Render
|
|
|
|
There are now only two major puzzles remaining in our quest for our own React. The next piece is: `render`. How do we go from our tree of nodes to actually displaying something on screen? The next puzzle we will be solving is the render method.
|
|
|
|
The signature for our `render` method should be familiar to you:
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function render(element, container)
|
|
```
|
|
|
|
This is the same signature as that of React itself. We begin by just focusing on the initial render. In pseudocode it looks like this:
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function render(element, container) {
|
|
const domElement = createDOMElement(element);
|
|
setProps(element, domElement);
|
|
renderChildren(element, domElement);
|
|
container.appendChild(domElement);
|
|
```
|
|
|
|
Our DOM element is created first. Then we set the properties, render children elements, and finally append the whole tree to the container.
|
|
|
|
Now that we have an idea of what to build we will work on expanding the pseudocode until we have our own fully functional `render` method using the same general algorithm React uses. In our first pass we will focus on the initial render and ignore reconciliation.
|
|
|
|
TODO note what reconciliation is
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function render(element, container) {
|
|
const { type, props } = element;
|
|
|
|
// create the DOM element
|
|
const domElement = type === 'TEXT' ?
|
|
document.createTextNode(props.nodeValue) :
|
|
document.createElement(type);
|
|
|
|
// set its properties
|
|
Object.keys(props)
|
|
.filter((key) => key !== 'children')
|
|
.forEach((key) => domElement[key] = props[key]);
|
|
|
|
// render its children
|
|
props.children.forEach((child) => render(child, domElement));
|
|
|
|
// add our tree to the DOM!
|
|
container.appendChild(domElement);
|
|
}
|
|
```
|
|
|
|
The `render` method starts by creating the DOM element. Then we need to set its properties. To do this we first need to filter out the `children` property and then we simply loop over the keys, setting each property directly. Following that, we render each of the children by looping over the children and recursively calling `render` on each child with the `container` set to the current DOM element (which is each child's parent).
|
|
|
|
Now we can go all the way from our JSX-like notation to a rendered tree in the browser's DOM! But so far we can only add things to our tree. To be able to remove and modify the tree we need one more part: reconciliation.
|
|
|
|
## Reconciliation
|
|
|
|
A tale of two trees. These are the two trees that people most often talk about when talking about React's "secret sauce": the virtual DOM and the browser's DOM tree. This idea is what originally set React apart. React's reconciliation is what allows you to program declaratively. Reconciliation is what makes it so we no longer have to manually update and modify the DOM whenever our own internal state changes. In a lot of ways, it is what makes React, React.
|
|
|
|
Conceptually, the way this works is that React generates a new element tree for every render and compares the newly generated tree to the tree generated on the previous render. Where it finds differences in the tree it knows to mutate the DOM state. This is the "tree diffing" algorithm.
|
|
|
|
Unfortunately those researching tree diffing in Computer Science have not yet produced a generic algorithm with sufficient performance for use in something like React as the current best still [runs in O(n^3^)](https://grfia.dlsi.ua.es/ml/algorithms/references/editsurvey_bille.pdf).
|
|
|
|
Since an O(n^3^) algorithm isn't going to cut it in the real-world, the creators of React instead use a set of heuristics to determine what parts of the tree have changed. Understanding how the React tree diffing algorithm works in general and the heuristics currently in use can help immensely in detecting and fixing React performance bottlenecks. And beyond that it can help one's understanding of some of React's quirks and usage. Even though this algorithm is internal to React and can be changed anytime its details have leaked out in some ways and are overall unlikely to change in major ways without larger changes to React itself.
|
|
|
|
According to the [React documentation](https://reactjs.org/docs/reconciliation.html) their diffing algorithm is O(n) and based on two major components:
|
|
|
|
* Elements of differing types will yield different trees
|
|
* You can hint at tree changes with the `key` prop.
|
|
|
|
In this section we will focus on the first part: differing types. In a later chapter we will discuss and implement the `key` prop.
|
|
|
|
The approach we will take here is to integrate the heuristics that React uses into our render method. This is similar to how React itself does it and we will discuss React's actual implementation later when we talk about Fibers.
|
|
|
|
Before we get into the code changes that implement the heuristics it is important to remember that React *only* looks at an element's type, existence, and key. It does not do any other diffing. It does not diff props. It does not diff sub-trees of modified parents.
|
|
|
|
Here is an overview of the algorithm we will be implementing:
|
|
|
|
{format: "javascript"}
|
|
```
|
|
if (!element && prevElement)
|
|
// delete dom element
|
|
else if (element && !prevElement)
|
|
// add new dom element, render children
|
|
else if (element.type === prevElement.type)
|
|
// update dom element, render children
|
|
else if (element.type !== prevElement.type)
|
|
// replace dom element, render children
|
|
```
|
|
|
|
Notice that in every case, except deletion, we still call `render` on the element's children. While its possible that the children will be able to reuse their associated DOM elements, their `render` methods will still be invoked.
|
|
|
|
Now, to get started with our render method we must make some modifications to our previous render method. First, we need to be able to store and retrieve the previous render tree. Then we need to add code to compare parts of the tree to decide if we need to re-render something or if we can re-use DOM elements from the previous render tree. And last we need to return a tree of elements that can be used in the next render as a comparison and to reference the DOM elements that we create. These new elements will have the same structure as our current elements but we will add two new properties: `domElement` and `parent`. `domElement` is the DOM element associated with our synthetic element and `parent` is a reference to the parent DOM element.
|
|
|
|
Here we begin by adding a global object that will store our last render tree, keyed by the `container`.
|
|
|
|
{format: "javascript"}
|
|
```
|
|
const renderTrees = {};
|
|
function render(element, container) {
|
|
const tree =
|
|
render_internal(element, container, renderTrees[container]);
|
|
// render complete, store the updated tree
|
|
renderTrees[container] = tree;
|
|
}
|
|
```
|
|
|
|
As you can see, the change we made is to move the core of our algorithm into a new function called `render_internal` and pass in the result of our last render to `render_internal`.
|
|
|
|
Now that we have stored our last render tree we can go ahead and update our render method with the heuristics for reusing the DOM elements. We name it `render_internal` because it is what controls the rendering but takes an additional argument now: the `prevElement`. `prevElement` is a reference to the corresponding `element` from the previous render and contains a reference to its associated DOM element and parent DOM element. If it's the first render or if we are rendering a new node or branch to the tree than `prevElement` will be `undefined`. If, however, `element` is `undefined` and `prevElement` is defined then we know we need to delete a node that previously existed.
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function render_internal(element, container, prevElement) {
|
|
let domElement, children;
|
|
if (!element && prevElement) {
|
|
removeDOMElement(prevElement);
|
|
return;
|
|
} else if (element && !prevElement) {
|
|
domElement = createDOMElement(element);
|
|
} else if (element.type === prevElement.type) {
|
|
domElement = prevElement.domElement;
|
|
} else { // types don't match
|
|
removeDOMElement(prevElement);
|
|
domElement = createDOMElement(element);
|
|
}
|
|
setDOMProps(element, domElement, prevElement);
|
|
children = renderChildren(element, domElement, prevElement);
|
|
|
|
if (!prevElement || domElement !== prevElement.domElement) {
|
|
container.appendChild(domElement);
|
|
}
|
|
|
|
return {
|
|
domElement: domElement,
|
|
parent: container,
|
|
type: element.type,
|
|
props: {
|
|
...element.props,
|
|
children: children
|
|
}
|
|
};
|
|
}
|
|
```
|
|
|
|
The only time we shouldn't set DOM properties on our element and render its children is when we are deleting an existing DOM element. We use this observation to group the calls for `setDOMProps` and `renderChildren`. Choosing when to append a new DOM element to the container is also part of the heuristics. If we can reuse an existing DOM element then we do but if the element type has changed or if there was no corresponding existing DOM element then and only then do we append a new DOM element. This ensures the actual DOM tree isn't being replaced every time we render, only the elements that change are replaced.
|
|
|
|
In React, when a new DOM element is appended to the DOM tree, React would invoke `componentDidMount` or `useEffect`.
|
|
|
|
Next up we'll go through all the auxiliary methods that complete the implementation.
|
|
|
|
Removing a DOM element is straightforward; we just `removeChild` on the parent element. Before removing the element, React would invoke `componentWillUnmount` and `useEffect`.
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function removeDOMElement(prevElement) {
|
|
prevElement.parent.removeChild(prevElement.domElement);
|
|
}
|
|
```
|
|
|
|
In creating a new DOM element we just need to branch if we are creating a text element since the browser API differs slightly. We also populate the text element's value as the API requires the first argument to be specified even though later on when we set props we will set it again. This is where React would invoke `componentWillMount`.
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function createDOMElement(element) {
|
|
return element.type === 'TEXT' ?
|
|
document.createTextNode(element.props.nodeValue) :
|
|
document.createElement(element.type);
|
|
}
|
|
```
|
|
|
|
To set the props on an element, we first clear all the existing props and then loop through the current props, setting them accordingly. Of course we filter out the `children` prop since we use that elsewhere and it isn't intended to be set directly.
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function setDOMProps(element, domElement, prevElement) {
|
|
if (prevElement) {
|
|
Object.keys(prevElement.props)
|
|
.filter((key) => key !== 'children')
|
|
.forEach((key) => {
|
|
domElement[key] = '';
|
|
});
|
|
}
|
|
Object.keys(element.props)
|
|
.filter((key) => key !== 'children')
|
|
.forEach((key) => {
|
|
domElement[key] = element.props[key];
|
|
});
|
|
}
|
|
```
|
|
|
|
I> React is more intelligent about only updating or removing props that need to be updated or removed.
|
|
|
|
W> This algorithm for setting props does not correctly handle events which must be treated specially. For this exercise that detail is not important though.
|
|
|
|
For rendering children we use two loops. The first loop removes any elements that are no longer being used. This would happen when the number of children is decreased. The second loop starts at the first child and then iterates through all of the children of the parent element, calling `render_internal` on each child. When `render_internal` is called the corresponding previous element in that position is passed to `render_internal`, or `undefined` if there is no corresponding element, like when the list of children has grown.
|
|
|
|
{format: "javascript"}
|
|
```
|
|
function renderChildren(element, domElement, prevElement = { props: { children: [] }}) {
|
|
const elementLen = element.props.children.length;
|
|
const prevElementLen = prevElement.props.children.length;
|
|
// remove now unused elements
|
|
for (let i = elementLen; i < prevElementLen - elementLen; i++) {
|
|
removeDOMElement(element.props.children[i]);
|
|
}
|
|
// render existing and new elements
|
|
return element.props.children.map((child, i) => {
|
|
const prevChild = i < prevElementLen ? prevElement.props.children[i] : undefined;
|
|
return render_internal(child, domElement, prevChild);
|
|
});
|
|
}
|
|
```
|
|
|
|
It's very important to understand the algorithm used here because this is essentially what happens in React when incorrect keys are used, like a list index. And this is why keys are so critical to high performance (and correct) React code. For example, in our algorithm here, if you removed an item from the front of the list you may cause every element in the list to be created anew in the DOM if the types no longer match up. Later on, in the chapter on keys, we will update this algorithm to incorporate keys. It's actually only a minor difference in determining which `child` gets paired with which `prevChild`. Otherwise this is effectively the same algorithm React uses when rendering lists of children.
|
|
|
|
There are a few things to note here. First it is important to pay attention to when React will be removing a DOM element from the tree and adding a new one as this is when the related lifecycle events or hooks are invoked. And invoking those lifecycle methods or hooks, and the whole process of tearing down and building up a component is expensive. So again, you can see how a bad key would lead to another performance bottleneck since React will be doing this on all or many of the elements in a list frequently.
|
|
|
|
## Fibers
|
|
|
|
The actual React implementation used to look very similar to what we've gone through so far but with React 16 this has changed dramatically with the introduction of Fibers. Fibers are a name that React gives to discrete units of work. And the React reconciliation algorithm was changed to be based on small units of work instead of one large, potentially long-running call to `render`. This means that React is now able to process just part of the render phase, pause to let the browser take care of other things, and resume again. This is the underlying change the enables the experimental Concurrent Mode.
|
|
|
|
But even with such a large change, the underlying algorithms for deciding how and when to render components is the same. And when not running in Concurrent Mode the effect is still the same as React does the render phase in one block still. So using a simplified interpretation that doesn't include all the complexities of breaking up the process in to chunks enables us to see more clearly how the process as a whole works. At this point bottlenecks are much more likely to occur from the underlying algorithms and not from the Fiber specific details. In the chapter on Concurrent Mode we will go in to this more.
|
|
|
|
## Putting it all together
|
|
|
|
Throughout the rest of the book we will be building on and using our React implementation so it would be helpful to see it all put together and working. At this point the only thing left to do is to create some components and use them!
|
|
|
|
{format: "javascript"}
|
|
```
|
|
const SayNow = ({ dateTime }) => {
|
|
return ['h1', {}, [`It is: ${dateTime}`]];
|
|
};
|
|
|
|
const App = () => {
|
|
return ['div', { 'className': 'header' },
|
|
[SayNow({ dateTime: new Date() }),
|
|
['input', { 'type': 'submit', 'disabled': 'disabled' }, []]
|
|
]
|
|
];
|
|
}
|
|
|
|
render(createElement(App()), document.getElementById('root'));
|
|
```
|
|
|
|
We are just creating two components, based on the same JSX-like notation we were using earlier. We create one `prop`: `dateTime`. It gets passed to the `SayNow` component which just prints out the DateTime passed in to it. To simplify our implementation we are just passing props as object literals.
|
|
|
|
The next step is to just call render multiple times.
|
|
|
|
{format: "javascript"}
|
|
```
|
|
setInterval(() =>
|
|
render(createElement(App()), document.getElementById('root')),
|
|
1000);
|
|
```
|
|
|
|
If you do that you will see the DateTime display being updated every second. And if you watch in your dev tools or if you profile the run you will see that the only part of the DOM that gets updated or replaced is the part that changes (aside from the DOM props). We now have a working version of our own React.
|
|
|
|
I> This implementation is designed for teaching purposes and has some known issues and bugs, like always updating the DOM props, along with other things. Fundamentally, it functions the same as React but if you wanted to use it in a more production setting it would take a lot more development.
|
|
|
|
## Conclusion
|
|
|
|
Of course our version of React elides over many details that React must contend with, like starting a re-render from where state changes and event handlers. For understanding how to build high-performance React applications, however, the most important piece to understand is how and when React renders components, which is what we have learned in creating our own mini version of React.
|
|
|
|
At this point you should have an understanding of how React works. In the rest of the book we are going to be refining this model and looking at practical applications of it so that we are prepared to build high performance React applications and diagnose any bottlenecks.
|
|
|
|
TODO maybe a graphic summarizing the heuristics?
|
|
|
|
TODO maybe show full example with our React
|
|
|
|
|