commit 8052d9f6ffb896e0dbbda3e39e7e529020da6dee Author: Thomas Hintz Date: Sun Aug 2 13:00:39 2020 -0700 Initial commit. diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..e4e5f6c --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +*~ \ No newline at end of file diff --git a/foundations-high-performance-react.org b/foundations-high-performance-react.org new file mode 100644 index 0000000..551a6cc --- /dev/null +++ b/foundations-high-performance-react.org @@ -0,0 +1,727 @@ +#+BEGIN_SRC emacs-lisp :exports results :results silent + (require 'ox-latex) + (add-to-list 'org-latex-packages-alist '("" "minted")) + (setq org-latex-listings 'minted) + (setq org-latex-pdf-process + '("xelatex -shell-escape -interaction nonstopmode -output-directory %o %f")) +#+END_SRC + +#+latex_class: book +#+latex_class_options: [book,12pt,oneside] +#+latex_header: \usepackage[book,top=2.5cm,bottom=2.5cm,left=2.5cm,right=2.5cm]{geometry} + +#+TITLE: Foundations of High-Performance React Applications +#+AUTHOR: Thomas Hintz + +#+startup: indent +#+tags: noexport sample frontmatter mainmatter backmatter +#+options: toc:nil tags:nil + + +* Preface :frontmatter: + :PROPERTIES: + :EXPORT_FILE_NAME: manuscript/preface.markua + :END: + +Welcome to /Foundations of High-Performance React/ where we build our +own simplified version of React. We will use our React to gain an +understanding of the real React and how to build high-performance +applications with it. + +* Introduction :mainmatter: + :PROPERTIES: + :EXPORT_FILE_NAME: manuscript/introduction.markua + :END: + +* Foundations: Building our own React + :PROPERTIES: + :EXPORT_FILE_NAME: manuscript/fundamentals--building-our-own-react.markua + :END: +Baking bread. When I first began to learn how to bake bread the recipe +told me what to do. It listed some ingredients and told me how to +combine them and prescribed times of rest. It gave me an oven +temperature and a period of wait. It gave me mediocre bread of wildly +varying quality. I tried different recipes but the result was always +the same. + +Understanding: that's what I was missing. The bread I make is now +consistently good. The recipes I use are simpler and only give ratios +and general recommendations for rests and waits. So why does the bread +turn out better? + +Before baking is finished bread is a living organism. The way it grows +and develops and flavors depend on what you feed it and how you feed +it and massage it, care for it. If you have it grow and ferment at a +higher temperature and more yeast it overdevelops producing too much +alcohol. If you give it too much time, acidity will take over the +flavor. The recipes I used initially were missing a critical +ingredient: the rising temperature. + +But unlike a lot of ingredients: temperature is hard to control for +the home cook. So the recipe can't just tell you exactly what +temperature to grow the bread at. My initial recipes just silently +made assumptions for the temperature, which rarely turn out to be +true. This means that the only way to consistently make good bread is +to have an understanding of how bread develops so that you can adjust +the other ingredients to complement the temperature. Now the bread can +tell me what to do. + +While React isn't technically a living organism that can tell us what +to do, it is, in its whole, a complex, abstract entity. We could learn +basic recipes for how to write high-performance React code but they +wouldn't apply in all cases, and as React and things under it change +our recipes would fall out-of-date. So like the bread, to produce +consistently good results we need to understand how React does what it +does. + +** Components of React + +Conceptually React is very simple. It starts by walking a tree of +components and building up a tree of their output. Then it compares +that tree to the tree currently in the browser's DOM to find any +differences between them. When it finds differences it updates the +browser's DOM to match its internal tree. + +But what does that actually look like? If your app is janky does that +explanation point you towards what is wrong? No. It might make you +wonder if maybe it is too expensive to re-render the tree or if maybe +the diffing React does is slow but you won't really know. When I was +initially testing out different bread recipes I had guesses at why it +wasn't working but I didn't really figure it out until I had a deeper +understanding of how making bread worked. It's time we build up our +understanding of how React works so that we can start to answer our +questions with solid answers. + +React is centered around the ~render~ method. The ~render~ method is +what walks our trees, diffs them with the browser's DOM tree, and +updates the DOM as needed. But before we can look at the ~render~ +method we have to understand its input. The input comes from +~createElement~. While ~createElement~ itself is unlikely to be a +bottleneck it's good to understand how it works so that we can have a +complete picture of the entire process. The more black-boxes we have +in our mental model the harder it will be for us to diagnose +performance problems. + +** Markup in JavaScript: ~JSX~ + +~createElement~, however, takes as input something that is probably +not familiar to us since we usually work in JSX, which is the last +element of the chain in this puzzle and the first step in solving +it. While not strictly a part of React, it is almost universally used +with it. And if we understand it then ~createElement~ will be less of +a mystery since we will be able to connect all the dots. + +JSX is not valid HTML or JavaScript but its own language compiled by a +compiler, like Babel. The output of that compilation is valid +JavaScript that represents the original markup. + +Before JSX or similar compilers, the normal way of injecting HTML into +the DOM was via directly utilizing the browser's DOM APIs or by +setting ~innerHTML~. This was very cumbersome. The code's structure +did not match the structure of the HTML that it output which made it +hard to quickly understand what the output of a piece of code would +be. So naturally programmers have been endlessly searching for better +ways to mix HTML with JavaScript. + +And this brings us to JSX. It is nothing new; nothing +complicated. Forms of it have been made and used long before React +adopted it. Now let's see if we can discover JSX for ourselves. + +To start with, we need to create a data-structure -- let's call it +JavaScript Markup (JSM) -- that both represents a DOM tree and can +also be used to insert one into the browser's DOM. And to do that we +need to understand what a tree of DOM nodes is constructed of. What +parts do you see here? + +#+BEGIN_SRC html +
+

Hello

+ +
+#+END_SRC + +I see three parts: the name of the tag, the tag's properties, and its +children. + +|-----------+-----------------------------| +| Name: | 'div', 'h1', 'input' | +| Props: | 'class', 'type', 'disabled' | +| Children: |

, , Hello | + +Now how could we recreate that in JavaScript? + +In JavaScript, we store lists of things in arrays, and key/value +properties in objects. Luckily for us, JavaScript even gives us literal +syntax for both so we can easily make a compact DOM tree with our own +notation. + +This is what I'm thinking: + +#+CAPTION: JSM - JavaScript Markup +#+BEGIN_SRC javascript +['div', { 'className': 'header' }, + [['h1', {}, ['Hello']], + ['input', { 'type': 'submit', 'disabled': 'disabled' }, []] + ] +] +#+END_SRC + +As you can see, we have a clear mapping from our notation, JSM, to the +original HTML. Our tree is made up of three element arrays. The first +item in the array is the tag, the second is an object containing the +tag's properties, and the third is an array of its children; which are +all made up of the same three element arrays. + +The truth is though, if you stare at it long enough, although the +mapping is clear, how much fun would it be to read and write that on a +consistent basis? I can assure you, it is rather not fun. But it has +the advantage of being easy to insert into the DOM. All you need to do +is write a simple recursive function that ingests our data structure +and updates the DOM accordingly. We will get back to that. + +So now we have a way to represent a tree of nodes and we +(theoretically) have a way to get those nodes into the DOM. But if we +are being honest with ourselves, while functional, it isn't a pretty +notation nor easy to work with. + +And this is where our object of study enters the scene. JSX is just a +notation that a compiler takes as input and outputs in its place a +tree of nodes nearly identical to the notation we came up with! And if +you look back to our notation you can see that you can easily embed +arbitrary JavaScript expressions wherever you want in a node. As you +may have realized, that's exactly what the JSX compiler does when it +sees curly braces! + +There are three main differences between JSM and the real output of +the JSX compiler: it uses objects instead of arrays, it inserts calls +to React.createElement on children, and spreads the children instead +of containing them in an array. Here is what real JSX compiler output +looks like: + +# #+NAME: foo +# #+CAPTION: foo bar +# #+attr_leanpub: :line-numbers true +#+BEGIN_SRC javascript +React.createElement( + 'div', + { className: 'header' }, + React.createElement('h1', {}, 'Hello'), + React.createElement( + 'input', + { type: 'submit', 'disabled': 'disabled' }) +); +#+END_SRC + +As you can see, it is very similar to our JSM data-structure and for +the purposes of this book we will use JSM, as it's a bit easier to +work with. A JSX compiler also does some validation and escapes input +to prevent cross-site scripting attacks. In practice though, it would +behave the same in our areas of study and we will keep things simple +by leaving those aspects of the JSX compiler out. + +So now that we've worked through JSX we're ready to tackle +~createElement~, the next item on our way to building our own React. + +** Getting Ready to Render with ~createElement~ + +React's ~render~ expects to consume a tree of element objects in a +specific, uniform format. ~createElement~ is the method by which we +achieve that objective. ~createElement~ will take as input JSM and +output a tree of objects compatible with ~render~. + +React expects nodes defined as JavaScript objects that look like this: + +#+BEGIN_SRC javascript +{ + type: NODE_TYPE, + props: { + propA: VALUE, + propB: VALUE, + ... + children: STRING | ARRAY + } +} +#+END_SRC + +That is: an object with two properties: ~type~ and ~props~. The +~props~ property contains all the properties of the node. The node's +~children~ are also considered part of its properties. The full +version of React's ~createElement~ includes more properties but they +are not relevant to our study here. + +#+BEGIN_SRC javascript +function createElement(node) { + // if array (not text, number, or other primitive) + if (typeof node === 'object') { + const [ tag, props, children ] = node; + return { + type: tag, + props: { + ...props, + children: children.map(createElement) + } + }; + } + + // primitives like text or number + return { + type: 'TEXT', + props: { + nodeValue: node, + children: [] + } + }; +} +#+END_SRC + +Our ~createElement~ has two main parts: complex elements and primitive +elements. The first part tests whether ~node~ is a complex node +(specified by an array) and then generates an ~element~ object based +on the input node. It recursively calls ~createElement~ to generate an +array of children elements. If the node is not complex then we +generate an element of type 'TEXT' which we use for all primitives +like strings and numbers. We call the output of ~createElement~ a tree +of ~elements~ (surprise). + +That's it. Now we have everything we need to actually begin the +process of rendering our tree to the DOM! + +** Render + +There are now only two major puzzles remaining in our quest for our +own React. The next piece is: ~render~. How do we go from our JSM tree +of nodes, to actually displaying something on screen? To do that we +will explore the ~render~ method. + +The signature for our ~render~ method should be familiar to you: + +#+BEGIN_SRC javascript +function render(element, container) +#+END_SRC + +This is the same signature as that of React itself. We begin by just +focusing on the initial render. In pseudocode it looks like this: + +#+BEGIN_SRC javascript +function render(element, container) { + const domElement = createDOMElement(element); + setProps(element, domElement); + renderChildren(element, domElement); + container.appendChild(domElement); +#+END_SRC + +Our DOM element is created first. Then we set the properties, render +children elements, and finally append the whole tree to the +container. + +Now that we have an idea of what to build we will work on expanding +the pseudocode until we have our own fully functional ~render~ method +using the same general algorithm React uses. In our first pass we will +focus on the initial render and ignore reconciliation. + +#+BEGIN_NOTE +Reconciliation is basically React's "diffing" algorithm. We will be +exploring it after we work out the initial render. +#+END_NOTE + +#+BEGIN_SRC javascript +function render(element, container) { + const { type, props } = element; + + // create the DOM element + const domElement = type === 'TEXT' ? + document.createTextNode(props.nodeValue) : + document.createElement(type); + + // set its properties + Object.keys(props) + .filter((key) => key !== 'children') + .forEach((key) => domElement[key] = props[key]); + + // render its children + props.children.forEach((child) => render(child, domElement)); + + // add our tree to the DOM! + container.appendChild(domElement); +} +#+END_SRC + +The ~render~ method starts by creating the DOM element. Then we need +to set its properties. To do this we first need to filter out the +~children~ property and then we simply loop over the keys, setting +each property directly. Following that, we render each of the children +by looping over them and recursively calling ~render~ on each child +with the ~container~ set to the current DOM element (which is each +child's parent). + +Now we can go all the way from our JSX-like notation to a rendered +tree in the browser's DOM! But so far we can only add things to our +tree. To be able to remove and modify the tree we need one more part: +reconciliation. + +** Reconciliation +A tale of two trees. These are the two trees that people most often +talk about when talking about React's "secret sauce": the virtual DOM +and the browser's DOM tree. This idea is what originally set React +apart. React's reconciliation is what allows you to program +declaratively. Reconciliation is what makes it so we no longer have to +manually update and modify the DOM whenever our own internal state +changes. In a lot of ways, it is what makes React, React. + +Conceptually, the way this works is that React generates a new element +tree for every render and compares the newly generated tree to the +tree generated on the previous render. Where it finds differences +between the trees it knows to mutate the DOM state. This is the "tree +diffing" algorithm. + +Unfortunately, those researching tree diffing in Computer Science have +not yet produced a generic algorithm with sufficient performance for +use in something like React; as the current best algorithm still [[https://grfia.dlsi.ua.es/ml/algorithms/references/editsurvey_bille.pdf][runs +in O(n^3)]]. + +Since an O(n^3) algorithm isn't going to cut it in the real-world, the +creators of React instead use a set of heuristics to determine what +parts of the tree have changed. Understanding how the React tree +diffing algorithm works in general and the heuristics currently in use +can help immensely in detecting and fixing React performance +bottlenecks. And beyond that it can help one's understanding of some +of React's quirks and usage. Even though this algorithm is internal to +React and can be changed anytime its details have leaked out in some +ways and are overall unlikely to change in major ways without larger +changes to React itself. + +According to the [[https://reactjs.org/docs/reconciliation.html][React documentation]] their diffing algorithm is O(n) +and based on two major components: + + - Elements of differing types will yield different trees + - You can hint at tree changes with the ~key~ prop. + +In this section we will focus on the first part: differing types. In a +later chapter we will discuss and implement the ~key~ prop. + +The approach we will take here is to integrate the heuristics that +React uses into our ~render~ method. Our implementation will be very +similar to how React itself does it and we will discuss React's actual +implementation later when we talk about Fibers. + +Before we get into the code changes that implement the heuristics it +is important to remember that React /only/ looks at an element's type, +existence, and key. It does not do any other diffing. It does not diff +props. It does not diff sub-trees of modified parents. + +While keeping that in mind, here is an overview of the algorithm we +will be implementing in the ~render~ method. ~element~ is the element +from the current tree and ~prevElement~ is the corresponding element +in the tree from the previous render. + +#+BEGIN_SRC javascript + if (!element && prevElement) + // delete dom element + else if (element && !prevElement) + // add new dom element, render children + else if (element.type === prevElement.type) + // update dom element, render children + else if (element.type !== prevElement.type) + // replace dom element, render children +#+END_SRC + +Notice that in every case, except deletion, we still call ~render~ on +the element's children. And while it's possible that the children will +have their associated DOM elements reused, their ~render~ methods will +still be invoked. + +Now, to get started with our render method we must make some +modifications to our previous render method. First, we need to be able +to store and retrieve the previous render tree. Then we need to add +code to compare parts of the tree to decide if we can re-use DOM +elements from the previous render tree. And last, we need to return a +tree of elements that can be used in the next render as a comparison +and to reference the DOM elements that we create. These new element +objects will have the same structure as our current elements but we +will add two new properties: ~domElement~ and ~parent~. ~domElement~ +is the DOM element associated with our synthetic element and ~parent~ +is a reference to the parent DOM element. + +Here we begin by adding a global object that will store our last render +tree, keyed by the ~container~. + +#+BEGIN_SRC javascript +const renderTrees = {}; +function render(element, container) { + const tree = + render_internal(element, container, renderTrees[container]); + // render complete, store the updated tree + renderTrees[container] = tree; +} +#+END_SRC + +As you can see, the change we made is to move the core of our +algorithm into a new function called ~render_internal~ and pass in the +result of our last render to ~render_internal~. + +Now that we have stored our last render tree we can go ahead and +update our render method with the heuristics for reusing the DOM +elements. We name it ~render_internal~ because it is what controls the +rendering but takes an additional argument now: the +~prevElement~. ~prevElement~ is a reference to the corresponding +~element~ from the previous render and contains a reference to its +associated DOM element and parent DOM element. If it's the first +render or if we are rendering a new node or branch of the tree then +~prevElement~ will be ~undefined~. If, however, ~element~ is +~undefined~ and ~prevElement~ is defined then we know we need to +delete a node that previously existed. + +#+BEGIN_SRC javascript +function render_internal(element, container, prevElement) { + let domElement, children; + if (!element && prevElement) { + removeDOMElement(prevElement); + return; + } else if (element && !prevElement) { + domElement = createDOMElement(element); + } else if (element.type === prevElement.type) { + domElement = prevElement.domElement; + } else { // types don't match + removeDOMElement(prevElement); + domElement = createDOMElement(element); + } + setDOMProps(element, domElement, prevElement); + children = renderChildren(element, domElement, prevElement); + + if (!prevElement || domElement !== prevElement.domElement) { + container.appendChild(domElement); + } + + return { + domElement: domElement, + parent: container, + type: element.type, + props: { + ...element.props, + children: children + } + }; +} +#+END_SRC + +The only time we shouldn't set DOM properties on our element and +render its children is when we are deleting an existing DOM +element. We use this observation to group the calls for ~setDOMProps~ +and ~renderChildren~. Choosing when to append a new DOM element to the +container is also part of the heuristics. If we can reuse an existing +DOM element then we do, but if the element type has changed or if +there was no corresponding existing DOM element then and only then do +we append a new DOM element. This ensures the actual DOM tree isn't +being replaced every time we render, only the elements that change are +replaced. + +In the real React, when a new DOM element is appended to the DOM tree, +React would invoke ~componentDidMount~ or schedule ~useEffect~. + +Next up we'll go through all the auxiliary methods that complete the +implementation. + +Removing a DOM element is straightforward; we just ~removeChild~ on +the parent element. Before removing the element, React would invoke +~componentWillUnmount~ and schedule the cleanup function for +~useEffect~. + +#+BEGIN_SRC javascript +function removeDOMElement(prevElement) { + prevElement.parent.removeChild(prevElement.domElement); +} +#+END_SRC + +In creating a new DOM element we just need to branch if we are +creating a text element since the browser API differs slightly. We +also populate the text element's value as the API requires the first +argument to be specified even though later on when we set props we +will set it again. This is where React would invoke +~componentWillMount~ or schedule ~useEffect~. + +#+BEGIN_SRC javascript +function createDOMElement(element) { + return element.type === 'TEXT' ? + document.createTextNode(element.props.nodeValue) : + document.createElement(element.type); +} +#+END_SRC + +To set the props on an element, we first clear all the existing props +and then loop through the current props, setting them accordingly. Of +course, we filter out the ~children~ prop since we use that elsewhere +and it isn't intended to be set directly. + +#+BEGIN_SRC javascript +function setDOMProps(element, domElement, prevElement) { + if (prevElement) { + Object.keys(prevElement.props) + .filter((key) => key !== 'children') + .forEach((key) => { + domElement[key] = ''; // clear prop + }); + } + Object.keys(element.props) + .filter((key) => key !== 'children') + .forEach((key) => { + domElement[key] = element.props[key]; + }); +} +#+END_SRC + +#+begin_note +React is more intelligent about only updating or removing props that +need to be updated or removed. +#+end_note + +#+begin_warning +This algorithm for setting props does not correctly handle events, +which must be treated specially. For this exercise that detail is not +important and we leave it out for simplicity. +#+end_warning + +For rendering children we use two loops. The first loop removes any +elements that are no longer being used. This would happen when the +number of children is decreased. The second loop starts at the first +child and then iterates through all of the children of the parent +element, calling ~render_internal~ on each child. When +~render_internal~ is called the corresponding previous element in that +position is passed to ~render_internal~, or ~undefined~ if there is no +corresponding element, like when the list of children has grown. + +#+BEGIN_SRC javascript +function renderChildren(element, domElement, prevElement = { props: { children: [] }}) { + const elementLen = element.props.children.length; + const prevElementLen = prevElement.props.children.length; + // remove now unused elements + for (let i = elementLen; i < prevElementLen - elementLen; i++) { + removeDOMElement(element.props.children[i]); + } + // render existing and new elements + return element.props.children.map((child, i) => { + const prevChild = i < prevElementLen ? prevElement.props.children[i] : undefined; + return render_internal(child, domElement, prevChild); + }); +} +#+END_SRC + +It's very important to understand the algorithm used here because this +is essentially what happens in React when incorrect keys are used, +like using a list index for a key. And this is why keys are so +critical to high performance (and correct) React code. For example, in +our algorithm here, if you removed an item from the front of the list +you may cause every element in the list to be created anew in the DOM +if the types no longer match up. Later on, in the chapter on keys, we +will update this algorithm to incorporate keys. It's actually only a +minor difference in determining which ~child~ gets paired with which +~prevChild~. Otherwise this is effectively the same algorithm React +uses when rendering lists of children. + +#+CAPTION: Example of ~renderChildren~ 2nd loop when the 1st element has been removed. In this case the trees for all of the children will be torn down and rebuilt. +| i | child Type | prevChild Type | +|---+------------+----------------| +| 0 | span | div | +| 1 | input | span | +| 2 | - | input | + +There are a few things to note here. First, it is important to pay +attention to when React will be removing a DOM element from the tree +and adding a new one as this is when the related lifecycle events or +hooks are invoked. And invoking those lifecycle methods or hooks, and +the whole process of tearing down and building up a component is +expensive. So again, if you use a bad key, like the algorithm here +simulates, you'll be hitting a major performance bottleneck since +React will not only be replacing DOM elements in the browser but also +tearing down and rebuilding the trees of child components. + +** Fibers + +The actual React implementation used to look very similar to what +we've built so far, but with React 16 this has changed dramatically +with the introduction of Fibers. Fibers are a name that React gives to +discrete units of work during the render process. And the React +reconciliation algorithm was changed to be based on small units of +work instead of one large, potentially long-running call to +~render~. This means that React is now able to process just part of +the render phase, pause to let the browser take care of other things, +and resume again. This is the underlying change the enables the +experimental Concurrent Mode as well as running most hooks without +blocking the render. + +But even with such a large change, the underlying algorithms for +deciding how and when to render components is the same. And when not +running in Concurrent Mode the effect is still the same as React does +the render phase in one block still. So using a simplified +interpretation that doesn't include all the complexities of breaking +up the process in to chunks enables us to see more clearly how the +process as a whole works. At this point bottlenecks are much more +likely to occur from the underlying algorithms and not from the Fiber +specific details. In the chapter on Concurrent Mode we will learn more +about Fibers. + +** Putting it all together + +Throughout the rest of the book we will be building on and using our +React implementation so it would be helpful to see it all put together +and working. At this point the only thing left to do is to create some +components and use them! + +#+BEGIN_SRC javascript + const SayNow = ({ dateTime }) => { + return ['h1', {}, [`It is: ${dateTime}`]]; + }; + + const App = () => { + return ['div', { 'className': 'header' }, + [SayNow({ dateTime: new Date() }), + ['input', { 'type': 'submit', 'disabled': 'disabled' }, []] + ] + ]; + } + + render(createElement(App()), document.getElementById('root')); +#+END_SRC + +We are creating two components, that output JSM, as we defined it +earlier. We create one component prop for the ~SayNow~ component: +~dateTime~. It gets passed from the ~App~ component. The ~SayNow~ +component prints out the ~DateTime~ passed in to it. You might notice +that we are passing props the same way one does in the real React, and +it just works! + +The next step is to call render multiple times. + +#+BEGIN_SRC javascript +setInterval(() => + render(createElement(App()), document.getElementById('root')), + 1000); +#+END_SRC + +If you run the code above you will see the DateTime display being +updated every second. And if you watch in your dev tools or if you +profile the run you will see that the only part of the DOM that gets +updated or replaced is the part that changes (aside from the DOM +props). We now have a working version of our own React. + +#+begin_note +This implementation is designed for teaching purposes and has some +known issues and bugs, like always updating the DOM props, along with +other things. Fundamentally, it functions the same as React but if you +wanted to use it in a more production setting it would take a lot more +development. +#+end_note + +** Conclusion + +Of course our version of React elides over many details that React +must contend with, like starting a re-render from where state changes +and event handlers. For understanding how to build high-performance +React applications, however, the most important piece to understand is +how and when React renders components, which is what we have learned +in creating our own mini version of React. + +At this point you should have an understanding of how React works. In +the rest of the book we are going to be refining this model and +looking at practical applications of it so that we are prepared to +build high performance React applications and diagnose any +bottlenecks.