Automated Performance Regression Testing with Reassure

Rate this content
Bookmark

As developers we love to dive into performance metrics, benchmarks, compare one solution to another. Whether we enjoy it or not, we’re often required to fix performance issues in our React and React Native apps. But this process is not sustainable and prone to regressions, especially as the app and team grow. What’s worse, those issues are often discovered by your users, making their experience miserable. In my talk I’ll introduce you to Reassure—a performance regression testing library for React and React Native— which happens to be a missing piece in our automated testing and performance suites. Spotting problems before they hit production.

16 min
24 Oct, 2022

Video Summary and Transcription

Today's Talk introduces Reacher, a performance monitoring tool for React and React Native codebases. It highlights the need for catching performance regressions early in the development process and identifies JavaScript misusage as a common source of performance issues. ReaSure, developed by Covstack, is presented as a promising library that integrates with existing ecosystems and provides reliable render time measurements and helpful insights for code review. Considerations for operating in a JavaScript VM are discussed, including JIT, garbage collection, and module resolution caching. Statistical analysis using the z-score is mentioned as a method for determining the significance of measurement results.

Available in Español

1. Introduction to Performance Monitoring

Short description:

Today, I'm going to talk about performance monitoring in React and React Native codebases with Reacher. Entropy is the increase of disorder, which distinguishes the past from the future. As developers, we fight against entropy by following a development cycle and addressing bugs. However, even with a well-designed workflow, negative reviews can still appear.

Hi, today I'm going to talk about performance monitoring and how to make it happen in your React and React Native codebases with Reacher. My name is Michał Pieszchala, I'm a Head of Technology at Callstack, responsible for our R&D and open source efforts. I'm also a core contributor to a bunch of libraries currently maintaining the React Native CLI and the React Native testing library.

Let's start with some inspiration, shall we? Anyone heard of entropy? Not really this one. The real world entropy, described by physics like this. Or how Stephen Hawking framed it. You may see a cup of tea fall off a table and break into pieces on the floor, but you will never see the cup gather itself back together and jump back on the table. The increase of disorder, or this entropy, is what distinguishes the past from the future, giving a direction to time. Or in other words, things will fall apart eventually when unattended.

But let's not get too depressed or comfortable with things just turning into chaos, because we can and do fight back against it. We can exert efforts to create useful types of energy and order, resilient enough to withstand the unrelenting pull of entropy by expending this energy. When developing software we kind of feel entropy is a thing. That's why we usually put some extra effort and follow some kind of a development cycle. For example, we start with adding a new feature. During development we sprinkle it with a bunch of tests. When done we send it to QA. QA improves it and promotes our code to production channel release. And we're back to adding another feature. But that's quite simplified version of what we usually do. Let's complicate it a little bit. Among other things we don't take into account that bugs may suddenly appear. Now our circle becomes rather a graph but that's okay because we know what to do. We need to identify the root cause, add a regression test so it never breaks again, send to QA once again, ship it and we're back to adding new features.

So we're happy with our workflow. It works pretty well. We're adding feature after feature, our app release is so well designed that even adding 10 new developers doesn't slow us down. And then we take a look at our app reviews to check what folks think. And a wild one-star review appears. And then another one comes in. And they just...

2. Challenges with Performance Monitoring

Short description:

Our perfect workflow is not resilient to performance regressions. We need a way to spot them before they impact our users. Treating performance issues as bugs allows us to catch regressions early in the development process. To find the best tool for performance testing, we need to consider the impact and target the most likely regressions. Most performance issues originate from the JavaScript side, particularly from React misusage. We estimate that around 80% of the time spent fixing performance issues is in the JavaScript realm. We found a promising React performance testing library that is worth exploring.

they just keep on coming. And we start to realize that our perfect workflow based on science, our experiences and best practices, which was supposed to prevent our app from falling apart, is not resilient to a particular kind of bugs. Performance regressions. Our codebase doesn't have the tools to fight these. We know how to fix the issues once spotted but we have no way to spot them before they hit our users.

So how was it, once again? Or... Performance will fall apart eventually when unintended. So if I don't do anything, to optimize my app while adding new code and letting the time go by, it will get slower. And we don't know when it will happen. Maybe tomorrow, maybe in a week, or in a year. And if only there's been an established way of catching at least some of the regressions early in the development process, before our users notice. Wait a minute, there is! If we start treating performance issues as bugs, we don't even need to break of our development workflow. Regression tests run in a remote environment, on every code change, so we just need to find a way to fit performance tests there, right?

But before we go on a hunt for the best tool, let's take a step back and think about impact and what's worth testing. As with any test coverage, there is a healthy ratio that we strive for, to provide us the best value for the lowest amount of effort. We want to make sure to target regressions which are most likely to hit our users. And apparently, we are developing a re-ignited app. By the way, did you know there's a font named Impact? And you've probably seen it with hits like memes. Anyway, take a look at the typical performance issues callstack developers are dealing with daily. Slow lists and images, SVGs, React context misusage, re-renders, slow TCI, just to name a few. If we look at this list from the origin of issue point of view, we'll notice that the vast majority of these come from the JavaScript side. Now, let's check the relative frequency. And what emerges is pretty telling. We estimate that most of the time our developers spend fixing performance issues, around 80%, origin from the JavaScript realm, especially from React misusage. Only the rest is bridge communication overhead and native code, like image rendering or database operations working inefficiently. But I'm not a fan of reinventing the wheel, so I've done my googling for React performance testing library, and I found this. This package. It looks promising. Let's see what's inside. It's not quite popular, but that's okay. Last release was 9 months ago.

3. Introduction to ReaSure

Short description:

We need a new library that integrates with our existing ecosystem, measures render times reliably, provides a CI runner, generates readable and parsable reports, and offers helpful insights for code review. Introducing ReaSure, a performance regression testing companion for React and React Native apps. Developed by Covstack in partnership with Intane, ReaSure enhances the code review process by integrating with GitHub. It runs jest through Node code with special flaks to increase stability and uses the React profiler to handle measurements reliably. ReaSure compares test results between branches and provides a summary of statistically categorized results. Embracing stability and avoiding flakiness is key for cognitive benchmarks, especially in Node.js.

That's okayish. What else? It monkey patches React. That's not okay. It uses React internals as well. Well, that's a bummer. It's not a good fit for our use case and doesn't really look like a solid foundation to build on.

But, what do we actually need from such a library? Well, ideally, it should integrate with existing ecosystem of libraries we're using. It should measure render times and count reliably, have a CI runner, generate readable and parsable reports, provide helpful insights for code review, and, looking at our Google library, have a stable design. And since there's nothing like this out there, we need a new library.

And I'd like to introduce you to ReaSure, a performance regression testing companion for React and React Native apps. It's developed at Covstack in partnership with Intane, one of the world's largest sports betting and gaming group. ReaSure builds on top of your existing setup and sprinkles it with an unobtrusive performance measurement API. It's designed to be run on a remote server environment as a part of your continuous integration To increase the stability of results and decrease flakiness, ReaSure will run your tests once for the current branch and another one for the base branch. Delightful developer experience is at the core of our engineering design. That's why ReaSure integrates with GitHub to enhance the code review process. Currently, we leverage Danger.js as our bot backend, but in the future we'd like to prepare a plug-and-play GitHub action.

Now, let's see what it does. ReaSure runs jest through Node code with special flaks to increase stability. The measureRender function we provide runs the react profiler to handle measurements reliably, allowing us to avoid monkey-patching React. After the first run is completed, we switch to the base branch and run tests again. Once both test runs are completed, the tool compares the results and presents the summary, showing statistically categorized results that you can act upon. Let's go back to our example. Notice how we created a new file with .perf-test-.dsx extension, that reuses our regular React testing library component test in a scenario function. The scenario is then used by the measurePerformance method from Reassure, which renders our counter component, in this case, 20 times. Under the hood, React profiler measures renderCount and duration times for us, which we then write down to the file system. And that's usually all you have to write. Copy-paste your existing tests, adjust, and enjoy. Cognitive benchmarks is not a piece of cake, even in non-JS environments. But it's particularly tricky with Node.js. The key is embracing stability and avoiding flakiness.

4. Considerations for JavaScript VM

Short description:

Operating in a JavaScript VM, we need to consider JIT, garbage collection, and module resolution caching. Statistical analysis requires running measurements multiple times. The z-score is used to determine the statistical significance of results.

Operating in a JavaScript VM, we need to take JIT, garbage collection, and module resolution caching into account. We have a cost of concurrency that our test runner embraces for speed execution. We need to pick what to average and what to percentile. And a lot more. To take statistical analysis, for example. To make sure our measurement results make sense mathematically, running them once or twice is not enough. Taking other things into account, we've figured ten times is a good baseline. Then to determine the probability of the result being statistically significant, we need to calculate the z-score, which needs the mean value or average divergence and standard deviation. This got me flashbacks from college, so I'm not gonna dive any deeper here.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

React Advanced Conference 2022React Advanced Conference 2022
25 min
A Guide to React Rendering Behavior
Top Content
React is a library for "rendering" UI from components, but many users find themselves confused about how React rendering actually works. What do terms like "rendering", "reconciliation", "Fibers", and "committing" actually mean? When do renders happen? How does Context affect rendering, and how do libraries like Redux cause updates? In this talk, we'll clear up the confusion and provide a solid foundation for understanding when, why, and how React renders. We'll look at: - What "rendering" actually is - How React queues renders and the standard rendering behavior - How keys and component types are used in rendering - Techniques for optimizing render performance - How context usage affects rendering behavior| - How external libraries tie into React rendering
React Summit 2023React Summit 2023
32 min
Speeding Up Your React App With Less JavaScript
Too much JavaScript is getting you down? New frameworks promising no JavaScript look interesting, but you have an existing React application to maintain. What if Qwik React is your answer for faster applications startup and better user experience? Qwik React allows you to easily turn your React application into a collection of islands, which can be SSRed and delayed hydrated, and in some instances, hydration skipped altogether. And all of this in an incremental way without a rewrite.
TestJS Summit 2021TestJS Summit 2021
33 min
Network Requests with Cypress
Top Content
Whether you're testing your UI or API, Cypress gives you all the tools needed to work with and manage network requests. This intermediate-level task demonstrates how to use the cy.request and cy.intercept commands to execute, spy on, and stub network requests while testing your application in the browser. Learn how the commands work as well as use cases for each, including best practices for testing and mocking your network requests.
React Summit 2023React Summit 2023
23 min
React Concurrency, Explained
React 18! Concurrent features! You might’ve already tried the new APIs like useTransition, or you might’ve just heard of them. But do you know how React 18 achieves the performance wins it brings with itself? In this talk, let’s peek under the hood of React 18’s performance features: - How React 18 lowers the time your page stays frozen (aka TBT) - What exactly happens in the main thread when you run useTransition() - What’s the catch with the improvements (there’s no free cake!), and why Vue.js and Preact straight refused to ship anything similar
TestJS Summit 2021TestJS Summit 2021
38 min
Testing Pyramid Makes Little Sense, What We Can Use Instead
Top Content
Featured Video
The testing pyramid - the canonical shape of tests that defined what types of tests we need to write to make sure the app works - is ... obsolete. In this presentation, Roman Sandler and Gleb Bahmutov argue what the testing shape works better for today's web applications.
JSNation 2022JSNation 2022
21 min
The Future of Performance Tooling
Top Content
Our understanding of performance & user-experience has heavily evolved over the years. Web Developer Tooling needs to similarly evolve to make sure it is user-centric, actionable and contextual where modern experiences are concerned. In this talk, Addy will walk you through Chrome and others have been thinking about this problem and what updates they've been making to performance tools to lower the friction for building great experiences on the web.

Workshops on related topic

React Summit 2023React Summit 2023
170 min
React Performance Debugging Masterclass
Featured WorkshopFree
Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)
React Summit 2023React Summit 2023
151 min
Designing Effective Tests With React Testing Library
Featured Workshop
React Testing Library is a great framework for React component tests because there are a lot of questions it answers for you, so you don’t need to worry about those questions. But that doesn’t mean testing is easy. There are still a lot of questions you have to figure out for yourself: How many component tests should you write vs end-to-end tests or lower-level unit tests? How can you test a certain line of code that is tricky to test? And what in the world are you supposed to do about that persistent act() warning?
In this three-hour workshop we’ll introduce React Testing Library along with a mental model for how to think about designing your component tests. This mental model will help you see how to test each bit of logic, whether or not to mock dependencies, and will help improve the design of your components. You’ll walk away with the tools, techniques, and principles you need to implement low-cost, high-value component tests.
Table of contents- The different kinds of React application tests, and where component tests fit in- A mental model for thinking about the inputs and outputs of the components you test- Options for selecting DOM elements to verify and interact with them- The value of mocks and why they shouldn’t be avoided- The challenges with asynchrony in RTL tests and how to handle them
Prerequisites- Familiarity with building applications with React- Basic experience writing automated tests with Jest or another unit testing framework- You do not need any experience with React Testing Library- Machine setup: Node LTS, Yarn
JSNation 2023JSNation 2023
170 min
Building WebApps That Light Up the Internet with QwikCity
Featured WorkshopFree
Building instant-on web applications at scale have been elusive. Real-world sites need tracking, analytics, and complex user interfaces and interactions. We always start with the best intentions but end up with a less-than-ideal site.
QwikCity is a new meta-framework that allows you to build large-scale applications with constant startup-up performance. We will look at how to build a QwikCity application and what makes it unique. The workshop will show you how to set up a QwikCitp project. How routing works with layout. The demo application will fetch data and present it to the user in an editable form. And finally, how one can use authentication. All of the basic parts for any large-scale applications.
Along the way, we will also look at what makes Qwik unique, and how resumability enables constant startup performance no matter the application complexity.
TestJS Summit 2022TestJS Summit 2022
146 min
How to Start With Cypress
Featured WorkshopFree
The web has evolved. Finally, testing has also. Cypress is a modern testing tool that answers the testing needs of modern web applications. It has been gaining a lot of traction in the last couple of years, gaining worldwide popularity. If you have been waiting to learn Cypress, wait no more! Filip Hric will guide you through the first steps on how to start using Cypress and set up a project on your own. The good news is, learning Cypress is incredibly easy. You'll write your first test in no time, and then you'll discover how to write a full end-to-end test for a modern web application. You'll learn the core concepts like retry-ability. Discover how to work and interact with your application and learn how to combine API and UI tests. Throughout this whole workshop, we will write code and do practical exercises. You will leave with a hands-on experience that you can translate to your own project.
React Day Berlin 2022React Day Berlin 2022
53 min
Next.js 13: Data Fetching Strategies
Top Content
WorkshopFree
- Introduction- Prerequisites for the workshop- Fetching strategies: fundamentals- Fetching strategies – hands-on: fetch API, cache (static VS dynamic), revalidate, suspense (parallel data fetching)- Test your build and serve it on Vercel- Future: Server components VS Client components- Workshop easter egg (unrelated to the topic, calling out accessibility)- Wrapping up
React Advanced Conference 2022React Advanced Conference 2022
81 min
Introducing FlashList: Let's build a performant React Native list all together
Top Content
WorkshopFree
In this workshop you’ll learn why we created FlashList at Shopify and how you can use it in your code today. We will show you how to take a list that is not performant in FlatList and make it performant using FlashList with minimum effort. We will use tools like Flipper, our own benchmarking code, and teach you how the FlashList API can cover more complex use cases and still keep a top-notch performance.You will know:- Quick presentation about what FlashList, why we built, etc.- Migrating from FlatList to FlashList- Teaching how to write a performant list- Utilizing the tools provided by FlashList library (mainly the useBenchmark hook)- Using the Flipper plugins (flame graph, our lists profiler, UI & JS FPS profiler, etc.)- Optimizing performance of FlashList by using more advanced props like `getType`- 5-6 sample tasks where we’ll uncover and fix issues together- Q&A with Shopify team