We've all asked ourselves this while waiting an eternity for our CI job to finish. Slow CI not only wrecks developer productivity breaking our focus, it costs money in cloud computing fees, and wastes enormous amounts of electricity. Let’s take a dive into why this is the case and how we can solve it with better, faster tools.
Why is CI so Damn Slow?
From:

DevOps.js Conf 2022
Transcription
Hi everybody, my name is Nicholas. I'm a software developer at Rome Tools and I'd like to talk to you all about why your CI is so damn swell. We've all been there. You've pushed your latest code to GitHub, your CI service is spitting up and it's taking forever. You wait and wait and wait only to get your result back five, ten, even twenty minutes later. It's annoying, it's disruptive, it's a waste of your damn time. Normally we just accept this as an eternal truth. CI is swell. I don't think that's okay. Having to wait for CI has a direct dire effect both on your productivity and on your finances. Let's start with a resource that matters to all of us, developer time. I don't know about you all, but when I encounter a time sink such as a slow CI, I do one of two things. I slack off or I start another task. Slacking off is clearly a net negative. Sure, we all deserve breaks, but not every single time we push up some code. Our CI shouldn't determine when we work or not. But Nick, you say, we could just start another task. And yeah, starting another task seems tempting. We're all capable multitaskers. Turns out, no, we are not. In the paper, Task Interruption in Software Development Projects, the authors measured the effect of task switching on productivity. It wasn't good. In particular, they noted that self-interruptions, basically when you purposely switch between tasks, they tend to be more disruptive than external interruptions and lead to less task completion. With slow CI, this flow gets constantly interrupted. You end up switching between the CI jobs and your new task, losing your working context and therefore your efficacy and efficiency. Or worse, you get focused on your new task, forget about your CI job, and only remember a few hours later that your job is done. You ping your reviewer just to realize that they've gone home. Suddenly a quick PR takes two days or more to merge. Even worse is when you have to debug a CI workflow. Whenever I have to debug one, it's so awful. I end up tweaking a setting, waiting for the job to complete, getting distracted, and only seeing the results 15 minutes later. A 15-minute development cycle is not acceptable in this day and age. But it's not just developer time. Slow CI means more compute time, which, as anyone who's stared in shock at their aws bill knows, means more money. One such example is the free desktop GitLab instance, which hosts a bunch of free software projects, such as Mesa, the Linux kernel drivers, and many others. They experienced a massive period of growth in the late 2019 to early 2020 era. However, their expenses ballooned accordingly. First at $75,000 in 2019, and then they were projected to hit $90,000 in 2020. They managed to cut costs before they ran out of money. But still, for an open source project, that's a massive amount to be spending on CI. Let's do better. Let's figure out why CI is slow. To do so, we should look at what a CI job does. At its core, there are two types of CI jobs, static and dynamic jobs. Static jobs apply developer tools, such as a linter, bundler, formatter, etc, to run your code without executing the code. Dynamic jobs may also apply tools, but they have to run your code. We're going to focus on static jobs, but many of these lessons apply dynamic ones too. Regardless, if your CI job is slow, it's likely because your tools are slow. But what do I mean exactly by slow? Of course, I mean the tools are slow in their runtime. However, for CI, there's a different type of slowness. These tools are also slow in their installation time. Let's take ESLint. It's a great tool. However, like many tools in the javascript ecosystem, it has a lot of dependencies. This isn't a dependency shaming talk. I use dependencies, you use dependencies. They're a necessary part of modern software development. But dependencies do have an impact on your CI. Namely, you have to install them. And no one likes waiting for npm or yarn to do its thing. And with javascript, there's always the biggest dependency of all, Node. You have to download this massive runtime just to run your other dependencies. How many times has a CI service had to spin up a Docker container with Node just to run a tool like ESLint or webpack? And back to running time, why are these tools slow to run? At Roam, we've put a lot of thought into this question. First and most obviously, it's because most tools are written in javascript, and javascript is not the ideal language for developer tooling. It uses a lot of memory, it's hard to write safe concurrent code, and it requires a runtime. The solution is, well, don't write the tools in javascript. We use rust, ES build uses Go, Bun uses Zig. In our case, we found that rust has given us significant benefits. We first wrote Roam in typescript. With this original codebase, we had to squeeze out performance with carefully optimized javascript and our own low-level primitives. In rust, we got very good performance right out of the gate. Second, because rust compiles with static binary, we no longer have to install dependencies or a CI process. We can simply download a binary. And finally, there's an aspect of javascript tooling that we believe is underestimated. That's the problem of redundant work. The javascript tooling ecosystem is extremely fragmented. We use different tools to lint, format, bundle, and so on. Each tool, therefore, needs to parse and represent the javascript as a syntax tree. If you remember your compilers, you'll recall that syntax trees take up a significant amount of memory. Ever wonder why IntelliJ is slow? It's storing all these syntax trees. Not to mention, parsing involves I.O., which always comes with its own overhead. Therefore, each tool needs to do its work. Use three tools, that's three times the parsing time and three times the memory. With Roam, they have one tool that handles linting, formatting, bundling, and so on. That means one parsing stage and one syntax tree. There's no wasted compute and no wasted memory. With all this in mind, what does our ideal CI job look like? In the case of static jobs, these jobs should operate as native programs with no javascript runtime involved. There's really no reason to run Node or Docker for a linter or bundler. They can be used from the command line or as an api. In which case, the performance expectation should be on par with any other REST api. You call it, get a result in under a second. Of course, for dynamic jobs, where we need to execute the code, we can still spin up a runtime. But perhaps we can stick to more lightweight environments. Something like an aws lambda or a cloudflare worker could easily suffice. So where does Roam come into this? For now, we're focused on delivering fast, native javascript tools. Our long-term plan is to be the one-stop shop for all of javascript. Hopefully, in the future, these tools will be the replacement for static CI jobs. But where do we start? Of course, we couldn't tackle every tool simultaneously. So we decided to do some research, look at open source repositories, and see what was slow in our tooling setup. We found that linting was generally pretty slow. But in particular, one rule stood out, prettier. If you're not familiar, there's an ESLint rule that checks if your code is formatted correctly by running prettier on your code and seeing if anything changes. In the case of Babel, the javascript compiler originally written by Roam's founder, Sebastian, the prettier ESLint rule takes over six seconds to complete. With webpack, this rule takes five seconds. You may be saying at this point, six seconds? That's not that much. And I'd agree in abstract. But in reality, six seconds is the difference between running your linter as a local watcher and running it on CI. If it takes six seconds to get your lint results back, you're probably not going to have the linter integrated into your development flow. You're probably not going to want to call it through a REST api. Instead, you're going to rely on running it as a CI task. You're going to wait longer before seeing potential issues in your code. With this evidence in mind, we made it our mission to make formatting better. Don't get me wrong. I love prettier, but it can get remarkably slow in large files. In particular, we were inspired by rust formatter, which is super fast. We decided that we wanted a formatter of the same level of performance, but for javascript. We also wanted a formatter that was integrated into your development environment. One particular feature of developer tools that we love is called error tolerance. A tool is error tolerant if it works even when your code has errors, whether they're syntactic or semantic. We put a lot of work into making the formatter error tolerant. That way, it can still make your code look good no matter what. We're super excited to announce our formatter will be released by March 28th. We'd love to give a sneak peek of the performance of our formatter. First up, we have Dojo, a frontend framework. Running the formatter on a bundle library prettier takes 75 milliseconds, which is not bad, but Rome is all the way down at 10 milliseconds. For svelte, prettier is up to 750 milliseconds, almost a second, while Rome is all the way down at 200 milliseconds. And finally, there's the behemoth that is typescript's checker.ts, clocking in over 44,000 lines, or 2.64 megabytes. I don't know if you've ever opened this file in an editor, but I'll tell you now, it slows mine down to a crawl. Prettier takes over two seconds to format this file. We do it in 335 milliseconds. But that's just the beginning. Our next goal is winting. As I mentioned before, winting is quite slow in javascript. ESLint can easily take 15, 30 seconds to run on a top computer. We want to be a tenth of that. Winting will also let you off the power of our concrete syntax tree, or CST-based parsing. If you're familiar with compilers, you may know that parsers generally produce what's called an abstract syntax tree, or AST. ASTs are great because they abstract away unnecessary details, such as whitespace or comments. However, it turns out that these supposedly unnecessary details are rather important for other developer tools. Specifically, it's really hard to do this with a single CST. It's really hard to do something like parse the code into an AST, modify the AST, then convert it back into code while completely rearranging the code, thereby creating a massive git diff. With a CST, everything is preserved. Comments, whitespace, everything. That means converting a CST back into code with only a minimal diff is extremely easy, which in turn opens up a whole host of powerful code modification features. We're really excited to demonstrate the possibilities of CST-based linting. Our linter will come out by the end of Q2, and we're so excited to demonstrate the possibilities of cross-language CST-based linting. To recap, CI is slow because our tools are slow, both in their installation time and in their running time. They're slow because they use slower languages that rely on runtimes and containers. To make CI faster, we need fast, native tools for javascript. If these goals sound interesting, we're going to be hiring soon. Feel free to reach out at Nicholas at RomeTools.com or via Twitter. Thank you. Yeah, a really great talk. Nicholas, thank you for sharing that with us. It was actually a pretty quick talk for being about CI being so slow, right? Yeah, I didn't want to waste your time. But it was a good one. But you've unpacked a lot there. So, great talk. Let's check out our Slido results. I'm really intrigued to see what people are thinking. What is their slowest and most annoying tool? Let's check that out together. So, bundler. 50% say bundler. Okay, that's interesting. But in second place, we have IDE. So, if you want to share in the chat which IDEs you're using, we'd love to hear which ones you're using and you find pretty slow. And unless that's like etcd there, I'd love to unpack also the et al or etc to know what's in third place. So, tell us what other tools you think didn't make the list. We'd love to hear about those. And yeah, let's hear what people had to ask, Nicholas. Let's start with some of the Q&A. All right, cool. So, cool. So, let's start with what about testing? Isn't there inherent performance issues with testing? Any thoughts on that? Yes. So, I would agree with that. I would say that testing is arguably the thing that makes CI the slowest. And there's inherent boundary to tests being just code that you have to run. But I think, you know, there are potential solutions in this area. For instance, you can make the runtime overhead a lot smaller. You can use something like a work curve as optimized to have a fast speed up and fast start. You can also look into potential optimizations within the javascript. Like maybe you can have the javascript run faster. Maybe you can figure out ways to precompile the javascript. I think, you know, we've been kind of treating it as this black box. But I think there are possibilities there that we can really use to make CI a little faster and a little more responsive. Yeah, on the testing front. I agree. I think there's plenty that can be done there to optimize. So, in testing, I mean, what are the other like heavy lifts in that process? Just wondering what else kind of, you know? Yeah. So, I would say testing, benchmarking is another classic one. Because again, benchmarking, you know, you got to run the code and you got to keep it. You have to oftentimes run the code multiple times. A lot of modern benchmarking frameworks do statistical analysis. And so, they have to run it a statistically significant amount of times. And so, you know, if there isn't as great of a solution because, you know, it's again benchmarking. But I think, you know, there are some techniques that we can steal from large companies. For instance, I know Facebook, they only run relevant benchmarks and relevant tests for the commit. So, they don't run the entire testing suite. Because obviously, Facebook is running at Facebook scale and you can't run every test there is. So, I think there's really, I think we're really excited to see what other people tackle, how other people tackle this problem. And also, we're really excited to show people our solutions to this problem. Awesome. So, Cece Miller asks, coming from the PHP world, Pest PHP is rocking the testing world over there. And it's making tests massively faster. But he, he liked to ask, or not he or she, I don't know. Would you agree that some of the slowness of CI can be bypassed by cleaner code? I would, I mean, certainly, of course. I think, you know, there's both cleaner code in the tools and also cleaner code in your normal code. I mean, obviously, if you're running the code, it's, you want your code to be fast and not slow. And I would say, from my perspective, as someone who writes tools, I think that that is something that has been neglected in the tooling world is performance. And, you know, I would like to give a shout out to like a lot of people who are writing tools who think about performance really carefully, such as the authors of ES build and the authors of SWC. Ourselves at Roam, we put a lot of effort into that. And in fact, I actually, the benchmarks I showed you guys in the talk are actually significantly out of date. We've been doing a lot of performance optimization since then. And I would say that our numbers are perhaps like maybe 20% or 30% better than that, which is really fantastic work. And I'm so pumped to show that to people. I think, you know, there's always more that you can do. There's more performance you can squeeze out. So, yeah. What, just, I guess, share a little bit, what kind of performance improvements have you implemented at Roam, just off the top of your head? Off the top of your head? So, for instance, a big thing that, you know, you always have to do is parse the code. And parsing is notoriously slow for a bunch of reasons. First, you have to read in the code from IO. And second, you have to create this big data structure. So, we've been doing a lot of work to basically make sure that when you parse, you're doing the minimal amount of work, because oftentimes you might parse something and then realize that it's a different syntactic construct. So, for instance, you may be, your parser might be parsing some code and realize, oh, wait a minute. It's not a JSX expression. It's a typescript type assertion. Because it turns out that these two things look very, very similar. And we've been putting in a lot of work into that process where the parser realizes that it's something different than expected and has to backtrack. We're trying to do that less and less so that the parsing is way, way faster. We're also putting in a lot of work into formatting, into making sure that the formatter is doing just the bare minimum it needs to. And it really isn't doing redundant work in, like, moving text around and moving rearranging the code. Okay. I see that, folks, I'm just going to encourage folks to first remind you about the Slido if you want to tell us what other tools you find annoying and slow. We'd love to hear from the 11% of you. And also, which IDs you're using. We'd love to hear that. And also, if you answer in time, maybe Nicholas can give us thoughts on why he thinks those are specifically slow tools. But I mean, I guess from my personal perspective, I'd like to say that I don't find my CI very slow. So why is this something that should matter? Why should I care about this? That's a really good point. And I'm happy that your CI isn't slow. I'm happy that you're not waiting like some other people are. I would say that I think fast tools are universally a good thing, even if it's not for your CI. I think it's really, again, we have to remember that not everybody is running a super modern computer. As I said in the talk, we often as developers at companies, we get modern hardware pretty often. But there's plenty of people out there who are just starting out or maybe at companies who don't have the budget, wherever running on slower hardware. And we have to accommodate them. I think it's really not fair to write code that only works on a brand new MacBook Pro. Yeah, I can appreciate that. Obviously, I was spoken from very deeply within privilege. I was like, OK. Yeah, I mean, I've been guilty of that too. I run the code and I think, oh, it's funny fast. But I forget which company it is. But they had a habit of basically running their internet connection slower, like basically throttling it for the express purpose of accommodating people who are on like a 3G or even a 2G connection. Because I think that's a really important lesson to give to developers. Yeah, that's very true. Dainey asks, can you share some details about the linter? How will it compare to ESLint in terms of rules, plugins, other factors? Yeah, so I'm really excited about this one because linting is something that is and has been slow for a while. And obviously, ESLint is a fantastic project and they have just a lot of rules. And I'm not going to come out and say that we're going to capture all these rules at once because, as everyone always says to me all the time, Rome was not built in a day. So I would say is that what we're focusing on is what are the rules that are the slowest in ESLint, what are the things that really make it painful, and then we're going to build a linter for those specific rules. And then the idea maybe is then you can run our linter, get fast feedback for certain things. And if you want a full, complete linting setup, you can still run ESLint and you can still get all of the feedback from ESLint, but you'll get the fast feedback first. That's cool. So Nicholas, somebody's, Jay Reed is saying, Nicholas mentioned that some of the various tools in the JS ecosystem, javascript ecosystem, each have to build their own ASTs. Apologies, I don't know what AST stands for. To do their thing, anyone know if there is any collaboration happening between some of the most popular tools with the goal of sharing a single AST? And just tell me what an AST is, quickly. Abstract Syntax Tree. So it's the thing that parses your outputs. Thank you, Abstract Syntax Tree. Yeah, and as far as I know, I don't think there is. I mean, I could be wrong and I'm happy to be proven wrong in this because I think that would be a great project. One thing that I would like to say, one thing that I think is really inspiring is the TreeSitter project, which is a parser that basically can parse multiple different languages, pretty much any language you can think there's a TreeSitter parser for, including javascript. And that's a great instance of basically someone writing like one really great parser, which then a lot of people can use. I think, I'm not really sure if there's any way to share the AST currently. I mean, obviously there's some issues with if one tool changes the AST, then the other tools can't really use the same AST. So, it's a little complicated and I'm not certain that that would necessarily work without some serious changes. Awesome. Thank you. That was a very good answer. Cece Miller asks another question. Have you looked at AI interpretation of code such as GitHub copilot? Perhaps this could step it into another level. Personally, I've played around with them. I'm a huge fan of Tab9 and copilot as well. I think they're really cool projects. Yeah, Tab9 is a fantastic project. I wouldn't say that that's necessarily the direction we're taking at Rome right now. I think it's definitely something that we're open to exploring in the future. I mean, I think, you know, ML plus code is a really fascinating area that will definitely develop more in the coming years. I do think one thing that we have looked into and I have been considering is perhaps using machine learning for parser error recovery. So, as I noted in the talk, one big thing about our parser is that it has very good error recovery. Basically, it can notice when there's an error and basically try to guess what you meant by it. And right now we do that with some heuristics, but I think in the future it could be a really interesting opportunity for machine learning where you can basically use data sets of incorrect code to estimate what the intended code is and then offer a fix or just, you know, better semantic analysis. Yeah, that makes sense. Forlex, I don't know if that's Forlex or if it's Leap for like Alex, but asks, how about all those custom linter rules that come with other libraries, free as lint, how will you support these? That's a really good question. I'm gonna be honest, we're still trying to figure that one out ourselves. I think, you know, again, we have this idea of like, sort of like our tool being like the scout that kind of goes out ahead and like, you know, if our rules are incorrect, you should fix those first. And then if our rules are all, you know, valid and the code is correct for these roles, then you can run ESLint and get the full completion. I don't, you know, again, I think we're going to do our best to offer, you know, wide variety of roles and we're going to keep adding more. But the reality is ESLint, you know, has a wonderful job and is also a very flourishing community. And, you know, we have to acknowledge that. Awesome. You're getting really great feedback here. Just gonna tell you that what the community thinks, really good insights from your partner in the discussion room and the build room, Ante Tomic. Really good insights. He's curious about Rome, wants to check it out. And a lot of really great feedback about the talk. I don't see, now is the time, folks. This is it. We're coming down to the wire. Just get your last questions in for Nicholas. Otherwise, we're gonna join Nicholas and our fellow emcees for a short fireside chat.