How I Automated Code Changes for 100 Repositories: Getting Started With Codemods



So, hello everyone! How's it going? Right, cool. Do you enjoy the conference? Okay, cool. How many developers here? Wow, quite a lot. How many of lazy developers here? All right, let me rephrase a bit, a bit less. How many lazy developers who like to automate things? Whoa, I like it, now we talk. Okay, today I will share how I automated code changes for 100 repositories. My name is Konstantin. I'm working at Xebia in the Netherlands as a software development consultant, and I also help Git Nation team to organize the conferences. So feel free to reach me out on Twitter or LinkedIn. All right, so my talk is connected to my current customer, and to understand the problem and the solution better, I'll give you a bit of details about the scale of the product itself. So I'm working at This is the largest marketplace in the Netherlands. It has millions of visitors daily, and today it's 90 millions of live ads. Hundreds of thousands of new ads every day. So if you came from the Netherlands, you absolutely for sure know this product. If you are from Germany, you can compare it to eBay DE. So as for today, we run 59 standalone BFF services in production. On average, there are more than 15 production releases daily, and there are no limits for that. To support all the required features, we have more than 100 frontend-related repositories with javascript or typescript code. And to make it work properly, the code ownership is spread across different frontend teams. Those frontend teams are part of bigger domain teams or full-stack teams, you can call them. And those are strongly focused only on building business logic. So ideally, nothing else. And to support the product teams, we have a frontend platform team. This is the team I'm working at. Our goal basically is to enable other frontend engineers to deliver secure, reliable, and efficient software. So my team is responsible for many things for the frontend platform itself. So we can say like it builds, it's deployments, react server-side rendering engine, design system libraries, performance, security, monitoring, observability. So all those issues are handled by a frontend platform team. And this distributed setup has a lot of advantages. At the same time, it obviously has some challenges, right? For example, the library is published as npm packages. And the change in one library usually leads to multiple updates in dependent projects. Of course, we have some internal tools to track and orchestrate with all these dependencies between the projects. And we utilize RenovateBot, for instance, to perform automated updates for them. Well, that might look like overhead, maybe a hassle, but we benefit a lot from this approach. If you want to discuss the architecture more in depth, you can find me after the talk or reach me out on Twitter. Yeah, there is a Twitter in the bottom. You can join me, you can follow me. So we can discuss the architecture. But today I'm going to talk about other things. Let's get closer to a more real use case example. Imagine you're maintaining a design system library that exports the primary button component. So far, this component has only one property, onCloseHandler. And there's another project, for instance, some backend for a frontend, which has a toast component. It doesn't really matter what this component does. You only need to know that it imports and uses primary button, which is provided by design package system version 1. And as a design system library maintainer, one day you decided to make some changes. So primary button no longer exists. It's just replaced by just button, which is being exported. And the new component has extra property, you can see, property kind, which is basically responsible for the kind of how the button look like. So this is the perspective of design system package breaking change. So that's why we released version 2 to follow the semantic versioning. Now to update the project with the latest design system package, we need to fix our toast component. So we basically need to replace primary button with button. We need to add new property kind to button, and we use primary value for it. And everything else should stay the same, like children or other properties. So far it's obvious how to do. This git diff looks really simple, but someone has to do those changes. So we can wait until other frontend teams will pick up the changes and fix these dependencies. But dependencies are lazy, right? You know it. And as a platform team, we don't want that our changes will stale and will somehow block other frontend teams. So we need to support them for that. But how can we do it? Because there are many files using this component and the library. And let's even multiply it by a number of projects. For instance, use design system. automation for the rescue. This is an open-ended pipeline I use to automate such changes. The first step is code-based preparation, like GitHub, Codespace, or checking some projects in your local machine. Then we modify the code the way we want. Then we apply auto-fixes, like code styling or ESLint fixes. And to make sure that nothing is broken, run tests, build. And the final step, of course, deliver or commit your changes. Okay, let's focus only on code modification step. We can update code in many automated ways. But the most powerful way is based on usage of abstract syntax tree. So if you are not familiar with abstract syntax tree, you can basically think of it like an object representation of code after being parsed. So, for example, this is a function called... named grid, right? This function returns a string, hibernium. And this is abstract syntax tree that looks like a tree of nodes with attributes. The defined function corresponds to a function declaration node that has identifier grid, and it doesn't have any params. The body of this function is just an array of statements. But our function has only one return statement, which is string literal hibernium. Okay, so we can parse javascript or typescript code into abstract syntax tree, or basically just an object. You can think of it like an object. But that gives us possibility to update the tree easily, because we can manipulate this object. For instance, we can add, remove, replace, or even update some nodes. And if we will generate new source code from the mutated abstract syntax tree, you will get a new modified source code. So basically this is... you know such tools like Babel or ESLint? This is basically what's happening there under the hood. So let's get back to our example with toast and design system package. We can use Babel or other parser to convert source code to abstract syntax tree, but how effectively to mutate the nodes? We can traverse manually, search it somehow, but there are better ways. There are several great tools that could help us with that, but I find JS Code Shift is one of the most powerful tools for that. And I really think that this is underused and underestimated tool. This is basically a code mod toolkit that was open sourced by Facebook in 2015. So it's not a new library, but yeah, react is also two years older than JS Code Shift. And we are today still talking about react at this conference. So JS Code Shift is used under the hood by many other projects. For instance, you can find its dependency in react, remix, BlizzJS, storybook, you name it. There are multiple of them. And to modify your tree, JS Code Shift provides you some simple api. For instance, you can use transformation function for location to a transformation function as one of the parameters. And you need to provide a path to files you want to modify. So a lot of code mods are already written. You can find them in the Internet and GitHub. But let's create one for our specific use case. Well, this is source code of a transformation function. It looks scary from the first glance, but you will for sure understand it in a couple of next minutes. Bear with me. Transformation or transform in this case is just one javascript function that accepts several parameters. You can see file info. These are details about file being processed or api. This is an object that provides access to JS Code Shift helper functions. And options is kind of extra parameters you can use for parsing or manipulating this tree. So you can probably from external calls, for instance. And this slide is crucial for understanding. At the first highlighted line, you see the code is parsed from basically string or source to abstract syntax tree. And at the last line, we generate back our code from abstract syntax tree. So as you already might guess, all mutations are happening in between of the slides. Okay. Let's implement a code mod for breaking change from our design system library. And let's apply it for our toast component. This is a part of abstract syntax tree that represents this node with our import declaration. You see the source from design system package, and it has three import specifiers. And we are looking specifically at the one for our primary button, which is in the middle. So let's fix it with JS Code Shift transformations. First, we need to find the node that is responsible for imports from design system package. So JS Code Shift provides api. In this case, we use find method. We are looking for import declaration node type. Find method also accepts some filter parameters. For instance, we're looking for a node which has source attribute and has value design system. With this collection of imports, we're basically looking for imports with a file that has imported name primary button. Quite simple so far. And we have another method, utility function from JS Code Shift that helps us to replace the existing node with another one. So this is a mutation operation for AST. So we replace this primary button node with new one. And to create a new one, we use another api. We can construct a new import specifier with an identifier button. So after this step, basically our import is fixed and our AST is mutated. But how should we deal with the primary button JS element within the component? So you already know the concept. You know the approach. For example, we are looking for all the JS elements. And we are looking specifically for a primary button opening element. Because it's not self-closing element. We are looking, for instance, for a primary button opening element here. As our component basically can utilize several primary button elements, we get a collection from this response from this method. And for each of these elements, we need to mutate the legacy element to a new one. So in order to do that, let's create a new JS element. We are using the same approach from JS Code Shift api. Create new JS element. And here we provide a button as an opening element. And for our opening element, we also can add here an attribute kind we were talking about before. We provide a value primary. Don't forget to copy all existing attributes from the legacy node. In our case, it's on click property. And of course, the closing button element. And the last step, we need to provide all the children. So we can copy all the children from the legacy node. In our case, it's just a close text. Okay. Final step. We are familiar with this api. We replace all legacy nodes with newly created ones. And that's it. We just implemented the code mode to update the six components. Those six components, right? Now we can apply the same code mode for the entire code base. And for instance, JS Code Shift accepts not only one file path. It accepts the path to many files. And this command will basically update all the users of primary button within one project. But how should we deal with other projects? We have more than hundreds of projects. So, yes, remember, we have this automated pipeline, which I mentioned before. So we can integrate the code mode into this pipeline. And, yeah, we can update with this approach the entire code base for all the frontends. And, yeah, takeaways. Automate repetitive tasks. Use the power of abstraction text tree to code changes. And if you haven't, come home and write your first code mode if you haven't done it yet. Thank you all. Thank you ever so much. Please join me over here for some questions. Wow, we've had a lot of questions. Thank you ever so much to our audience members for submitting them. We'll crack right on and get through however many we can in the 12-ish minutes we have. The first question is, does source code AST source code transformation keep the original source code formatting? That's a good question, because we can use different parsers. And different parsers provide different outputs. For instance, we have Babel parser, Recast, or typescript parser. And some of them provide your original identations or locations of every node. And, yeah, by default, yes. But we can also provide this option parameter, which is the third one. And there we can also add some extra things like prettify code the way we want and so on. So, yeah, but to answer this question, yes. Excellent. Thank you. Next question. Wow, we have so many. Do you create these transformers, or are teams forced to also commit them? Well, originally I introduced this approach to create these transformers within our front-end yield or chapter, I would say. And now I'm boarding the teams to use these transformers. Transformers are the same, because it's really powerful. And there are some cases that we need to update code in many places. And, yeah, this is already used by other developers. So basically we have a CLI tool, or we have this CLI tool integrated into our pipeline. So we can just provide transformation function, and it will apply changes everywhere. By the way, you have this transform.js function. It can be even URL. So just code shift can download it from somewhere outside and just apply to your transformations. Interesting. And I think this might come out again in some of the other questions that have been asked. But before we get to those, how good have you found typescript or TSX support is with jsCodeShift? Yeah, it's fully supported. So we can use proper parsers for that. So you can also update types the way you also update other nodes. So, yeah, typescript also has all the node types within the IST. Cool. Thank you. So, next one. angular has nice tooling for this already. For example, automatically running code updates when updating libraries using ngUpdate. Do you use similar automation? Not yet because our case is not really related to update libraries. This is just an example, and this is really rare. But, for instance, CodeShift has quite a big community. And there's a website called CodeShift community Something. They built a CLI which basically provides some extra api on top of jsCodeShift. And you can provide a config when you can define, okay, we migrating the system from version 1 to design system to version 2. And this is transformation function to apply. So basically you can provide your releases of the library transformation function like ngUpdate does. So, yeah. Cool. There's a few that are kind of similar. I'll ask the one that's at the top of my list here, which is how do you or how do you suggest others culturally embed writing code mods? Is there a concept like who writes breaking changes must supply a code mod? Well, it depends, of course. But if you want to make happy the consumers of your library or project, just make it with code mods supported, right? You can release breaking changes, but no one will be happy about it. But if you just can update smoothly, you will get more benefit of it. Yeah. And that's completely fair. I wonder, though, if there's more like there's more processes or mental models or cultural rituals that you build within your engineering teams that encourage people to build code mods specifically when they make breaking changes. Yeah. The thing is that not every project or company needs code mods, right? Because it's quite low code and not low code, but hard code. And it requires some skills. But, for instance, if you have a specific case, you can fix these code mods. Just try once, see the smiles on the face of other developers, just show them my presentation, and it will be okay. Yeah. And just a reminder for the audience, when we're done with our Q&A, we've still got some time. We can always go over to the speaker Q&A area, and you're welcome to ask Konstantin some questions directly, because there are so many I don't think we're going to get through them all in eight minutes. The next one, I think, currently the most upvoted question is, is there any good tooling for testing code mods? Or approaches, perhaps, not necessarily just tooling. Well, yeah. Basically, the approach is quite simple. So you can just create a file, like entry file, like fixtures, and you can also test the output. And basically, you can provide those fixtures for testing framework and run code mods. So it will take this original file, compare with expected, and it will be tested. Cool. Oh, because you can output new files, so you can do an easy comparison. Awesome. What are your take on monorepos, or asked differently, would you rather run a code mod in one monorepo or many non-monorepos? It doesn't really make sense to run code mod in monorepo, unless you have some edge case. But, for instance, if you are exporting one button, in the case I just showed, you can just use internal IDE tools to rename it everywhere. So it doesn't really matter. But if you, for instance, have independent code, which has to be updated in many places, maybe you could use for a good one. But it's not so relevant to monorepos. Yeah, makes sense. At what point, I think this is a forever developer question, at what point is automation, in your opinion, even worth it? Have you had any examples where it will take more time to build automation scripts than doing the manual work to update instances? Well, it's kind of related to a topic about laziness, and also about curiosity. We can endlessly do this job, like changing here and there all the time. But I'm a software developer by nature, and I like to build these things. Of course, as a software development consultant, I have to sell my customers this approach. And I built some POC, for instance, and showed how it works. So I spent not a lot of time. So to start with JS code shift, it's just easy, really. You can try it and get benefit in one hour. It's more about integration in the current process. But once I achieved this and brought this CLI, it already was used multiple times, and it already was, how to say, paid back with it. I recently had a scenario where I spent a whole day writing a script to do something, and it was a boring, menial task. I didn't want to do it, but it would have only taken me an hour or two to do it manually. How do code mods work with the as keyword to alias imports? I'm pretty sure this structure is part of AST. So if it's parsed by Babel, for instance, it will be supported by JS code shift as well. So JS code shift, just once again, this is kind of a way, just abstraction above AST, which provides you some helper function to traverse and mutate the tree. And the parser could be different. It can be typescript, it can be Babel, it can be other parser. So if it's supported in javascript, if it's supported in Babel, it will work with JS code shift as well. Excellent. Is it possible to parameterize the transform functions to create common reusable actions, for example, rename component.js, change props.js, and so on? Exactly. So this is a third parameter option, which is four. So you can provide extra parameters, any custom parameters, and you can get them from object. Awesome. This is a very flexible and extensible way to be doing tasks like this. It's kind of the vibe I'm getting. Every question is like, does it work in this context? Does it work in that context? You're like, yeah, yeah, it does. I suppose I'll ask the inverse question, which is, in what circumstances have you found that the tools you present today are not necessarily the most appropriate way to make these kind of changes? Or are there any? Well, the first one we just mentioned recently is just Monorepo. So you have to think twice, do you need a code mode or not? Second one, you don't even have Monorepo, you have one project. There is no need for a code mode as well. So if you have some job which is not repetitive, it does not require automation, then maybe a code mode is not a good choice. A little bit of a heavy-handed approach, perhaps. As we're rounding out and we have just a few minutes left, I'm going to take a scroll through the questions and find those which have the most upvotes, so the ones where most people have asked for them to be asked. Could you use CodeMod to migrate from an enzyme to a testing library? Well, probably you could implement the one. If it does not exist yet, have it checked. Just check first, maybe it's already there. But probably you could, because with CodeMod you basically can do a lot. For instance, you can find a CodeMod that migrates class components to functional components in react CodeMod's library. It doesn't support advanced cases, but it does some basic migration. So if you need just to use static analysis to understand what was there and what needs to be done, you can base some algorithm on that, so you can basically implement this CodeMod. Cool. I think we might have time for one more really quick one, which is, can you use the diff of the AST of two different library versions to generate transformers rather than writing them yourself? Well, not really, I guess, because you need to reverse the approach, right? You have to make a transform function out of these two source codes. It's kind of tricky. I guess it's time for AI for that. Yeah, absolutely. It saves us all time writing code. Perhaps any code at all, and then what do we do? Thank you so much for the amazing talk. Thank you so much for being so flexible in using up the time we have today when we asked you to fill this slot. Thank you on behalf of the whole audience. Thank you to everyone who asked questions. I know we didn't get around to all of them. You can go to the speaker Q&A section and pose those now.
28 min
02 Dec, 2022

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic