How to Use Gamification to Improve Quality on Your Project

Rate this content
Bookmark

Building software is all about hitting the right spot between features and quality. But we rarely talk about how to measure quality. Let’s look at how a gamification system (points & leaderboard) within a GitHub action helped developers on my team care about quality!


13 min
20 Jun, 2022

Video Summary and Transcription

Welcome to my talk on using gamification to improve quality. Code quality is crucial and involves maintaining technical debt, ensuring maintainability, and prioritizing delivery and quality. To improve code quality, create a new standard, automate enforcement, and motivate teams. Resolving merge conflicts by removing warnings and automating warning decrease and error reduction can prevent future issues. Strive for zero errors by finding a balance, enhancing tools, blocking pull requests with errors, and incentivizing developers.

Available in Español

1. Introduction to Talk

Short description:

Welcome to my talk on using gamification to improve quality. I'm Jonathan Wagner, an engineering manager at Theda UK. I've always struggled with finding the right balance between fast delivery and code quality.

Hi, everyone. Welcome to my talk on using gamification to improve quality. To quote about myself, I'm Jonathan Wagner. I'm an engineering manager at Theda UK and I've been working as a tech lead for the past four years on about 10 and more projects in production. I can tell you I've always struggled with finding the right balance between pushing for fast delivery and having a good code quality on my projects. I've seen both extremes, like project with 100% code coverage and project were just going to stretch production without testing anything. So, it's always been a struggle and I want to talk to you about all of this.

2. Understanding Code Quality

Short description:

Code quality is a crucial aspect of software engineering. It involves maintaining technical debt, ensuring maintainability, and prioritizing delivery and quality. It's important to address both quick fixes and root causes, prioritizing the quick fix first and then investing time in preventing future issues.

Something classic that people say in software engineering is that there are three hard things. You have caching validation, naming things and prioritizing code quality. What do I mean by code quality? Let's dive a bit more into this. It's everything that contains technical debt, maintainability, factoring and so on. So, it's finding the right balance between delivery and quality. But it's also deciding when to do the fix for the root cause versus the quick fix. Ideally you want to do both, but maybe in the right order. So, prioritize the quick fix and then invest time in looking to the root cause and preventing the issue from happening again. But that's the first question of prioritizing.

3. Improving Code Quality

Short description:

To improve code quality, start by creating a new standard for new code, automating its enforcement using tools like ESLint. Address legacy code by motivating teams to prioritize its improvement. I'll share a story about how we approached this, starting with a project with 1500 warnings. We used gamification and CI to incentivize reducing warnings, but encountered challenges with simultaneous pull requests.

How do you improve it? It can be quite complex. I started to develop a theory on this and I'm going to try to explain this to you. So, let's start by trying to split the problem into smaller parts. So, let's say you have a code base and you want to improve on it. The first thing you can try is to look at the new code you add. And then after that, focus on the legacy code.

So, first of all, the simple part, the new code. You can start by creating a new standard, training your team with this new standard and then making it hard for people to write bad code. That's an important step, that step you can automate. And to automate it, you can use tools like ESLint. It's not the only solution, it's definitely not perfect, it doesn't catch everything, but it helps prevent bugs. And often when analyzing the root cause of a bug, you can identify that an ESLint rule could have prevented it. So it's a good occasion to add a new rule, train your team on it, make sure they know how to fix it and bit by bit do something about it. And that means a new code you write has better standards and hopefully, you improve the legacy code as well.

But this legacy code, it's hard to decide when to look at it or not. And it's even harder to motivate everyone in your team or multiple teams, if you have to look at it. So that's when it gets tricky. What do you do there? So let me tell you a little story about how I approached the problem and explain some other things I've learned along the way. So little story, we'll explain you in what state we started in, how we played with the CI and gamification and then what happened and what kind of results we had. Initially, we had a project with about 1500 warnings. So quite a lot of warnings and this number was decreasing very slowly. So every time developers were adding features, it was known that they shouldn't be adding new warnings and Bluecross would be blocked by the tech lead or the developers in the number increase. That means there was a place in the code where you can say, okay, this is the number of max warnings. And if it changes, it has to go down, it cannot go up. But in some cases, we broke the deployment pipeline. That would happen when two amazing devs would want to decrease the number kind of at the same time with a different pull request. So let's say the first developer fixes two warnings. The max warning is now 1498. And the other developer fixes three different warnings. That means he gets a max warning down to 1497.

4. Resolving Merge Conflicts

Short description:

We encountered an issue where merging code with a different number of warnings caused unexpected breaks. To avoid this, we decided to remove warnings and only allow errors. By modifying the YesLimConfig and using the Klinter tool, we successfully eliminated all errors and prevented future merge conflicts.

First one merges. Everything is green, all good. Second one merges. Didn't replace beforehand. Everything was green without any merge conflicts. And boom. It breaks. Why did it break? It's because we now have minus five warnings instead of the expected minus three or minus two that we had before. And that means everything is broken. Someone has to fix. People are not sure why it's broken. You might not have the alerting. Might take forever to fix.

So we want to avoid that at all cost. And one way to do this is to basically remove warnings. Let's say we don't want warnings anymore. We just want errors. That's one way to look at it. That's what we tried. So basically I went to the YesLimConfig. We pressed all the warnings by errors and overwrote the one that was defaulted by the plugins we had. And we went from 1500 warnings to zero. But then we had the same amount of errors and that meant RCA was broken.

But thank god we had a little tool that already existed which is a LimConfig generator called Klinter. It's open source. You can use it as well. And it helps you automatically add YesLim disabled comments everywhere you have an error. And that means we don't have any more errors. The CR was clean again and we never had any more merge conflict like this one. So first step, quite simple, straight forward, fixes everything.

5. Automating Warning Decrease and Error Reduction

Short description:

I automated the process of decreasing warnings by creating a GitHub action that posts a comment on pull requests, providing points earned and rankings. It takes less than 10 seconds and does the job well. After some initial bugs, the system ran smoothly for the next three months. We started with around 1600 errors and, despite a few bad weeks, managed to decrease the number of errors by 235 in three months. At a rate of about 78 errors per month, it would take us 4 years to reach zero errors.

But then we have the problem of decreasing our warnings. So that's when I started thinking about, OK, let's try to automate this. Let's try to put kind of a little incentive in place. Let's try to make it like a game with a leaderboard. So I went on it, spent a little quick encoding and came up with this.

So it's a GitHub action. Could be adapted to SQL CI, GitLab or any other CI tool you use. It's quite simple. It's basically posting a comment on your pull request and it's telling you how many points you've earned in the pull request. How many points you've earned since the beginning of the week and your current rank for this week. And you can see the podium, the full leaderboard and the explanation about how to earn points. Basically what happens is, it's taking the git diff on the pull request and then counting how many lines you added contained a nested disavowal and how many lines you removed that contained a nested disavowal. Then based on this, we get the score. We compute the score for everyone as well and print the leaderboard. No need to store anything anywhere, it's just computed there every time you open a pull request. It takes less than 10 seconds and does the job just well.

So I put that in place, I was super proud of myself. We released it, first week, so many bugs. Lots of pull requests had zero points when people should have had points. People complained, people were unhappy, I worked a bit on it. And after that, it was a smooth ride for the next three months.

So here's the data for the three months on the project. So we started out around 1600 and then you can see a few new recipient rules added. And overall, it looks like it has increased quite a bit. But let's zoom in a bit more and see what happens. So first, zooming in on the 3000s and then adding some baselines, we can actually see that apart from a couple of weeks in April that were a bit bad at the beginning and at the end, we had some good times. And there's a bit more computation on that. If we forget about every rule we've added, we actually decreased our number of errors by 235 in about three months. So that's a rate of about 78 a month. And assuming we started, we now start at 3500, it would take us 4 years to get down to zero.

6. Striving for Zero Errors and Ensuring Code Quality

Short description:

We can aim for zero errors in code, but when dealing with legacy code, fixing everything may introduce new bugs. It's important to find a balance and prioritize improvements. To ensure code quality, we can enhance the tool, block pull requests with errors, show potential points, and track the weekly decrease in errors. Exploring alternatives like tests and incentivizing developers can also be beneficial. Ultimately, the goal is to make coding fun and encourage contributions to improve quality.

We can do a bit more math and be like, okay, 78 per month. If we have a team of say 35 developers, that's about each dev fixing one error every two weeks. It's definitely not a lot. You can probably expect them to be fixing like two bugs in a week. That means divide by four that number, that means in one year we could get down to zero. So is it good? Is it bad? What do you guys think? That's the kind of question that we were asking ourselves and that I can explain what I've learned.

First of all, should we aim for no errors? Is this something we should be... Should we fix everything? Should we go down to zero? My opinion is that when you touch legacy code, you might introduce new bugs because the code is working. You might not be well tested and by touching it, you just increase the chances of introducing new regressions. So aiming for no errors means probably inserting new bugs in your code base. It's like the complete opposite of what we want to do in the first case. So maybe that's not actually the goal. Maybe we'll just be like normal to have a slope that is quite good at the beginning and then after a while it stabilizes and it's normal because the new code is up to standard and you don't need to fix anything more.

Something else that you can think of then is how do you make sure that you get into the situation as best as possible and how do you detect when you get there. So maybe you can improve the tool. Maybe we can have new features like making sure we block the pull request to prevent people from adding new errors. It could be showing potential points by looking at the files that have been touched. So if we look at each file and how many warnings they have then we can say okay had you fixed everything in there you would have won 20 points instead of just three. Then we can also show the total weekly diff so that every week you can actually make sure that the number is decreasing and not staying average like we had in April. And maybe there are other alternatives. Maybe we shouldn't be just looking at easelint. Maybe we could be looking at tests and make sure that when we test files we earn points as well. Maybe we could also incentivize people a bit more like people are earning a prize when they get first place or you could also like put the leaderboard on the tv for everyone to see it at all time and make sure that it's present in everyone's mind that it's a priority. But then that depends on, is it actually a priority? But then most importantly, was it fun to code? Definitely. I had so much fun coding this and i really hope that you guys start trying it, contributing and yeah setting it up on your project playing with it, opening pull requests and improving quality everywhere. So that's it.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

TechLead Conference 2023TechLead Conference 2023
35 min
A Framework for Managing Technical Debt
Let’s face it: technical debt is inevitable and rewriting your code every 6 months is not an option. Refactoring is a complex topic that doesn't have a one-size-fits-all solution. Frontend applications are particularly sensitive because of frequent requirements and user flows changes. New abstractions, updated patterns and cleaning up those old functions - it all sounds great on paper, but it often fails in practice: todos accumulate, tickets end up rotting in the backlog and legacy code crops up in every corner of your codebase. So a process of continuous refactoring is the only weapon you have against tech debt.In the past three years, I’ve been exploring different strategies and processes for refactoring code. In this talk I will describe the key components of a framework for tackling refactoring and I will share some of the learnings accumulated along the way. Hopefully, this will help you in your quest of improving the code quality of your codebases.

React Day Berlin 2022React Day Berlin 2022
29 min
Fighting Technical Debt With Continuous Refactoring
Top Content
Let’s face it: technical debt is inevitable and rewriting your code every 6 months is not an option. Refactoring is a complex topic that doesn't have a one-size-fits-all solution. Frontend applications are particularly sensitive because of frequent requirements and user flows changes. New abstractions, updated patterns and cleaning up those old functions - it all sounds great on paper, but it often fails in practice: todos accumulate, tickets end up rotting in the backlog and legacy code crops up in every corner of your codebase. So a process of continuous refactoring is the only weapon you have against tech debt. In the past three years, I’ve been exploring different strategies and processes for refactoring code. In this talk I will describe the key components of a framework for tackling refactoring and I will share some of the learnings accumulated along the way. Hopefully, this will help you in your quest of improving the code quality of your codebases.
React Summit 2023React Summit 2023
26 min
Principles for Scaling Frontend Application Development
After spending over a decade at Google, and now as the CTO of Vercel, Malte Ubl is no stranger to being responsible for a team’s software infrastructure. However, being in charge of defining how people write software, and in turn, building the infrastructure that they’re using to write said software, presents significant challenges. This presentation by Malte Ubl will uncover the guiding principles to leading a large software infrastructure.
React Day Berlin 2022React Day Berlin 2022
25 min
Building High-Performing Cross-Cultural Teams
Everything we do, from the way in which we write our emails, to the method in which we provide negative feedback and evaluate performance, governs the performance of our teams. And understanding how culture impacts our efficacy as a team can drastically improve our day-to-day collaboration. In this session you'll learn: How different cultures communicate, How different cultures evaluate performance and give constructive criticism, How different cultures make decisions, How different cultures trust, How different cultures perceive time.
React Summit 2022React Summit 2022
21 min
Scale Your React App without Micro-frontends
As your team grows and becomes multiple teams, the size of your codebase follows. You get to 100k lines of code and your build time dangerously approaches the 10min mark 😱 But that’s not all, your static CI checks (linting, type coverage, dead code) and tests are also taking longer and longer...How do you keep your teams moving fast and shipping features to users regularly if your PRs take forever to be tested and deployed?After exploring a few options we decided to go down the Nx route. Let’s look at how to migrate a large codebase to Nx and take advantage of its incremental builds!
Node Congress 2023Node Congress 2023
30 min
Next Generation Code Architecture for Building Maintainable Node Applications
In today's fast-paced software development landscape, it's essential to have tools that allow us to build, test, and deploy our applications quickly and efficiently. Being able to ship features fast implies having a healthy and maintainable codebase, which can be tricky and daunting, especially in the long-run.In this talk, we'll explore strategies for building maintainable Node backends by leveraging tooling that Nx provides. This includes how to modularize a codebase, using code generators for consistency, establish code boundaries, and how to keep CI fast as your codebase grows.

Workshops on related topic

DevOps.js Conf 2022DevOps.js Conf 2022
76 min
Bring Code Quality and Security to your CI/CD pipeline
WorkshopFree
In this workshop we will go through all the aspects and stages when integrating your project into Code Quality and Security Ecosystem. We will take a simple web-application as a starting point and create a CI pipeline triggering code quality monitoring for it. We will do a full development cycle starting from coding in the IDE and opening a Pull Request and I will show you how you can control the quality at those stages. At the end of the workshop you will be ready to enable such integration for your own projects.