Introducing CI/CD to your project might be a challenging process. In GitLab we value iteration as one of our key values, and in spirit of iteration we will be happy to share how GitLab could help you to gradually work on bringing your project to CI/CD heaven.
Build your JS Pipeline in Incremental Fashion with GitLab
AI Generated Video Summary
GitLab supports the entire DevOps cycle and uses tools like YesLint, Jest, Docker, and Kubernetes. Cache and validation are major challenges in DevOps. GitLab's auto DevOps feature simplifies Docker, Kubernetes, and Helm. Customization and advanced options are available in GitLab. GitLab's pipeline allows for optimizing job dependencies and continuous improvement. The average duration of front-end build pipelines is under 10 minutes for most people. Running a build and pipeline process in GitLab involves job calculations, runner setup, and hidden logic. GitLab can help with running front-end in Kubernetes and has a DAG visualizer. Dealing with flaky tests in the frontend is a challenge in GitLab pipelines.
1. Introduction to DevOps Cycle and Tools
I'm Ilya Klimov from GitLab, a senior frontend engineer. GitLab supports the entire DevOps cycle, focusing on verify, package, and release. On the Verify stage, we use tools like YesLint and Jest for code quality and testing. Packaging is now standardized with Docker. Lastly, I will discuss Kubernetes for release.
Hello, everyone. My name is Ilya Klimov. I'm from GitLab, from a managed import team. I'm a senior frontend engineer, and I love fast things. So I drive a lot to Celes car, I try to use my one gigabit speed Internet connection where possible, and also I love GitLab for fast build times. And while the first two obviously out of context of our conference, I will be happy to share my knowledge with the third one.
So we in GitLab are trying to support you for entire DevOps cycle, starting from create, where you create your source code, managing issues, planning ethics and so on and so on. And ending with protecting you from different malicious activities and monitoring the health of all your production staging and so on environments. But obviously speaking about entire DevOps cycle will take forever to complete. So let's focus just on these three things. It's verify, package and release, which is basically what is continuous integration about and continuous delivery will be just right after that. Delivering the things to the right after the release somewhere to your actual running environment.
2. Challenges with Tools in the DevOps Pipeline
Even tools are hard. Building a good test is complex. Ensuring code runs properly with different versions of Node.js can lead to unpredictable errors. The complexity of the pipeline grows quickly as more tools are added.
I realize probably your pipeline may be or your future pipeline may not utilize it. You may probably pick another way of running code or running Docker containers on the bare metal, whatever. But just for now, let's start with this one. And the problem here is that even tools are hard. Building a good test is a very complex thing, which is probably worth another talk. Making sure that your code runs properly when your development environment and continuous integration environment have different versions of Node.js may be tricky and may lead to unpredictable errors. One day I spent half of a day debugging an unknown crash, literally six-fold, which was one minor in the third part of the version, different than Node.js. I never wish to do that again. But as you see, we are adding more and more tools in our pipeline, even just for these three steps, the complexity grows very quickly.
3. Challenges with Cache, Validation, and Registries
In a DevOps world, cache and validation are two major problems. Caching node modules and delivering artifacts, such as test results and coverage results, are important for improving pipeline performance. Registries, like Docker registry or npm registry, vary depending on the type of release.
But, well, it still thinks that, hey, let's start with a very simple step, let's make some verification in our continuous integration pipeline. And here comes troubles. One day your boss comes and say, hey, your pipeline works pretty slow. And you're welcome to the world of one of the two biggest problems of programming. Now, cache and validation. But now in a DevOps world. So, you start learning all these fancy things of how to cache, for example, node modelers between your build steps so you could avoid running npm install or YAR install on each tab. How to deliver artifacts, things that should be persisted across pipelines, for example, test results, coverage results, if you do test integration testing, it might be a screen shot of failed things. And registry. And registry could be different. Docker registry may be for your docker containers. If you're releasing something to npm, it could be as a private npm register or public npm registry, things differ.
4. GitLab's Auto DevOps Feature
GitLab can help you with the complexity of Docker, Kubernetes, and Helm in the first stage of the DevOps cycle. The auto DevOps feature in GitLab CI is powerful and allows for easy setup. By importing a real-world repository and using the default auto DevOps pipeline, you can automatically build and test your application. GitLab utilizes Heroku buildpack to provide various features such as code quality checks, security scans, and more. The available features depend on the tier of your GitLab solution. GitLab also has an auto code quality tool that detects YesLint and runs checks specific to it. Zero configuration is the default, but additional customization is possible.
But one day, you're starting with very first things. So you have a tiny fancy step, and the boss comes to you and say, hey, you know, automated tested is not enough. I want to be able to have a quality assurance person to go through the changes of the sub branch, and to check that everything goes smooth. In GitLab, we call it review applications. Some kind of manual approval. And hello, all this complexity, Docker, Kubernetes, Helm, whatever, is arriving already at the first stage. And let's see how GitLab could probably help you with it.
So, let's start with the first step, which is auto DevOps. Frankly speaking, the power of GitLab CI was the main reason why I joined GitLab long, long ago. I was a great fan of it maybe three years before I joined GitLab, and I'm super happy about auto DevOps feature, and even happy that, as a front-end engineer, I contributed in Git. So, let's assume you just go and import some real-world repository. I've chosen the Vue real-world example application, just because my main stack is Vue.js, and click the default tool auto DevOps pipeline the settings of new project, and the magic happens. You have this pipeline, which will automatically have steps that build, code quality, yes, length, and so on, and so on, and so on.
5. Customization and Advanced Options
Autodevelops allow for quick taste of continuous integration. Customization is simple with environment variable configuration. You can configure build steps, disable certain steps, and enable/disable review apps. GitLab encourages contributions and learning. Use auto DevOps configuration and GitLab UI as inspiration. Consider the untampermylogfile project for security. Explore the frontend.GitLab.CI.YAML in the main GitLab repository for advanced options and speed optimization.
And autodevelops will allow you to quickly have a quick taste of what continuous integration means for you. So the next step is obviously customization. I don't want to go too deep into that because it will be reading like 10 pages of documentation, but it is extremely simple. As long as you know how to configure your application with environment variables, you know how to set up DevOps. You could configure every build step and you could disable certain steps in our DevOps pipeline. You could either enable or disable review apps if you want to have it built for each for each repository. If you want to have review apps, you will need to Kubernetes cluster integration, by the way.
And this is pretty simple, but one day that will not be enough, and you want to make your hands dirty, and you want to contribute something, and which we probably did not take into account with our auto DevOps pipeline. Remember the GitLab's motto, everyone could contribute, and we will be happy if you open the merge request if you find any issues or potential improvements. And I don't want to stop here on the learning, you, hey, you need the, you need to learn the new YAML configuration and you can read the docs, they are pretty awesome, trust me, but sometimes every one of us needs a source for inspiration.
I suggest you do just three things. First of all, take a look with auto DevOps configuration you've started. It will let you know how we invoke certain things. It's all open source in your project, and we'll let you have a head start. If you want just to choose something minor, just copy, paste, tune, and you're awesome. And probably with two projects which are entire in GitLab, which is one is GitLab UI. GitLab UI is our UI library, and we have a very tiny pipeline there just running tests, releasing to NPM, and having visual integration, comparing the screenshots, and always use this YAML file as a source for my inspiration. Because, for example, have you ever thought that every time your contributor, internal or external, gives you update in your YARN log file, or NPM log file, he could probably update this file to point resolution to some malicious code, malicious version, or even, entirely third-party code. Do you really want to check this manually with your eyes? Obviously no. And for example, just a few days ago, when I looked again in this file preparing for the talk, I've discovered untampermylogfile project, which is dealing with this, checking with NPM registries and making sure that log file is telling you the truth and was not altered in that specific and probably malicious way. And if you want to go to something really, really insane, just check our main GitLab repository, frontend.GitLab.CI.YAML. We are keeping separate there. And, well, it is really close to insanity. If you don't know, approximately 60% of the time, GitLab, CI, SaaS runners are building GitLab. So, we really want to talk about speed. Well, if you just think how many money we could save if we could make running our pipelines faster and how happier your developers will be as they will have feedback faster. So, let's speak about DAG. This is our GitLab pipeline. It is not even full. The test stage is way more bigger.
6. Optimizing Job Dependencies in the Pipeline
And usually, it runs one by one. We are splitting our jobs by stages. Prepare, build, images, fixtures, test. Our job running tests needs a front-end fixtures job. Yes, Lint and GraphQL Verify are two jobs. Yes, Lint can be moved to an earlier stage, but it's required to be in the test stage. The new code directed acyclic graph feature allows for specifying job dependencies. The pipeline now has different dependencies from one another.
And usually, it runs one by one. This is a philosophy everyone has. We are splitting our jobs by stages. So, first stage running first, waiting for all jobs to complete after that second, and so on and so on. Obviously, this is our flow.
Prepare, build, images, fixtures, test. Could we theoretically make it run faster? Yes, of course. Let's zoom in. And if you take a look here, obviously, our just job, which is running tests, needs a front-end fixtures job, which is cut here. But trust me, this is the second word, fixtures, here. And this job obviously generating some mock data, which is consumed by our tests. So, these two jobs are obviously in place and are kind of hard dependency. They need each other. But let's take a look at these jobs.
Yes, Lint and GraphQL Verify. For Yes, Lint, we need just a code. There is no reason to wait for something. So, probably maybe we could move it to the earlier stage. I don't like it. I don't like it because it's crushing the entire idea. Obviously, the Yes, Lint is in the test stage and is required to be there. So, what to do? And here comes to the rescue our feature still quite new. Not so shiny new, but new code directed acyclic graph, which allows you, after that, you get your hands dirty in the previous stage and understand that what is a job, what are the job names. Just specify, hey, this job, Yes, Lint as a force, doesn't need anything. So, it could be run immediately as soon as possible. And this job needs our front end fixture to be completed. So, please wait for this and start our job as soon as possible. So, part of our pipeline now looks in this way. And, well, it's pretty fun to look, and it's pretty fun to understand how fast we could go with this one. As you can see, there are a lot of different dependencies from one to each other.
7. Optimizing Pipeline and Continuous Improvement
We utilize the GitLab UI for various tasks, such as running tests as early as possible and deploying the review job. Docker image builds trigger automatic screenshot updates and container scanning. Using a DAG in GitLab simplifies the pipeline and makes it more discoverable. Consider the full custom pipeline or contributing to the auto DevOps template for your company. Pipeline improvement is an ongoing process. Feel free to reach out with any questions or suggestions on Twitter. Thank you for joining us!
For example, we couldn't calculate coverage before all our tests were completed, but we want to run them as early as possible. For example, we want to run tests only after we understand which tests we want to run. We are paying a lot of not trying to test for the things which definitely did not change, and a lot of other approaches. If you still think, hey, you don't need this. It looks like another field for my small project. We still utilize it in GitLab UI and look, it's pretty simple and still shiny. For example, the review job, which allows people to check the things on the separate URL and deploy, will be run as soon as our storybook will be built. Yeah, probably, we can do a better job on putting words on one line, but hey, we are constantly improving. Same as soon as a Docker image is built, we could run update screenshots, which will automatically update the screenshot, this is a manual job. We can run visual check to check that our screenshot looks equal, and container scanner to make sure that nothing malicious is running. This is still awesome, and this feels like a significant improvement, because it is so easy, at least for me, to understand the approach of the needs. Previously, a long, long time ago when I was working not on GitLab, I was splitting my things across many, many different stages, just to make sure I could put something to the things which make sense to me. Look at your code, maybe also have all of this, like have the first, like test one, test two, test three, test four stage, get rid of it. With a DAG, you could make your thing smooth, and quite discoverable, because it's a graph, every software engineer loves a graph, I believe. If there is one thing you probably should give a try in GitLab, this is, obviously the DAG, like 10 of 10 points, recommend it.
So, what to do next? Probably you've optimized it and you're happy with the results, and there one more thing to do it. One more thing to do is depending on what you want, actually. Probably you could go with full custom pipeline, which is, for example, what we're currently doing GitLab, because sometimes we want unusual tasks and then usual requirements and you could want to be a truly DevOps engineer, but there is another option, which I suggest you to consider. If you're running multiple projects in your company, for example, my previous company was outsourcing one so we've created many quite similar projects, just discover in our docs way how you could contribute to specific auto DevOps template for your company, assuming you're running a standalone GitLab version. And if you do this, probably you could save enough time for other people in your company. So just remember that pipeline improvement is a constant process and not a single time task. Every time take a look at your pipeline, discover the things which goes slow. And if you have any questions, any suggestions, always feel free to reach me on Twitter, Xanth-UA. And never stop in improving your flows. Thank you. Drop the mic, Ilja. What a great talk. Thank you so much for joining us. It's really excellent. Let's welcome Ilja to answer the question that he asked all of us before the talk.
8. Average Duration of Front End Build Pipeline
What is the average duration of your front end build pipeline? Most people have a pipeline under ten minutes, which is either because they are a small project or their pipeline is super-optimized. However, there is a gradual increase in the 10 minutes to 30 minutes range.
9. Pipeline Duration and CI Adoption
Not so many people have joined the over 30-minute club for pipelines. Pipelines can range from 40 minutes to six hours, depending on luck and scale. The under 10 minutes and 10 minutes to 30 minutes categories are both at 40%, indicating that the majority of people are doing CI.
So, well, and I'm really happy that not so many people joined this, like, over 30-minute club hello from GitLab. We have, like, from 40 minutes to six hours pipelines, depending on your luck and the scale. So, yes, you know that story, like, okay, I probably grab a coffee while pipeline is running, so for GitLab, it will probably be a lot of coffee. I see that now the under 10 minutes and the 10 minutes to 30 minutes are head to head, they're both at 40%. So, there are folks that are still not even doing CI. So, I guess these numbers are good. Good enough, at least we know the large majority of folks are actually doing CI.
10. Running a Build and Pipeline Process
Let's remind me what the time out is on running a build. Questions from our watchers. Alexa asks, what happens in the background when a pipeline is triggered, especially with GitLab pipelines? Yeah, for sure. It is quite simple and quite complicated simultaneously. Internally in GitLab you have .gitlab.CI.yaml file which specifies or not specifies certain requirements for build machine. GitLab marks, like, hey, here's the pipeline. Depending on certain conditions, who triggered the job, why the job was triggered, and so on and so on, it calculates which jobs will be run in this pipeline. After that, all the runners are pinging GitLab. So, each runner, it might be the same one or not the same one, depending on how many you have, is doing pulling your repository, cloning it, and after that, going for exactly job description. Pipeline is a big one, it consists out of the jobs. Also, the one thing which probably people sometimes do not realize, that even if you are running free tier on gitlab.com and you are ran out of our limit, I believe it's 2000 minutes, CI minutes per month for free tier, you could either buy minutes or just set up a known runner on any virtual or hardware PC you have, register it in gitlab.com and you will still be able to build your project on your hardware or on your, whatever you love, Amazon, Google Cloud, Azure with still remaining in free tier. So you can always take control of your hardware. Why I'm saying it is hard because I believe this sounds quite easy because in the ground there are a lot of hidden logic like retries, understanding that jobs are stuck because everyone loves infinite cycles, retrying job if runner fails, having shared caches so when your cache is on one runner it could be safely pulled from another. It is a big, big part written in Go language and in Ruby and there are a lot of dark code, at least for me, a front-end perspective tying this together. Got it. That's a great answer.
Let's remind me what the time out is on running a build. Like, I remember that used to always be an issue, where, like, there were really, like, long builds. I never remember how time is out after.
So, okay. Questions from our watchers. Alexa asks, what happens in the background when a pipeline is triggered, especially with GitLab pipelines? Yeah, for sure. It is quite simple and quite complicated simultaneously. Internally in GitLab you have .gitlab.CI.yaml file which specifies or not specifies certain requirements for build machine. Because for certain steps, you might, for example, not common for front end, but, hey, our webpack sometimes likes to consume a lot of memory. So, you could have different machines with runners installed who will consume, who will provide different capabilities, maybe different CPUs, different amount of memory available or whatever. So, GitLab marks, like, hey, here's the pipeline. Depending on certain conditions, who triggered the job, why the job was triggered, and so on and so on, it calculates which jobs will be run in this pipeline. After that, all the runners are pinging GitLab. Like, hey, I'm free, like, McDonald's and saying, like, hey, I'm free, could you give me some job? The runner is pulling up the job description and does exactly what is described there. So, each runner, it might be the same one or not the same one, depending on how many you have, is doing pulling your repository, cloning it, and after that, going for exactly job description. Pipeline is a big one, it consists out of the jobs. Also, the one thing which probably people sometimes do not realize, that even if you are running free tier on gitlab.com and you are ran out of our limit, I believe it's 2000 minutes, CI minutes per month for free tier, you could either buy minutes or just set up a known runner on any virtual or hardware PC you have, register it in gitlab.com and you will still be able to build your project on your hardware or on your, whatever you love, Amazon, Google Cloud, Azure with still remaining in free tier. So you can always take control of your hardware. Why I'm saying it is hard because I believe this sounds quite easy because in the ground there are a lot of hidden logic like retries, understanding that jobs are stuck because everyone loves infinite cycles, retrying job if runner fails, having shared caches so when your cache is on one runner it could be safely pulled from another. It is a big, big part written in Go language and in Ruby and there are a lot of dark code, at least for me, a front-end perspective tying this together. Got it. That's a great answer.
Running Front-End in Kubernetes and DAG Visualizer
Is it worth running front-end in Kubernetes? It depends on what you're optimizing, either cost or speed. Kubernetes scales well from tiny projects to huge setups like GitLab. But if your project is small, GitLab can still help you. You can optimize your pipeline based on your current stage. Cost is not an issue. DAG visualizer is available in GitLab Enterprise versions introduced in 12.2 and later.
Question from Vritr. Is it worth running front-end in Kubernetes? Since there's no CDN, you have to scale the infrastructure all the time as the workloads grow and you can also hit node IO limitations and overhead, etc. Taking into account cost plus time and the simple setup of S3 on CloudFront, what would you do? Would you do it? Would you set up a front-end on Kubernetes? Oh, it's a good question and it really depends on what you're optimizing. Either cost, I mean money or the speed. It is very hard to answer without precise problems. What I mentioned in my talk, and I would like to insist again, that this setup with Kubernetes and so on is just because it scales well in terms of infrastructure from tiny projects to huge setups like GitLab. But this doesn't mean that if you decided to dump Kubernetes because your project is quite small, this does not mean that GitLab could not help you. You could do everything in your pipeline, and for my sum of pet projects I'm just like building the code, and not even using docker for running, just copying the resulting build files to the server and running it. Quite messy and I will never show it in my resume as a good infrastructure setup, but come on, we are speaking about building infrastructure step by step, so feel free to optimize what you need on the current stage. If you have spare money, you could probably already go to the things I call the blood enterprise, but if not, feel free to go as dirty as you want. We will not prevent you from doing that. Yeah, cost is not an issue. I see that you've started to do a stand up here as well, and I'm kidding. So Louise asks, which versions of GitLab enterprise is DAG visualizer enabled? It is everywhere, and I'm just not sure I need to quickly check in the GitLab docs. It might be behind the feature flag and no, I believe it is already available. It was introduced in GitLab 12.2 and the feature flag was removed in GitLab 12.10. And this is like, my math and time is hard. It is quite long ago, like approximately 10 months ago, so it should be available for any pretty fresh GitLab.
Challenges with Flaky Tests in GitLab Pipelines
The hardest part with GitLab pipelines is dealing with flaky tests in the frontend. Due to the asynchronous nature of frontend development, it's easy to write tests that don't wait for certain conditions, leading to failing tests when the CI environment is not in sync. Unfortunately, there is no perfect solution for ensuring test reliability, and it can be frustrating when tests fail intermittently.
Okay, very cool. So, just one more question. Now is the time, folks. We still have Ilya here with us, so drop your questions in DevOps Talk Q&A, and don't forget that Ilya will be around in his speaker room and in the spatial chat, so you can definitely chat with him there.
But one more question. So, what would you say, Ilya, is the hardest part with GitLab pipelines, if you had to think about it? Oh, the hardest parts with GitLab pipelines are flaky tests, if we speak about frontend, because, due to asynchronous nature of frontend, it is so easy to write a test which does not wait for something, but just relies that something on your CI environment is slow enough, or even fast enough. And that means that you're changing some things in CI configuration, and suddenly you see a lot of failing tests, which you, hey, probably I messed up my configuration, but it is not. Unfortunately I do not have like a perfect solution about making sure that your tests are cool and it is not actually a pipeline question, but for me this is like the darkest part of any front end related pipeline things in GitLab. Because, well, when it fails every time, you know what to do. When it fails sometimes, everyone is annoyed and, hey, no one is happy when you're like, oh, it failed. Let's try to retry it again and pray for the best.
I hear that. Folks, now's the time. Any last questions for Ilya? We'll wait a couple seconds. It takes folks time to. Many folks saying, thanks. Awesome talk. Cool talk. Thank you. Very useful. So, you should be happy, Ilya. You had a really great talk. So, congrats on a really excellent session. I think we're going to wrap it up. Thank you so much for being with us and sharing from your insights on GitLab. Ilya will be around in the speaker room and on the spatial chat and on Discord. Feel free to reach out and ask any more questions that you might have that didn't make it live into the session. Thank you so much.