Performance testing expertise that is developed for a long time. In order to measure your server performance you need a tool that can efficiently simulate a lot of abilities and give you good measurements according your analysing criteria.
Autocannon NPM library gave me exactly that - that library is super easy to install and has a very simple API to work with. Within a really short amount of time you can start do performance testing to your application and get good measurements in development environment and in your performance labs, and generate complicated testing scenarios.
In this talk I will introduce Autocannon, explain how to efficiently analyse your server performance with it, and show how it helped me to understand complicated performance issues in my Node.js servers. At the end of this lecture, developers will be able to have the ability to integrate a fast and easy tool in order to measure your server performance.
Hi everyone, I'm very happy that you have come to my session about performance testing with AutoCannon. But first before we will really go into some technical stuff, I would like to introduce myself.
So I am Tamar, I'm writing code for a lot of years, and it's my passion to write code. In addition to that, I was managing large development groups and I was working as an architect in several places. Currently I lead the backend development in a startup called XM Cyber. It's a really cool startup.
The mystery of performance testing
[02:24] Okay. So now let's go into the technical part of the lecture. And I would like to talk a little bit about the mystery of performance testing. Why do I call it a mystery?
Let's say that at the first time that I did performance testing, I felt like I was climbing on a mountain. Well, it was very, very hard and confusing. So why it was very hard and confusing? Because I had so many questions, because everybody was talking a lot about a lot of terminology that I did not understand. So to which terminology I mean?
[03:10] Well, when you're doing performance testing, you're talking about a lot of terms and a lot of measurements that you're not familiar with. And at least for me at the beginning left me a little bit confused. So first of all, the throughput, the throughput of the server. So how do you measure this throughput of the server? What does that mean? I can simulate a lot of scenarios in a lot of ways. So what is the best way to measure the throughput of the server?
In addition to that, concurrent users. Well, concurrent users, how would that affect my scale? What is it? What is it concurrent users? What is that mean? What is its measurement? How to simulate that? What is the difference between that and between HTTP pipeline?
[04:15] Another thing that is very common when you're talking about performance testing and talking about benchmarking is the 99 percentile. What is the 99 percentile? Why it's very important? Because sometimes when I measure and when people measure, we're looking at the 99 percentile, much more than what we're looking at the average. So why the 99 percentile is so important.
And the last thing is the average response time or the response time. So the response time, how you measure it? Whether you have to look at the average or the 99 percentile, there's also the center deviation of the benchmark that needs to be taken into account.
[05:09] So all of those, when I first encountered them, left me very confused. And I had to understand exactly what I'm doing in order to understand how to simulate my server in order for it to the test means something and really improve my performance.
Explanation of the above terms
[05:29] So let's explain a little bit about all those terms and just a little bit at a high level to get you in order to all of this.
So first of all, of course, the main goal for performance testing is to understand how much load our server can handle. Well, usually you are working with one Docker container, in my opinion, in performance testing. And then you're simulating HTTP request to that one Docker in order to understand what throughput this one Docker can handle. And if this one container can handle 100 concurrent requests, when you duplicate it and you create another instance of it, you create another replica, then you're able to handle 200 requests, et cetera. If you create three replicas and 300 requests... But it's really important to understand how much load one Docker container actually can handle.
So important questions that were needed to ask, what is the 99 percentile of our response time? And what is the throughput? How many concurrent requests can we handle on average? Those are very important questions, and why those questions are important?
[07:04] First of all, the 99 percentile of the response time and in performance testing, it's really important to look at the 99 percentile. And the question is why it is important to talk and look at the 99 percentile? Why the 99 percentile is so important?
So imagine that you have a commitment for a third party, meaning that somebody's using your system and you're telling them, "Listen, my requests are all always faster than, let's say, three seconds."
[07:42] If you would go on the average, then it's not data that you can rely on. And why it's not something that you can rely on? Because you have standard deviation. Usually most of your instances are not around the average. You can have instances that are far from the average and in that case, it is better to look at the 99 percentile, because that means if it's three seconds, it means that 99% of my requests were faster than three seconds. And only 1% was slower than three seconds.
So yeah, this is why. And then you are very sure to give the commitment. You feel confident in that commitment to a third party to say, "Hey, yeah, this is something that I can rely on. My requests are faster than three seconds because my 99 percentile is three seconds." So this is why this is important.
[09:00] Another thing is, I think, the most valuable measurement in order to understand the throughput of your server is the average concurrent requests. What is the amount of concurrent requests that can be run simultaneously? Here we're looking at the average and not on the 99 percentile, because in some cases or most of the cases, the 99 percentiles represent a peak. Because when you have a peak, then your server can have more. And a maximum concurrent request is like your throughput during the peak, but that is also a really important measurement.
What is AutoCannon?
[09:45] All right. So after we're speaking about all of that, let's speak about AutoCannon and how's AutoCannon getting into the picture? So you need to have a tool that can simulate requests. I'm talking about sending a lot of requests simultaneously. So you need something that will help you send those requests simultaneously. You need to control the amount of concurrent users you would like to control the run time, you want the tool to be run for 15 minutes or 30 minutes for a period of time. And that is how AutoCannon is getting into the picture as a really good tool to simulate, to simulate performance testing and do benchmarking.
So what is AutoCannon? So AutoCannon is a tool for performance testing and a tool for benchmarking. It is written in Node.js, which is really cool like that, it's written in the language and it's supporting HTTP pipelining for HTTP, it's supporting HTTPS, it supports concurrent connection. But yeah, I'm talking about HTTP pipelines and I'm talking about concurrent connections and you know what? Let's talk about, what are HTTP pipelines and what are concurrent connections and what is the difference between them?
[11:16] Well, the diagram is on the left side of this screen. HTTP pipeline, meaning that I'm sending three requests and I don't have to wait for the first one to return and then send the second one, but I'm sending them simultaneously and I'm sending them without waiting for the response. So here on the left side of the picture, I'm sending three requests without waiting for a response.
Then on the right side, we have concurrent connections. Well, what does it mean? It means that we have a socket open from the client to the server, and you have requests on that socket, but that is a simulation, a good simulation of users, for example, approaching your website. Because if you have like thousand users approaching your website, map out and put current connections. So that is what I can say.
How do you install the AutoCannon itself?
But now, after we spoke about it, I would like to show you a cool demo about testing server with AutoCannon. And then we will show the improved version of the server and try to compare some results. All right, then.
[13:42] Cool. So let's look at this nice server code. Okay. So this is a server code that we have, and I hope that you're all familiar with express, express is an extremely popular library for hosting and publishing HDP APIs for your servers. The syntax is really clear here. You're requiring express. You're exposing it on a specific port. In my case, 3000. Here, you're exposing a route. This is called an express route. I'm exposing one simple route of get with a slash. It would be an HTTP gap.
What this route is doing? Well, you have a hash function here. This is hashing a password. I gave it a password, which is a constant here, random password, as you can see. But well, when you enter here, you have this function. It was doing an algorithm and cryptography. It's generating a hash to this string that I have transferred to it using a salt, a random salt.
[15:15] This algorithm is what's called CPU intensive, but worse than that it's syncronics. If you're looking at here, I'm using it in a syncronics API of Node.js, there is no call that promises or anything like that here, which means that it would be executed inside the example itself. And it would cause a freeze in my code.
So let's try to test this server a little bit with AutoCannon and test the results. All right, then. So this is a command line, and I think the instance of the server is here. Let's take it down and let's write it again. Here it is up and running. Server is up on 3000 and then let's do AutoCannon, which parameters they give it? c is the number of concurrent connections, d as you can see, is the duration and w is the workers. So right now I don't want to give it workers. I would like it to work with -c -d. Let's hit enter. Now let's count to ten. One, two, three, four, five, six, seven, eight. Ooh, cool. We have the results. Okay. Let's see what we have here.
[16:37] So, first of all, let's look at the latency. Here's the latency of that one, meaning the response time. So the 99 percentile is 1.5 seconds. Meaning the 99% of my requests were faster than 1.5 seconds. And 1% was slower than 1.5 seconds. As you can see, the average was close to 800 millisecond and the standard deviation is 118 millisecond. The maximum request is around 1.5 seconds. So this is the data that we have here. Let's remember it. 1.5 is the 99 percentile.
Then we're looking at how many requests per second. So we can see that on the average, we were handling 12 requests per second, but at most we were handling 13 requests per second. We never handled less than 10 requests per second. So as we can see, our server, our now server can handle 10 requests per second.
[17:59] Now, after doing that, let's stop this server and let's see the other server that I wanted to show you. All right. Sorry, let's go to VSCO and I didn't want to go back to my presentation. But yeah, you can see here this is the second server. This is also implemented in a simple express server. Very simple, and it exposed the same API, but here we are working with the asynchronous API, as you can see, it's right here. So how do you know that it's asynchronous? Here there is a callback that is being transferred and resolving it, meaning that this is not a synchronous operation anymore, but asynchronous and is transferred to the event loop. I'm sorry. It's transferred to the worker flag, and is not blocking my event loop. So this is what we can say about that operation. So now after we've seen that, let's go to the command line.
And well, first of all, find this server that we had. Sorry about that. And as we said, this is the server. Now I'm running the asynchronous version of the server. That should be more efficient and let's look at the results now. Okay. We're running it for 10 seconds, remember? One, two, three, four, five, six, seven, eight. All right, let's see what's going on here.
[19:41] First of all, the 99 percentile is 1.4 seconds, which is much better than what we have in almost 100 millisecond. So it's better. The next thing is the average, which is close to 700 milliseconds. And if you remember, the previous average was around 700 milliseconds, which is good. And now let's look at how many responses we can handle. So the average, if you remember, was around 12 requests per second. So the average went up to 14 requests per second, and the 99 percentile is 20 requests per second. Meaning that at peak we can handle 20 requests simultaneously, which is in contrast to the last run that the 97 percentile was 14 requests per second.
So yeah, we can see that all of the measurements has been improved, which is good, but this is like a standard ran of AutoCannon. And this is how I can see the results and I can analyze them. And this is a process of, well here, I had a server and I knew what was a problem in advance, but you can do this process and change your server and then rerun it and run out of AutoCannon, and look at those measurements, like basic measurements and see whether there are improvements.
[21:18] All right, cool. So now let's go back to our presentation, to my presentation and let's continue AutoCannon.
One thing that I would like to say about AutoCannon, which is pretty cool. Well, AutoCannon actually uses worker threats. I hope that you're familiar with worker threats. It's a cool model that is implemented in... It started in Node10. Right now it's implemented, it's become known experimental in Node12, and it's enabled us to run several events within parallel and this model is used here in AutoCannon, which is really cool. So if you want to work with several workers, you're able to do that with a flag of -w.
First of all, the basic example. So my main goal is to write testing tools, cool testing tools that can help me test my application. So here is a basic example. We're getting an instance of AutoCannon, and then look at what we're doing. We're just starting the run, at the end of the run we're printing it. Here, I have 10 concurrent connections. I don't do HTTP pipelining, meaning that it works in a way that he sends a request and waits for a response before he would send another request. And the duration here is 10 seconds. And this is how he can simulate.
[23:08] Another example with Async Await. Well, most of us are working with Async Await and a modern code of Node.js. Here, Async Await, you're creating an instance and you're waiting for it to finish the run. And then you print the results.
Another nice thing that you can do in your code with AutoCannon is it has an API of client events, meaning that you can receive an event and do something with it. For example, every time that you receive a specific response, you can handle it in which way that you want, which can be useful. So this is another API that is really nice to explore.
[24:59] And the last thing, well, usually when you want to do performance testing, you don't want to send the same request all the time. You would like to have a variety of requests. And here is where the feature of AutoCannon with request help us in coming into action. So here, you can see, in the example we have two requests. The first one is a post request. And what we're doing here is we're posting a product and then we're getting the ID at the response that the server has generated for us. And then for that product, we are posting a catalog. So in the first request, this is the flow we're posting an ID. I'm sorry, we're posting a product. The server is generating an ID. And then on the second request, we can take the ID and post more data through. So this a flow that can give you variety and flow and multiple requests, and this is more close to simulate like real life scenarios. This is more close to that.
Tips for testing production scenarios
[25:19] All right, so we are really close to the end, so just really small tips for testing production scenarios.
First of all, a thing that at the beginning I wasn't aware of and when I've become aware of, it has improved my performance testing very much. You have to make sure that the data that you're testing on is similar to what you have in production, meaning that if a collection in a specific size, make sure that your mock data that you're testing with it is also in the same size, that fields and production are identical to fields that happen in your database, that appear in your database. And yeah, and look at your production flows and try to mock them as much as you can.
That was it. I hope that my lecture gave you some knowledge about performance testing, and I hope that AutoCannon, you would explore it a little bit in order to try to write your own performance testing tests. So that was me, that's my Twitter handle. And thank you very much for listening.
[26:58] Tamar Twena-Stern: All right. So I see that a lot of people have done performance testing, even like most…
[27:08] Stefania Ioana Chiorean: Yeah. I think we are on a majority slightly close, but still we are on a good path. They do it. Would you expect these results?
[27:18] Tamar Twena-Stern: Actually no. I thought that most of the people don't do it, but good to know that most of the people are getting professional in this area. It's a hard one.
[27:38] Stefania Ioana Chiorean: Oh, definitely. And it's very important, I think. Especially with so many devices and we have so many issues with the performance all the time.
[27:50] Tamar Twena-Stern: Yeah. Also there's a lot of types of performance testing. There is stress testing, low testing, peak testing.
[28:00] Stefania Ioana Chiorean: Yeah. I think that's a good point that you mention though. Because one of the questions I had was: "How many of these types do you think we can cover with a good amount of work? Not to sacrifice and not publish anything anymore because it's not properly tested. Or which will be the most important of them to be tested for sure, from the types that you mentioned?"
[28:27] Tamar Twena-Stern: All of them should be tested. And the way, in AutoCannon you can now write scripts that can help you do it very efficiently. For example, it's very important to do performance testing, stress testing, to understand how much load your server can handle. And also it's really important to understand if you have a specific amount of traffic and then suddenly you have a peak it's very, very important to make sure that your server won't collapse. Like in peaks, there are a lot of other issues like auto scale. Like you have to see that you take up other instances of your server quickly and that your users are getting responses fast. So I think if you're talking about prioritization, of course like stress and low testing, to my opinion at least, are the tasks that you should start with and then you should go to peak testing.
[29:50] Stefania Ioana Chiorean: Yeah. I agree, agree. Now. Yeah. We need a bit of prioritization. Because sometimes we just don't have the resources. That's why. Okay. So one was asking "How does AutoCannon compare to GitLink regarding injection profiles"…I'm sorry?
Tamar Twena-Stern: Can you please repeat how AutoCannon compares to...
Stefania Ioana Chiorean: GitLink.
Tamar Twena-Stern: Okay.
Stefania Ioana Chiorean: G I T I L ink. And more regarding injection profiles, scenarios or feeders.
[30:29] Tamar Twena-Stern: Well, I have to say that usually I don't work with GitLinks. Actually the good thing with AutoCannon is that as a developer, I can write whatever I want, regarding scenarios and all the things that I... Well, the person who asked this question asks about how to build testing scenarios. So I'm just really comfortable doing it in coding. And so I actually more like the flexibility of AutoCannon. The thing that I … any scenario that I want. I prefer it than GitLink.
Stefania Ioana Chiorean: I agree.
[31:37] Tamar Twena-Stern: To me, it's much more simple and gives better results, to me at least.
Stefania Ioana Chiorean: Yeah, yeah. Personal experience. Yeah. Mark has another question. "If my computer does not have enough power to challenge the server, what should I do? May I orchestrate a swarm of computers to run AutoCannon?"
[32:01] Tamar Twena-Stern: He was saying that his computer doesn't have enough power to run the AutoCannon, if I understand correctly. Yeah, definitely he can deploy it in the cloud and he can build a cluster in the cloud and deploy the softwares testing with AutoCannon over there.
[32:38] Stefania Ioana Chiorean: Oh, good to know. We do have another one from MandalorQA. And the question is "Can you integrate AutoCannon with end to end test in Cypress?"
[32:49] Tamar Twena-Stern: Oh yeah. Sorry. Actually I didn't try to integrate it with Cypress to be honest. But well, you can do coded integration everywhere. It shouldn't be hard.
[33:10] Stefania Ioana Chiorean: Yeah. Yeah. I agree. And also in this case you did a try, maybe they can let us know afterwards. And maybe one last question, I think we have time, "Are you integrating Git into a CI or running performance testing continuously, such as once a day or every deployment"?
[33:27] Tamar Twena-Stern: All right. So I am personally supporting the approach of running performance testing once a day, let's say, and not in every CI, because I think it a little bit slows down the development. Of course, it has the downside of not knowing on which specific commit things were broken. But I think that faster development happens when you're running once a day.
[34:01] Stefania Ioana Chiorean: Oh, that's an interesting opinion because these days everything is pushed to be like DevOps, everything we integrate in the continuous pipeline. And maybe we should…
[34:13] Tamar Twena-Stern: The thing is that this test is long and at least in XM Cyber, we have a huge CI/CD that does a lot of things. And actually it's very efficient and find a lot of bags, but this specific test is long. Yeah. So it would mean that a developer would have to wait for a few hours. It's adding a few hours to every commit, it's long.
[34:51] Stefania Ioana Chiorean: Yeah, definitely. And you have to balance quality and time invested. Thanks so much Tamar for joining. Thanks so much for being here and thanks for the great talk, it was nice having you.
Tamar Twena-Stern: Thank you very much.