Modern distributed architectures are more complex than ever before, with a majority of companies operating multiple languages, protocols and architectural styles. This poses significant challenges for engineering teams increasingly asked to deliver more at speed. Whilst the practice of contract testing rose to prominence during the RESTful microservices boom to address similar challenges, the problem statement has evolved. In this talk, we'll discuss these new challenges and demonstrate how contract testing is still as relevant as it has ever been in helping to solve them.
Beyond Rest - Contract Testing in the Age of gRPC, Kafka and GraphQL
TestJS Summit 2022
Okay. Well, thank you everybody for coming to my talk on Beyond rest contract testing in the age of grpc Kafka and graphql. My name is Matt fellows. I'm a principal product manager at smartbear. I was a co-founder of pactflow which joined us about their family back in April this year. And I'm a core maintainer of pact which is in a consumer driven contract testing framework and open source one and the subjects all of course. I put as much of today's talk. Prior to working a Pack Flow. I was a consultant. I'm a recovering consultant and I was lucky enough to work it from Australia's biggest and largest well known organizations and their distributed systems and really see them evolve over my career. There have been of course a huge amount of technological change since the days we're so which was the predominant technology when I joined the industry over the years and in my relatively short career, you know worked from that solar. So starting point to rest and micro Services the rise of public cloud and iot event sourcing events framing modern data pipelines, and of course serverless architectures. And I found in many of these situations and context that contract testing was still really relevant and would often look to introduce contract testing into places where there was benefits. We had in checking left moving faster and solving those problems in itself. But of course during those rollouts or often, I'd be on the receiving end of some kind of snarky comment usually from another competing consultancy. Of course that had the following shaped argument. If we just used insert blank technology, then we wouldn't need contract testing. But is it true? Well today we're going to examine that statement. We're going to ask the question is rest really the problem and could we save ourselves a trouble of having to think about contract testing and writing test by using a superior technology to API Communications? We all learn briefly what contract testing is why it exists in the problem that it solves and we're going to look into you know to see if history is repeating itself or if these newer Technologies and Architectural Trends really do solve the problem. Specifically we're going to look at a few classes of Technologies. We're gonna look at specifications such as OAS and Ace and kpi to a degree graphql and also idls and things like protobufs and our Bruins Rift. There of course others but these are by far the most common Alternatives suggested to me by my consultant interlock it is we'll look at these from a general contract testing lens. But of course we'll use pact which is obviously a tool that I work on as a concrete implementation to help us understand and how it works in practice. And hopefully you'll see that yeah while packed has evolved to meet. Some of these needs will also see that the problem and solution is much more General than any specific technology or language, you know discuss today. Let's quickly talk about you know or understand why contrasting exist and the contract the context in which it operates. I think starting with a customer quote is a good way to set the scene just to help us, you know feel the problem. This is this is a quote from a Pack Flow Prospect reaching out for some help and for argument's sake and you know to protect the innocent, let's call him Bill and Bill is a leader of a testing Organization for over 40 teams in a very large banking institution and you can see here. He's basically describing working in this big environment with a highly volatile sort of you know, testing environments which makes it challenging. He's trying to work out how he can use contract testing to test, you know, all the things he's got it's got restful services graphql at Kafka third party systems all these things right and we can take away from this and he works in this chaotic environment is complex and he's looking for ways to bring some process and control to that situation. If it sounds anything at all, like your company or architecture, you're not alone research from Smart Bears standard quality as well as postman's API report really back this up and we can see that for one internal microservices are becoming a massive Focus for teams. You can see here that 61% said they're gonna say the most growth for microservices, but actually behind the scenery that there's actually internal Services about making a data available internally to other teams to create more value, which is really interesting, but you can see that also most companies operate in a multi protocol environment any one percent or so and almost 60% have three or more protocols. Now, of course while microservices aren't you and me lessons have been learned. There's these issues. We're starting to see, you know, really a decade on that. We're still emerging from this this new wave doing things and in the report or both these reports we can see that you know, 50% of people stated that experience a skills were barrier to getting microservices going and 35% state of the complexity of systems. There's a second issue is becoming the problem. And the obstacles are around you this the speed of delivery or the expected speed of the reverse is the time to actually test and build stuff. He's really at odds with one another and what I found most interesting of all was it mature organizations are ones feeling the pain? And so you think why it's a bit kind of intuitive if it's immature. They've probably got these practices and technology and whatnot to deal with it, but I actually think the reasons are well understood. The first reason or one of those reasons is how we test microservices today. Most companies rely on tests and their distributed systems using a form of integration testing called end-to-end Innovative tests. This involves basically taking your all your applications and deploying them all into one big shared environment and running a battery of tests against the whole system. So all the layers of your system, right and then if that works you can then deploy. Now this might give you a high degree of confidence. They do if they pass they do tend to be quite slow and they also tend to be very flaky and they tend to be very hard to debug. And because of all this they give you feedback much further down the life cycle, right? Because you've had to deploy them before you can get that feedback. It means they're very difficult to deploy. You probably can't deploy things independently and you probably have a distributed monolith rather than a nice coherent set of Cooperative components, you know working to a single end in mind. This creates a problem we start to scale the size of your ecosystem, press people on software. You know as you do this as you add new components of people into system, you see this nonlineal action to nonlinear response to things like the cost complexity time or number of environments build time the costs associated with change and developer idle time. But if you look really careful, you'll notice it going we start to feel the pain a bit further down. You know, that inflection point is not there at the start. So you sort of come into the system thinking it's easy to use but then as you scale you eventually it's Tipping Point where it becomes real painful and so no wonder bills having a pain having a bad time. This is kind of explaining what is going on there. Okay. So what's the solution? Well, one of the solutions is using things like contract testing. Contract testing Can Help by reducing a lot of the engine tests and replacing them with a way of testing your API Communications, which is often what into interesting aims to do. Contract testing is a technique to ensure that two systems are able to communicate and we look at this integration points. And we do it by mocking out the dependencies. So if you're an API consumer you mock out the provider and you replay those mocks against it later on in real life against the real provider. And if those mocks have been validated we're feeling confident these two systems are able to communicate. And the benefit of this way of working is that it's much simpler to understand you just testing a single integration point at a time. You don't need dedicated test environments. You can run your Dev machine because of this you get faster reliable feedback that's easy to debug these test scale linearly and we can deploy the services independently and of course we can actually track this over time which gives us this ability to evolve them. So we talked about this in a previous test Summit go back and watch that video if you want the rest of the detail of that talk Okay, so hopefully we've got a bit of a grass at the problem. We're trying to deal with and how contract testing might get help with that. Let's now return back to the original hand waving argument from my consultant interlocutor and ask the question is rest really the problem and could we save ourselves all the trouble by using something else? Let's start with open API and its counterpart Jason schema, but you can also think about this as applying to things like 18 kpi and soap and other specifications to an extent graphql comes into this mix as well. We don't have time to talk about graphql specifically today. So how does it aim to solve the problem? Well, the first thing is that specifications contain all the bits, you know needed for humans and computers to communicate what an API can do and it uses things like Jason schema to tell us what the shape of the incoming data is and what the response shape should look like or what they can be as well and then we can generate API clients and servers from that OAS, so we know we're not going to Breaking changes, right? So if we can generate a client code from The Oasis, we're not always guaranteed to have a working system. Well, of course the answer is no otherwise, we wouldn't be here. And if you're old enough to remember soap, you'll remember that we had all the same features too that clear specification a clear schema and client client generation and schema and service Generations, but it didn't solve the problem. and rest obviously has some better redundancy principles built into the design of it, you know Apostles law and extensibility, but it really isn't enough actually and if you're looking here if you look at this quote, this is from the Jason schema website, it basically says if you're going to do this kind of testing with Jason schema to validate your requests, You're gonna need two types of validations one is a structural tool the schema level, which is what Jason schema can do and one of the semantic level and that needs to be done in code probably so Jason scheme is still can't do that and even tells you that so the aphorism that an API is not incompatible with respect. It starts to sum up. how I think about how we think about open apis what I mean by that is scheme is actually abstract. And so it's very difficult to say that an API actually does implement aspect because spec is actually abstract and not exactly that clear. Examples so optional and nullable Fields. In sufficiently advanced open API documents you'll see the use of optional annullable Fields, but you won't know and what situations those fields will or will not be present. So in some cases you a lot of those fields turned out to be optional and so now it's really difficult to actually understand what data will be available at what point in time, but certain consumers may need that information combine this with polymorphic endpoints. So there's an endpoints against different shapes inputs and different shaped outputs. it now becomes very difficult to say for every single resource in your document which input is going to correspond to which output and under what conditions In the case of soap, we actually had schematron to help us with this. It had these extra layer of protection. You can put over the top of it, but it needed XLT to do that which used functions to do it. Of course, we lose sight of the API surface area as well. So if we're using client generated sdks, well, we don't know what I consumers doing. So we have assume that using the entire interface which means we need a different mechanisms mechanism for evolution. The standard one here is going to be versioning which we'll touch on a second. And lastly but not leastly client sdks can be modified and use an unexpected ways. So I was talking to the one of our solution engineers in Europe who works with yes on the world's largest lager that's like a hub customers and you mentioned that basically 90% of all the customers who use codegen will actually modify the generator code when OS changes after that initial generation. So that that opportunity for drift is definitely still present. Of course, let's quickly briefly touch briefly touch your head versioning versioning is the most common practice here. But it is painful. We don't want to do it if we can avoid it teams in a build another version of the software maintain it test it release it and now I've got multiple versions We need to maintain so now we've got more code to maintain. Of course, the cost of maintaining coders are usually much greater over the last level than building it. We now assuming we don't know what that consumers using functionality always carries forward through those versions because we have to assume they need it and want it. And then we need to get consumers eventually on those new versions which require us again to monitor coordinate communicate people onto those versions. Really? This is the cost and if we can avoid this overhead we should so let's talk about the second class of issues here. We're going to talk about interface definition languages. So we're talking about things like protobust Avro and Thrift and I should say alligator that protobust by far is the most common comment that come up with me is that we just use products. We wouldn't need contract testing because it's got schema Evolution built into it from the start. In fact, it's so good. You can go forward and backwards in time. You can use older clients with newer servers and newer clients with older servers and they can all communicate magically and of course it supports codeine as well. And so we can create service and clients from those definitions. so if it's true that that's the case then why is Proto bus and grpc even number one requested feature on our open source roadmap? Well, the answer here is a colorless green ideas sleep furiously. Well, what am I saying here? Obviously, this is absurd statement actually comes from a guy named Noam Chomsky who was running about this in 1950s and he's dissertation about language and he was talking about what he's making a point here is about syntax and semantics that's intactic will formless does not mean It's semantically understandable. So, you know Grandma and syntax are different. There's no meaning here. So what we're trying to say is just because we can communicate with one another we must still need to better understand what it is we're saying. So let's take a bit of an example here. Let's say we've got a Kafka queue a topic where we're posting orders have been completed onto that topic and we have a consumer on that topic that's reading all the orders coming in and telling up the totals just to do a report or something. Right? So it's going to finally ordered every time one comes through. It looks at the total reads a total value and print it out saves it somewhere. Now let's say we need to change that order structure. Maybe we need to split prop the the total into some different values. Maybe just put up into you know, GST or taps or what? Have you all we need to move it somewhere else in the payload, right? Either way it's changed and now We've pushed out this new message. Well, just because the consumer can still read the message doesn't mean it didn't need that total value, right? It still needs the value. Otherwise, it can't do its job. This problem manifests itself, even further and gets much more difficult when you combine it with optional fields. In researching this talk. I spoke with the team that worked with grpc and products in a Global Payments Processing Company and as a strange bug appearing when Merchants were occasionally not receiving payments from the provider and after careful analysis and debugging their discovered. There was this new immersion service that had been changed as a configuration API had a new Boolean field on it for enabling the stabling payments for a merchant and then found that being one single client that didn't know about this update and every time it's sent to request to the merchant service. It would inadvertently disable merchantments because the default field of course is going to be false. So you can see how this kind of bug could be particularly in a various defined and resolve. And it's actually a case where you probably don't want the forwards and backwards compatibility maybe in network explosion would be better. And in fact to make things worth version 3 at protobus gets rid of required Fields altogether which feels very much like what happened with soap in the old days where everything was an optional field and I was just really an incomprehensible API. So coming back to the challenges the product Buffs. We talked about some Antics and optional fields. We still need to manage breaking changes so field descriptors and things like this technically or theoretically they should be easy to manage but actually in practice it's really easy to accidentally refactor your code and change field descriptors. We may need to consider how we manage transport safety in the case of protobuf which can go over different types of transports. We may want to look at narrowing the types so idls a lot of these Technologies. In fact don't have really rich types systems. Graphical is a good example that does But let's say you want to store a date that's an anti-date or a particular format of a date or SIM for all these kinds of things. Well, the language won't have those those Primitives built into it. You need to put that on top of that. That's actually a challenge. It actually could be a problem. And of course if we use sdks and whatnot, we're going to lose a visit visibility just like before into real-world client usage. And of course if we accept these problems, we now need a new wave coordinating changes. Right. So hopefully inside to see a bit of a theme here and whilst obviously these Technologies do give us some benefits. They don't actually solve this one problem. We've been talking about so let's return to what lessons we can learn from all this and see if we can understand why consumer driven contract testing yourself effective and helping us. The first point I want to make is that your provider contract is not your API. It is just one representation of your API. And in fact any observable change in the behavior and API will be deemed breaking by certain consumers. This observation is known as Hiram's law. So while we can't make this law disappear we can find ways to reduce the ambiguity and one of those ways is to bring the consumers perspective into the picture. Which is of course how packed contract testing impact can help us. We use record and replay to test the representative examples against the real provider. This gives us confidence is actually going to work because we're testing it. We also get specification by examples for this is what we're doing and that reduces them ambiguity. We now see how it's supposed to work. It improves API comprehension. We can actually time travel we can we can actually do time travel and that service Evolution because we now know pairs of application versions and what contracts the valid between them at any point in time so we can actually go forwards and backwards in time. We can go and say does this version of the consumer can it read messages from this producer and kind of Vice Versa? Very cool. We encode those transport concerns in the contract. We can do narrow type matching in the contract and of course because the goal of the contracts we know what all that consumers doing. We have the surface area that gives us a mechanism to evolve without having to do versioning. So I hope you can see the contract testing does provide this generalized data exchange model that works across, you know, transports and protocols and content types. Whether you're providing restful apis over HTTP or products remember grpc or Kafka or running average or file system as part of an ETL data pipeline. The problem we're trying to solve is really the same. It's can we communicate some data between multiple versions of evolving applications? So I'll pack doesn't have negative support for all of these Technologies out of the box. It can be extended by plugins. So you can write a plugin in your choice language of choice to support things like transports or protocols or matching rules. And then you can just distribute that out to all the languages that support plugins. So we've added that support to pack jvm. So Java kotlin and whatnot pack go and we've just released Beta Support for JS. So you're all the first to know about that and the first official product plugin we created was part of us, of course. So here's an example product buff or geopics E / product about failing where the provider has not matched the exact content type of the consumer. So what I takeaways here multi-protocol internal microservice adoption is accelerating. And that the lack of standardization for design and test is contributing to the challenges of micro services for all. Learn about Hiram's law and the need to reduce ambiguity and how really how API is just is not the specification itself. It's just one view of the API in contract testing is an approach that can help both reduce the complexity of our testing but also the ambiguity inherent you know that those API specifications And lastly we saw how packed can wrap all this up into a standardized workflow for testing those API Communications across languages transports and protocols. So, thank you so much for coming to my talk. I really hope that was interesting and you learn something you have any questions do reach out to me on any channels below and enjoy the rest of the conference. Thank you.