Creating and deploying new software is risky. We've all seen how easily bugs arise, causing software to be poorly delivered or to the wrong people. What's more, depending on how tightly we couple our systems and services, they can interact unexpectedly and unfortunately with existing software or hardware. Beyond unintended consequences, we can also find that people can use our services for nefarious purposes. It's essential to have safety nets in place when things don't go as planned or people attempt to break the rules. In this session, we'll discuss how feature flags can work in both temporary and permanent scenarios to enable you to break the quality triangle and deliver quality promptly.
Keep Calm and Deploy On: Creating Safer Releases with Feature Flags
AI Generated Video Summary
Feature flags can be used to mitigate risk in software development by altering the visibility of features to end users. By using flags, you can protect against single points of failure and pivot to a fallback service in worst-case scenarios. Monitoring and managing complexity is crucial, and using feature flags allows for dynamic changes and adjusting values based on proven correctness. Operating in the unknown is inevitable in software development, so it's important to manage complexity and embrace learning. Collaboration is key in making feature failures less painful.
1. Introduction to Flags and Risk Mitigation
Hey everyone at React Advanced. I'm Jessica, and I'm going to talk to you about how you can use flags to mitigate risk in your software development. Feature flags are typically used to alter the visibility of a feature to end users. They can be used for testing, rolling out features to a subset of users, and more. At LaunchDarkly, we can flag based on different types of data, allowing you to mitigate risk in complex scenarios.
Hey everyone at React Advanced. Hope you're having a good time. I'm Jessica, and I'm going to talk to you about how you can use flags to mitigate risk in your software development. So, let's get into it.
Now you've likely heard about feature flags solving sort of release-shaped problems, right? And they're often used in these sort of entitlement scenarios, changing what's available to certain users. And it's typically used in that sort of boolean state. We take a feature, we wrap it in a flag, and that effectively becomes our control point within our code, allowing us to alter its visibility to our end users. The feature's either visible or it isn't. It's on or it's off. And once we've validated the changes in production and are confident that our feature can be on for 100% of our audience, we get rid of the flag. That's the kind of typical lifecycle that we see with flags.
As you know, this is super useful when it comes to, say, previewing features for testing and production without going out to our end users or for rolling out customer-facing features to just a subset of our user base. But what if the problem we're trying to solve requires more than just a binary state change or A-B testing? At LaunchDarkly, when we're talking about flags, we're not simply talking about two states. We can actually deal with a whole spectrum of stages in your release process. We can flag based on a number, a string. We even have JSON flags. And that allows you to mitigate risk in these more sort of complex scenarios.
2. Flags for Risk Mitigation
It's important to protect yourself from single points of failure and mitigate risk by using flags. By flagging around potential failure points, you can create a system that allows you to pivot to a fallback service in worst-case scenarios. This helps you roll back gracefully and maintain uptime, even in the presence of bad actors. Switching flags can support online stability and provide agility in resolving issues. Staging rollouts and using flags ensure a solution that can be applied across your user base. Flags give you certainty and the ability to operate from one version of the truth. It's crucial to take care when deploying in complex environments.
It ultimately helps you maintain availability of all of your applications. And it's super common, as we all know, to rely on downstream services and providers. But things start to get scary when you have a single point of failure in your delivery. Well, why not protect yourself? De-risk that element. By flagging around that point, you could effectively create a system that allows you to pivot to a fallback service, if in case the worst case scenario does in fact occur, which we know it often does, unfortunately. Sorry.
This gives you the ability to roll back gracefully, and without having to go offline altogether, all within about 200 milliseconds. You're protecting your uptime, you're supporting your team's SLOs, and everyone's much happier. This can also be done in the case of bad actors. Say someone's using your service for something they really shouldn't be. You can isolate that one endpoint. You can give 404 for that one bad-acting device and everyone still gets their 200s. In essence, you get to really define how you degrade. You get to ring fence your blast radius and make a decision around how you do roll back. So this is perfect for scenarios like load shedding or for manual control of certain problems. This process is all about putting you back in control of a situation that you likely didn't anticipate or ask for.
And of course, when we're talking about resolution of these sort of scenarios, let's take the situation where a safety valve is able to maintain uptime by rolling back to a previous state where there's like a breaking change. Switching a flag can not only support you in staying online, but also gives you the agility needed to be able to fix the issue at hand. Using the audit log and your observability platform, you're able to pinpoint the issue, see when, where it occurred. What was the change that contributed towards the outage? And when you fix is, in fact, ready to deploy, you, of course, need to be ready, like sure that it can actually go out to all of your users. That it is going to be a solution that can be applied across your user base, isn't going to cause further problems when implemented because we can stage your rollout. You can stage your fix by going out to a subset of your users at first and gradually rolling out to more and more people as your confidence grows. Flags give you the gift of certainty here. It gives the ability for everyone to operate from one singular version of the truth. And now that you're back online, your fixes live to your entire user base.
Of course, we want to stay online, right? Sometimes it's hard to know if your configuration is truly good to go. You may make some guesses based on your platform, how it behaves in certain scenarios. But the thing is, is that assumptions, they can be easily proven incorrect and preconceptions proven wrong. You know, when you're having a myriad of microservices or dealing with processes requiring numerous network calls, there's some complex tuning often required. A lot of the time you're having to take a great deal of care when deploying.
3. Monitoring and Managing Complexity
Monitoring is crucial for understanding the impact of CPU and performance. Accepting guesswork as part of your workflow can be beneficial. Storing configurations in feature flags allows dynamic changes. Starting with assumptions and adjusting values based on proven correctness. Different configurations can be pushed based on IP addresses, node sets, or methods. Operating in the unknown is inevitable in software development. We must manage complexity and embrace learning. Exploring new possibilities with added protection and confidence. Collaborating to make feature failures less painful.
Monitoring can quickly become your best friend and you're having to get a real sense of the impact of your overall CPU and performance that you're incurring here.
How about accepting the guesswork as part of your workflow? Store your configuration in a feature flag, and this approach allows you to, it gives you the ability to dynamically change which value you're going out with. You start with your assumptions and roll towards a more suitable value as your, hopefully, assumptions get proven correct or incorrect as you go. You can even push different configurations based on different IP addresses, node sets or methods to kind of get to where you want to get to.
And when it comes to software, operating within the unknown sometimes just can't really be avoided. We've got to manage complexity, and we've always got to be learning. We're going to need to venture into uncharted territory to discover better ways of doing things. That's just how this works. But let's make sure that we're exploring these new possibilities with a sense of added protection and confidence. So while features will inevitably fail sometimes, and things will go wrong, let's work together to make them less painful.