Building Reliable Backends with Durable Execution

Sylwia Vargas
21 min
04 Apr, 2024


  • James S
    James S
    I was disappointed that this ended up being an ad for a SaaS instead of ways to build reliable backends without throwing money at the problem.

Video Summary and Transcription

This Talk explores the paradigm of message queues for reliable backend execution. It highlights the benefits of message queues, such as guaranteed delivery and offloading of long-running processes. The drawbacks of using queues are discussed, including the complexity of managing infrastructure and applications. The solution of using a reliability layer called Ingest is presented, which allows for non-blocking background tasks and provides a dashboard for monitoring and managing jobs. The Talk also emphasizes the importance of reliability in building software systems and introduces the expanding scope and functionality of Ingest.

1. Introduction to Message Queues

Short description:

Hello, everyone. Welcome to my talk about reliability, backend, and execution. I will discuss the paradigm that makes life easier. We are now living in a constant 90s nostalgia. The 90s brought us many great things, but there is one thing we could say goodbye to: queues. Message queues are a form of asynchronous service to service communication. They allow for guaranteed delivery and offloading of long-running processes.

Hello, everyone. Welcome to my talk where, for the next 20 minutes, I will talk about reliability, backend, and execution. Just a quick introduction. My name is Sylvia Vargas. I'm from Poland. I really love pierogi and previously I worked at StackBlitz. Now I'm a developer relations lead at Ingest.

This talk is about the paradigm that makes life easier. But before we talk about the good, let's talk about the bad. We are now living in a constant 90s nostalgia. And, of course, this is no surprise. The 90s brought to us a lot of different things, great stuff that really is still with us. However, there is one thing that possibly we could say goodbye to. And these are the queues.

So let's look at what message queues are. A message queue is a form of asynchronous service to service communication using service and microservices architecture. Messages are stored on the queue until they are processed and deleted. Each message is processed only once by a single consumer. But here I need to interject because in actuality, multiple workers can consume messages from a queue. In order to preserve ordering of tasks, they will need to execute serially. But back to the definition now. And message queues can be used to decouple heavyweight processing to buffer or batch work and to smooth spikey workloads. So you can think about it that once you add something to the queue, it will reach its destination one by one. The delivery is guaranteed. And what's happening in the queue does not impact other parts of the infrastructure. And queues can be really massive.

So let's recap. With queues, you get guaranteed delivery because you know that once something is added to the queue, it will leave it only once it's processed. And queues allow developers to offload long-running processes to the background so that your application does not choke. You would use queues for data-intensive processes or when integrating with external systems.

2. Drawbacks of Using Queues

Short description:

And another benefit of queues is horizontal scalability. However, there are drawbacks to using queues. Building additional infrastructure and managing complex applications can be a lot of work. In times of limited budgets and resources, it's worth considering if managing queues is the right choice. Instead, durable execution allows us to define workflow logic in our application code and ensures reliable execution.

And another benefit is horizontal scalability because multiple messages can be processed in parallel. As workload increases, multi-applications can handle high throughput while remaining reliable.

However, there is a but. So let's look at this Reddit comment. So queues are great in data intensive processes, as I said, that don't need to run on main thread because they execute asynchronously. The tasks are processed in the background and the application is still responsive. However, there are some drawbacks to the queues, which this Reddit user delicately mentions in this quote. Once you take something from the queue, the rest is on you. And queuing service does not care anymore. So what does it even mean? Let's look at that. So queues are great when your application is simple. When it grows in complexity or if it's distributed, you all of a sudden need to worry about a whole wealth of additional infrastructure that you need to build.

And it's going to be you who needs to build it. So, for example, you will need to build concurrency because you want to be able to control how many steps are executed at one time. Or, for example, debouncing because we all know how costly it is when functions execute multiple times. Or state persistence and management because now that you have a distributed or complex application, you have to share state across different functions and queues. Then there's also error handling because what if just hypothetically one service provider has an outage? You will need to include retries and also failures. I mean, retries for failures and also timeouts. And in that case, you also need to recover tooling to understand and process the errors and failed events.

So this already sounds like a lot of work and it's not even an exhaustive list. So you don't have to listen to me on that. In times like this, when engineering budgets and headcounts are slashed down, we as individual developers, engineers need to do more with less. So it is really worth asking at this point, do you really want to be in the business of managing and operating your own queues? Well, Matthew Druker, the CEO of SoundCloud, doesn't think we should. So if this is now a common knowledge, why are people still using queues? Well, we are used to something. It feels familiar and cozy even if it's not the coziest solution. You can make everything work with just enough effort.

Fortunately, there is a better solution that builds on the concept of message queues. So instead of separating our infrastructure, such as queues from our code, what if we could define our workflow logic purely in our application code and ensure it executes reliably? So this is what durable execution gives us. Durable execution is, as the name says, durable. It guarantees that our code will run, it will be completed, even if there are messages failures along the way.

