Serverless in Production, Lessons from the Trenches

Rate this content
Bookmark

Serverless technologies help us build better and more scalable applications in the cloud. It's a powerful paradigm that allows us to focus on creating business values and letting the cloud provider handle the undifferentiated heavy-liftings such as managing the underlying infrastructure. In this session, Yan Cui would take us through many of the lessons that he has learnt from running serverless workloads in production over the last five years, including development tips, testing and observability strategies, and much more.


You can check the slides for Yan's talk here.

Yan Cui
Yan Cui
34 min
18 Feb, 2022

Comments

Sign in or register to post your comment.

Video Summary and Transcription

This Talk provides valuable insights for those considering serverless in 2022, with a focus on troubleshooting and observability using Lumigo. It emphasizes the use of multiple AWS accounts and Org Formation for better control and scalability. Security considerations include securely loading secrets at runtime and implementing zero-trust networking. Optimizing Lambda performance is discussed, along with updates on serverless frameworks and the role of Terraform. The Talk also compares Honeycomb and Lumigo for observability in serverless applications.

1. Introduction to Serverless Lessons

Short description:

In this talk, I will share the lessons I've learned from running serverless workloads in production over the past five years. I'll provide valuable insights for those considering serverless in 2022.

Hi everyone, thank you for joining this talk where I'm going to tell you about some of the lessons I've learned running production workloads with serverless the last five years. My name is Yen Chui. I'm one of the first AWS serverless heroes and nowadays I work as a developer advocate with Lumigo, which I think is the hands down best ability tool for serverless applications. The other half of my time I work as an independent consultant where I work with companies around the world to help them succeed with serverless, so I've been running workloads in production using several technologies since 2016 and quite a few number of things that's come up along the way and I've categorized them into a number of lessons, which I think will be really useful for everyone who's thinking about using serverless in 2020, in 2022 rather.

2. Importance of Troubleshooting and Observability

Short description:

You need to think about how you're going to troubleshoot your system from the start. Observability is crucial because bad things will happen. Logs are overrated for troubleshooting under time pressure. I use Lumego, structured logs, CloudWatch, and Lumego's alerts for troubleshooting.

The first and arguably the most important lesson is that you really need to think about how you're going to troubleshoot your system right from the start because it's going to be a much harder problem for you to fix after the fact. And observability is a measure of how well the internal state of a system can be inferred from its external output, and it's absolutely crucial because bad things are going to happen to your system. It might not have happened yet, but you will eventually, because everything fails all the time, as Werner Vogel famously said.

And when things go wrong and users are impacted, you need to be able to fix the problems as quickly as possible. And that requires us to be able to both identify the issue, but also to resolve them in a timely fashion. I've spent many years just swimming around in the log messages, and I've come to the conclusion that logs are overrated. They are useful, but they're not the most effective means of troubleshooting problems when you're under time pressure.

And I think at the best of times, it sometimes feels like an exercise of finding a needle in a haystack when you're browsing through huge amounts of log messages. So nowadays I have adopted a different approach whereby I'm using a combination of Lumego, and not writing many logs, but when I do, I make sure that my logs are structured, and that they cover the blind spots that I don't see in Lumego. And then I also use the CloudWatch for system metrics and the alerts to complement the alerts that I get from Lumego. And most of my troubleshooting is done inside Lumego, where I either be notified via a Slack alert, or I go to the issues page, where I can see all the recent errors, and they've been captured and categorized by function and error type.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
Do you know what’s really going on in your node_modules folder? Software supply chain attacks have exploded over the past 12 months and they’re only accelerating in 2022 and beyond. We’ll dive into examples of recent supply chain attacks and what concrete steps you can take to protect your team from this emerging threat.
You can check the slides for Feross' talk here.
Towards a Standard Library for JavaScript Runtimes
Node Congress 2022Node Congress 2022
34 min
Towards a Standard Library for JavaScript Runtimes
Top Content
You can check the slides for James' talk here.
ESM Loaders: Enhancing Module Loading in Node.js
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
Native ESM support for Node.js was a chance for the Node.js project to release official support for enhancing the module loading experience, to enable use cases such as on the fly transpilation, module stubbing, support for loading modules from HTTP, and monitoring.
While CommonJS has support for all this, it was never officially supported and was done by hacking into the Node.js runtime code. ESM has fixed all this. We will look at the architecture of ESM loading in Node.js, and discuss the loader API that supports enhancing it. We will also look into advanced features such as loader chaining and off thread execution.
Out of the Box Node.js Diagnostics
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
In the early years of Node.js, diagnostics and debugging were considerable pain points. Modern versions of Node have improved considerably in these areas. Features like async stack traces, heap snapshots, and CPU profiling no longer require third party modules or modifications to application source code. This talk explores the various diagnostic features that have recently been built into Node.
You can check the slides for Colin's talk here. 
You Don’t Know How to SSR
DevOps.js Conf 2024DevOps.js Conf 2024
23 min
You Don’t Know How to SSR
A walk-through of the evolution of SSR in the last twelve years. We will cover how techniques changed, typical problems, tools you can use and various solutions, all from the point of view of my personal experience as a consumer and maintainer.
Node.js Compatibility in Deno
Node Congress 2022Node Congress 2022
34 min
Node.js Compatibility in Deno
Can Deno run apps and libraries authored for Node.js? What are the tradeoffs? How does it work? What’s next?

Workshops on related topic

AI on Demand: Serverless AI
DevOps.js Conf 2024DevOps.js Conf 2024
163 min
AI on Demand: Serverless AI
Top Content
Featured WorkshopFree
Nathan Disidore
Nathan Disidore
In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.
Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
0 to Auth in an Hour Using NodeJS SDK
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
Building a Hyper Fast Web Server with Deno
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Matt Landers
Will Johnston
2 authors
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
Deploying React Native Apps in the Cloud
React Summit 2023React Summit 2023
88 min
Deploying React Native Apps in the Cloud
WorkshopFree
Cecelia Martinez
Cecelia Martinez
Deploying React Native apps manually on a local machine can be complex. The differences between Android and iOS require developers to use specific tools and processes for each platform, including hardware requirements for iOS. Manual deployments also make it difficult to manage signing credentials, environment configurations, track releases, and to collaborate as a team.
Appflow is the cloud mobile DevOps platform built by Ionic. Using a service like Appflow to build React Native apps not only provides access to powerful computing resources, it can simplify the deployment process by providing a centralized environment for managing and distributing your app to multiple platforms. This can save time and resources, enable collaboration, as well as improve the overall reliability and scalability of an app.
In this workshop, you’ll deploy a React Native application for delivery to Android and iOS test devices using Appflow. You’ll also learn the steps for publishing to Google Play and Apple App Stores. No previous experience with deploying native applications is required, and you’ll come away with a deeper understanding of the mobile deployment process and best practices for how to use a cloud mobile DevOps platform to ship quickly at scale.