Tools for better Observability in NodeJS Serverless IoT Applications

Observability is crucial for successfully operating large IoT fleets. IoT incorporates different components, including hardware, network, on-device software, and cloud. Devices operate under unreliable conditions and constraints, and need to be monitored remotely. Cloud applications become complex and costly, as they are built to handle device activity at scale. Answering questions such as:
-    Do I have a problem in my IoT application?
-    Where is the problem?
-    What is causing the problem?
-    How much of my fleet is affected?
-    Is my code expensive to run and if yes, how can I fix that? ,
can be challenging. Logging, monitoring and tracing are fundamental observability pillars. However, they are often viewed as non-functionals in IoT applications, and can fall off the radar, or are not standardized during development.

This session will show how to leverage Open Source tools, such as AWS Lambda PowerTools, in a fully functional Serverless IoT application, to ease adoption of best practices for modern application development, and integrate services such as Amazon X-Ray, Amazon CloudWatch and AWS IoT Core features, to achieve end-to-end observability.

Alina Dima
Alina Dima
8 min
14 Apr, 2023


Video Summary and Transcription

The Talk discusses the challenges of IoT development, including issues with fleet offline, data missing, alerts not working, inconsistent data, and slow loading dashboards. It explores how to build observability in IoT applications using metrics, logging, and tracing. The integration between the rules engine and Lambda is explained, highlighting the use of tools like Lambda Power Tools and X-Ray for logging, monitoring, and tracing. The Lambda invocation process and the tracing capabilities of X-Ray are also mentioned.

1. Introduction to IoT Challenges

Short description:

Everybody starts in the IoT space thinking it's all sunshine and butterflies, but when you go to production, it becomes a maze. Prototyping and testing in the lab may go well, but then issues arise. Fleet offline, data missing, alerts not working, inconsistent data, and slow loading dashboards.

All right, so everybody who starts in the IoT space kind of thinks that this is how the IoT journey looks like. You know, everything is sunshine and butterflies, and you work with devices and they're very cool. You prototype with devices. You learn new protocols and so on, so you think it looks like that.

But actually, it doesn't look like that. When you're going to production with an IoT solution, it more looks like that. So you're constantly putting off fires of one kind or another. So what's really going on? So when you're working in IoT, you are actually part of an ecosystem. You are working on the device side, or you're working with teams who work on the device side. You have backends on the cloud side. You're working with cloud teams. You're working with data teams. So it's all just really relatively crazy, and it can get even crazier really fast. So it is a maze.

You suddenly are prototyping or testing with your devices in the lab, and it all works, and everything's fine, everybody's happy. And then you go to production, and suddenly 50% of your fleet is offline from one day to another. And you try to investigate why, and you don't know why. Then you've got data missing. You're sending data. You're using MQTT. You've done all the right things. Ideally, you've used quality of service one. Ideally, you've actually used local storage at the edge as well, but still data's gone missing. So your data team is complaining. You don't know where the problem is. What about the alerts you built in? Well, you're not seeing any of them. Have you actually built them? Well, I don't know. It would be a good idea if you did. Data is inconsistent. Your users are basically complaining that loading a dashboard for, I don't know, 50 devices as an aggregate, to see aggregate metrics just takes too long.

2. Building Observability in IoT

Short description:

To build observability in an IoT application, you need metrics, logging, and tracing in a standardized way. Let's explore how to achieve this in a serverless backend scenario, where an IoT device sends data over MQTT, picked up by an AWS IoT rule and pushed into a Lambda function.

So it's all crazy. So what do you do about all of this? So clearly, you actually need to build observability in your application. So you need metrics. You need logging. You need tracing. And, ideally, you need all of this in a standardized way, so an operations team who is actually looking at this stuff, looking at this data can actually understand what's going on. So let's see how we can build observability in an IoT application.

I'm going to make the assumption that the back end here is mostly serverless. So I'm imagining a situation where you've got an IoT device sending some data over MQTT and you've got an IoT rule, you know, an AWS IoT rule, picking up this data and pushing it into a Lambda function. You're using this amazing cool integration that AWS IoT has with the rules engine. And you think everything is perfect, right? So if you scan that QR code, you can actually look at the code for what I'm going to show you. You can do that. I have it linked at the end as well. So I'll give, like, two seconds for people to look at that.

