Getting Weird with Video Manipulation and HTML5 Canvas

Rate this content

In this lightning talk we will be pushing the boundaries of HTMl5 Canvas browser APIs. Join us while we do some experiments with video in the browser to see what’s truly possible. DISCLAIMER: No promises of cross-browser compatibility. Not for the faint of heart. Must be this tall to ride. Attending this lightning talk may result in serious injury or death. All participants must bring a life jacket.

16 min
17 Jun, 2021

Video Summary and Transcription

Today's Talk at React Summit focused on the Canvas and HTML5 video APIs, showcasing the capabilities and possibilities they offer for video manipulation and interactivity. The speaker demonstrated how to use the HTML5 video element and canvas to manipulate and draw images, apply filters, and add real-time text overlays. They also showcased real-time object detection on video frames using machine learning. The Talk concluded with an example of enhancing a marketing website with interactive video using the canvas element. Overall, the Talk highlighted the power and potential of these APIs for video development.

Available in Español

1. Introduction to Canvas and HTML5 Video APIs

Short description:

Hello, everyone, at React Summit. Today, we're going to talk about the Canvas and HTML5 video APIs and some cool stuff you can do with them. I'm Dylan Javary from Mux, where we provide Video for Developers. We focus on creating easy-to-use APIs for video. If you're interested, let's chat.

This is a test. Hello, everyone, at React Summit. I'm very excited to be talking to you here today. We're going to be talking about the Canvas and HTML5 video APIs, and some cool stuff that we found that you can do with them.

So quick intro, I'm Dylan Javary, I work at Mux. If you have not heard of Mux, Mux is Video for Developers. Maybe you know of Stripe, Stripe is payments for developers, or you know Twilio, which is phone calls and text messages for developers. We like to be like those companies, where we're built first with developers in mind and try to make great easy-to-use APIs, but we do this for video.

I'm not going to be talking too much more about Mux today, but if you are interested, come talk to me. I'd love to chat with you.

2. Introduction to React App and Player Component

Short description:

I'd love to chat with you. Let's start with a simple demo of a React app using the player component and canvas. The player component is a video element that uses HLS technology for video streaming.

I'd love to chat with you. Cool, so now to jump into some code. So I have this code sandbox set up. Code sandbox is a great tool, by the way. It's become one of my favorite pieces of software. I think there's some code sandbox folks here at this conference, so shout-out to you all. I love this product. And I'll be sharing this after so you can fork it, play with the code, do things yourself.

And let's just start out with a really simple demo. So this is a very straightforward React app. We have a few different routes. These five different examples I'm going to show and we're using React Router, React DOM. And let's start with the first one. Start with a simple demo. So right here we have simple.js. This is the component that we're rendering. We have this player component and then we have this canvas. And right now, you can't see the canvas on the page, but that's what we will be. We'll be kind of manipulating that and doing some fun stuff as we go along.

So real quickly, let's just take a look at this player component. So this player component is... It's really just a video element. But if you're familiar with video... How many of y'all have done video on the internet? So video streaming, video on demand or live streaming, anything like that. You might have used the video element before and maybe you've done an MP4 file and that can kind of work. But when you really want to do video streaming properly, what you need to do is use something like HLS. So HLS is a technology that allows you to basically download videos in segments and at different bit rates and different quality levels according to the user's bandwidth. So that's kind of something muts does for you. We're not going to get too deep into that. But that's what we're using here on this video player.

3. Exploring HTML5 Video Element and Canvas

Short description:

So this is the HTML5 video element with extra JavaScript for HLS capabilities. When the play event fires, the onPlayCallback is called. The video is duplicated on a canvas element below. The code uses the video element and a canvas context to manipulate and draw images onto the canvas. The drawImage function copies each frame from the video element to the canvas. Let's take it one step further and look at the filter example.

So this is... It's really just the HTML5 video element. And then we're attaching some extra JavaScript to give it some HLS capabilities. And then when the play event fires, that play event is when the playback begins on the video, and we're gonna call this onPlayCallback.

So let's jump back into the component that's rendering this page. Zoom in a little bit here. Make sure you can see that. So right here, we have the player, onPlayCallback. And when that fires, see what happens. What we see is this video is playing in the video element. And then it's being duplicated on this canvas element right below.

Let's jump into some of this code. So onPlay calls, we grab the video element, and we create this context, this context ref. What this is it's sort of a handle onto the canvas element. And then we can call functions on that context that allows us to manipulate that canvas element, change how it's displayed, and that's kind of our hook into manipulating the actual canvas itself. So onPlay, we call requestAnimationFrame, call updateCanvas. And what that's going to do is just call this one liner, drawImage, we pass that video element into it. And this tells the canvas to just draw this image onto the canvas. And these are the dimensions. This is the coordinates where to start and these are the dimensions to draw. And that this is actually, we call this recursively. So every time this runs, we requestAnimationFrame again, and then the callback call updateCanvas again. So you can see what's happening. We're just drawing that, we're basically copying that video element down onto the canvas and right below it. So that's how that works. Quick showing what we did there. Video element, copy each frame, draw them onto the canvas, pretty simple, right?

So now let's jump into, take this one step further. So let's go to this filter example. So what the filter does, could play, okay, same kind of thing, but you can see something else is going on here. What we're doing is the same kind of callback update canvas.

4. Manipulating Canvas and Video Frames

Short description:

We can manipulate and work with raw image data from the canvas. By iterating through the image data and adjusting color values, we can achieve effects like grayscale. Additionally, we can add text on top of the canvas, allowing for real-time modifications. This opens up possibilities for interactive video manipulation using browser APIs. Let's explore more examples, including grabbing individual frames from a video and manipulating them. The video we're using is big buck bunny, a popular example in the video streaming community.

And what we do is we draw that image onto the canvas. We extract the image data off the canvas. And now we have like raw image data that we can actually manipulate and work with. And we're gonna iterate through that image data and we're gonna mess with the color values. We can average out, if we average out the red, green, and blue values, that's gonna give us this grayscale effect. So we're actually just like manipulating the image frame by frame from the video at a time, and then putting it back onto the canvas, redrawing it onto the canvas. And you can see it has that effect. And you can see this, this canvas is always staying synced with the frame of video that the video element is rendering.

Okay. Pretty cool, right? So let's look at the steps that we did there where we took this kind of a little bit further. So, instead of just drawing each frame onto the canvas, after we do that, we're extracting the frame, manipulating the colors onto a gray scale, and then redrawing it back onto the canvas. Okay. So now we have a few more examples. Let's see what else we can do. It's going to get better and better each time. Layla, this is my coworker Phil's dog. And let's look at this example. So now in the update canvas function, we draw the image, and then we're just going to add this context Phil, we're going to call this Phil text method on the canvas. So what we're doing there is we're actually just adding text on top of the canvas. So we're rendering the video image and into the canvas, and then just adding text on top. Now you can imagine this could get pretty useful, right? If we have a video, and that video we're playing, if we just hit this video element and played it and draw it onto the canvas, then we can do all these cool things like add text in real time, do all these cool things in real time, frame by frame on the client side in the browser, all with these browser APIs. So that's where we're adding a name. Let's see what else we can do.

Okay. So now let's get into this one. This is called classify. So what we've looked at is we can grab individual frames from the video in real time, draw them onto a canvas, and before we draw them onto a canvas, we can manipulate them, right? So what else can we do? When we have a raw frame of a video, let's think about what else we can do. So this video, if you don't recognize this video, this video is big buck bunny. It's sort of the canonical, hello world video example in the kind of video streaming community. I've watched this video way too many times and it kind of makes a good example.

5. Real-time Object Detection and Use Cases

Short description:

In this classify demo, we run machine learning object detection on each video frame, drawing rectangles around detected objects. We use the TensorFlow Cocoa SSD model to detect objects in real time. By extracting image data from the canvas, we can map predictions and draw boxes with labels on the video. Although not perfect for animated content, it can detect real-life objects accurately. This opens up possibilities for real-time object detection in live video streams. Let's explore more use cases.

So I'm gonna use this for the purposes of this, this classify demo. And let's just push play here. And if you see what's happening is every frame of the video, we're running some like machine learning, object detection functionality on each image frame, and you can see it's, and then we're drawing the rectangle after we detected the object onto the frame. And right now it thinks this is a person, go a little further, now it thinks it's a bird. So we're actually like detecting frame by frame, what's going on with the objects in this video.

So let's take a look at the code. We draw the image onto the context, we extract the image data. And this is the same image data where we were manipulating the colors, but we have this extra call here, which is model.detect and we pass in that image data. So model is something that comes from this TensorFlow Cocoa SSD model, which is this TensorFlow model that will do object detection on images. It's made to work with images. And when we pass in this image data that we've extracted from the canvas, it's going to run the object detection and send us back an array of predictions that they call it, okay? So now once we have an array of predictions, we can pass those into this outline stuff function that's going to map those predictions. It has the X, Y coordinates, the width and the height of this bounding box. And then we can actually just draw those boxes with the labels directly on to that canvas element that we're already using to render the video. So you can see it thinks it's a bird, still thinks it's a bird. And dog, we saw there was a dog there for a second, here it thinks that is a sports ball. So, you know, it's not the most accurate object detection for this animated content. Now it's a sheep, it kinda looks like sheep, but we're actually able to do some pretty cool stuff. And remember, this is happening in real time. So we don't even necessarily, a lot of times when you're doing image detection on a video, you would do that out of band, on a server, kind of once the video is kind of finalized. But imagine this was a live stream, right? If we're dealing with a live stream of video, we'd be able to actually run this on the client and actually detect objects in real time. And, you know, the sky's the limit there and we can do all kinds of things with the detection that we're doing. Let's look at one more example of the classification. Let's pull up, Laila fills dog again, and you can see here, TensorFlow for a real live video. It's the type of dog, it's actually pretty good at detecting real life things, animated things, animated giant bunnies, maybe not so much, but a dog, it can get. So that is to really quickly review what we did there. So the kind of key kind of part to pay attention is that once we get images into a canvas, we can actually extract that raw image data. And then this red circle where we're doing live object detection, replace that with anything, right? Manipulate the colors, add text overlays. And then we can redraw those back onto the canvas and with all the canvas APIs that are available. So that's what we did there. Now, let's take a quick look at some real world use cases of this.

6. Enhancing Marketing Website with Interactive Video

Short description:

We recently did a design refresh on our marketing website, adding an API demo in the top hero section. Previously, we had a single video, but this time we wanted to make it more interactive. By using a strategy of copying frames from a video element and rendering them to canvas elements, we achieved the desired effects. This approach eliminated the need for double streaming, avoided playback sync issues, and allowed for a more interactive experience. If you're interested in video, let's chat!

We at Mux, we actually use this on our marketing website recently. So we recently kind of did a design refresh on our marketing website and we have this API demo in this top hero section and you can see what's going on here. These before previously on our marketing site before this iteration, we had a similar sort of API demo but it was all one video. So you can imagine if all of this here was just one video with this device and the browser popping out, that worked pretty well but we kind of wanted to make it better this time.

What we were thinking is that you'll notice that as I'm hovering over this, that's popping out. If I hover over the browser and browser pops out I can copy text here. I can interact with it. That's what we want to do. Like, let's say a developer comes here they want to copy this text or, you know, but just make it more interactive. We also have these bleeding colors in the back that we want those to bleed kind of outside the bounds of this element and kind of bleed into the top header and bleed into the bottom. And if this was just a static video we wouldn't be able to get that effect.

So, the way we were able to pull this off I have a storybook here example. So, the way we were actually able to do this is through the strategy that I described. So we actually inspect these elements. Okay, we inspect these elements. You can see that this right here is a canvas. Let me replay this. And then we see that this right here is another canvas. And then if we look further down here in the DOM, we can actually see that there's a video element. So this is the video element that is streaming the video. And then we're copying the frames of that video and rendering it to these two canvas elements in real time. So the benefits of that strategy, alternatively, we could kind of pull out the same design and have this browser be one video element and this device be another video element. And that would work okay. Except the downside of that is, number one, we're like double streaming the same video, which is gonna double the bandwidth, more bandwidth for the user. More video data being downloaded seems unnecessary and repetitive. Number two is that then the two videos could get out of sync, right? Like, if one video buffers and you're on a slow connection and the other one's not buffered yet, then you can get this playback sync happening so we'd probably have to write some JavaScript that kind of like keeps the play heads aligned and in sync and that seems kind of buggy, not a great solution. So what we did is kind of apply the strategy of taking this video element, grabbing the frames from that video element, rendering them to the canvas. And that way, these two canvases will always stay in sync. We're only downloading the video once. It works well and let's play this one more time. And that's the solution we came to. So you'll notice now I can hover over this, hover over this and the devices pop out and it's more interactive. I can copy code and now this video, this is a happy birthday video for React Summit. It's a video I found online of kids crying when they blow out their birthday candles and it's kind of funny. So happy birthday React Summit. I'm excited to be here, excited to talk with you all. And if you have anything to talk about video, I'd love to chat. If you're adding video to your product, building video, doing cool things, please chat with me and thanks for having me. Find me on Twitter, DylanJAJ and that is the end.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

React Advanced Conference 2021React Advanced Conference 2021
39 min
Don't Solve Problems, Eliminate Them
Top Content
Humans are natural problem solvers and we're good enough at it that we've survived over the centuries and become the dominant species of the planet. Because we're so good at it, we sometimes become problem seekers too–looking for problems we can solve. Those who most successfully accomplish their goals are the problem eliminators. Let's talk about the distinction between solving and eliminating problems with examples from inside and outside the coding world.
React Day Berlin 2022React Day Berlin 2022
22 min
Jotai Atoms Are Just Functions
Top Content
Jotai is a state management library. We have been developing it primarily for React, but it's conceptually not tied to React. It this talk, we will see how Jotai atoms work and learn about the mental model we should have. Atoms are framework-agnostic abstraction to represent states, and they are basically just functions. Understanding the atom abstraction will help designing and implementing states in your applications with Jotai
React Summit 2023React Summit 2023
24 min
Debugging JS
As developers, we spend much of our time debugging apps - often code we didn't even write. Sadly, few developers have ever been taught how to approach debugging - it's something most of us learn through painful experience.  The good news is you _can_ learn how to debug effectively, and there's several key techniques and tools you can use for debugging JS and React apps.
React Day Berlin 2022React Day Berlin 2022
29 min
Fighting Technical Debt With Continuous Refactoring
Top Content
Let’s face it: technical debt is inevitable and rewriting your code every 6 months is not an option. Refactoring is a complex topic that doesn't have a one-size-fits-all solution. Frontend applications are particularly sensitive because of frequent requirements and user flows changes. New abstractions, updated patterns and cleaning up those old functions - it all sounds great on paper, but it often fails in practice: todos accumulate, tickets end up rotting in the backlog and legacy code crops up in every corner of your codebase. So a process of continuous refactoring is the only weapon you have against tech debt. In the past three years, I’ve been exploring different strategies and processes for refactoring code. In this talk I will describe the key components of a framework for tackling refactoring and I will share some of the learnings accumulated along the way. Hopefully, this will help you in your quest of improving the code quality of your codebases.
React Summit Remote Edition 2020React Summit Remote Edition 2020
32 min
AHA Programming
Top Content
Are you the kind of programmer who prefers to never see the same code in two places, or do you make liberal use of copy/paste? Many developers swear the Don't Repeat Yourself (DRY) philosophy while others prefer to Write Everything Twice (WET). But which of these produces more maintainable codebases? I've seen both of these approaches lay waste to codebases and I have a new ideology I would like to propose to you: Avoid Hasty Abstractions (AHA). In this keynote, we'll talk about abstraction and how you can improve a codebase applying and creating abstractions more thoughtfully as well as how to get yourself out of a mess of over or under-abstraction.
React Summit US 2023React Summit US 2023
21 min
The Epic Stack
Modern web development is fantastic. There are so many great tools available! Modern web development is exhausting. There are so many great tools available! Each of these sentiments is true. What's great is that most of the time, it's hard to make a choice that is wrong. Seriously. The trade-offs of most of the frameworks and tools you could use to build your application fit within the constraints of the vast majority of apps. Despite this, engineers consistently struggle with analysis paralysis.Let's talk about this, and a solution I am working on for it.

Workshops on related topic

React Advanced Conference 2021React Advanced Conference 2021
174 min
React, TypeScript, and TDD
Top Content
Featured WorkshopFree
ReactJS is wildly popular and thus wildly supported. TypeScript is increasingly popular, and thus increasingly supported.

The two together? Not as much. Given that they both change quickly, it's hard to find accurate learning materials.

React+TypeScript, with JetBrains IDEs? That three-part combination is the topic of this series. We'll show a little about a lot. Meaning, the key steps to getting productive, in the IDE, for React projects using TypeScript. Along the way we'll show test-driven development and emphasize tips-and-tricks in the IDE.
React Advanced Conference 2021React Advanced Conference 2021
145 min
Web3 Workshop - Building Your First Dapp
Top Content
Featured WorkshopFree
In this workshop, you'll learn how to build your first full stack dapp on the Ethereum blockchain, reading and writing data to the network, and connecting a front end application to the contract you've deployed. By the end of the workshop, you'll understand how to set up a full stack development environment, run a local node, and interact with any smart contract using React, HardHat, and Ethers.js.
React Summit 2022React Summit 2022
136 min
Remix Fundamentals
Top Content
Featured WorkshopFree
Building modern web applications is riddled with complexity And that's only if you bother to deal with the problems
Tired of wiring up onSubmit to backend APIs and making sure your client-side cache stays up-to-date? Wouldn't it be cool to be able to use the global nature of CSS to your benefit, rather than find tools or conventions to avoid or work around it? And how would you like nested layouts with intelligent and performance optimized data management that just works™?
Remix solves some of these problems, and completely eliminates the rest. You don't even have to think about server cache management or global CSS namespace clashes. It's not that Remix has APIs to avoid these problems, they simply don't exist when you're using Remix. Oh, and you don't need that huge complex graphql client when you're using Remix. They've got you covered. Ready to build faster apps faster?
At the end of this workshop, you'll know how to:- Create Remix Routes- Style Remix applications- Load data in Remix loaders- Mutate data with forms and actions
Vue.js London Live 2021Vue.js London Live 2021
169 min
Vue3: Modern Frontend App Development
Top Content
Featured WorkshopFree
The Vue3 has been released in mid-2020. Besides many improvements and optimizations, the main feature of Vue3 brings is the Composition API – a new way to write and reuse reactive code. Let's learn more about how to use Composition API efficiently.

Besides core Vue3 features we'll explain examples of how to use popular libraries with Vue3.

Table of contents:
- Introduction to Vue3
- Composition API
- Core libraries
- Vue3 ecosystem

IDE of choice (Inellij or VSC) installed
Nodejs + NPM
JSNation 2023JSNation 2023
174 min
Developing Dynamic Blogs with SvelteKit & Storyblok: A Hands-on Workshop
Featured WorkshopFree
This SvelteKit workshop explores the integration of 3rd party services, such as Storyblok, in a SvelteKit project. Participants will learn how to create a SvelteKit project, leverage Svelte components, and connect to external APIs. The workshop covers important concepts including SSR, CSR, static site generation, and deploying the application using adapters. By the end of the workshop, attendees will have a solid understanding of building SvelteKit applications with API integrations and be prepared for deployment.
React Summit 2023React Summit 2023
106 min
Back to the Roots With Remix
Featured Workshop
The modern web would be different without rich client-side applications supported by powerful frameworks: React, Angular, Vue, Lit, and many others. These frameworks rely on client-side JavaScript, which is their core. However, there are other approaches to rendering. One of them (quite old, by the way) is server-side rendering entirely without JavaScript. Let's find out if this is a good idea and how Remix can help us with it?
Prerequisites- Good understanding of JavaScript or TypeScript- It would help to have experience with React, Redux, Node.js and writing FrontEnd and BackEnd applications- Preinstall Node.js, npm- We prefer to use VSCode, but also cloud IDEs such as codesandbox (other IDEs are also ok)