When developers think about broadcasting live from the browser, the immediate assumption is to use WebRTC. While WebRTC is amazing technology, server-side implementations are...lacking right now. We'll talk about a (totally hacky) way to get video from the browser via technology you're using today.
Going Live from your Browser without WebRTC
AI Generated Video Summary
Mux provides an API for live streaming and aims to keep users in their own applications. Live broadcast and live chat are different, with live chat using WebRTC and live broadcast using RTMP and HLS. WebRTC can be implemented using headless Chrome or the getUserMedia process. Mux targets developers building platforms and suggests using semantic HTML. Ionic supports native apps and custom native views.
1. Introduction to Live Streaming and User Experience
I'm Matthew McClure, co-founder of Mux. We provide an API for live streaming. Users often ask how to let their users go live from a browser. Current recommendations involve using native software, but people want to keep users in their own applications.
So, let's get started. My name is Matthew McClure, I'm a co-founder of a company called Mux. We provide an API to online video infrastructure and one of the things that we provide is this live streaming solution. So, you can create live streams, we give you back stream key, and then you can just push RTMP feeds to it. So, it's great for live broadcasts, but a really common question we get is how can I let my users go live directly from a browser? So, right now the current recommendations are usually use native software like Open Broadcast Studio or Wirecast or something along those lines. But people want to be able to just keep people in their applications themselves and not ship them off to another solution and make them download and learn a new technology. So, it's understandable why they want to, but unfortunately it's not quite that easy.
2. Difference between Live Broadcast and Live Chat
Live broadcast and live chat are often confused, but they are different. Live chat is for direct communication between two users, while live broadcast is for one person broadcasting to many. Live chat uses WebRTC, while live broadcast uses RTMP and HLS. WebRTC cannot directly communicate with RTMP, so a server is needed for the conversion.
So, let's talk about a hack I've been working on that's probably a really terrible idea. So, first of all let's talk about what we mean by live broadcast. Some quick background here. Live broadcast is not live chat. So, it's a really common misconception, but these two things are actually quite different.
So, in live chat you have just two people, two users talking directly back and forth. They can just share video potentially even peer-to-peer. So, it doesn't need to get routed through a centralized server. It can just go directly from one to the other. This latency needs to be like 300 milliseconds or less. You start getting up to like 500 milliseconds, it gets really hard to actually have that one-to-one conversation. You can maybe even have a few peers here. So, that can get 3, 5, 10, really depends on how much bandwidth each user can have, because you're kind of constrained by whoever has the least bandwidth to be able to share video back and forth between every person in the chat.
Live broadcast, on the other hand, is from one camera feed to hundreds, thousands, tens of thousands, hundreds of thousands of people at once. So now, you're not talking any more about a communication, one-to-one communication anymore, it's one person kind of broadcasting out to a bunch. So you need to be able to scale it, you need to be able to have affordable costs, but then those viewers don't necessarily need to be talked back in real time. So, latency in, you know, 10 seconds, 15 seconds, is pretty fine. By the time a person's responded in chat, it should be pretty responsive. The same technology doesn't really work well for both, for a few reasons.
So, live chat is powered by browser technologies like WebRTC, so that's a suite of APIs that can be used for browsers to communicate with each other, peer-to-peer, get the browsers' media, all those kind of things. Live broadcasting, on the other hand, is powered by technologies like RTMP. So RTMP is a server protocol, a communication protocol for delivering video. So it used to be used a lot more on delivery, but now it's kind of standard for getting a broadcast feed into a server. Then that server will transcode that to something for delivery to the end users that's a little bit more cheap and scalable, like HLS. And HLS is a format that basically takes video, chunks it up into small chunks, lists those chunks in a manifest, and then players can download the manifest and then keep pulling it for updates. But it can just be hosted on normal CDNs, it's just delivered like normal files, it's just HTTP, so it's really really easy and understandable to scale, and it's cheap, relatively speaking. Ok, so you're probably thinking, if I need to get to RTMP first, let's just go WebRTC to RTMP in the browser. The browsers are mostly like, nah. You unfortunately can't get low level enough in the networking level to be able to communicate over RTMP. Ok, so what about the technology we can't access? What about the things that we do have in our toolkit? So spoiler alert, a server is going to be involved either way.
3. WebRTC to RTMP Streaming
You can implement WebRTC to server side WebRTC using headless Chrome, which allows you to incorporate browser technologies and overlays into the live stream. However, running Chrome at scale for each user can be complex. Alternatively, you can use the get user media process to capture the browser's microphone and camera, and send the stream over WebSockets. On the server side, you can use FFmpeg to process the WebSocket data and deliver it via RTMP. This method works well and there is a demo available on Glitch.
You're just not going to get from a browser to RTMP without in some way involving a server. So WebRTC to server side implementations of WebRTC, it can be done, it's a little bit complicated, the WebRTC spec is large and daunting. This has gotten a lot better recently, there's projects like Pyon that really make this much easier, but it's still pretty daunting.
So you're thinking WebRTC to a server side WebRTC implementation, but now that implementation is just headless Chrome, and this can be done. It's actually done really well, once you basically can have a chat, the headless Chrome instance can just join that chat and record it. And the nice thing here is you're just using browser technologies, you can do overlays, you can do whatever you would do on the browser there, and then it's just in the stream. It's really, really cool. The problem is now you're having to run Chrome at scale for every single person that wants to do a live stream, which can be complicated.
Okay, so what if we just used a piece of that WebRTC spec, get user media, which is the process of getting the browser microphone and camera, and then we'll just send that over WebSockets. WebSockets are understandable. The server set of limitations are common, and things that we've all worked with, or a lot of us have worked with in the past, so let's try that. You might be thinking, how would that work? So first, we would request the browser's media, so get user media is what I was talking about earlier. You can set different constraints. We'll just set audio and video to true, but you could adjust that if you wanted to. We'll set that stream, we'll add that stream to a video element so we can see it, then we'll capture that stream, and then we'll pass that stream to the media recorder, to a media recorder instance, or we'll create a new instance of the media recorder API, which just allows you to basically record content from a browser. And then, that recorder will expose this data available event, so every time that event fires, we have a chunk of video, so we'll just fire that chunk of video down a WebSocket connection. Now, that's going to be complying with all the process of creating that WebSocket connection, but assuming we have a WebSocket connection, now we can just send that video down that WebSocket connection, which is great. And then, the server side of things is also pretty simple and straightforward. We have this WebSocket, and every time we get a new connection, we'll spin up a FFmpeg process. Here, I'm using a MUX RTMP endpoint, but that could be anything. We'll do some cleanup if the FFmpeg process dies, or if the WebSocket closes, but otherwise, every time we get a new message and it's a buffer, we'll just write it to FFmpeg and then FFmpeg will deliver it via RTMP. So this actually works pretty well. If you want to see a demo of this, you can check out a glitch. It's got everything running. You can see it working in the browser. It's actually pretty cool. It works pretty well. Here, you just put a stream key. If you wanted to remix the glitch and use a different RTMP endpoint, totally cool. And I also wrote a blog post on this whole thing.
Q&A: MUX Target Market and Using a Div as a Button
I go a little bit more into it. We're a developer-facing product, purely just APIs for developers to build into their platform. If you're a streamer just looking to go live without writing any code whatsoever, Twitch and YouTube are great platforms to use. If you're trying to build a platform, we're probably a better fit there. Putting HTML inside of a button is not actually semantic HTML. So they may want to wrap that content into a div and make that an accessible button instead of putting a button around it.
So if you want to check it out and get more details, I go a little bit more into it. Thanks everybody! Wow! That's a lot of knowledge in just 20 or 28 minutes it was. Four great topics. I would like to invite all the Lightning Talk speakers with me on the stage to do the last round of Q&A of the day.
Hey everyone! Hello! Hey there. Hello! Good evening, night, whatever it is for you. I'm going to go straight into the questions. I'm going to start with the first question from Matt McClure. What market are you targeting and why would someone use MUX instead of Twitch or YouTube? Yeah, that's a valid question. We're a developer-facing product so we're purely just APIs for developers to build into their platform as opposed to Twitch and YouTube which are much more consumer-facing products. So if you're a streamer just looking to go live without writing any code whatsoever, those are great platforms you should probably use them. If you're trying to build a platform, we're probably a better fit there. Okay, so it's more about the target audience, I guess, and that you have more control over what you're doing? Yeah, I would think about it a little bit like a bad analogy that I mentioned in Slack is they're more like the PayPal or Venmo. We're more of the Stripe, if you're thinking about in terms of like payment APIs. Okay, thank you.
Next question is for Jen. What are the reasons why someone like the React Native web team would want to use a div as a button? The reason is that putting HTML inside of a button is not actually semantic HTML. So they may want to wrap that content, for instance, a card or a block of like an image and text into a div and make that an accessible button instead of putting a button around it. Yeah, so if you have a completely clickable card with different elements inside it, you can't do that semantically within a button. Correct. So in that case, you'll want to make an accessible div. Well, you should want to, at least. Perhaps. And can I say, if you don't, Jen is going to come and get you? I will very kindly tap you on the shoulder and make suggestions. How about that? Yes. Yeah, but tap doesn't work. Then I might hack it. Yeah. Yeah. Not really a question, but just for you, a nice tap on the shoulder from Martin van Houten.
Appreciation for Mesh and Ionic Native Support
Not really a question, just wanted to express appreciation for Mesh. At Albert Heijn, we've been using it and it's been a pleasure. Does Ionic support native apps like React Native, or is it more like a Cordova standard app with a web UI? It's a mix of both, allowing integration with custom native views. Thanks everyone for the talk, and goodbye for now.
Not really a question. Just want to say, mesh looks awesome. Well, that's always nice to hear. Thank you very much. I hope that you feel the same and not hate me. Well, actually, at the company I work for, Albert Heijn, we are using it. And I have to say, it's been a pleasure. So, thanks a lot.
Oh, you are my neighbor. I can come visit you. Would be gezellig. Mike, does Ionic support native apps, similar to React Native? Or it's like a Cordova standard app where it's a web UI instead of a native app? So, it's kind of a mix of both where the majority of the UI is displayed in a web view. You can integrate with custom native views or activities on Android and kind of mix which one gets displayed, the web view or the native view, or even just overlay the native view on top of the web view. So, you get kind of the best of both worlds. That feels powerful.
Okay. Thanks guys and my lady for this great talk. For the people watching, they're also going to be in the Zoom rooms for questions. But the formal part is now over. I'm going to say goodbye to you for a little bit. So, thanks for joining. Thank you. Thank you.