WebHID API: Control Everything via USB


Operational System allows controlling different devices using Human Interface Device protocol for a long time. With WebHID API you can do the same right from the browser. Let’s talk about the protocol features and limitations. We’ll try to connect some devices to the laptop and control them with JavaScript.



Hi, my name is Nikita and today we will talk about Webpage ID API as a way to control some real devices right from the browser via USB. But a little disclaimer, not only USB, you can work with Bluetooth too, but today we will talk about only USB. Who am I? I am Nikita Dubko, I am a web developer, about 15 years of experience. I'm a podcaster a little bit, D&D player. I really love kayaking with my friends and I'm a drums and piano player a little, and we will use it in this talk. So, also I am a Google Developer Expert and for me GDE is about knowledge and it's really cool to visit some GDE meetups and know something about new web APIs. For example, about Webpage ID and we talk about it right now because of GDE meetups. And how it all started? Maybe you saw in Twitter some kind of frontend is not real programming. You just move some pixels, you colorize some buttons and you don't even have an access to real devices. We C++ programmers, we real developers, we work with real devices, we use drivers and so on. But really, maybe we have any possibility to work with real devices. So, let's explore. Let's talk about some kind of C++ languages, not only C++. To work with real devices, you should use HID interface. HID is Human Interfaces Devices and you use it a lot. For example, it's keyboard, mice, gamepad, all devices that help people to communicate with laptops, PCs, and it's like a layer between PC and human. So, it can be USB class and Bluetooth class. Today we will talk about USB class, but working with Bluetooth, it's not so complicated. And let's talk about drivers, such drivers. To write a driver, you should know a lot about your device. And you know, drivers are like some abstraction layer to help our application to work with operating system. If we have some application and it wants to communicate with some real device, with hardware, it should call some operating system library that it can be a driver. And this driver will use some raw data to communicate with this real device. It asks real device, do you have some data? Hardware can respond, yes, I have. Take it. Operating system processes this data and returns the result of such processing to application. So, it's like abstraction layer. You can write this layer and name it driver. But to be honest, we have a polling in our operating system. So when you connect some real device via USB to operating system, it asks a real device, do you have something? A real device, no, I haven't. Do you have? No, I haven't. Do you have? Okay, I have something. It's polling. It's like HTTP polling on your websites. And by default, it's 125 GHz or 125 times per second. And yeah, it's about default values. It's about USB 2.0. But today we have USB 3.0, we have USB Type-C, it's much faster. And for example, we redraw our screen only 60 times per second in the browser. I understand that it's not right to compare such values, but yeah, it's for example. And okay, when you connect your device to operating system, this device should be enumerated. Why? For example, if you use USB hub, you have only one output for this hub. One input for your laptop, for example. And you can connect a lot of devices to this hub. So, a USB protocol it helps you. It has some features that can help you to use some USB bus to work with a lot of devices. And every device has some device ID, and it's a number. So, every package can have this device ID to help operating system to understand, okay, that's keyboard, okay, that's gamepad, and so on. And yeah, it has polling, it asks devices a lot of times per second, do you have something, please give me some data. But if we talk about some entities, these entities are input reports, feature reports, and output reports. It's just a set of data, set of bytes, and these reports, input reports, are about some data from a real device to your laptop, to your operating system. Feature reports and output reports, it's output for your laptop. So, it's data from laptop to your real device. And feature reports, it's like output reports, but it has some special features. And to push some data, to give some data to a real device, you can, for example, use some methods in a driver named set report, set feature, and yeah, what is report? Report is just a plain set of numbers. It's an array of bytes, I don't know. So, this is report too, and it's not really readable, right? It's just byte, byte, byte, some offsets, what is it? You should read a USB specification. It has part human interface devices. So, you can find that it's 2001. It's really old specification. So, yeah, it has a lot of descriptors, and descriptor is a way to describe how input or output report should look like. It has some usage pages, collections, logical minimum, logical maximum, and so on. It's a way to describe these bytes, this raw data. It's a way to help us, developers, to collect our data and pack it into such way, like in descriptor. And yeah, you can debug some data from your USB device using Wireshark. It's a great application. It's available on Mac OS, it's available on Windows, and it can help you sniff some data between your laptop and USB device. And of course, even if I am C++ developer, I don't want to write all these lines of code every time. So, I have a library, and you can use, for example, libusb-heat-api-library. It's abstraction layer that you can use in your application. But I am a front-end developer. I don't want to use C++ libraries in my code. But HID is already in your browser. So, you can find Chromium source code, and in this file you can find that Chrome works with gamepads, joysticks, mouses, keyboards, keypads, and so on. And it's okay. You work with keyboard every day, you type something for your browser, and browser can work with HID. But what if browser can help us, developers? What if it can give us some API that will help us to work with real devices too? And yeah, we have some human interface devices. Let's try to use it. Ta-da! WebHID. WebHID is a browser API, and it's not a W3C standard. And you can find the specification in the Web Platform Incubator Community Group. So, it has a lot of text about methods, what is navigator HID attribute, and so on. So, please just read it to understand this specification. And it's enabled by default in Chrome 89. So, you don't even need to enable some experimental web platform features. It's just enabled. So, you can use it right now, like Progressive Enhancement, for example. Unfortunately, Mozilla doesn't work with HID because of Mozilla standards positions. They will not implement the specification because of fingerprinting, privacy, and so on. Safari doesn't work with HID too, because of tracking prevention. But Chrome has a huge audience, so you can use HID like Progressive Enhancement. And yeah, WebHID doesn't work with Trusted Input. Trusted Input is when you work with some private data through this input device. It's Trusted Input. For example, when you use keyboard, you type some passwords, credit card numbers. When you use mice, you can click on capture, I don't know. So, it's sensitive data. So, HID will not work with this input. And you can find which devices are trusted in Chromium source code, for example. And yeah, it requires user gesture. It's okay. If we talk about audio, we cannot auto-play audio without user gesture. And I really want to know when some website wants to have access to my real device. It's my device. It's my private device. So, I want to allow the browser to work with my device. And let's try to have some demo. Let's try to connect my devices to web pages. How to work with HID? I have PlayStation DualShock, and we will try to connect it to the page. At first, we have Navigator HID attribute. And HID is HID interface. It has some properties, some methods to work with real devices. And to work with it, if we have HID Navigator, we can use some filters. Filters are the way to have some specific device to work with. And you should use method request device with these filters. You can have some vendor ID, product ID, and you will have some list of devices that has such vendor IDs and product IDs. And where can I find this vendor ID and product ID? At first, you can find it at a page about device log at Chromium. It looks like this. And you can find vendor ID, product ID, name, serial number, and so on. It's really an interesting page. You can find a lot of devices connected to your laptop. For example, Magic Trick Pod is HID device too. You can work with it somehow. And yeah, you should convert numbers from this tab because here are decimal numbers. And in code, I use hexadecimal numbers. And in specifications, we can find a lot of hexadecimal numbers. So please look at this. And if you want to find some rare device, you can use some websites. For example, I use devicehunt.com. Okay, we have some list of devices. If we want to work with just one device, I will use devices at zero. And we should open some connection to this device. So we will use await device open. And we will have connection between a real device and our web page. After that, we will have some events. We can listen to these events to have some data to react on something. And I want to use event input report. When my real device will send me some data, for example, button click, so I can catch this event, listen to it. And event has data attribute, data field, it's just plain data that I can parse and work with it. And it has report ID. Report ID, it's a way to have some signing on your packages. If you have a package with some report ID, you can read in specification about every report ID. For example, for some devices, 01 is a way to, I clicked something. It's a way to tell it. And yeah, data has a type data view, so you can use methods of data view. And yeah, I found a data format for my DualShock. You can find interesting thing. We use only one byte to encode eight bits of information. If we press some button on my DualShock, it will have just one bit of information. It's really cool. It's really efficient. And okay, I will use this table to convert some signals to Booleans. And yeah, I will have a code that can help me to, using some binary logic, I will have triangle is pressed, cross is pressed, and so on. Okay, and DualShock has two additional features. At first, it has a light bar. This light bar can change its color. And it has vibration. So I want to use it. And to use it, I need just to set some data. I will use uint8 array. I just fill this array with specific numbers. I really need to read the spec about PlayStation DualShock. And I will use a method device send report. And it will work. So it's demo time. Let me switch page. Okay, that's a page that will try to connect to my wireless controller. It's already connected. And you can find gyroscope and accelerometer data. So yeah, I can use it from the data too. Let's try to colorize our light bar, not just on the buttons. Yeah. It changes color. Green, orange. Yeah, it works. And I can even try to vibrate. Let's use my micro. I believe you can hear it. Such vibrations. It has two rumbles. I can control heavy and light. Some ASMR. Yeah, I can control my game part. And it's really cool. But let's try to work with, of course, there is a library, webhid.ds4 you can find on GitHub and it can help you to control webhid DualShock and use it. Of course, use it. Don't write my primitive code every time. But I want to work with another devices and I'll try to do it. I have a stream deck and it has 15 buttons. I want to use it like a drum pad. So let's try to do it. Give me some time to disconnect game pad and connect stream deck. Yeah. So this is stream deck. This is my stream deck. And I will try to configure it to use it like a drum pad. I'm a drums player. So I'm happy. I'm really happy that Pete LePage used his stream deck to control Google Meet. And he has a Meet stream deck helper on GitHub. I can use his code and I'm really happy that Bramus Van Damme used this stream deck helper to create his own drum pad. So I just use their code and the article of Bramus. And I found interesting gist on GitHub about stream deck protocol. It's really useful because stream deck protocol is not public. You should debug it to find some interesting features. So guys, really help. Thank you. You are really helpful for me with your articles and demos. And it's demo time again. Let's open one more page. You can see that it has the same buttons like on the page. So you can use WebPageID to display something on Elgato stream deck on its buttons. And let's use it to play something. Okay. I hope my sound will be recorded. Okay. So let's get started. Yeah, cool. I can play guitar without a guitar. Yeah, just five notes and you can play Seven Nation Army. Really cool. I used a browser to play guitar. Cool. Be a good developer. Don't forget to close the connection before you unload the page just to help another pages to work with this devices to connect to them. And what's next? I recommend you to know some WebPageID repository on GitHub. It has a lot of articles, specifications, examples, how to use WebPageID. I'm inspired by it a lot. And if you want to start working with HID devices, I really recommend you an article on WebDev about how to connect your device to your laptop. Some basic knowledge, but it's really useful. And you can find my slides on methodi.dev.org. And I'm DarkMethodi on Twitter. So let's keep in touch. Be safe and use front-end power for really interesting features on your websites. Thank you.
23 min
20 Jun, 2022

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic