WebHID API: Control Everything via USB
AI Generated Video Summary
Today's Talk introduces the webHID API, which allows developers to control real devices from the browser via USB. The HID interface, including keyboards, mice, and gamepads, is explored. The Talk covers device enumeration, input reports, feature reports, and output reports. The use of HID in the browser, especially in Chrome, is highlighted. Various demos showcase working with different devices, including a DualShock controller, microphone, gamepad, and Stream Deck drum pad. The Talk concludes with recommendations and resources for further exploration.
1. Introduction to webHAD API and HID Interface
Today we will talk about webHAD API as a way to control real devices from the browser via USB. We will explore the HID interface, which includes keyboard, mice, and gamepad. Let's also discuss drivers, which act as an abstraction layer between applications and the operating system.
Hi, my name is Nikita and today we will talk about webHAD API as a way to control some real devices right from the browser via USB. But a little disclaimer, not only USB, you can work with Bluetooth too, but today we will talk about only USB.
Who am I? I'm Nikita Dubko, I am a web developer, about 15 years of experience. I'm a podcaster a little bit, D&D player, I really love kayaking with my friends and I am a drums and piano player a little and we will use it in this talk. So also I am a Google developer expert, and for me GDE is about knowledge. And it's really cool to visit some GDE meetups and know something about new web APIs. For example, about WebHAD and we talk about it right now because of GDE meetups.
And how it all started? Maybe you saw on Twitter some kind of front end, it is not real programming. You just move some pixels, you colorize some buttons and you don't even have an access to real devices. We C++ programmers, we real developers, we work with real devices, we use drivers and so on. But really maybe we have any possibility to work with real devices. So let's explore. Let's talk about some kind of C++ languages, not only C++.
To work with real devices, you should use HID interface. HID — it's Human Interfaces Devices. And you use it a lot. For example, it's keyboard, mice, gamepad — all devices that help people to communicate with laptops, PCs. And it's like a layer between PC and human. So it can be USB class and Bluetooth class. Today we will talk about USB class, but working with Bluetooth, it's not so complicated.
And let's talk about drivers — such drivers. To write a driver, you should know a lot about your device and, you know, drivers are like some abstraction layer to help our application to work with the operating system. If we have some application and it wants to communicate with some real device, with hardware, it should call some operating system library that it can be a driver. And this driver will use some raw data to communicate with this real device. It asks real device, do you have some data, hardware can respond, yes, I have, take it. Operating system processes this data and returns the result of such processing to application. So, it's like abstraction layer, you can write this layer and name it driver. But to be honest, we have a polling in our operating system. So, when you connect some real device via USB to operating system, it asks real device, do you have something? Real device, no, I haven't. Do you have? No, I haven't.
2. Polling, Device Enumeration, and Reports
Today we will discuss polling in the context of webHAD API and USB devices. We'll explore the concept of device enumeration and the use of device IDs. Additionally, we'll cover input reports, feature reports, and output reports, which are sets of data exchanged between devices and the operating system. These reports can be manipulated using methods like set report and set feature. For more details, refer to the USB specification's human interface devices section.
Do you have? Okay, I have something. It's polling. It's like HTTP polling on your websites. By default, it's 125 GHz or 125 times per second. It's about default values, it's about USB 2.0, but today we have USB 3.0, we have USB type-C, it's much faster.
For example, we redraw our screen only 60 times per second in a browser. I understand that it's not right to compare such values, but it's for example. And when you connect your device to operating system, this device should be enumerated. Why? For example, if you use USB hub, you have only one output for this hub, one input for your laptop, for example, and you can connect a lot of devices to this hub. So USB protocol, it has some features that can help you to use some USB bus to work with a lot of devices. And every device has some device ID, and it's a number. So every package can have this device ID to help operating system to understand, okay, that's keyboard, okay, that's gamepad and so on. And yeah, it has polling, it asks devices a lot of times per second. Do we have something? Please give me some data. But if we talk about some entities, these entities are input reports, feature reports, and output reports. It's just a set of data, set of bytes. And these input reports are about some data from a real device to your laptop, to your operating system. Feature reports and output reports, it's output for your laptop. So it's data from laptop to your real device.
And feature reports, it's like output reports, but it has some special features. To push some data, to give some data to your real device, you can, for example, use some methods in the driver named set report, set feature. And yeah, what is report? Report is just a plain set of numbers. It's an array of bytes. I don't know. So this is report, and it's not really readable, right? It's just byte, byte, byte, some offsets. What is it? You should read a USB specification. It has part human interface devices. So you can find that it's 2001. It's really old specification. So yeah, it has a lot of descriptors, and descriptor is a way to describe how an input or output report should look like.
3. Introduction to HID and USB Device Data
HID is a way for developers to describe and pack data from USB devices. Wireshark can help debug USB device data. C++ developers can use the libusb-head-api-library, but front-end developers can leverage HID in the browser. Chrome supports HID for gamepads, joysticks, mouses, keyboards, and keypads.
It has some usage pages, collections, logical minimum, logical maximum, and so on. It's a way to describe these bytes, these raw data. It's a way to help us, developers, to collect our data and pack it into such way like in descriptor.
And yeah, you can debug some data from your USB device using Wireshark. It's a great application, it's available on Mac OS, it's available on Windows, and it can help you sniff some data between your laptop and USB device.
And of course, even if I am a C++ developer, I don't want to write all these lines of code every time. I have a library, and you can use, for example, libusb-head-api-library. It's an abstraction layer that you can use in your application. But I am a front-end developer. I don't want to use C++ libraries in my code. But HID is already in your browser. So you can find a Chromium source code, and in this file you can find that Chrome works with gamepads, joysticks, mouses, keyboards, keypads, and so on. And it's okay. You work with a keyboard every day. You type something for your browser, and browser can work with HID.
4. Introduction to Web HID API
But what if a browser can help us, developers? What if it can give us some API that will help us to work with real devices, too? Web HID is a browser API, not a W3C standard. It's enabled by default in Chrome 89. Unfortunately, Mozilla and Safari do not work with HID due to their respective standards positions. HID does not work with trusted input, which includes sensitive data like passwords and credit card numbers. It requires user gesture and provides control over device access. Let's try a demo.
But what if a browser can help us, developers? What if it can give us some API that will help us to work with real devices, too? And yeah, we have some human interface devices. Let's try to use it. Ta-da! Web HID.
Web HID is a browser API, and it's not a W3C standard. And you can find the specification in Web Platform Incubator Community group. So it has a lot of text about methods, what is navigator HID attribute, and so on. So please just read it to understand this specification. And it's enabled by default in Chrome 89. So you don't even need to enable some experimental web platform features. It's just enabled. So you can use it right now like progressive enhancement, for example.
Unfortunately, Mozilla doesn't work with HID because of Mozilla standards positions. They will not implement the specification because of fingerprinting privacy and so on. Safari doesn't work with HID too, because of tracking prevention, but Chrome has a huge audience, so you can use HID like progressive enhancement.
And yeah, HID, web HID doesn't work with trusted input. Trusted input is when you work with some private data through this input device. It's trusted input. For example, when you use keyboard, you type some passwords, credit card numbers. When you use mice, you can click on capture, I don't know. So it's sensitive data. So HID will not work with this input. And you can find which devices are trusted in Chromium source codes, for example. And yeah, it requires user gesture. It's okay. If we talk about audio, we cannot auto play audio without user gesture. And I really want to know when some website wants to have access to my real device. It's my device. It's my private device. So I want to allow the browser to work with my device. And let's try to have some demo.
5. Connecting Devices and Working with HID
Let's try to connect my devices to webpages. We can use the HID interface to work with real devices. By using filters with specific vendor IDs and product IDs, we can request devices that match our criteria. To find these IDs, we can refer to the device lock page in Chromium or use websites like devicehunt.com. Once we have a device, we can open a connection to it and listen for events, such as input reports. These reports contain data that we can work with, including a report ID for package identification. For example, a report ID of 01 may indicate a button click.
Let's try to connect my devices to webpages. How to work with HID? I have a PlayStation Dualshock, and we will try to connect it to the page. At first we have navigator HID attribute. And HID is HID interface. It has some properties or some methods to work with real devices. And to work with it, if we have HID navigator, we can use some filters.
Filters are the way to have some specific device to work with and you should use method request device with these filters. You can have some vendor ID, product ID and you will have some list of devices that has such vendor IDs and product IDs. And where can I find this vendor ID and product ID? At first you can find it at the page about device lock at Chromium. It looks like this and you can find vendor ID, product ID, name, serial number and so on. It's really an interesting page. You can find a lot of devices connected to your laptop. For example, magic trick pod is HID device too. You can work with it somehow. And yeah, you should convert numbers from this tab because here are decimal numbers and in code I use hexadecimal numbers and in specifications we can find a lot of hexadecimal numbers. So please look at this. And if you want to find some rare device, you can use some websites. For example, I use devicehunt.com.
Okay, we have some list of devices. If we want to work with just one device, I will use devices at zero, and we should open some connection to this device. So we will use await-device-open, and we will have connection between a real device and our web page. After that, we will have some events. We can listen to these events to have some data to react on something. And I want to use event input report when my real device will send me some data. For example, button click, so I can catch this event, listen to it, and event has data attribute, data field. It's just plain data that I can pass and work with it. And it has report ID. Report ID it's a way to have some signing on your packages. If you have a package with some report ID, you can read in specification about every report ID. For example, for some devices, 01 is a way to I clicked something.
6. Working with DualShock Controller
I found a data format for my DualShock controller that encodes eight bits of information in just one byte. Using a table, I can convert signals to Booleans and manipulate the controller's features like the light bar and vibration. I can also access gyroscope and accelerometer data and control the color of the light bar. It's demo time!
It's a way to tell it. And yeah, data is has a type data view. So you can use methods of data view. And yeah, I found a data format for my DualShock. You can find interesting thing. We use only one byte to encode eight bits of information. If we press some button on my DualShock, it will have just one bit of information. It's really cool. It's really efficient and okay.
I will use this table to convert some signals to Booleans and yeah, I'll have a code that can help me to using some binary logic, I will have a triangle expressed, cross expressed and so on. Okay.
And DualShock has two additional features. At first it has a light bar. This light bar can change its color and it has vibration. So I want to use it and to use it I need just to set some data. I will use uint8 array. I just fill this array with specific numbers. I really need to read the spec about PlayStation DualShock and I will use a method device send report and it will work. So it's demo time. Let me switch the page. OK. That's a page that will try to connect to my wireless controller. It's already connected and you can find gyroscope and accelerometer data. So I can use it from the data too. Let's try to colorize our light bar, not just on the buttons. Yeah. Hop! It changes the color. Green, orange. Yeah, it works. And I can even try to vibrate.
7. Using Microphone and Gamepad
Let's use my microphone and control its two rumbles. I can also control my gamepad using the Zorisa library, WebHAD, and the DS4 from GitHub. But I want to work with other devices as well.
Let's use my microphone. I believe you can hear it, such vibrations. It has two rumbles I can control. Heavy and light. Ooh, some ASMR. Yeah I can control my gamepad and it's really cool. But let's try to work with, of course, the Zorisa library, WebHAD, the DS4, you can find on GitHub and it can help you to control WebHAD DualShock. And use it, of course, use it. Don't write my primitive code every time. But I want to work with another devices and I'll try to do it.
8. Using Stream Deck as a Drum Pad
I have a stream deck with 15 buttons that I want to use as a drum pad. I will configure it using code from Pete LePage and Brahms Van Damme, who created their own drum pads using the stream deck helper. I found an interesting gist on GitHub about the stream deck protocol, which is not public. With your help, I can use the WebPageID to display and play something on the Elgato Stream Deck buttons. Now I can play guitar without a guitar!
I have a stream deck and it has 15 buttons. I want to use it like a drum pad. So let's try to do it. Give me some time to disconnect gamepad and connect stream deck.
Yeah. So this is stream deck. This is my stream deck. And I will try to configure it to use it like a drum pad. I'm a drums player. So I am happy.
I'm really happy that Pete LePage used his stream deck to control Google Meet. And he has a Meet stream deck helper on GitHub. I can use his code. And I'm really happy that Brahms Van Damme used this stream deck helper to create his own drum pad. So I just use their code and the article for Brahms. And I found interesting gist on GitHub about stream deck protocol. It's really useful because stream deck protocol is not public. You should debug it to find some interesting features.
So, guys, really, help. Thank you. You are really helpful for me with your articles and demos. And it's demo time again. Let's open one more page. Oops. You can see that it has the same buttons like on the web page. So, you can use WebPageID to display something on Elgato Stream Deck, on its buttons. And let's use it to play something. Okay. I hope my sound will be recorded. Yeah, cool! I can play guitar without guitar.
9. Conclusion and Recommendations
Just five notes and you can play Seven Nation Army. I used a browser to play guitar. Don't forget to close a connection before unloading the page. Check out the awesome web HID repository on GitHub for articles, specifications, and examples. Find my slides on mefodi.dev.tlk.org or use the QR code. Connect with me on Twitter as DarkMefodi.
Just five notes and you can play Seven Nation Army. Really cool. I used a browser to play guitar. Cool. Be a good developer. Don't forget to close a connection before you unload the page just to help another page to work with these devices to connect to them.
And what's next? I recommend you an awesome web HID repository on GitHub. It has a lot of articles, specifications, examples, how to use web HID. I am inspired by it a lot. And if you want to start working with HID devices, I really recommend you an article on web dev about how to connect your device to your laptop. Some basic knowledge, but it's really useful.
And you can find my slides on mefodi.dev.tlk.org, or use this QR code. And I'm DarkMefodi on Twitter. So let's keep in touch. Be safe and use front-end power for really interesting features on your websites.