WebHID API: Control Everything via USB

Rate this content
Bookmark

Operational System allows controlling different devices using Human Interface Device protocol for a long time. With WebHID API you can do the same right from the browser. Let’s talk about the protocol features and limitations. We’ll try to connect some devices to the laptop and control them with JavaScript.

23 min
20 Jun, 2022

AI Generated Video Summary

Today's Talk introduces the webHID API, which allows developers to control real devices from the browser via USB. The HID interface, including keyboards, mice, and gamepads, is explored. The Talk covers device enumeration, input reports, feature reports, and output reports. The use of HID in the browser, especially in Chrome, is highlighted. Various demos showcase working with different devices, including a DualShock controller, microphone, gamepad, and Stream Deck drum pad. The Talk concludes with recommendations and resources for further exploration.

1. Introduction to webHAD API and HID Interface

Short description:

Today we will talk about webHAD API as a way to control real devices from the browser via USB. We will explore the HID interface, which includes keyboard, mice, and gamepad. Let's also discuss drivers, which act as an abstraction layer between applications and the operating system.

Hi, my name is Nikita and today we will talk about webHAD API as a way to control some real devices right from the browser via USB. But a little disclaimer, not only USB, you can work with Bluetooth too, but today we will talk about only USB.

Who am I? I'm Nikita Dubko, I am a web developer, about 15 years of experience. I'm a podcaster a little bit, D&D player, I really love kayaking with my friends and I am a drums and piano player a little and we will use it in this talk. So also I am a Google developer expert, and for me GDE is about knowledge. And it's really cool to visit some GDE meetups and know something about new web APIs. For example, about WebHAD and we talk about it right now because of GDE meetups.

And how it all started? Maybe you saw on Twitter some kind of front end, it is not real programming. You just move some pixels, you colorize some buttons and you don't even have an access to real devices. We C++ programmers, we real developers, we work with real devices, we use drivers and so on. But really maybe we have any possibility to work with real devices. So let's explore. Let's talk about some kind of C++ languages, not only C++.

To work with real devices, you should use HID interface. HID — it's Human Interfaces Devices. And you use it a lot. For example, it's keyboard, mice, gamepad — all devices that help people to communicate with laptops, PCs. And it's like a layer between PC and human. So it can be USB class and Bluetooth class. Today we will talk about USB class, but working with Bluetooth, it's not so complicated.

And let's talk about drivers — such drivers. To write a driver, you should know a lot about your device and, you know, drivers are like some abstraction layer to help our application to work with the operating system. If we have some application and it wants to communicate with some real device, with hardware, it should call some operating system library that it can be a driver. And this driver will use some raw data to communicate with this real device. It asks real device, do you have some data, hardware can respond, yes, I have, take it. Operating system processes this data and returns the result of such processing to application. So, it's like abstraction layer, you can write this layer and name it driver. But to be honest, we have a polling in our operating system. So, when you connect some real device via USB to operating system, it asks real device, do you have something? Real device, no, I haven't. Do you have? No, I haven't.

2. Polling, Device Enumeration, and Reports

Short description:

Today we will discuss polling in the context of webHAD API and USB devices. We'll explore the concept of device enumeration and the use of device IDs. Additionally, we'll cover input reports, feature reports, and output reports, which are sets of data exchanged between devices and the operating system. These reports can be manipulated using methods like set report and set feature. For more details, refer to the USB specification's human interface devices section.

Do you have? Okay, I have something. It's polling. It's like HTTP polling on your websites. By default, it's 125 GHz or 125 times per second. It's about default values, it's about USB 2.0, but today we have USB 3.0, we have USB type-C, it's much faster.

For example, we redraw our screen only 60 times per second in a browser. I understand that it's not right to compare such values, but it's for example. And when you connect your device to operating system, this device should be enumerated. Why? For example, if you use USB hub, you have only one output for this hub, one input for your laptop, for example, and you can connect a lot of devices to this hub. So USB protocol, it has some features that can help you to use some USB bus to work with a lot of devices. And every device has some device ID, and it's a number. So every package can have this device ID to help operating system to understand, okay, that's keyboard, okay, that's gamepad and so on. And yeah, it has polling, it asks devices a lot of times per second. Do we have something? Please give me some data. But if we talk about some entities, these entities are input reports, feature reports, and output reports. It's just a set of data, set of bytes. And these input reports are about some data from a real device to your laptop, to your operating system. Feature reports and output reports, it's output for your laptop. So it's data from laptop to your real device.

And feature reports, it's like output reports, but it has some special features. To push some data, to give some data to your real device, you can, for example, use some methods in the driver named set report, set feature. And yeah, what is report? Report is just a plain set of numbers. It's an array of bytes. I don't know. So this is report, and it's not really readable, right? It's just byte, byte, byte, some offsets. What is it? You should read a USB specification. It has part human interface devices. So you can find that it's 2001. It's really old specification. So yeah, it has a lot of descriptors, and descriptor is a way to describe how an input or output report should look like.

3. Introduction to HID and USB Device Data

Short description:

HID is a way for developers to describe and pack data from USB devices. Wireshark can help debug USB device data. C++ developers can use the libusb-head-api-library, but front-end developers can leverage HID in the browser. Chrome supports HID for gamepads, joysticks, mouses, keyboards, and keypads.

It has some usage pages, collections, logical minimum, logical maximum, and so on. It's a way to describe these bytes, these raw data. It's a way to help us, developers, to collect our data and pack it into such way like in descriptor.

And yeah, you can debug some data from your USB device using Wireshark. It's a great application, it's available on Mac OS, it's available on Windows, and it can help you sniff some data between your laptop and USB device.

And of course, even if I am a C++ developer, I don't want to write all these lines of code every time. I have a library, and you can use, for example, libusb-head-api-library. It's an abstraction layer that you can use in your application. But I am a front-end developer. I don't want to use C++ libraries in my code. But HID is already in your browser. So you can find a Chromium source code, and in this file you can find that Chrome works with gamepads, joysticks, mouses, keyboards, keypads, and so on. And it's okay. You work with a keyboard every day. You type something for your browser, and browser can work with HID.

4. Introduction to Web HID API

Short description:

But what if a browser can help us, developers? What if it can give us some API that will help us to work with real devices, too? Web HID is a browser API, not a W3C standard. It's enabled by default in Chrome 89. Unfortunately, Mozilla and Safari do not work with HID due to their respective standards positions. HID does not work with trusted input, which includes sensitive data like passwords and credit card numbers. It requires user gesture and provides control over device access. Let's try a demo.

But what if a browser can help us, developers? What if it can give us some API that will help us to work with real devices, too? And yeah, we have some human interface devices. Let's try to use it. Ta-da! Web HID.

Web HID is a browser API, and it's not a W3C standard. And you can find the specification in Web Platform Incubator Community group. So it has a lot of text about methods, what is navigator HID attribute, and so on. So please just read it to understand this specification. And it's enabled by default in Chrome 89. So you don't even need to enable some experimental web platform features. It's just enabled. So you can use it right now like progressive enhancement, for example.

Unfortunately, Mozilla doesn't work with HID because of Mozilla standards positions. They will not implement the specification because of fingerprinting privacy and so on. Safari doesn't work with HID too, because of tracking prevention, but Chrome has a huge audience, so you can use HID like progressive enhancement.

And yeah, HID, web HID doesn't work with trusted input. Trusted input is when you work with some private data through this input device. It's trusted input. For example, when you use keyboard, you type some passwords, credit card numbers. When you use mice, you can click on capture, I don't know. So it's sensitive data. So HID will not work with this input. And you can find which devices are trusted in Chromium source codes, for example. And yeah, it requires user gesture. It's okay. If we talk about audio, we cannot auto play audio without user gesture. And I really want to know when some website wants to have access to my real device. It's my device. It's my private device. So I want to allow the browser to work with my device. And let's try to have some demo.

5. Connecting Devices and Working with HID

Short description:

Let's try to connect my devices to webpages. We can use the HID interface to work with real devices. By using filters with specific vendor IDs and product IDs, we can request devices that match our criteria. To find these IDs, we can refer to the device lock page in Chromium or use websites like devicehunt.com. Once we have a device, we can open a connection to it and listen for events, such as input reports. These reports contain data that we can work with, including a report ID for package identification. For example, a report ID of 01 may indicate a button click.

Let's try to connect my devices to webpages. How to work with HID? I have a PlayStation Dualshock, and we will try to connect it to the page. At first we have navigator HID attribute. And HID is HID interface. It has some properties or some methods to work with real devices. And to work with it, if we have HID navigator, we can use some filters.

Filters are the way to have some specific device to work with and you should use method request device with these filters. You can have some vendor ID, product ID and you will have some list of devices that has such vendor IDs and product IDs. And where can I find this vendor ID and product ID? At first you can find it at the page about device lock at Chromium. It looks like this and you can find vendor ID, product ID, name, serial number and so on. It's really an interesting page. You can find a lot of devices connected to your laptop. For example, magic trick pod is HID device too. You can work with it somehow. And yeah, you should convert numbers from this tab because here are decimal numbers and in code I use hexadecimal numbers and in specifications we can find a lot of hexadecimal numbers. So please look at this. And if you want to find some rare device, you can use some websites. For example, I use devicehunt.com.

Okay, we have some list of devices. If we want to work with just one device, I will use devices at zero, and we should open some connection to this device. So we will use await-device-open, and we will have connection between a real device and our web page. After that, we will have some events. We can listen to these events to have some data to react on something. And I want to use event input report when my real device will send me some data. For example, button click, so I can catch this event, listen to it, and event has data attribute, data field. It's just plain data that I can pass and work with it. And it has report ID. Report ID it's a way to have some signing on your packages. If you have a package with some report ID, you can read in specification about every report ID. For example, for some devices, 01 is a way to I clicked something.

6. Working with DualShock Controller

Short description:

I found a data format for my DualShock controller that encodes eight bits of information in just one byte. Using a table, I can convert signals to Booleans and manipulate the controller's features like the light bar and vibration. I can also access gyroscope and accelerometer data and control the color of the light bar. It's demo time!

It's a way to tell it. And yeah, data is has a type data view. So you can use methods of data view. And yeah, I found a data format for my DualShock. You can find interesting thing. We use only one byte to encode eight bits of information. If we press some button on my DualShock, it will have just one bit of information. It's really cool. It's really efficient and okay.

I will use this table to convert some signals to Booleans and yeah, I'll have a code that can help me to using some binary logic, I will have a triangle expressed, cross expressed and so on. Okay.

And DualShock has two additional features. At first it has a light bar. This light bar can change its color and it has vibration. So I want to use it and to use it I need just to set some data. I will use uint8 array. I just fill this array with specific numbers. I really need to read the spec about PlayStation DualShock and I will use a method device send report and it will work. So it's demo time. Let me switch the page. OK. That's a page that will try to connect to my wireless controller. It's already connected and you can find gyroscope and accelerometer data. So I can use it from the data too. Let's try to colorize our light bar, not just on the buttons. Yeah. Hop! It changes the color. Green, orange. Yeah, it works. And I can even try to vibrate.

7. Using Microphone and Gamepad

Short description:

Let's use my microphone and control its two rumbles. I can also control my gamepad using the Zorisa library, WebHAD, and the DS4 from GitHub. But I want to work with other devices as well.

Let's use my microphone. I believe you can hear it, such vibrations. It has two rumbles I can control. Heavy and light. Ooh, some ASMR. Yeah I can control my gamepad and it's really cool. But let's try to work with, of course, the Zorisa library, WebHAD, the DS4, you can find on GitHub and it can help you to control WebHAD DualShock. And use it, of course, use it. Don't write my primitive code every time. But I want to work with another devices and I'll try to do it.

8. Using Stream Deck as a Drum Pad

Short description:

I have a stream deck with 15 buttons that I want to use as a drum pad. I will configure it using code from Pete LePage and Brahms Van Damme, who created their own drum pads using the stream deck helper. I found an interesting gist on GitHub about the stream deck protocol, which is not public. With your help, I can use the WebPageID to display and play something on the Elgato Stream Deck buttons. Now I can play guitar without a guitar!

I have a stream deck and it has 15 buttons. I want to use it like a drum pad. So let's try to do it. Give me some time to disconnect gamepad and connect stream deck.

Yeah. So this is stream deck. This is my stream deck. And I will try to configure it to use it like a drum pad. I'm a drums player. So I am happy.

I'm really happy that Pete LePage used his stream deck to control Google Meet. And he has a Meet stream deck helper on GitHub. I can use his code. And I'm really happy that Brahms Van Damme used this stream deck helper to create his own drum pad. So I just use their code and the article for Brahms. And I found interesting gist on GitHub about stream deck protocol. It's really useful because stream deck protocol is not public. You should debug it to find some interesting features.

So, guys, really, help. Thank you. You are really helpful for me with your articles and demos. And it's demo time again. Let's open one more page. Oops. You can see that it has the same buttons like on the web page. So, you can use WebPageID to display something on Elgato Stream Deck, on its buttons. And let's use it to play something. Okay. I hope my sound will be recorded. Yeah, cool! I can play guitar without guitar.

9. Conclusion and Recommendations

Short description:

Just five notes and you can play Seven Nation Army. I used a browser to play guitar. Don't forget to close a connection before unloading the page. Check out the awesome web HID repository on GitHub for articles, specifications, and examples. Find my slides on mefodi.dev.tlk.org or use the QR code. Connect with me on Twitter as DarkMefodi.

Just five notes and you can play Seven Nation Army. Really cool. I used a browser to play guitar. Cool. Be a good developer. Don't forget to close a connection before you unload the page just to help another page to work with these devices to connect to them.

And what's next? I recommend you an awesome web HID repository on GitHub. It has a lot of articles, specifications, examples, how to use web HID. I am inspired by it a lot. And if you want to start working with HID devices, I really recommend you an article on web dev about how to connect your device to your laptop. Some basic knowledge, but it's really useful.

And you can find my slides on mefodi.dev.tlk.org, or use this QR code. And I'm DarkMefodi on Twitter. So let's keep in touch. Be safe and use front-end power for really interesting features on your websites.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

JSNation Live 2021JSNation Live 2021
27 min
Building Brain-controlled Interfaces in JavaScript
Neurotechnology is the use of technological tools to understand more about the brain and enable a direct connection with the nervous system. Research in this space is not new, however, its accessibility to JavaScript developers is.Over the past few years, brain sensors have become available to the public, with tooling that makes it possible for web developers to experiment building brain-controlled interfaces.As this technology is evolving and unlocking new opportunities, let's look into one of the latest devices available, how it works, the possibilities it opens up, and how to get started building your first mind-controlled app using JavaScript.
ML conf EU 2020ML conf EU 2020
41 min
TensorFlow.js 101: ML in the Browser and Beyond
Discover how to embrace machine learning in JavaScript using TensorFlow.js in the browser and beyond in this speedy talk. Get inspired through a whole bunch of creative prototypes that push the boundaries of what is possible in the modern web browser (things have come a long way) and then take your own first steps with machine learning in minutes. By the end of the talk everyone will understand how to recognize an object of their choice which could then be used in any creative way you can imagine. Familiarity with JavaScript is assumed, but no background in machine learning is required. Come take your first steps with TensorFlow.js!
JSNation 2022JSNation 2022
21 min
Crafting the Impossible: X86 Virtualization in the Browser with WebAssembly
WebAssembly is a browser feature designed to bring predictable high performance to web applications, but its capabilities are often misunderstood.
This talk will explore how WebAssembly is different from JavaScript, from the point of view of both the developer and the browser engine, with a particular focus on the V8/Chrome implementation.
WebVM is our solution to efficiently run unmodified x86 binaries in the browser and showcases what can be done with WebAssembly today. A high level overview of the project components, including the JIT engine, the Linux emulation layer and the storage backend will be discussed, followed by live demos.
JSNation 2022JSNation 2022
22 min
Makepad - Leveraging Rust + Wasm + WebGL to Build Amazing Cross-platform Applications
In this talk I will show Makepad, a new UI stack that uses Rust, Wasm, and WebGL. Unlike other UI stacks, which use a hybrid approach, all rendering in Makepad takes place on the GPU. This allows for highly polished and visually impressive applications that have not been possible on the web so far. Because Makepad uses Rust, applications run both natively and on the Web via wasm. Makepad applications can be very small, on the order of just a few hundred kilobytes for wasm, to a few megabytes with native. Our goal is to develop Makepad into the UI stack of choice for lightweight and performant cross-platform applications. We intend to ship with our own design application and IDE.
JSNation 2023JSNation 2023
25 min
Pushing the Limits of Video Encoding in Browsers With WebCodecs
High quality video encoding in browsers have traditionally been slow, low-quality and did not allow much customisation. This is because browsers never had a native way to encode videos leveraging hardware acceleration. In this talk, I’ll be going over the secrets of creating high-quality videos in-browsers efficiently with the power of WebCodecs and WebAssembly. From video containers to muxing, audio and beyond, this talk will give you everything you need to render your videos in browsers today!

Workshops on related topic

Node Congress 2022Node Congress 2022
57 min
Writing Universal Modules for Deno, Node and the Browser
Workshop
This workshop will walk you through writing a module in TypeScript that can be consumed users of Deno, Node and the browsers. I will explain how to set up formatting, linting and testing in Deno, and then how to publish your module to deno.land/x and npm. We’ll start out with a quick introduction to what Deno is.