Finding Stealthy Bots in Javascript Hide and Seek

Rate this content
Bookmark

JavaScript has a lot of use cases - one of them is automated browser detection. This is a technical talk overviewing the state of the art automated browser for ad fraud, how it cheats many bot detection solutions, and the unique methods that have been used to identify it anyway.


FAQ

There are both beneficial and malicious bots on the web. Beneficial bots include those used for testing, while malicious bots can perform activities like ad fraud, social media manipulation, and denial-of-service attacks.

Basic methods for detecting bots include analyzing the user agent string, checking for the absence of JavaScript execution, and observing if a bot can handle complex behaviors like running JavaScript or generating tokens that mimic human interactions.

Browser quirks are inconsistencies or unique behaviors in how different browsers operate. These can be used to detect bots because while bots can mimic many human behaviors, replicating these specific quirks accurately can be more challenging.

Automated browsers have significantly advanced bot capabilities by allowing bots to emulate human browsing more convincingly. These tools can run a DOM, execute JavaScript, and fake user interactions, making bots harder to detect.

Puppeteer is an automated browser developed by Google that is particularly adept at mimicking human browser interactions without being easily detected. It supports extensions and modifications that can enhance bot operations, making it a key tool in sophisticated botting activities.

Advanced bots have evolved to bypass CAPTCHAs by employing techniques that mimic human responses or by using machine learning algorithms to solve CAPTCHA challenges, effectively diminishing the effectiveness of CAPTCHAs as a standalone deterrent.

Emerging techniques for detecting sophisticated bots include analyzing inconsistencies in browser fingerprinting, using behavioral analysis to detect unnatural patterns of interaction, and monitoring session level data for anomalies.

As bot technology evolves, bots are becoming better at mimicking human behaviors and browser characteristics, making it increasingly difficult to distinguish them from real human users without advanced detection techniques.

Adam Abramov
Adam Abramov
11 min
20 Jun, 2022

Comments

Sign in or register to post your comment.

Video Summary and Transcription

The Talk discusses the challenges of detecting and combating bots on the web. It explores various techniques such as user agent detection, tokens, JavaScript behavior, and cache analysis. The evolution of bots and the advancements in automated browsers have made them more flexible and harder to detect. The Talk also highlights the use of canvas fingerprinting and the need for smart people to combat the evolving bot problem.

1. Introduction to Web Bots

Short description:

I'm here to ask what's going on with bots on the web. We'll talk about simple detections, how the bots got better. We'll talk about what's possibly the best bot out there cheating on most detection solutions. And we'll lastly get to my favorite part, which is how you can find it anyways. My job is playing hide and seek with these bots, so advertisers can avoid them. It's going to be social media, concert ticket sellers, a lot of people facing this issue because the internet was not designed with bot detection in mind. When you do that, yeah, real story, when I was 16, high school product projects may or may have not dropped service to some site. So to make the internet better, we want to detect them. Let's talk detections. Starting with the basics. User agent. Does the HTTP request header identifying the browser? You guys know this. You see it's a Python bot. You block that. Probably not a real user behind that. They figured this out, the bot makers know, they hide the user agent. Let's say you don't run JavaScript on your bot.

Hey, everyone. I'm Adam. I'm super happy to be here, and I'm here to ask what's going on with bots on the web. I'm not talking about the nice ones, the testing. I'm talking about the bad ones. We'll talk about simple detections, how the bots got better. We'll talk about what's possibly the best bot out there cheating on most detection solutions. And we'll lastly get to my favorite part, which is how you can find it anyways.

But before all that, one reason I'm here is because I always like packing stuff, and now I'm the reverse engineer for DoubleVerify. They measure ads. But my job is playing hide and seek with these bots, so advertisers can avoid them. But it's not just advertisers and the games. It's going to be social media, concert ticket sellers, a lot of people facing this issue because the internet was not designed with bot detection in mind. Seriously. The only real standard is bots.txt telling bots what they're allowed and disallowed to do. Basically the honor system asking good people to play nice. When you do that, yeah, real story, when I was 16, high school product projects may or may have not dropped service to some site. But some people actually do this on purpose and at scale, denying service to real users, using what they have to steal, sneakers, sneaking around social media with fake users. I practice that part. So to make the internet better, we want to detect them.

Let's talk detections. Starting with the basics. Not because bot makers can't play around these, but because they're usually the first thing you rely on when you come up with something more complicated because simple detections are pretty straightforward. User agent. Does the HTTP request header identifying the browser? You guys know this. You see it's a Python bot. You block that. Probably not a real user behind that. They figured this out, the bot makers know, they hide the user agent. Let's say you don't run JavaScript on your bot.

2. Detecting Bots with Tokens and JavaScript

Short description:

You can use tokens and JavaScript behavior to detect bots on your site. Browser quirks can be used to verify the true nature of a browser. Digging deep into JavaScript can reveal attempts to hide something.

Maybe you make a token as the detection as the site. In Azure, actually make sure it's created. So if you have a bot that's navigating to your site, not generating this token, not running JavaScript, you know something's going wrong. But let's say they do run JavaScript. All of a sudden, you can check how the browser behaves. You people probably hate browser quirks. Bot makers hate them too, because they can be used to verify what's under the hood and not what the browser is reporting at face value. And sometimes you can dig deep in JavaScript to see if somebody's trying to hide something.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

A Framework for Managing Technical Debt
TechLead Conference 2023TechLead Conference 2023
35 min
A Framework for Managing Technical Debt
Top Content
Let’s face it: technical debt is inevitable and rewriting your code every 6 months is not an option. Refactoring is a complex topic that doesn't have a one-size-fits-all solution. Frontend applications are particularly sensitive because of frequent requirements and user flows changes. New abstractions, updated patterns and cleaning up those old functions - it all sounds great on paper, but it often fails in practice: todos accumulate, tickets end up rotting in the backlog and legacy code crops up in every corner of your codebase. So a process of continuous refactoring is the only weapon you have against tech debt.In the past three years, I’ve been exploring different strategies and processes for refactoring code. In this talk I will describe the key components of a framework for tackling refactoring and I will share some of the learnings accumulated along the way. Hopefully, this will help you in your quest of improving the code quality of your codebases.

Debugging JS
React Summit 2023React Summit 2023
24 min
Debugging JS
Top Content
As developers, we spend much of our time debugging apps - often code we didn't even write. Sadly, few developers have ever been taught how to approach debugging - it's something most of us learn through painful experience.  The good news is you _can_ learn how to debug effectively, and there's several key techniques and tools you can use for debugging JS and React apps.
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
Do you know what’s really going on in your node_modules folder? Software supply chain attacks have exploded over the past 12 months and they’re only accelerating in 2022 and beyond. We’ll dive into examples of recent supply chain attacks and what concrete steps you can take to protect your team from this emerging threat.
You can check the slides for Feross' talk here.
Building a Voice-Enabled AI Assistant With Javascript
JSNation 2023JSNation 2023
21 min
Building a Voice-Enabled AI Assistant With Javascript
Top Content
In this talk, we'll build our own Jarvis using Web APIs and langchain. There will be live coding.
Power Fixing React Performance Woes
React Advanced Conference 2023React Advanced Conference 2023
22 min
Power Fixing React Performance Woes
Top Content
Next.js and other wrapping React frameworks provide great power in building larger applications. But with great power comes great performance responsibility - and if you don’t pay attention, it’s easy to add multiple seconds of loading penalty on all of your pages. Eek! Let’s walk through a case study of how a few hours of performance debugging improved both load and parse times for the Centered app by several hundred percent each. We’ll learn not just why those performance problems happen, but how to diagnose and fix them. Hooray, performance! ⚡️
Monolith to Micro-Frontends
React Advanced Conference 2022React Advanced Conference 2022
22 min
Monolith to Micro-Frontends
Top Content
Many companies worldwide are considering adopting Micro-Frontends to improve business agility and scale, however, there are many unknowns when it comes to what the migration path looks like in practice. In this talk, I will discuss the steps required to successfully migrate a monolithic React Application into a more modular decoupled frontend architecture.

Workshops on related topic

Building a Shopify App with React & Node
React Summit Remote Edition 2021React Summit Remote Edition 2021
87 min
Building a Shopify App with React & Node
Top Content
WorkshopFree
Jennifer Gray
Hanna Chen
2 authors
Shopify merchants have a diverse set of needs, and developers have a unique opportunity to meet those needs building apps. Building an app can be tough work but Shopify has created a set of tools and resources to help you build out a seamless app experience as quickly as possible. Get hands on experience building an embedded Shopify app using the Shopify App CLI, Polaris and Shopify App Bridge.We’ll show you how to create an app that accesses information from a development store and can run in your local environment.
0 to Auth in an hour with ReactJS
React Summit 2023React Summit 2023
56 min
0 to Auth in an hour with ReactJS
WorkshopFree
Kevin Gao
Kevin Gao
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool. There are multiple alternatives that are much better than passwords to identify and authenticate your users - including SSO, SAML, OAuth, Magic Links, One-Time Passwords, and Authenticator Apps.
While addressing security aspects and avoiding common pitfalls, we will enhance a full-stack JS application (Node.js backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session securely for subsequent client requests, validating / refreshing sessions- Basic Authorization - extracting and validating claims from the session token JWT and handling authorization in backend flows
At the end of the workshop, we will also touch other approaches of authentication implementation with Descope - using frontend or backend SDKs.
Build a chat room with Appwrite and React
JSNation 2022JSNation 2022
41 min
Build a chat room with Appwrite and React
WorkshopFree
Wess Cope
Wess Cope
API's/Backends are difficult and we need websockets. You will be using VS Code as your editor, Parcel.js, Chakra-ui, React, React Icons, and Appwrite. By the end of this workshop, you will have the knowledge to build a real-time app using Appwrite and zero API development. Follow along and you'll have an awesome chat app to show off!
Hard GraphQL Problems at Shopify
GraphQL Galaxy 2021GraphQL Galaxy 2021
164 min
Hard GraphQL Problems at Shopify
WorkshopFree
Rebecca Friedman
Jonathan Baker
Alex Ackerman
Théo Ben Hassen
 Greg MacWilliam
5 authors
At Shopify scale, we solve some pretty hard problems. In this workshop, five different speakers will outline some of the challenges we’ve faced, and how we’ve overcome them.

Table of contents:
1 - The infamous "N+1" problem: Jonathan Baker - Let's talk about what it is, why it is a problem, and how Shopify handles it at scale across several GraphQL APIs.
2 - Contextualizing GraphQL APIs: Alex Ackerman - How and why we decided to use directives. I’ll share what directives are, which directives are available out of the box, and how to create custom directives.
3 - Faster GraphQL queries for mobile clients: Theo Ben Hassen - As your mobile app grows, so will your GraphQL queries. In this talk, I will go over diverse strategies to make your queries faster and more effective.
4 - Building tomorrow’s product today: Greg MacWilliam - How Shopify adopts future features in today’s code.
5 - Managing large APIs effectively: Rebecca Friedman - We have thousands of developers at Shopify. Let’s take a look at how we’re ensuring the quality and consistency of our GraphQL APIs with so many contributors.
0 To Auth In An Hour For Your JavaScript App
JSNation 2023JSNation 2023
57 min
0 To Auth In An Hour For Your JavaScript App
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.js backend + Vanilla JS frontend) to authenticate users with One Time Passwords (email) and OAuth, including:
- User authentication – Managing user interactions, returning session / refresh JWTs- Session management and validation – Storing the session securely for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.