Pixels, Promises, and Panic: Horror Stories of Production Nightmares

Rate this content
Bookmark

Join me for "Pixels, Promises, and Panic" as we delve into the world of frontend mishaps. We'll share 4-5 real-life horror stories from the trenches of web development. From baffling browser bugs to cringe-worthy code catastrophes, these tales are a mix of humor and caution. Whether you're a seasoned developer or just starting out, these stories will entertain, enlighten, and remind us all of the unexpected twists and turns in the world of coding.

Neciu Dan
Neciu Dan
28 min
15 Jun, 2024

Comments

Sign in or register to post your comment.

Video Summary and Transcription

The Talk covers the importance of preventing and dealing with bugs in software development, with the speaker sharing their experiences and solutions. They discuss the impact of bugs on user experience and revenue, the importance of stress testing and implementing alerts, and the surprising impact of CSS. The speaker also emphasizes the importance of simplicity, monitoring, and refactoring, as well as the need to address security threats and learn from failure. They provide tips for writing postmortems and highlight common mistakes to avoid, especially for junior developers.

1. Introduction to Bugs Bunny

Short description:

Welcome to the last talk of the day. I'm here to talk to you about how to make mistakes and get away with it. Bugs are the leading cause of serious issues for developers. They bypass tests and wreak havoc on users. Bugs bunny is inevitable, but we can prevent them from breaking our production service. Let me share my experiences and solutions to help you avoid making the same mistakes. I have 4 or 5 stories to tell. I've been a front-end engineer for 12 years and worked on all three big frameworks.

Welcome to the last talk of the day. Like Daniel introduced me, my name is Nechu Dan. I usually introduce myself as Dan Nechu, but that was a mistake because on Twitter, it doubled the N, so I decided that doesn't look good, so I switched it up.

I'm here to talk to you about how to make mistakes and get away with it. That's my title of the talk. We have usually in production, we have issues, outages, downtime, service failure, technical stoppages, and system breakdowns. These are all serious, serious issues and problems that are affecting millions of developers everywhere. And, at the route of all this, what is the leading cause of these terrible issues that keep developers from sleeping at night? Well, it's actually bugs.

These ferocious little rascals bypass all our tests and manual quality assurance, they go to production, and they wreak havoc on our users, causing churn, loss of revenue, and frustrating users every day. Or these are bugs but developers like to call them little oopsies. But who here knows what they're actually called bugs? A couple. I assume they're called bugs because they're gross and scary, so why not call them spiders or snakes? They're scary, deadly, and more dangerous than bugs, or something really, really scary like clowns.

The reason is that in 1947, at Harvard University, a team of computer scientists saw that their big very big computer was not working correctly. After tinkering with it a little bit on the software and finding nothing, they opened the computer up and what they saw inside was one bug that was stuck to the motherboard. This also was made popular by the story of Dr Grace Hopper who was one of the inventors of COBOL. Personally, I don't find bugs scary because I usually like to call them bugs bunny. The reason for this is, think about it, bugs bunny never gave up. He always outwitted Elmer Fudd and found a way to get what he wanted. He repeatedly changed duck season to rabbit season and tricked Elmer Fudd into shooting himself in the face. It is the same with bugs. No matter how much you try, how many people you throw at it, how many automated tests you write, they usually find a way to be clever and make us the developers shoot ourselves in the face. So bugs bunny is inevitable. That's why it is our job as engineers to make sure he doesn't get too fat and break into our production service and eat all our carrots. So that's why I'm here today. I want to talk to you about all the times that bugs bunny tricked me into shooting myself in the face. I have four or five stories depending on the time. Each of these solutions had an obvious solution in hindsight, but thinking about it and implementing some of these practices might help you not make the same mistakes that I did.

So about me, I've been a front-end engineer for 12 years. I was born in Romania, but I live in beautiful sunny Barcelona. I like to write tech articles, and I worked on all three big frameworks.

2. How 20 Children Broke the Leaderboard

Short description:

I joined the company with little programming knowledge and got involved in building multiple projects. We developed a browser game for kids, but a bug allowed them to cheat the leaderboard. The issue caused a backlash from parents and the media, but we eventually fixed it. This experience taught me the importance of stress testing and implementing proper alert systems.

My first story is when I joined the company as a total programming noob, I knew a lot of things like data structures, dynamic programming, how to reverse a linked list, and all sorts of things that were absolutely useless when I started my first job. So as for web development, I knew just how to position things with position absolute all over the place, but the company that I worked at had a lot of projects and very few developers, so immediately I got stuck building and building, building a lot of projects, and got my hands dirty.

So we usually, this company was an outsourcing company, and we liked to do browser games. So there we were building this fantastic gaming experience, a point-and-click game that helped children learn about the benefits of milk. So the game itself was very straightforward, even for seven or eight-year-old kids. You had three little games, and a milk cart would appear, and then if the child saw milk, he had to click it, he got points. If he clicked multiple milks in a row, he got double the points, and he could create a streak. Now these were three mini games, and outside you had a leaderboard that showed the top 20 scores that the children could do. And they all tried to build it.

Now one tricky mechanic about these games is as you moved around, it kept the same score, and if you paused, you can restart it in another game. What I didn't count on is how driven seven-year-olds would be to be at the top of the leaderboard, especially considering there were no prizes, this was just for a fun game, and it was a game before we knew about gamification, like duolingo habits, and all the fancy stuff that we know now. Back then, we added the leaderboard just for fun. We tested the game for months, but after a week after launching, we came back to work to our surprise to see the leaderboard like this. The score were in the billions.

So immediately, we think, okay, it's one person who created 20 accounts who cheated. He found a way to send it with Postman the results or something, the score, and update the values in our database. So we assumed the worst and discussed it with our client and banned all 20 people from the leaderboard, or as we consider one person. So everything was fine for a couple of days, the quiet before the storm, and then we got bombarded with emails and calls from clients, from the news, because all of them were blaming us for making 20 children cry from the same class, because when we banned them, obviously the entire school saw that they were no longer in the leaderboard, and then they started calling them cheaters, and then they started crying, and they told their parents in turn, and the only thing scarier in this world than a pissed-off mother with a crying child is actually 20 pissed-off mothers with crying children.

So they banded together, called the company, called the news, and went after us, and it took some time for the shitstorm, let's say, to actually reach us, the small company that built the game. So when we did, we actually found out what happened. So a child, 7-year-old, found out that if he paused the game while he had the streak going and then started and paused and started and paused, his score would double. And the first thing he did of course is he became first in the leaderboard, then he went to his friends and bragged about it, and then his friends started doing the same thing and became a competition who can get the best score by doing this trick that they found out. And we actually thought they were hackers, but in reality they were just gaming the system. And of course, once I found the bug, fixing it was just one line of code, a wrong put if statement that verified some local score with the database score. So one line of code made 20 children cry. So that code was written in PHP. So we can actually say that PHP makes even young children cry, not just adults. So in the end we fixed it, we apologized, we restored the accounts of the top 20 children, but gave them adequate scores, and then we let the friendly competition continue. Now what did I learn from all of this? So first of all, you need to test-stress everything. We could have easily prevented this if we had an alert system that alerted us if one score was an outlier above the median.

QnA