Come On Barbie, Let’s Go Party: Using AI for Music Mixing

Rate this content
Bookmark

As a DJ, I use many techniques to mix and create new sounds that get peoples’ hands in the air. In this talk I’ll describe AI algorithms based on Neural Networks which have the ability to break down music into elements. I’ll cover how our brain differentiate between dozens of different sound signals when we listen to music. Can we instruct AI to do so? 

The cool part: live DJing on stage using AI algorithms.

Ziv Levy
Ziv Levy
27 min
13 Jun, 2024

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Today, we explore DJ mixing and how deep learning revolutionizes the art by discussing sound processing, extracting features, and using machine learning. Deep learning allows for efficient extraction of audio features and high-resolution track separation. Neural networks can achieve source separation by converting audio to spectrograms and applying convolutional and recurrent neural networks. This has immediate impact on industries such as karaoke and music transcription.

1. Introduction to DJ Mixing and Deep Learning

Short description:

Today, we're going to explore DJ mixing and how deep learning revolutionizes the art. I'm a DJ at Wix and a data scientist. DJing is more than curating playlists, it's about reading the crowd. Sometimes, when I try to blend a song that sounds perfect in my headphones, it crashes on the dance floor. Let me show you an example. We'll discuss sound processing, extracting features, and using machine learning. And then, we'll dive into the revolutionary deep learning approach.

Today, we're going to actually explore and dig into this art of mixing, of DJ mixing, and I'm going to talk about this perspective as a DJ and actually we're going to also talk about how deep learning brings a whole new revolution to this music art of mixing and in general what can be done with sound signals and neural networks.

So, again, a bit more about myself. I'm working at Wix for the past seven years now. I'm working at the data science group. My day job is that actually I'm building machine learning pipelines for data scientists across the organization. For those of you who are not familiar with Wix, Wix is a website platform building. And again, I'm also a DJ. I'm mixing Dark 80s, synthwave, and techno sounds and this is what we're going to talk about today, again, this aspect of my life as a DJ.

And I don't need to tell you that being a DJ is not only curate the right playlist, but it's also the ability to read the crowd and to see what track is going to be next according to the energy on the dance floor. And the issue is, like, the problem is that sometimes I hear something very good that really perfectly fits to the dance floor in my headphones and when I try to blend it in to the dance floor, it crashes. Let me show you how I am crashing a mix. And how awful this sounds. So I picked those two songs. One of them is by Adele. You are familiar with this song, right? And the next one is, oh, not this one. Next one is this one. Also familiar. By the way, everything I do, I'm doing it live. So if I have some glitches or some messed up, just excuse me. Okay?

So in my head, those songs are perfectly matching. But if I'm trying to play it, and let's skip to this, to the highlight of the Adele song. I'll try to mix the song exactly at the highest point of it. Okay. As you have heard, it's a lot of noise. This is where, you know, some of you probably would do some faces of, hmm, what? What's wrong with this DJ? But to my fortunate, you will be surprised to see what, you know, a very drunk crowd may overcome. But for me, it's like it's devastating. It's really like ruining the moment, and the energy is unbalanced, and I need to recover from it, and it's very stressing. But again, in my head, it was perfect. So what was it? So what we're going to talk about today is what sound is and how we process audio with computers and how we're pulling out features from this audio, and how we use it in machine learning. Okay? And then we're going to talk about the deep learning approach, which is pretty much revolutionary.

2. Exploring Source Separation and Sound Modeling

Short description:

It all started with an email about a unique technology for separating track sources. I didn't pay much attention until a friend asked for help in separating vocals. I rediscovered the tool in my DJ software and was amazed by its real-time capabilities. Intrigued, I delved into music source operation using neural networks. Sampling measures amplitude levels, resulting in a waveform that holds information about frequency, intensity, and timbre. Computers struggle to distinguish between instrument overtones, unlike our brains.

And as we speak, things are really happening right now. So it all started with, you know, back a couple of years ago, I got an email from the release note from the DJ software that I'm using, and they're saying something like, here, dear DJs, we are now able to provide you a unique technology that will allow you to separate the sources of your track, and by that, you know, be creative and do something with it. And I thought to myself at first that, well, it's not so interesting. I mean, probably it has been solved already. But you know, it was like post-Covid era, there were still limitations, and you know, limitation on crowding and everything, so I really didn't pay attention to that.

And recently, a friend of mine came to me and she said, I want your help to separate the vocals out of some track that I have. This is a very old track, there are no studio versions or something. What can I do? And you know, sometimes I have my equalizer here, and I can play and, you know, in some manner reduce the sound of some sounds, or enhance the sounds of others, but it's not really creating a karaoke version of, like, peeling apart the layers. But suddenly I remembered that I have this tool in my DJ software, and I read, you know, the step-by-step guide of what to do, what do I need to configure, click a few buttons, and boom, I had it. And I was, you know, it was nice, she was happy, but then I, like, played with it with another song and another song, and I was actually, it wasn't just nice. I was amazed by that, and everything was happening in real time.

And this is something that was not on this release note, by the way, but actually, or maybe it is, but I didn't read the entire thing. But actually I was amazed, so this really triggered my engineering part of the brain. And, you know, I started, what do I do? I want to know how things are happening, I go to Google. And I look for music source operation using neural network, and I downloaded an article, read it, another article, read it, downloaded the dataset, downloaded the Python code, trained the model myself, and then I was testing it with another track and another track and another track, and I was actually mind-blowing about this technology. And after a few hours of playing with it, this is how I look like. Like, I discovered, you know, a whole new world came out to me.

So, the first thing is how we model sound, okay? What sound is? So, sound, eventually, is like changing in air pressure caused by air molecules' vibrations. Our ears are sensitive to those vibrations, and eventually this is what our brain perceives as sound. So computers does something similar called sampling. I'm not going to dig into this technique because of time constraints, but the computer measures the amplitude levels of those vibrations. Eventually, what we get is a waveform, which is the most common visual representation of sound, but actually this waveform holds multifactorial information about the sound. The first thing is the frequency, okay? If we zoom in, we can get the frequency of the sound. Second thing is the intensity of the sound. The intensity is measured by a squared, like we are taking a squared area of the waveform, and we see what is the peak in proportion to what is the minimum and the maximum points. And then we have something very important, which is the timbre of the sound. And the timbre is something that also considered as the tone quality or the tone color. It's not the quality like how clear I hear the sound, it's the tone quality of like overtone of different instruments overtone each other. For example, if I'm playing a C chord at the same time I'm playing a C chord in a guitar, at the same time someone plays a C chord at the piano, I want to be able to distinguish between those instruments, and this is something very hard to do for computers. Actually, if you think about it, our brain can do it pretty much instantly.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Building a Voice-Enabled AI Assistant With Javascript
JSNation 2023JSNation 2023
21 min
Building a Voice-Enabled AI Assistant With Javascript
Top Content
In this talk, we'll build our own Jarvis using Web APIs and langchain. There will be live coding.
AI and Web Development: Hype or Reality
JSNation 2023JSNation 2023
24 min
AI and Web Development: Hype or Reality
In this talk, we'll take a look at the growing intersection of AI and web development. There's a lot of buzz around the potential uses of AI in writing, understanding, and debugging code, and integrating it into our applications is becoming easier and more affordable. But there are also questions about the future of AI in app development, and whether it will make us more productive or take our jobs.
There's a lot of excitement, skepticism, and concern about the rise of AI in web development. We'll explore the real potential for AI in creating new web development frameworks, and separate fact from fiction.
So if you're interested in the future of web development and the role of AI in it, this talk is for you. Oh, and this talk abstract was written by AI after I gave it several of my unstructured thoughts.
Charlie Gerard's Career Advice: Be intentional about how you spend your time and effort
6 min
Charlie Gerard's Career Advice: Be intentional about how you spend your time and effort
Featured Article
Charlie Gerard
Jan Tomes
2 authors
When it comes to career, Charlie has one trick: to focus. But that doesn’t mean that you shouldn’t try different things — currently a senior front-end developer at Netlify, she is also a sought-after speaker, mentor, and a machine learning trailblazer of the JavaScript universe. "Experiment with things, but build expertise in a specific area," she advises.

What led you to software engineering?My background is in digital marketing, so I started my career as a project manager in advertising agencies. After a couple of years of doing that, I realized that I wasn't learning and growing as much as I wanted to. I was interested in learning more about building websites, so I quit my job and signed up for an intensive coding boot camp called General Assembly. I absolutely loved it and started my career in tech from there.
 What is the most impactful thing you ever did to boost your career?I think it might be public speaking. Going on stage to share knowledge about things I learned while building my side projects gave me the opportunity to meet a lot of people in the industry, learn a ton from watching other people's talks and, for lack of better words, build a personal brand.
 What would be your three tips for engineers to level up their career?Practice your communication skills. I can't stress enough how important it is to be able to explain things in a way anyone can understand, but also communicate in a way that's inclusive and creates an environment where team members feel safe and welcome to contribute ideas, ask questions, and give feedback. In addition, build some expertise in a specific area. I'm a huge fan of learning and experimenting with lots of technologies but as you grow in your career, there comes a time where you need to pick an area to focus on to build more profound knowledge. This could be in a specific language like JavaScript or Python or in a practice like accessibility or web performance. It doesn't mean you shouldn't keep in touch with anything else that's going on in the industry, but it means that you focus on an area you want to have more expertise in. If you could be the "go-to" person for something, what would you want it to be? 
 And lastly, be intentional about how you spend your time and effort. Saying yes to everything isn't always helpful if it doesn't serve your goals. No matter the job, there are always projects and tasks that will help you reach your goals and some that won't. If you can, try to focus on the tasks that will grow the skills you want to grow or help you get the next job you'd like to have.
 What are you working on right now?Recently I've taken a pretty big break from side projects, but the next one I'd like to work on is a prototype of a tool that would allow hands-free coding using gaze detection. 
 Do you have some rituals that keep you focused and goal-oriented?Usually, when I come up with a side project idea I'm really excited about, that excitement is enough to keep me motivated. That's why I tend to avoid spending time on things I'm not genuinely interested in. Otherwise, breaking down projects into smaller chunks allows me to fit them better in my schedule. I make sure to take enough breaks, so I maintain a certain level of energy and motivation to finish what I have in mind.
 You wrote a book called Practical Machine Learning in JavaScript. What got you so excited about the connection between JavaScript and ML?The release of TensorFlow.js opened up the world of ML to frontend devs, and this is what really got me excited. I had machine learning on my list of things I wanted to learn for a few years, but I didn't start looking into it before because I knew I'd have to learn another language as well, like Python, for example. As soon as I realized it was now available in JS, that removed a big barrier and made it a lot more approachable. Considering that you can use JavaScript to build lots of different applications, including augmented reality, virtual reality, and IoT, and combine them with machine learning as well as some fun web APIs felt super exciting to me.


Where do you see the fields going together in the future, near or far? I'd love to see more AI-powered web applications in the future, especially as machine learning models get smaller and more performant. However, it seems like the adoption of ML in JS is still rather low. Considering the amount of content we post online, there could be great opportunities to build tools that assist you in writing blog posts or that can automatically edit podcasts and videos. There are lots of tasks we do that feel cumbersome that could be made a bit easier with the help of machine learning.
 You are a frequent conference speaker. You have your own blog and even a newsletter. What made you start with content creation?I realized that I love learning new things because I love teaching. I think that if I kept what I know to myself, it would be pretty boring. If I'm excited about something, I want to share the knowledge I gained, and I'd like other people to feel the same excitement I feel. That's definitely what motivated me to start creating content.
 How has content affected your career?I don't track any metrics on my blog or likes and follows on Twitter, so I don't know what created different opportunities. Creating content to share something you built improves the chances of people stumbling upon it and learning more about you and what you like to do, but this is not something that's guaranteed. I think over time, I accumulated enough projects, blog posts, and conference talks that some conferences now invite me, so I don't always apply anymore. I sometimes get invited on podcasts and asked if I want to create video content and things like that. Having a backlog of content helps people better understand who you are and quickly decide if you're the right person for an opportunity.What pieces of your work are you most proud of?It is probably that I've managed to develop a mindset where I set myself hard challenges on my side project, and I'm not scared to fail and push the boundaries of what I think is possible. I don't prefer a particular project, it's more around the creative thinking I've developed over the years that I believe has become a big strength of mine.***Follow Charlie on Twitter
The Rise of the AI Engineer
React Summit US 2023React Summit US 2023
30 min
The Rise of the AI Engineer
We are observing a once in a generation “shift right” of applied AI, fueled by the emergent capabilities and open source/API availability of Foundation Models. A wide range of AI tasks that used to take 5 years and a research team to accomplish in 2013, now just require API docs and a spare afternoon in 2023. Emergent capabilities are creating an emerging title: to wield them, we'll have to go beyond the Prompt Engineer and write *software*. Let's explore the wide array of new opportunities in the age of Software 3.0!
TensorFlow.js 101: ML in the Browser and Beyond
ML conf EU 2020ML conf EU 2020
41 min
TensorFlow.js 101: ML in the Browser and Beyond
Discover how to embrace machine learning in JavaScript using TensorFlow.js in the browser and beyond in this speedy talk. Get inspired through a whole bunch of creative prototypes that push the boundaries of what is possible in the modern web browser (things have come a long way) and then take your own first steps with machine learning in minutes. By the end of the talk everyone will understand how to recognize an object of their choice which could then be used in any creative way you can imagine. Familiarity with JavaScript is assumed, but no background in machine learning is required. Come take your first steps with TensorFlow.js!
Web Apps of the Future With Web AI
JSNation 2024JSNation 2024
32 min
Web Apps of the Future With Web AI
AI is everywhere, but why should you care, as a web developer? Join Jason Mayes, Web AI Lead at Google, who will get you on track by demystifying common terminology ensuring no one is left behind, and then take you through some of the latest machine learning models, tools, and frameworks you can use right in the browser via JavaScript to help you bring your creative web app ideas to life for almost any industry you may be working in. By moving AI to the client side, there is no reliance on the server after the page load, bringing you benefits such as privacy, low latency, offline solutions, and lower costs which will be of growing importance as the field develops. This talk is suitable for everyone with a curiosity for web and machine learning, so come along and learn something new to put in your web engineering toolkit for 2024.

Workshops on related topic

AI on Demand: Serverless AI
DevOps.js Conf 2024DevOps.js Conf 2024
163 min
AI on Demand: Serverless AI
Top Content
Featured WorkshopFree
Nathan Disidore
Nathan Disidore
In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.
Working With OpenAI and Prompt Engineering for React Developers
React Advanced Conference 2023React Advanced Conference 2023
98 min
Working With OpenAI and Prompt Engineering for React Developers
Top Content
Workshop
Richard Moss
Richard Moss
In this workshop we'll take a tour of applied AI from the perspective of front end developers, zooming in on the emerging best practices when it comes to working with LLMs to build great products. This workshop is based on learnings from working with the OpenAI API from its debut last November to build out a working MVP which became PowerModeAI (A customer facing ideation and slide creation tool).
In the workshop they'll be a mix of presentation and hands on exercises to cover topics including:
- GPT fundamentals- Pitfalls of LLMs- Prompt engineering best practices and techniques- Using the playground effectively- Installing and configuring the OpenAI SDK- Approaches to working with the API and prompt management- Implementing the API to build an AI powered customer facing application- Fine tuning and embeddings- Emerging best practice on LLMOps
Building Your Generative AI Application
React Summit 2024React Summit 2024
82 min
Building Your Generative AI Application
WorkshopFree
Dieter Flick
Dieter Flick
Generative AI is exciting tech enthusiasts and businesses with its vast potential. In this session, we will introduce Retrieval Augmented Generation (RAG), a framework that provides context to Large Language Models (LLMs) without retraining them. We will guide you step-by-step in building your own RAG app, culminating in a fully functional chatbot.
Key Concepts: Generative AI, Retrieval Augmented Generation
Technologies: OpenAI, LangChain, AstraDB Vector Store, Streamlit, Langflow
Leveraging LLMs to Build Intuitive AI Experiences With JavaScript
JSNation 2024JSNation 2024
108 min
Leveraging LLMs to Build Intuitive AI Experiences With JavaScript
Workshop
Roy Derks
Shivay Lamba
2 authors
Today every developer is using LLMs in different forms and shapes, from ChatGPT to code assistants like GitHub CoPilot. Following this, lots of products have introduced embedded AI capabilities, and in this workshop we will make LLMs understandable for web developers. And we'll get into coding your own AI-driven application. No prior experience in working with LLMs or machine learning is needed. Instead, we'll use web technologies such as JavaScript, React which you already know and love while also learning about some new libraries like OpenAI, Transformers.js
Can LLMs Learn? Let’s Customize an LLM to Chat With Your Own Data
C3 Dev Festival 2024C3 Dev Festival 2024
48 min
Can LLMs Learn? Let’s Customize an LLM to Chat With Your Own Data
WorkshopFree
Andreia Ocanoaia
Andreia Ocanoaia
Feeling the limitations of LLMs? They can be creative, but sometimes lack accuracy or rely on outdated information. In this workshop, we’ll break down the process of building and easily deploying a Retrieval-Augmented Generation system. This approach enables you to leverage the power of LLMs with the added benefit of factual accuracy and up-to-date information.
Let AI Be Your Docs
JSNation 2024JSNation 2024
69 min
Let AI Be Your Docs
Workshop
Jesse Hall
Jesse Hall
Join our dynamic workshop to craft an AI-powered documentation portal. Learn to integrate OpenAI's ChatGPT with Next.js 14, Tailwind CSS, and cutting-edge tech to deliver instant code solutions and summaries. This hands-on session will equip you with the knowledge to revolutionize how users interact with documentation, turning tedious searches into efficient, intelligent discovery.
Key Takeaways:
- Practical experience in creating an AI-driven documentation site.- Understanding the integration of AI into user experiences.- Hands-on skills with the latest web development technologies.- Strategies for deploying and maintaining intelligent documentation resources.
Table of contents:- Introduction to AI in Documentation- Setting Up the Environment- Building the Documentation Structure- Integrating ChatGPT for Interactive Docs