Welcome everyone. So what we're going to do first off is let me go ahead and share my screen. We are going to get started with a little bit of an introduction before we get into the hands-on. So <a href="/tags/artificial-intelligence">AI</a> is, or <a href="/tags/artificial-intelligence">artificial intelligence</a>, it's just a fad. Right? Well, actually, I don't think it's a fad. It's actually going to be here to stay and it's a revolutionary change. It helps businesses solve real problems and it's helping employees and individuals become more productive. Let's talk about why <a href="/tags/artificial-intelligence">AI</a> matters now more than ever and how <a href="/tags/artificial-intelligence">AI</a> can take your <a href="/tags/react-components">React</a> applications to the next level.So through this <a href="/tags/workshop">workshop</a> we're going to build <a href="/tags/artificial-intelligence">AI</a> into a <a href="/tags/react-components">React</a> application that only answers questions from the <a href="/tags/documentation">documentation</a> that we provided, our custom <a href="/tags/data">data</a>. So I'm Jesse Hall, a senior developer advocate at <a href="/tags/mongodb">MongoDB</a>. You might know me from my YouTube channel CodeStacker. So what we're going to do today, if you didn't get a chance to watch my talk at <a href="/tags/react-components">React</a> Day Berlin, then this first part is going to be an overview of that talk, just so that we're on the same page. So we're going to talk about the demand for intelligent apps, practical use cases, limitations of LLMs, overcoming these limitations, the tech that we're going to use to build this app, and then how to integrate GPT, make it smarter, and optimize the <a href="/tags/user-experience">user experience</a>.And so the introduction is going to be an overview. I personally hate slides, but I have to talk for a little bit just to make sure that we're on the same page, and then we're going to get to the hands-on. So the prerequisites, we're going to make sure everyone's up to speed locally. That shouldn't take more than 15 minutes. We're going to get the application set up. We're going to understand how we're going to use <a href="/tags/mongodb">MongoDB</a>'s Atlas search, which includes vector search, and that's how we're going to be actually making this app work. We're going to create our embeddings. We're going to create a search index, and then implement Rag. We're going to find out what Rag means in a bit, and then I'll leave some time at the end for Q&amp;A. All right. So let's get right into it.There's a huge demand for building intelligence into our applications in order to make these modern highly engaging applications and to make differentiating experiences for each of our users. And so you could use this for fraud detection, for chat bots, personalized <a href="/tags/tips">recommendations</a>, and so many other use cases. To compete and win, we need to make our applications smarter and surface insights faster. Smarter apps use AI-powered models to take action autonomously for the user, and the results are twofold. Firstly, our apps drive competitive advantage by deepening user engagement and satisfaction as they interact with your application. And secondly, our apps can unlock higher efficiency and profitability by making intelligent decisions faster on fresher, more accurate <a href="/tags/data">data</a>.And so, going back, almost every application going forward is going to use <a href="/tags/artificial-intelligence">AI</a> in some capacity. <a href="/tags/artificial-intelligence">AI</a> is not going to wait for anyone. So we have to stay competitive and build intelligence into our applications in order to gain those rich insights from our <a href="/tags/data">data</a>. And so, <a href="/tags/artificial-intelligence">AI</a> is being used in both to power the user-facing aspect and the fresh <a href="/tags/data">data</a> and insights that we gain from that is going to help power more efficient business decision models as well. And so there are many different use cases, and here are just a few. Retail, healthcare, finance, manufacturing. Now, these are very different use cases, but they're all unified in their critical need for to work with the freshest <a href="/tags/data">data</a> in order to achieve their objectives in real time. They all consist of AI-powered apps that drive the user-facing experience and then predictive insights that make use of that fresh <a href="/tags/data">data</a> to automate and drive more efficient business processes. But how did we get to this stage of <a href="/tags/artificial-intelligence">AI</a>? Well, let's look at <a href="/tags/artificial-intelligence">AI</a> through the ages. First, we start out with analytics. In the early days of computing, applications primarily relied on analytics to make sense of <a href="/tags/data">data</a>. This involved analyzing large <a href="/tags/data">data</a> sets, extracting insights that could inform business decisions. And as computing power increased, it became easier to analyze these large <a href="/tags/data">data</a> sets in less time. And so, this is where batch <a href="/tags/artificial-intelligence">AI</a> came in. So, as computing power continued to increase, the focus shifted towards <a href="/tags/machine-learning">machine learning</a>. And traditional batch <a href="/tags/machine-learning">machine learning</a> involves training models on historic <a href="/tags/data">data</a> and using this to make predictions or inferences about future events. And we could possibly see how users might interact in the future. So more <a href="/tags/data">data</a> over time, the more <a href="/tags/data">data</a> that we feed the model over time, it gets better. And the more we can tune it, and the more accurate the future predictions become. And so, as you can imagine, this is really powerful because if you can predict what's going to happen tomorrow, we can make really good business decisions today.

So, batch <a href="/tags/artificial-intelligence">AI</a>, as the name implies, is usually run offline on a schedule. And so, it's analyzing historic <a href="/tags/data">data</a> to make predictions about the future. But that's where the problem is with batch <a href="/tags/artificial-intelligence">AI</a>. It's working on historic <a href="/tags/data">data</a>. It can't <a href="/tags/react-components">react</a> to events that are happening quickly in real-time. Now, although, I mean, it's good for some industries, such as maybe finance and healthcare, where we can look at the history. But we need <a href="/tags/data">data</a> on things that are happening now. So we can make those real-time decisions. And so, this is where real-time <a href="/tags/artificial-intelligence">AI</a> comes in.Real-time <a href="/tags/artificial-intelligence">AI</a> represents a significant step forward from traditional <a href="/tags/artificial-intelligence">AI</a>. This approach involves training models on live <a href="/tags/data">data</a> and using them to make predictions or inferences in real-time. And this is particularly useful in fraud detections, for instance, where decisions need to be made quickly based on what's happening right now. And I mean, what good is fraud detection if the person defrauding you has already gotten away with it, right?And then, this finally brings us to generative <a href="/tags/artificial-intelligence">AI</a>. This represents the cutting edge. This approach involves training models to generate new content. And this could be images, text, <a href="/tags/creativity">music</a>, <a href="/tags/video">video</a>. It's not simply making predictions anymore. It's actually creating the future. And so, a fun fact about this slide here, all of these images were created using Dolly. And so, over the years, we've seen <a href="/tags/artificial-intelligence">AI</a> evolve from analytics to real-time <a href="/tags/machine-learning">machine learning</a> and now to generative <a href="/tags/artificial-intelligence">AI</a>. And these are not incremental changes. They're transformative. And they shape how we interact with technology every single day.

So, let's zoom in a bit. We have something called generative pre-trained transformers, GPT. These large-language models perform a variety of tasks from <a href="/tags/nlp">natural language</a> processing to content generation and even some elements of common sense reasoning. These are the brains that are making our applications smarter.But there's a catch to this. GPTs are incredible, but they are not perfect. One of the key limitations is their static knowledge base. And they only know what they are trained on. So, there are some integrations with some models now that can search the internet for newer information. But how do we know that the information that they're finding on the internet is accurate? They can also hallucinate very confidently, I might add. And so, how do we minimize this? They can't access or learn from real-time proprietary <a href="/tags/data">data</a>, your specific <a href="/tags/data">data</a>. And that's a big limitation. So, the need for real-time proprietary domain-specific <a href="/tags/data">data</a> is why we can't rely on the LLMs as they are. This is especially true in the business context where up-to-date information can be a game-changer.And so, what is the solution? How do we make these models adaptable, real-time, and more aligned with our specific needs? Well, this brings us to the focus of the <a href="/tags/workshop">workshop</a> today. It's not merely about leveraging the power of GPTs in <a href="/tags/react-components">React</a>, but it's about taking your <a href="/tags/react-components">React</a> applications to the next level by making them intelligent and context-aware. So, we're going to explore how to augment <a href="/tags/react-components">React</a> apps with smart capabilities using these large language models and boost those capabilities even further with Retrieval Augmented Generation, or RAG. So, we're not just integrating <a href="/tags/artificial-intelligence">AI</a> into <a href="/tags/react-components">React</a>, we're optimizing it and making it smarter and more context-aware.So, what's involved in Retrieval Augmented Generation? First up is vectors. What are vectors? We have to understand what this is. These are the building blocks that allow us to represent complex, multidimensional <a href="/tags/data">data</a> in a format that is easier to manipulate and understand. So, in the simplest explanation, a vector is a numerical representation of <a href="/tags/data">data</a>, and it's basically an array of numbers. These numbers are coordinates in an in-dimensional space, where in represents the length. So, however many numbers we have in this array is how many dimensions we have. We also may hear vectors referred to as vector embeddings or just embeddings.Here's a real-life example of vectors in use. When you go to a store and you ask one of the workers where to find something, they might say, go to aisle 30, bay 15. That is a two-dimensional vector. You'll also notice in the stores that similar items are placed next to each other for ease of searching and finding. So, the light bulbs, for instance, are not scattered all around the store, they're strategically placed so that you can find them easily. Another example is games. Games use 2D and <a href="/tags/3d">3D</a> coordinates to know where objects are in the game's world. With these coordinates, we can compute the proximity between objects to detect collisions, for instance. So, the same math is used to compute the similarity between vectors during vector search. If you're a Stargate fan, the gate address is made up of at least seven dimensions that are like vectors. To locate Stargates and other galaxies, you can add an eighth or a ninth dimension, just like you would add to a phone number the area <a href="/tags/code">code</a> or the country <a href="/tags/code">code</a>. And so, this shows how adding dimensions significantly increases the size of that virtual space in which our <a href="/tags/data">data</a> is organized.So, again, what makes vectors so special? They enable semantic search. And so, this is in simpler terms, they let us find information that is contextually relevant and not just a keyword search. And this <a href="/tags/data">data</a> is not limited to just text. It can also be images, <a href="/tags/video">video</a>, <a href="/tags/audio">audio</a>. These can all be converted into vectors. So, how do we go about creating these vectors? This is done through an encoder. The encoder defines how the information is organized in that virtual space. Now, there are different types of encoders that can organize these vectors in different ways depending on our use case. There are specific encoders for text, for <a href="/tags/audio">audio</a>, images, <a href="/tags/video">video</a>, and so on. And many of the most popular encoders can be accessed through Huggingface, Open <a href="/tags/artificial-intelligence">AI</a>, and many others. And so, how do we tie all of this back to retrieval, augmented generation? Well, RAG leverages vectors to pull in real-time contextually relevant <a href="/tags/data">data</a> to augment the capabilities of the LLM. Your search capabilities can augment that <a href="/tags/performance">performance</a> and accuracy of GPT models by providing a memory or a ground truth to reduce hallucinations, providing up-to-date information and allows us to access our proprietary private <a href="/tags/data">data</a>. So, first, we take our custom <a href="/tags/data">data</a>, whatever it is, and generate our embeddings using an embedding model.

And then, we store those embeddings in a vector <a href="/tags/database">database</a>. So, again, this <a href="/tags/data">data</a>, it could be documents. It could be <a href="/tags/documentation">documentation</a>, blog articles, videos, images, PDFs, anything that we have. And through here, we create those embeddings. And now, you don't have to use LangChain to facilitate all of this, but it is very helpful and we're going to talk more about that later.So, those embeddings are created. They're stored now in our vector <a href="/tags/database">database</a>. And now, we're able to accept user queries to find relevant information within our custom <a href="/tags/data">data</a>. Now, to do this, we send the user's <a href="/tags/nlp">natural language</a> query to an LLM to vectorize the query as well. And then, we use the vector search to find information that is closely related, semantically related to the user's query and we return those results. And we can do anything that we want with these results once we have them. We can... It looks like someone's waiting in the waiting room. One second. Here we go. So, we can summarize and answer their questions based on our custom <a href="/tags/data">data</a>. We can respond with links to specific <a href="/tags/documentation">documentation</a> pages and so on.And so, imagine your <a href="/tags/react-components">React</a> app has an intelligent chatbot. And with RAG and vector embeddings, this chatbot can pull in real-time <a href="/tags/data">data</a>, let's say the latest product inventory and it knows what's in stock and what's not in stock. And it can offer the customer some other products during their interaction. With RAG and vector embeddings, your <a href="/tags/react-components">React</a> app isn't just smart, it's adaptable with real-time and incredibly context-aware. And so, how are we going to go about this? Let's take a look at the tech stack that we're going to use.The first thing that we're going to use is <a href="/tags/nextjs">Next.js</a>. And of course, we're going to use the app router. <a href="/tags/nextjs">Next.js</a> and Vercel just makes building <a href="/tags/artificial-intelligence">AI</a> apps and working with <a href="/tags/artificial-intelligence">AI</a> technology so easy. Next, OpenAI. They have been spearheading advancements in language models like GPT 3.5 Turbo, 4, and so on. And so, there are many other language models out there, but today we're going to focus on OpenAI. We're going to use them for the embedding and for the generating of the responses. LangChain is another crucial part of our tech stack. It helps us in <a href="/tags/data">data</a> preprocessing, routing <a href="/tags/data">data</a> to the proper storage, and making the <a href="/tags/artificial-intelligence">AI</a> part of our app more efficient and just easier to write. And then there is the Vercel <a href="/tags/artificial-intelligence">AI</a> SDK. This is an open source library designed specifically for building conversational streaming UIs. So, any time you have a chatbot, it reduces so much boilerplate <a href="/tags/code">code</a> that you would have to write otherwise. And last, but definitely not least, we're going to store all of our vector embeddings in <a href="/tags/mongodb">MongoDB</a>, and we're going to use <a href="/tags/mongodb">MongoDB</a> Vector Search to find the similarity between vectors. And this is a game changer for <a href="/tags/artificial-intelligence">AI</a> applications, enabling us to provide more contextual and meaningful <a href="/tags/user-experience">user experience</a> by storing our vector embeddings directly in our application <a href="/tags/database">database</a>. So, instead of bolting on another external service, and it's not just vector search, but <a href="/tags/mongodb">MongoDB</a> Atlas itself brings in a new level of power to generative <a href="/tags/artificial-intelligence">AI</a> applications, and we'll take a look at that. And each of these technologies in this stack were chosen for a specific reason, and when these are combined, they enable us to build smarter, more powerful <a href="/tags/react-components">React</a> applications.Okay, so it's time for me to stop talking for a bit. There's another person in the waiting room. Let me admit them. And this is where we're going to start out. So, let everyone locally go to mdb.link slash vs-demo. So this is the <a href="/tags/workshop">workshop</a> page, and let's enter the <a href="/tags/workshop">workshop</a>. Now, this is a link that you can share with anyone that you'd like. This is actually a self-paced <a href="/tags/workshop">workshop</a>. It's designed to be a self-paced <a href="/tags/workshop">workshop</a>. It's better when you have someone walking you through it. So some of the things in here, we're going to kind of skip over, because I've already talked about them, but we're going to use this as our guide for this <a href="/tags/workshop">workshop</a>. So, some prerequisites.

We are going to need a <a href="/tags/mongodb">MongoDB</a> Atlas account, an OpenAI account, and <a href="/tags/nodejs">Node.js</a> version 18 or later locally. You also just need some basic Git knowledge. And then, let's move on to the next part.So, what is... The first thing that we're going to do is make sure that we're all up to speed and we have an Atlas account. So, I'm sure most of you are probably familiar with <a href="/tags/mongodb">MongoDB</a>, the open source <a href="/tags/database">database</a>. <a href="/tags/mongodb">MongoDB</a> Atlas is the cloud-hosted version, and it adds a ton of capabilities that the local version does not have. There's app services for fully managed <a href="/tags/backend">backend</a> for your web applications. There's <a href="/tags/serverless">serverless</a> functions and triggers and a whole bunch of other stuff included with that. There's Atlas Search. There's <a href="/tags/data">Data</a> Federation. You can store stuff in S3, Google <a href="/tags/cloud">Cloud</a> Storage, and access those in <a href="/tags/mongodb">MongoDB</a>. But what we're going to look at, what we're going to use today is the Atlas Vector Search. And that is going to allow us to find the similarities between our vectors during our query time. And so, where is <a href="/tags/mongodb">MongoDB</a> Atlas deployed? It's on all of the major clouds. <a href="/tags/aws">AWS</a>, <a href="/tags/azure">Azure</a>, Google <a href="/tags/cloud">Cloud</a>, and many, many regions all around the world. You can also, how much does it cost? Well, today, we're going to, of course, use the free plan. There is a very generous free cluster. It's free forever. It does not require a credit card. So that's what we're going to use today. It is perfect for <a href="/tags/testing">testing</a> and for prototyping. I would never recommend it for production, but it is completely free.

All right, so first thing we're going to do is create your Atlas account if you don't already have one. So just follow this guide here. There's a link here to the website. You're going to fill out, you can log in with Google. That makes it easier to fill out the information. Verify your email address. Last time I went through this <a href="/tags/workshop">workshop</a>, it took about a minute for that verification email to come in, and some people were a little quick on the draw and they hit resend too quickly and we took us a little bit. So just be a little bit patient on that email. And I'll kind of wait here for that spot.Now at the bottom of Zoom in the menu bar, you're going to see something that it says, where is it? Reactions. Now under reactions, there's a whole bunch of things you can do. You can wave and thumbs up and all that sort of stuff. There's a green check, and I'm going to put it on mine. When you're good at this stage and you have your Atlas account, hit that green check under reactions. That way I know that we're all ready to go.After we get this, we're going to work on creating a cluster. Some green check marks, good. I'm just going to go to this next phase here. If you're ready to go to this next phase, feel free. Creating a cluster is pretty simple. As soon as you get that email and you get logged in, just kind of go through the wizard that pops up. Create your new deployment. Be sure to select the M03 cluster. Choose whatever provider you're comfortable with and a region that's closest to you would be great. For the name, you can leave the default. It doesn't matter what you name it. And then hit create. Any provider, it doesn't matter which one you choose, whichever one you're comfortable with. <a href="/tags/aws">AWS</a>, Google <a href="/tags/cloud">Cloud</a>, <a href="/tags/azure">Azure</a>, you can choose any provider. And then the region and whatever is closest to you. So it really doesn't matter which one you choose. As long as you've got the M03 cluster selected, it's all free. Some providers have more regions in different areas. So you might go through those just to find a region that's closest to you. It'll just be faster. After you've chosen your provider and your region, you'll be prompted to create a user. You can create a username and a password, whatever you'd like to choose there. Just save it because we'll need that later. Then you'll need to whitelist your IP address. That should be done by default. You should see your IP address in there, whitelisted. There is an option to allow access from anywhere. We don't recommend this, especially in production, because this will open up your <a href="/tags/database">database</a> to anyone in the world. So ideally, it will just be opened up to your specific IP address. That makes it super secure because not only do they have to be on your network, they also have to know your username and password to access it.All right. I'm going to move on now to the OpenAI part, unless anyone is still having issues. I think there's a way for me to clear the reactions, is there not? There used to be a way in Zoom to clear the reactions. All right. Just go ahead and remove your check marks for me, please, so we can move on to the next part.

All right. The next part is getting OpenAI set up. All right. So what is OpenAI? I'm pretty sure most of us are familiar with them. They are creators of some of the amazing large language models like GPT-3, 3.5, 4, 4Vision, and so on.Now, how much does OpenAI cost? Well, if you set up a brand new account, you're going to receive credits, and those credits will be enough to complete this <a href="/tags/workshop">workshop</a>. The API calls are a fraction of a cent. If you've already exhausted your credits, this <a href="/tags/workshop">workshop</a> is not going to cost more than five cents. So you can look at the pricing details here.So we'll go through here and create an account. So you go to OpenAI.com. So in the last time I ran this <a href="/tags/workshop">workshop</a>, we had a bit of an issue here, because a lot of people already had accounts and they've already exhausted their credits. So you can create a second account, but you have to have a second email address, and you have to have a second phone number, because they've added phone number <a href="/tags/validation">validation</a>, and you only get credits if both are unique. So if we're going through here and creating accounts and we have any issues, then I'll share an API key with you so that you can use that, because I don't want you to have to put a credit card in and all that stuff for five cents. So let me know if you're creating a brand new account and you've got credits. Great. If you can't get to that spot, ideally I would love you to be able to go through this so that you can see the steps. If not, then I'll share an API key with you. So once you've signed up, you go to API keys, you'll create a new secret, and then you'll want to copy and save this somewhere safe, because you'll never see it again. But it's okay. You can always delete it and create a new one, so it's not a big deal. Green checkmarks, whenever you get to this phase and you've got your API key. We'll move on to the application setup.So what are we going to do here? We're going to create a <a href="/tags/documentation">documentation</a> application with a chat bot that only answers questions from the information that we've provided it. So the base of this application uses a starter kit from Vercel that includes <a href="/tags/nextjs">Next.js</a>, Tailwind, OpenAI, LangChain, and the Vercel AISDK. And then to that I've added the <a href="/tags/mongodb">MongoDB</a> package. So you can see that starter here, but you don't need to go there. It's just for information. The actual application is on the next page. So this is where the GitHub repo is. So you can git clone this, <a href="/tags/ci-cd">cd</a> into that directory, and then do an <a href="/tags/npm">npm</a> install. However you like to normally do this, I'm going to go through these same steps. I'm going to clone the repo and do an <a href="/tags/npm">npm</a> install. So over here in GitHub repo, this is the way I like to do it. I just go here and grab the HTTPS, and then where is my VS <a href="/tags/code">code</a>? I've got too many windows open. There it is. VS <a href="/tags/code">code</a>, I go over to source control, clone a repo, and then choose where I want to put it. I'll just put it in my desktop and select open. I've got the repo here. Open up the <a href="/tags/cli">terminal</a> and run <a href="/tags/npm">npm</a> install. Oh, yes, sure, I would love to update right now in the middle of a <a href="/tags/workshop">workshop</a>. Screen check marks again. Whenever you've completed this, looks like many of you have completed it before me. That's great. All right, <a href="/tags/npm">npm</a> install. Permission denied and cloning. Okay, so that's going to be a local issue. Permission denied is where are you cloning it to? Assuming you're probably on a Mac. Just be sure that you have permission to write to whatever directory you're trying to clone into. You do it and be sure you're in your user directory on Mac.

Create a blank folder there or something and clone into there. We'll give this another couple of minutes and see if we can help.Let's go ahead to configure the application. If you could clear your check marks. So there's a.env.example file that you'll find in that repo. We need to rename that to.env and add in some keys here. So you got your open <a href="/tags/artificial-intelligence">AI</a> API key that you just got and your <a href="/tags/mongodb">MongoDB</a> Atlas connection string.So to get the Atlas connection string, that is going to be under connect and then drivers and then this is what it looks like. So let me see. Let's go. I'll walk you through that on my page. So here I'm under overview. You could find it from overview. You can also find it under databases. Either way. So let's go to overview and this connect button is what you're looking for. So connect. These all basically do the same thing. But if we go to drivers, and then this string right here is what you want. Everyone's string is going to be a little bit different. It contains your user ID and then a place for your password. So let me copy this and then let's go back over to VS <a href="/tags/code">Code</a>. And then the.env. So I'm going to paste that right here. And some strange formatting. What's going on in my VS <a href="/tags/code">Code</a>? It's weird. All right. So <a href="/tags/mongodb">MongoDB</a> is my username. And then my password, I think I said it was <a href="/tags/mongodb">MongoDB</a> as well. And then, you know, yours is going to be different with a different cluster, etc. But just be sure that you put in your, you have your username there and your password there. And that is good to go. I'm not worried about you knowing this because you're not on my internet. So you won't have the same IP address. So you won't be able to access it. And then the open <a href="/tags/artificial-intelligence">AI</a> API key. Let me grab that. And put that here as well. And then save the file. Actually, we need to rename it. So let me go over here and rename that and just make it a.env. There we go. Now the formatting looks better. That's what it was because it was a.example file. And then, again, like I said, if anyone has issues on the next stage, then I'll share this API key with you. Let me see. Go back. Make sure. Okay. So.example.

Yeah. Into your connection string. And then rename it to.env. Okay. Let me know if anyone is having any issues with that. Finding your connection string, et cetera, should be good.Next phase, we're going to test the app. This is where we'll find if there are any issues. Oh, yeah. One other thing. I'll go back here. So here in the connection string that you got from <a href="/tags/mongodb">MongoDB</a>, be sure that you replace the brackets as well. Do not leave the brackets in. So it should just be username and the colon and then your password. So double-check that. All right. Let's move on to <a href="/tags/testing">testing</a> the app. So pretty simple, <a href="/tags/npm">npm</a> run dev. So let's go back to the app. Yes, <a href="/tags/code">code</a> of the <a href="/tags/cli">terminal</a> in p. And you should be able to open it up on localhost 3000. And at this point, we should have a connection to open <a href="/tags/artificial-intelligence">AI</a>, to the default llm. And you should be able to ask questions like basically just exactly like basically just exactly like chat GPT. So this is what you should see at the bottom. Say something. I'm gonna zoom in a bit, and then let's just say hi. See what happens. Hello, how can I assist you today? So we're getting a connection here to chat GPT. This is where we're probably gonna run into an error somewhere. So let me know who is having problems. If I go back into VS <a href="/tags/code">code</a>, and I check my <a href="/tags/cli">console</a>, this is where you'll see any errors, any issues. The reply won't come back. Okay, so go back to your <a href="/tags/cli">terminal</a>. And you should see some sort of an error or something happening here in your <a href="/tags/cli">terminal</a>. You exceeded, okay, you exceeded your quota. Yep, insufficient quota. Yep. Okay. So let me error insufficient quota. Yep. Quota. Yep. Okay, so everyone's quota is expired. All right. So let's see, where is that key? Okay, so enter that as your open <a href="/tags/artificial-intelligence">AI</a> API key in your.env file. If you're having a quota issue, you can use that, and it should work. Go ahead and, yes, if you're changing an environment variable, go ahead and kill the server and restart it. And then try again. Awesome, glad it's working for everybody. The main point was being able to see, in OpenAI, you know, where to get the key and how to set it up and all of that, so at least you were able to go through there.

And some were even having issues, and the last time I ran this <a href="/tags/workshop">workshop</a>, they were trying to add their credit cards, and there was like a $5 minimum or something like that. I've never run into that before. So it's kind of strange. So definitely didn't want to put all of you through that for a few cents. I'll be sure that <a href="/tags/mongodb">MongoDB</a> reimburses me. It doesn't look like anyone's having any issues. If you are, please let me know. But what I'm gonna do for now is move on to the next step, which is more of an explanation, which I've already sort of gone through. So let me just give you a brief overview of <a href="/tags/mongodb">MongoDB</a> Atlas Search, as a whole, it is a full text search engine. So similar to Elasticsearch or other third-party search engines that you can bolt on to your existing <a href="/tags/database">database</a>, but this is built right into <a href="/tags/mongodb">MongoDB</a> Atlas. So it's directly on your <a href="/tags/database">database</a>, it's not an extra feature. And so it helps us with full text search, so keyword searching, scoreing, there's a ton of language support, auto complete, highlighting, all kinds of stuff there that you would expect in a search engine.And then why right there in Atlas? Why is it a great thing? Well, again, like we talked about Elasticsearch, Solar is another one. Those are built on Lucene, and normally you would have Elasticsearch or something on the side, bolted on to your existing <a href="/tags/database">database</a>, and you have to worry about syncing back and forth, CICD pipelines, etc. So Atlas Search is also built on Lucene, but it's directly on the <a href="/tags/database">database</a>. You don't have to worry about any of the back and forth, syncing, etc. So it's much faster as well, because there isn't that line in between. And it integrates very great, very well into the <a href="/tags/mongodb">MongoDB</a> Query API, it's just a dollar search operator to perform those search operations. And so how does that work? You can do simple text search, full text search. And then there are, of course, you have to create indexes to tokenize, to create these searches, and make these searches faster. And that is the basis of what we're gonna use today, but what is built on top of that is vector search. So vector search is not just a keyword search, it's a semantic search. So it allows us to see meaning in <a href="/tags/data">data</a>. So there's an example here, if you search for how to make a cake, a keyword search would look for how to make a cake, those words. But in a semantic search, it would actually translate that to multiple different things. It could be how to bake a cake, how to make a pie, because those things are similar to each other. Even though those aren't the exact words that you typed in, it still could return those results because they are so similar. So that's the difference between keyword search and semantic search. It's all about meaning.And so, again, I'm kinda skipping over a lot of this stuff because I've already covered it, and this is meant to be a self-paced if I wasn't here before and talking before. So this goes into vectors, why do we need vectors? This is a graph showing you, we can find vectors that are close to each other. There are different types of vector searches actually. So if we're looking for something here, one is gonna be the closest. But there's this Euclidean search is one type. And it finds vectors that are closest to the search vector that you're looking for. So for example, one, two, three, four, and five, these are the vectors that we have, our embeddings that we've created in our <a href="/tags/database">database</a>. The little search icon here is the user's query. The user queries something, and the closest things are one and four. So that's what gets returned in a vector search using the Euclidean search method. Cosine search method is a little different because it doesn't include anything negative or behind it. It uses everything positive. So one and three are in this top right vector, which means they're more similar to each other than four is. Even though four is physically closer, four is not in the same category. So a cosine search would actually return one and three. So there's different types of searches, there's different ways to find vectors, and it all depends on your use case. So for today, we're gonna be using cosine because it is actually the most efficient way to return vectors in the type of textual search that we're doing. So how do we create vectors? We kind of talked about this, we have to vectorize and create those embeddings. And so that is what we're gonna do right now. So we are going to look in our repo that we cloned, and let's take a look at the package.json. We can see the different dependencies that we have here.

We kind of talked about those, but the scripts, we have a dev script, a build script, a start script, and an embed script. So what this is gonna do is it's gonna run a node app that I've created called create-embeddings.mjs. So if we go here in the root directory, we'll find that file.So in this file, it's basically just a <a href="/tags/nodejs">node.js</a> app. So we are bringing in the file system from <a href="/tags/nodejs">node.js</a>. We're using the Langchain text splitter because we need to split our documents into chunks. We can't put them all, we can't create an embedding on the entire thing because it's just too big. So we're gonna split it, our documents into chunks. We're going to use the Langchain <a href="/tags/mongodb">MongoDB</a> Atlas vector search method. So this is a direct integration with Langchain through <a href="/tags/mongodb">MongoDB</a>. We're also gonna use Langchain open <a href="/tags/artificial-intelligence">AI</a> embeddings method. We'll bring in the <a href="/tags/mongodb">MongoDB</a> client and.env to bring in our secrets.So the first thing we do here is we set up our <a href="/tags/mongodb">MongoDB</a> client. So this is going to allow us to connect to Atlas using our Atlas connection string that we've already put in our env file. We're going to look for a <a href="/tags/database">database</a> named <a href="/tags/documentation">docs</a> and a collection named embeddings. Now those are not in any of your databases so far. So far we don't have anything there. But this is gonna automatically be created for us when we run this file. So we're going to have a <a href="/tags/database">database</a> named <a href="/tags/documentation">docs</a> and a collection named embeddings. And we are then going to use that collection a little bit further down.So under in the folder here, that we have a folder called underscore <a href="/tags/workshop">workshop</a> assets. And in there there's some fake <a href="/tags/documentation">docs</a> for a fake technology company. And we're gonna use those for this demo. So we're gonna grab those files. We're gonna <a href="/tags/cli">console</a> log the file names. We're gonna loop through the files. And then we are going to grab. We're gonna grab. Well, I clicked on the wrong thing. There we go. We're gonna grab the contents of those files. And then just <a href="/tags/cli">console</a> log that we're about to vectorize that file. All right, we're gonna send that through our recursive character text splitter from link chain. What that does is it's again, takes a long document. And splits it into small chunks that we can grab context from. And this is something that is use case specific. You can define the chunk size and the chunk overlap. So the more they overlap, the more context you're gonna have. Too much overlap could be a bad thing. Too small chunks could be a bad thing. Too large of chunks could be a bad thing. It all depends on use case. So these are numbers that there's no like magic number. You have to kind of play around with this depending on your use case. Then we're gonna just grab the output of those documents, those embeddings. Or actually the text from those documents to create the embeddings. So this is going to use the <a href="/tags/mongodb">MongoDB</a> Atlas vector search. We're gonna create an OpenAI embeddings call to OpenAI. We're gonna tell it which collection, index name, text key, and embedding key. And we're gonna look at what all this means in just a bit.

And then just <a href="/tags/cli">console</a> log that we're closing the connection. So what basically in a nutshell, this takes our documents, cuts it up into chunks, sends it to OpenAI. OpenAI sends back the embeddings we stored in <a href="/tags/mongodb">MongoDB</a>. I guess I could have started with that. That's a much easier explanation.Okay, so what we're going to do is run this script. So in your <a href="/tags/cli">terminal</a>, you should be able to run. Now, only run this once. Cuz this is not like a fully tested app. If you run it more than once, you're gonna just create a bunch of extra <a href="/tags/data">data</a> in your <a href="/tags/database">database</a> that you don't need. So <a href="/tags/npm">NPM</a> run embed is what you want to run. So let me go do that. <a href="/tags/npm">NPM</a> run embed. All right, so it grabbed all of our documents. It's vectorizing, it's looping through each one. It's like it worked for Tomas, normally it doesn't take this long. I'm wondering if all using the same API key might be a bad thing. Is it not working for anyone? I mean, I'm not sure. All using the same API key might be a bad thing. Is it not working for anyone else? Yeah, I got an issue. It threw an error. Yeah, okay, error, bad. Auth, auth failed, okay. Well, at least we're all getting the same error. Let's see what this means. But it worked for Tomas. It did. It worked for Kira. Are you all using my open <a href="/tags/artificial-intelligence">AI</a> key or your own? I'm curious. Using mine, okay, well it worked for you all, that's great. It might just be a waiting game because we might have hit some limit. Okay, all right, let's see, so what happened here? Server selection timed out, okay, so why did it time out? Okay, so this looks like a, my error here looks like a <a href="/tags/mongodb">MongoDB</a> connection issue. <a href="/tags/mongodb">Mongo</a> URI must include hostname, domain name, and TLD. TLD. Yeah, this looks like a <a href="/tags/mongodb">MongoDB</a> error that I'm having here. Server description, all right, let me double check. Yeah, all right, let me double check my, well, first of all, go to <a href="/tags/mongodb">MongoDB</a>. And I'm gonna go to <a href="/tags/database">database</a> and see if anything happened. Looks like we hit something. We got some somethings. I'm gonna go to browse collections and see if anything got written. Yeah, so I have no <a href="/tags/data">data</a>. So nothing got written, but we did try to make a connection. So let me, <a href="/tags/database">Database</a> access, <a href="/tags/mongodb">MongoDB</a>. <a href="/tags/mongodb">MongoDB</a>, I'm pretty sure I have the right password. And make sure my IP address, okay. I think my IP address changed. Or did it not? Yeah, that should be. That's fine. All right, so let me go back to databases, connect, and I'm gonna grab this more simple connection string and see if it's any different, see if it works.

Go over to, My env file. And replace that, make sure I put in my password. And let's save that and go back to the <a href="/tags/cli">terminal</a> and try again. Yeah, see I'm getting the response from OpenAI, but I'm not connecting to my <a href="/tags/database">database</a>. That's where I'm getting the timeout. That's odd, okay, let me, the joys of live coding, there's always an issue. I'm gonna double check something on my end here. Yeah, I got that same error, okay. Bonus points to whoever solves it first. Yeah, yeah, I checked my IP address, it is my current one. So that should be good. Yeah, I did double check. Let's just go, All right, I'm assuming others are still having issues as well. What I'm gonna do is just go to my <a href="/tags/database">database</a> access, Just delete the user and add the user back, <a href="/tags/mongodb">MongoDB</a>, <a href="/tags/mongodb">MongoDB</a>. And I want this to be a read and write to any <a href="/tags/database">database</a>, and add. So that should be good, and then network access. Let's just delete that, and add. Add my current IP address, yeah, that's the same one. Wait, so wait for these updates to deploy, shouldn't take more than just a few seconds. And if I go to my <a href="/tags/database">database</a>, see, I'm trying to make a connection. Just make sure to browse collections, yep, there's still nothing there. Okay, this should all still be correct. Still not, okay, are others still having the same issue, I'm curious. Yeah, I could add 000, but it shouldn't be necessary, because my IP address is on there. So it should allow me right now to do this. Yeah, same issue. All right, I'll just go back and give it a shot. I mean, allow access from anywhere. I mean, if that works, then I'm gonna be upset, because that shouldn't matter. I'll wait till this says active. So what it's doing here is it's actually, whenever you set up one of these clusters, it's actually a three node cluster. So there's three servers running in failover, and one is a primary, and two are the backups. And so what it's doing is it's propagating this change to all three of those servers. All right, active. Let's go back, and this should all be good. Let's see, I'm running bad. Come on. Okay, it worked. Gene, try 0.0.0.0 and give it a shot. That should not have caused this issue. I'm upset. Okay, so now, if I go to <a href="/tags/database">database</a>, under browse collection, we should now see our information. Awesome, so I have a <a href="/tags/documentation">docs</a> <a href="/tags/database">database</a> and embeddings collection, and in here, I have all of the split documents with their embeddings that came from OpenAI. So I have a few fields here. We have an underscore id, which is a default id that gets added to everything in the <a href="/tags/mongodb">MongoDB</a> collection. We have the text field, which is the original text that we split into chunks and sent to OpenAI, and then we have the embedding field, which is the array of 1,536 vectors that came back to us from OpenAI. If I expand that down, we'll see this is what a vector looks like, a vector embedding, so it's 1,536 numbers in an array. And then under location, or loc is the location of where this specific chunk came from in the file. So we have the lines from and to. All right, let me give this one more minute and make sure, if anyone else is still having issues, let us know.

It works now, awesome. All right, well, I'm glad it's working for everyone. Let's go back up here. So we ran that, we vectorized, awesome, awesome, looks like it's working. Okay, and then we checked it out, we looked at the embeddings, we saw what it actually did, the results, awesome.All right, so let's now, the next thing that we need to do is create our vector search index in <a href="/tags/mongodb">MongoDB</a>. This is the key thing that is going to allow us to search those vectors in <a href="/tags/mongodb">MongoDB</a>. So on your <a href="/tags/database">database</a> deployments tab, you should see at the bottom right here, Atlas Search and then create an index. There's a few different ways to do this, but this is kind of the easiest way. Create index, you'll click that and then you'll click Create Search Index on the next page. And then choose the JSON editor and click Next, and then you'll need this. Just so you can copy this from this page here, but this is the JSON configuration for this index. So I'm gonna walk through this myself as well, and I'll explain it. So under Databases, bottom right, create an index and then click the button Create Search Index. I'm going to choose Atlas Vector, so this is, the <a href="/tags/user-interfaces">UI</a> has changed again. I'm gonna have to update my <a href="/tags/workshop">workshop</a>. All right, Atlas Vector Search, JSON Editor, that's what you wanna choose. Next. For the index name, I think I have it set to default, cuz that's what it used to be, it's different now. So change your index name to default. If it's something else, it's not an issue, we can change it somewhere else, I'll show you where. But I'm gonna name mine default. On the left side, we're gonna choose the <a href="/tags/documentation">Docs</a> <a href="/tags/database">Database</a> and then the Embeddings Collection. We wanna know where do we want to create this index. And then I'm just gonna highlight this whole thing here and then hit Paste. So what we're looking at is our mappings, what do we want? Show the path to JSON Editor. Yeah, so we went to Databases, create an index, and then choose Vector Search, JSON Editor. I'll go back in just a second. In this JSON schema that we're putting in here, we have our mappings, we're telling it it's going to be dynamic, and then this is the important part, the fields. All right, so remember, we just created, in this embeddings collection, we created a bunch of documents, and they have a field called embedding. So that is where our vector embeddings live. And so we're telling it in the index where that is as well. So this right here could be whatever you called it. It could be document embeddings, or whatever you called that field. That's what you would put here. And then we're telling it the dimensions, because every encoder, we used the OpenAI encoder, and it has 1,536 dimensions. So depending on your encoder that you used, you might want to change this. Some similarity, we're going to use cosine. Remember, we talked about there's different types. There's Euclidean, there's cosine, there's several others. And then for type, it's knnVector. Knn stands for k-nearest neighbor, and that is the type of vector search that we're going to use. So got all that, selected that, and chose that, and type. Choose, please define the type property in your index structure. Let's see. I thought I did. All right. The joys of when they change the <a href="/tags/ux-ui">user interface</a> without telling you. Type is vector, path, I guess path is going to be embeddings. All right, this is going to be totally off the cuff, different than what it was last time I did this.

Embeddings, number of dimensions, 1536, 1536. Similarity, we're going to use cosine. All right, so if you go back and choose conventional JSON editor, you'll be able to click Next, yep. Yep, I think you're right here. I think either way that you do it, it should work just fine. They've changed this, and I'll have to give them some feedback on this, because this is a type. Okay, type is going to be vector search, and then type again is vector embeddings. All right, let's give this a shot and see if it works. Let me copy this, and I'm going to paste this in the chat, in case anybody needs that. And let me double check embeddings, that's what it was supposed to be, right? Called the field. The field is, no, no S at the end, no S, embedding. I got that wrong. Let me go back. The path is embedding. All right, next. Create the search index, and close that. It shouldn't take more than just a couple of, or maybe 30 seconds or so, for this index to be created. While it's doing that, let me check in with everybody. Again, I'll run through this search index. We can create a search index. So, in the past, the last time I did this <a href="/tags/workshop">workshop</a>, there was not a third option. There was just these two options. And so, JSON Editor, what's in the <a href="/tags/workshop">workshop</a> walkthrough would probably still work in here, I would assume. The new Editor is where I had to fill it out a little differently. Anyone having any issues at this point?

Anyone having any issues at this point? All right, looks like our search index is created. So, if we go back and check out what's going on here. And in the default starter app that we got from <a href="/tags/nextjs">Next.js</a>, there is a route under api slash vector search slash route.ts. And this is what we used when we ran <a href="/tags/npm">npm</a> run dev and we're talking directly to the OpenAI LLM. This is how it's set up. It's using OpenAI embeddings and then, let's see, it is using the text embedding 802 model to, again, we talked about in the past before, when the user submits a query, that has to be, an embedding has to be created for that as well. So that we have vectors of the users query, and we have our vector embeddings, and we can compare the two using vector search. So it sends that off, it does all of that, and then it returns the answer. But what we want to do is we want to inject some extra stuff before we send it to the LLM. And that is where rag comes into play. That is how we augment the capabilities. We give it extra information, extra context from our proprietary <a href="/tags/data">data</a>. So we're gonna create a new route. Actually, we're gonna update this route and then create a new route. So, Under API slash, I think we're up here. Yeah, so the first one, under add a vector search route, We can go ahead and copy this text block, api vector search route.ts. We go into VSCode and all of this. So under app, API, we currently have the generic chat route. We're gonna create a new route. And so I'm gonna say that was vector search slash route.ts. And it put it in the wrong spot, didn't it? Yeah, so I want vector search folder to be in the API folder, right? So we have a chat route, and we have a vector search route. And then I'm just gonna paste in that <a href="/tags/code">code</a> block that I copied. So again, in this vector search route, it is going to use the <a href="/tags/mongodb">MongoDB</a> Atlas vector search method from LangChain. And it is going to create those embeddings, the embedding for the user query. And it's going to compare the users query, The vectors from the users query with the vectors that are already in the <a href="/tags/database">database</a>. And it's gonna return the closest results. And how we tell it what the closest result should be is right here. So we're using something called maximal, marginal reference. And what we're able to tell it is, how many results it should find, like the closest numbers that it should find, but then return only a certain amount of that, the top percentage of those results. So we're able to, again with these numbers, kind of play around with them, depending on your use case to make our queries more accurate. So we could say, so here we're saying fetch 20 results, but only return the 1% from that. Or it's actually, yeah, 0.1%. So we're going to fetch 20, and we're gonna return 2. So it's gonna return the top 2 basically, out of that larger sample set. So we're gonna save that. And then, go back into here, and now we're gonna modify the chat route. And this is where we're going to implement the magic. This is where that all happens. So in this <a href="/tags/code">code</a> block here, go ahead and copy this <a href="/tags/code">code</a> block. And I'll go back into VS <a href="/tags/code">Code</a>. And let's open up the default chat route. So this is what came with the starter from Vercel. So in here, this is where we would normally just interact with OpenAI. And we want to inject some extra stuff. I'm gonna highlight all of this and just paste in that new <a href="/tags/code">code</a> block. And so in here, what we're doing is, what we've added to it, is this vector search right here. So we're gonna call this new vector search route that we just created within this route. We are grabbing the last message from the user. So as the messages come in and out of this chat, they're stored in an array. The last message is gonna be at the end of that array.

The last message is gonna be at the end of that array. So we're gonna just pluck out that last message, that we know that that was from the user. And we're gonna send that as context to vector search to return our results, the nearest results. Once we get that, here's the good stuff, this template. So this entire thing here is our template that we're going to actually send to OpenAI. Instead of just sending the user's query, we're gonna send this whole thing. So here's where we can modify this to be whatever we want it to be. So here I'm saying you're an enthusiastic fancy widget representative. That's the name of the tech thing that I had ChatGPT help me create <a href="/tags/documentation">docs</a> for. Fancy widget is a <a href="/tags/javascript">JavaScript</a> library who loves to help people. Given the following sections from the fancy widget <a href="/tags/documentation">documentation</a>, answer the question using only that information. I'll put it in markdown format. If you're unsure what the answer is, if it's not explicitly written in the <a href="/tags/documentation">documentation</a>, say, sorry, I don't know how to help you with that. And then for context, this is where we inject our vector search results. So this is the context that we're giving the LLM. And then we're also including the user's original question. So that then gets put back into our array of messages. And then the rest of this is unmodified. This is what came with the the Vercel starter kit. So we're chatting with OpenAI. This is where we can specify the model that we want to use. And then streaming is true so that we don't have to wait on the request to come back. It streams into the user. All right, so we saved that. Now we have, we've modified it so that we now have vector search capabilities. Let me just pop over here real quick, make sure, yep, it's time to test it out. So go ahead and run <a href="/tags/npm">npm</a> run dev again. And this time it should only answer questions from our specific <a href="/tags/documentation">documentation</a>. And if we go over into the folders here under <a href="/tags/workshop">workshop</a> assets, there's a questions.txt file with some sample questions that should work with the bot and any other questions should not work with the bot. Also again, those fake documents are in here if you wanted to take a look at them. They're all <a href="/tags/artificial-intelligence">AI</a> generated. So let's run <a href="/tags/npm">npm</a> run dev. And then let me just copy this top question here. Go back to localhost 3000. Refresh this. All right, let me ask that question. So I'm gonna ask it what is the last change made to the Fancy Widget library? So it should know the answer to this. I don't know. Did I get an error too? Somebody said they got an error. All right. Yes, I got an error as well. The joys of live coding. Okay, so this is gonna be a search index issue instance of the syntax unexpected token, yeah. Lucene vector index, unknown error, cluster time. Okay, so. Did it work for anyone? I'm gonna assume this is, this looks like an index error. So let me go back. Here and I'm gonna do it the other way. I'm gonna go back and delete this index. And then I'm gonna create a new search index using the regular JSON editor.

Default <a href="/tags/documentation">docs</a> embeddings and then, okay, it's still deleting the other name. Go back. Yeah, it's still deleting. I have to wait for this to delete. Okay, there we go. All right, let's create this regular JSON editor default. That in there and embeddings and next and create. All right, give this just a second to create the index. I'm gonna try it again. Apologies, yeah, when they change things, you don't get your <a href="/tags/workshop">workshop</a> updated and things break.All right, so it looks like we should be good to go there. We go back into VS <a href="/tags/code">code</a>. I'm just going to kill the server and I'm gonna go back into VS <a href="/tags/code">code</a> and VS <a href="/tags/code">code</a>, I'm just going to kill the server and run it again. Go back and refresh. Still spinning. All right, should be good. Let me make sure, yep, there's no errors so far. And let's ask the question. There we go, okay. That was the issue. All right, so we got the answer and it came from the <a href="/tags/documentation">documentation</a>. If I were to say anything, hi, it's going to say, I don't know how to help with that because the answer to hi is not in the <a href="/tags/documentation">docs</a>. If I asked it another question like, for instance, how do you install the fancy widget library? It should respond with, with some markdown actually, which it does. It tells me <a href="/tags/npm">npm</a> install fancy widget dash dash save, which it got from the <a href="/tags/documentation">docs</a>. So it should all be working now. Again, the issue was under the search index creation. This new Atlas vector search JSON editor. Obviously I need to do a little bit of research into what is needed there because the, what I created did not work. Under the regular Atlas search, JSON editor, the information that is in the walkthrough here works to create your search index. So if you create this search index, it will work. And now I named this index default. That is another thing that you have to be careful about. Let me go back into VS <a href="/tags/code">code</a>. And if we go to our API route, vector search route, you'll find here the index name. So we can have multiple indexes. You can have multiple embeddings from different places. You can have multiple embeddings from different things. You have to set up an index for each different type of search. And so this one, the default name is default. So I just left it there. So if yours is named anything else, you would want to update that in this <a href="/tags/code">code</a> as well. The text key that's telling us, telling the vector search algorithm, which key, which field in the collection document the original text is, so that it knows what to return. And then the embedding key is the field where the vector embeddings live. And then text embedding, ADA2, is the model that we used to create our original embeddings for our <a href="/tags/documentation">documentation</a>. It's the model being used right here to embed, to create embeddings for the user query. Okay. So everything should be working now. Let me know in the chat if anyone is still having issues. Works for me now. Okay, awesome.

Again, sorry about the issues. I have to go back to the product team and get them to let me know when they change things next time. Okay. Nice. Okay, good. I'm glad it's working for everyone. Again, I love hands-on. That's the way that I love to learn. I don't learn much from slides. So being able to like see how it actually works, you really can see, we created the embeddings from OpenAI, we stored them somewhere. We had to create embeddings for the user query. We compare that to our existing embeddings. We find the top results. And then we can do anything we want. This is, again, let me go back to the regular chat route. Here, this is all, really, where all of the secret comes in, which comes down to prompt engineering. That is what this is. You can tell it to do whatever you want. You're intercepting that user's query. You're injecting some extra context. And then you're having it return that context however you want. You can do whatever you want here with how that works.

There's a question. If we already have embeddings in corresponding text, why convert ask question to existing embeddings to retrieve corresponding text? Can't it be a lookup? Oh, okay, yeah. So traditional search would just be keyword search. So keyword for keyword. But what we've done here is we've translated these keywords into an algorithm of sorts, into embeddings. And let's say, for instance, we have a graph. And you've got cats and dogs. They're up here. Humans are down here. And then you've got plants that are up here. The person that's searching for plants, they're gonna find all the plants up here and nothing else. Food, et cetera. So we are converting their query into an embedding so that we can compare the embeddings to each other. So we're putting all of our things into this virtual space that is multidimensional, 1,536 dimensions. So we also have to put the user's query into that same space so that we can find and see what's closest to that query. Traditional keyword search would just be straightaway, just a lookup if they're searching for a word. You can actually do some sort of a similarity even through keyword search, but it's not as extensive as vector search. So that's why we have to convert their query as well so that they're all in the same space and being compared.Let's see, let's go back. And that is the end of this. So let's move on to some more questions. Why 1,536 as a dimension value and not another value? Okay, that's perfect. So that is not a value, an arbitrary number that I just came up with. That is the value that OpenAI came up with for their specific encoder. So it depends on your use case. If you're encoding <a href="/tags/audio">audio</a>, if you're encoding <a href="/tags/video">video</a> for search, if you're encoding images, text, there's tons of different encoders out there. OpenAI has a bunch, but actually HuggingFace has a whole bunch as well. And you can find an encoder that is specific for your <a href="/tags/data">data</a> type and use that. And the dimension value is going to be given to you by your encoder. There are encoders that are in the 100s. There are encoders in the 1,000s. It just depends on the encoder that you use. And you have to specify that number to the vector search query so that it knows how many dimensions that it can search for.Question two, should we use the same LLM model for indexing and querying? Yeah, so for embedding and querying. Yeah. So for embedding and querying, they do need to use the same model because you're embedding your proprietary information, your <a href="/tags/data">data</a> in a certain model in a certain virtual space. The query that comes in has to go into that same virtual space. So you have to use the same embedding on both sides for that to work. And then of course in your index, you have to specify, tell it which one so that it knows how to do the search.Okay, so then should I be able to ask some different questions since we're just comparing text like using similarity? Yeah, I mean, you should be able to ask any questions that are relevant to the information that we've given it because of the way that I set up the prompt, this prompt engineering. I've specifically told it, don't answer any questions that you can't find in our <a href="/tags/documentation">documentation</a>. And so if I left that out, it would still act like a regular chat GPT and it would try to answer your question even if it couldn't find the answer, which would possibly be a hallucination. So in my prompt, I specifically told it, don't answer any other questions unless it applies to this <a href="/tags/documentation">documentation</a>.Okay, so how can I use this as a doc? How can I use this as, I'm not sure, can you clarify your question? Could I also set this up locally, self-hosted without opening it? Yes, yes, totally possible to do that. You can run your own LLM locally. There's tons of self-hosted LLMs. You could run <a href="/tags/mongodb">MongoDB</a> locally. The vector search specific functionality though is a feature of Atlas. Now for development, there is a dev server that you can run locally, but for production, the vector search functionality, you would have to use Atlas in the <a href="/tags/cloud">cloud</a>. So for development, all of that can be done locally, yes.

Okay, Alfie, if you could rephrase your question or expand on that, how can I use this as a doc? For a more <a href="/tags/deep-dive">in-depth</a> look into an example of this, you can actually go to the <a href="/tags/mongodb">MongoDB</a> <a href="/tags/documentation">Docs</a> page and we have the <a href="/tags/artificial-intelligence">AI</a> actually working there. Let's see, <a href="/tags/mongodb">MongoDB</a>... A much more fully functional version of this is working there. So if I go to resources. Deployment, adaptation, I just want the regular <a href="/tags/documentation">docs</a>. There we go. So let me zoom in a bit. At the top here, you can say, how do you deploy a free cluster? There's some <a href="/tags/tips">suggestions</a>, but you can ask it whatever you want. And so here it's replying to me, here's how you do it, da, da, da. Gives you some context, and then it also gives you further reading. Here are the links to dive deeper into the <a href="/tags/documentation">docs</a>. So you can ask it anything about Atlas or about <a href="/tags/mongodb">MongoDB</a> in general, and it will return some basic summary, some basic instructions, and then a more <a href="/tags/deep-dive">deep dive</a>. So this is how we've implemented it.

Okay, cool. All right, let's... I'm free to stick around for some time for some questions if anyone has any questions. And again, I really appreciate everyone working through the hiccups here and there. And I hope you learned something and really also looking forward to your feedback on how we can improve this and what you thought about the <a href="/tags/workshop">workshop</a>. So open for any questions. Feel free to unmute if you want, or use the chat.We're gonna go ahead and get started. So we'll start with a few questions. And we'll see if we can get you to speak out. Is there an alternative to vector search that could host self-host? Okay. Probably, off the top of my head, I can't think of any that are locally hosted. There are tons of vector databases popping up everywhere. And they all have vector search capabilities. Again, those vector databases are a bolt-on because you have to do the whole CICD thing and back and forth and sync and et cetera. But locally hosted, that's where I'm coming up with a blank because I can't think of, I'm sure there are, but I can't think of any that are locally hosted for production.I appreciate that everyone. Can I create my own question and answer? Say I wanted to narrow it down to one sport tennis basic questions. Yeah, for sure. You have to kind of start with some information for it to use. So the idea here is to give the LLM more context. So when you're searching, when you're using Chattopiti, it just pulls from its existing knowledge base. The idea here is to give it extra stuff. For instance, let's say you have a company that has work orders and you need to like search through that information for some reason. You wanna ask it questions about these work orders, which ones are still active. The information that's in them, who the clients are, et cetera. You would feed that information as embeddings to the vector search and ask it questions about that proprietary information. And then that way you're not relying on Chattopiti by itself because it doesn't know anything about your work orders. So for sports specifically, Chattopiti probably already knows a good bit about tennis and other sports. But if you have some specific information about those sports that you want to make sure that it references, that is what you would want to embed and have it do a vector search on. I hope that answers your question.More about how to do it in <a href="/tags/code">code</a>. Okay. In the <a href="/tags/code">code</a>, so this is really the part that you want to customize. Again, we're intercepting the user's question and then we're telling the LLM what to do with the question. We're adding some extra context here and we're telling it what to do. This is really in our implementation on <a href="/tags/mongodb">MongoDB</a>. We actually tell it something very similar, but we also tell it to then return the references, the embeddings that it referenced as links so that we can then add those at the bottom for the user to dive deeper. We tell it to take all of the information that it returns and summarize it and then return that in a short summary instead of just giving back an entire document that is just too much context. And so this in the <a href="/tags/code">code</a>, this is where you inject that extra information other than before when we created the embeddings. Yeah, prompt engineering is a thing you have to, and this is another thing that you have to iterate on because maybe sometimes it doesn't return exactly what you wanted, you kind of have to come back in here and tweak it a bit until it's exactly what you're looking for. Same thing in chat GBT, you ask a question, you get a terrible response, but if you add more context, you get a better response. Any specific use cases that any of you might be looking into for this? I'm always curious how this might be used.Yeah, so in this, it's actually returning the text. And so it's returning that to the LLM. So this is what's being sent to the LLM. So right here, this vector search, which came from our vector search route, it's actually returning that text field from our <a href="/tags/database">database</a> with the actual markup, markdown in the text field. But the actual vector search is using the embeddings. So it's finding the embeddings, it's finding the documents in the <a href="/tags/database">database</a> that are closest matches, and then it's returning the actual text. And then this current message context, that's the message that the user actually typed in in text format as well.

All right, well, I'm gonna stick around for another few minutes in case anybody has any questions, but you're free to leave at any time. You're gonna receive a feedback email shortly. Again, I really appreciate your feedback. Thank you for sticking around and I hope I helped you learn something.Do you have any <a href="/tags/documentation">documentation</a> link or <a href="/tags/workshop">workshop</a> to further advance custom? Customize questions, yeah. Yeah, you're welcome. So the <a href="/tags/documentation">documentation</a> on the <a href="/tags/mongodb">MongoDB</a> web page as far as vector search is great. We also have a lot of resources here under... Why is this taking so long? Oh, I had to escape that. Okay, so if we go over to developer center, so it's actually mongodb.com slash developer. And if we go down to all technologies and then <a href="/tags/artificial-intelligence">AI</a> specifically, we've got a bunch of articles and videos on a bunch of different things, and some of them are Python related, some of them are <a href="/tags/javascript">JavaScript</a> and <a href="/tags/nodejs">Node.js</a> related, but a bunch of different use cases and tutorials here as well could get you started. And we've got <a href="/tags/mongodb">MongoDB</a> actually has integrations with tons of <a href="/tags/frameworks">frameworks</a> in the space. Link chain is one of them. This new one, Rivet, it looks pretty cool. So yeah, there's a lot of cool stuff going on in this space and you can learn a lot here from the developer center as well.If there's no more questions, I'll go ahead and let everyone go. I appreciate it again. And hopefully we'll see you next year in Berlin possibly.

React Day Berlin

React Day Berlin 2023

Let AI Be Your Docs

Deja que la IA Sea Tus Documentos

AI is a revolutionary change that helps businesses solve real problems and increase productivity. The workshop covers the demand for intelligent apps, limitations of LLMs, and how to overcome them. It explores the tech stack, integrating GPT, and optimizing the user experience. MongoDB Atlas Search and Vector Search are used to store embeddings and enable semantic search. Prompt engineering allows customization of AI responses.

Unleash the power of AI in your documentation with this hands-on workshop. In this interactive session, I'll guide you through the process of creating a dynamic documentation site, supercharged with the intelligence of AI (ChatGPT). Imagine a world where you don't have to sift through pages of documentation to find that elusive line of code. With this AI-powered solution, you'll get precise answers, succinct summaries, and relevant links for deeper exploration, all at your fingertips. This workshop isn't just about learning; it's about doing. You'll get your hands dirty with some of the most sought-after technologies in the market today: Next.js 13.4 (app router), Tailwind CSS, shadcn-ui (Radix-ui), OpenAI, LangChain, and MongoDB Vector Search.

Jesse Hall

JSNation US

I’m attending JSNation US. Join me there for free with 10k other JS engineers and 50+ great speakers. Just follow this badge.

JSNation US 2024

React Summit US

React Summit US 2024

React Advanced

React Advanced Conference 2024

Productivity Conference

Productivity Conference 2024

React Day Berlin 2024

Node Congress

I’m attending @NodeCongress, the biggest conf on #JavaScript backends – by clicking the badge you may get a free remote ticket & join me there with 5k other #nodejs devs & 35 great speakers #deno #cloudflare #awscloud

Node Congress 2025

JS Nation

I’m attending JSNation – the main JavaScript conference of the year. You may join me there for free with 10k other JS engineers and 40+ great speakers. Just follow this badge. 

#javascript

JSNation 2025

React Summit

I am attending the biggest #Reactjs conference remotely. You may get a limited free ticket (50% of talks, no workshops) and join me by clicking this badge.

#react #reactjs

React Summit 2025

C3 Dev Festival

Join me at C3 Dev Fest, a contemporary software engineering festival!

Register to win an all-paid trip to Amsterdam. Elevate your career with industry experts and workshops & rock 3 dance stages at the after-party!

C3 Dev Festival 2025

TechLead Conference

I'm attending TechLead Conference 2024 – get your free remote ticket and join me there

Let AI Be Your Docs

Comments