This is a natural evolution from the previous lesson. You can implement LangChain agents to connect to a provided SQL database, which is one of the scenarios that you can leverage in your professional activities. This lesson is all about SQL databases. You will leverage SQL light database and perform the RAG pattern against it. The result will be a database agent in which generative AI will help you translate natural language to SQL code. Let's have some fun! We are back for lesson three. You have successfully deployed your instances of Azure OpenAI and LangChain and connected to a CSV file. What's next? If our goal is to connect to a database an interact with this information? Exactly. We now create an agent to connect directly to an SQL database instead of a CSV file. This may sound complex, but I will guide you through it step by step. First, let's check the illustrative architecture for connecting to a SQL database. In building, our AI agent it will rely on the baseline GPT-4 model from Azure OpenAI. and use the RAG system. LangChain will orchestrate the knowledge, helping to find information and explain each step in obtaining it. Just like in lesson two, you will see traces with detail information which is powerful and useful. So we will keep this approach. This time will use a SQLite database, which is a local instance of an open source relational database. You will find all the necessary information in the notebook. And of course will access everything using APIs. Although we are using notebooks for this course, in a real world application, you could integrate the results into a web app, mobile application, or any other platform. The process remains the same. Replacing the CSV file with a SQL database. You could combine different data sources such as images, data lakes, and databases. The RAG mechanism will efficiently locate information from the appropriate source. Let's head to our notebook and get started. We are launching our environment again and this time we are incorporating a SQL database. The main difference is that we don't need to communicate with a SQL database. So you will see that this is the primary change compared to before. While the rest remains very similar. We will start by recovering the CSV data we used earlier. The idea is to load this CSV file into a SQL database, and then communicate with the database. If you already have your information in a SQL database, you will then need to load the CSV file. This step is just for the lesson. For simplicity, the data is available in your lesson directory under data all the states history CSV. wWe'll create the SQL Alchemy engine to facilitate communication with this SQL database, and will use pandas to handle the dataframes. Let's get started with connecting to our SQL database and loading the data. Basically what we are saying is we have a folder, a local folder called DB and there you have this test db file. This is basically the SQLite database file that you will use right? This is basically the template that you need to use. You can put it there. And then you had your local instance of that SQL database. Is everything you need. You can recover it from the website SQLite. They have examples. You can create your own. But this is good enough for what we want to do. And now we are creating the engine to connect to the database. And this is the interesting part. We'll be creating the engine. We'll be recovering the CSV file that will be on your data folder. Then we are creating the dataframe df dataframe with pandas. We are reading the csv file and this is the key part here. The delta compared to the second lesson, we are getting the dataframe and we are using the two SQL function. And basically what we're saying is yes, we are getting the all the states history. We are getting the engine that we have mentioned already. They say SQL engine if exist, imagine it's the first time we are doing it. The second time, we'll just replace it. No incremental, we are just replacing. So that's good. Let's run it. Let's see how this works. Good basically is telling you there are so many records and we have to load them into the SQL database. As simple as that. Let us go to the details of what will be about printing a SQL engine. So this will be our first SQL agent will evolve this kind of agent. We'll replace pieces later on the next lessons. But this is the first time you will be interacting truly with a SQL database. I hope you are excited, I'm very excited. Let's see. I'll be copying some information I have created that you will have available on your notebook. Don't worry, because a lot of information. Then I'll put it here. Let me show you. So look at this. You have the prefix. This is before, we are creating something called like a SQL agent prefix. We are explaining what we want to do. You are an agent designed to deal with that SQL database. So you are giving some context. Additional context before it was a CSV file Now we are saying hey you are, you are connecting to a SQL. And you will be running SQL queries and you will be doing this and that. You can change this. But this is a good initial template that you can leverage. And then you have the format instructions. So you have the format of how you want the engine to answer, like with an action the input, the situation, so the final answer, you have seen that before are the explanation. So what you are giving here, is an example for the model to understand what you are expecting as a user. The kind of answer you want, the format, the information you want to get, and then the trace of the SQL query. This powerful. We will be able to communicate with the SQL database. We will do it with natural language. But still we get the information about the query, right? This will be good for learning cases. within your company. Like people wanting to not only use the data, but also to learn about how to perform SQL queries. So that's good. Let's just put it here. Now let's put the second part of this, which is very similar to the sequence of steps we have followed before. So basically what you have here is the same. We are creating the LLM, the actual chat, open the instance with the endpoint with everything we have again database, the toolkit for the database. So we can actually combine the LLM database. And then we have again the question. Remember we can have any kind of question here, I'm putting something different. "How many patients were specialized during October 2020 in New York and nationwide as the total of the states". here we are creating the agent executer SQL, which is basically that agent, that LangChain agent with specific format that is expecting to have the prefix that we have sent, the format instructions that we have sent. Of course, we need to know which LLM we are using, which is Azure OpenAI. I do the version with that endpoint. Then the toolkit that we are using, which is a SQL database toolkit, and we will be retrieving information that okay, the you can customize these kind of parameters we'll execute this and see what happens. Like basically you execute this, there is no trace because we haven't sent any message or prompt to the system yet. This is the next step. So imagine even if I don't tell you what kind of function you need to send a prompt to the system. Remember from lesson one and two. Exactly, I know you are thinking about this is. The invoke function that we will put here. So basically we have the agent executer the SQL. This is different compared to previous lesson. And then we invoke, and then we send the question. So let's execute this. See how it goes. Remember before, we had similar trace. We are talking about an agent executer for this CSV file. Now we are talking about a SQL. This will take some time. Just enjoy the magic of a LangChain agent performing all these operations on your behalf. You will get a big lot of information here, all the traces. You can take your time for sure to analyze what this is saying, that basically, this is the part that will start to be interesting because explaining the results, remember that here the the query is different. We were talking about October 2020, New York and I believe we were saying hospitalized. So we have all the information the engine is finding where to find information, which columns, which values. You can see you already the selects the from the like the where all the queries that you will need to perform but that now are automatic with this agent. So let's see the answer. Remember the Phoenix chain is the clean version right of the question and the answer the problem and the output, the compilation. So basically here you have how many patients were hospitalized. And then on the output in October 2020 there were no new hospitalizations in New York. And there where like 53 new hospitalizations nationwide. Again, for whatever reason, you can get one another value depending on the state. Just go. You want to explore before this is the file and confirm that the engine is providing you with a good answer. I can confirm this is a good answer because I have checked the information before. So this is good. We'll continue with lesson 4. I hope you are enjoying it. I'm enjoying it a lot.