How you handle Personally Identifiable Information, or PII, is a critical concern when building any application. But it's especially important in genAI applications that use a third-party model or where you retrieve text from lots of diverse internal documents. In this lesson, you'll build a validator to check your user prompts for any PII, and if there's PII present, then prevent it from being passed over the internet to a third party model provider. You'll also use a state-of-the-art validator from Guardrails Hub to check LLM outputs for PII and redacted before it's shown to the user. Let's dive in. PII in general, you know, is something that should be taken pretty seriously. PII stands for personal identifiable information such as names, emails, social security numbers, anything that might be sensitive and could be used to identify you. This is a pretty important risk that's associated with using LLMs. And getting your customers to use LLMs. Since most developers and organizations that use LLMs use LLMs that are developed by third-party propriety organizations, and so you have to make this external API request that sends over all of, you know, maybe your data, maybe your customer data, etc... So essentially, what you want to do when being mindful of PII leakage is, first of all, making sure that none of your customers data or your employee's data or any of your organization's private data is ever leaked out to the third party LLM provider. The second thing to be mindful of is to make sure that in your responses to your users, that there's none of your own organization's data that is accidentally sent to somebody that should not be seeing it via the LLM responses. So for this lesson, we're actually going to be using a really great open source project by Microsoft. It's called Microsoft Presidio, and it's essentially a tool that helps analyze and anonymize many, many different types of PII. And we're going to dig into more details about how to use later in the lesson. I'm going to start with our usual practices here where we're going to copy over our warning filters. We are also going to copy over our imports that we need for our lesson. Okay. Now that we have our warnings and import set up, we are going to start with setting up our chatbot example. And we've done this multiple times by now. So I'm not going to talk through it into much detail. But we have an unguarded kind of activity and the same system message that we've been using. And then, we also initialize a RAG chatbot app that uses all of these components. And just to familiarize yourselves again, let us copy in our chat message and look at the failure in action. All right. So this is somebody named Hank Tate who shared their phone number. And once again for a pizza shop, you know, of a name and a phone number isn't really that sensitive. Your local pizzeria probably has that about you. But if you really think about it, for banks or for government organizations or for especially healthcare services, even details that might seem as innocuous as these need to be dealt with very sensitively. So here you see that you know the LLM responds with something short. The response is actually not what we need to focus on here. It's more that because Hank shared his private information when he wasn't meant to. What we end up having is on the back end of our chat application. If you scroll through messages, you actually see that Hank's data is saves whenever we store our back end. Right? And once again, if you're an enterprise or a large organization that cares about sensitive information, you need to be very careful with how you're storing customer data. What we'd like to do, ideally, is detect when a user or a customer of your genAI application is sharing private, sensitive information with you and filter it and detect it at the source. And also alert, you know, some underlying system so that you can kick in whatever measures you want to use for handling that information correctly. Those of you with ken eyes might notice that there's some other PII here. So this is PII that was retrieved by the retriever that was stored in the vector database. Later towards the end of this lesson we're actually going to deal with how to handle this PII that's in your database correctly. All right. So before we start building out our validator, let's actually look at how Microsoft Presidio works under the hood. So we are going to use Microsoft Presidio analyzer and anonymizer engines. And as the name suggests you know, analyzer engine basically takes in some string of text and then tells you what sensitive entities are present in the text, are their names, are their phone numbers, etc... And anonymize your engine takes detected entities and anonymize them so that the rest of the text is still usable with just the PII filtered out. So I am going to initialize both of these over here. And first, let us look at the analyzer engine in action. So this is the text that I'm using. Once again this is Hank's message. Let's see what our analyzer spits out. All right. So we identify three types of PII here. Date time which starts at 43 characters in and then ends at 60. Person. That starts in 73 ends here. So the person is Hank Tate. And then phone number is, you know, the phone number towards the end of the string. What's interesting here is that for us, name and phone number for our pizzeria is what we're sensitive about. Whereas date time is maybe not something we care as much about, but the specific entities that you care about filtering out would really vary from your organization, your use case, and the industry that you work in. Now let's look at Presidio's Anonymized engine and look at the anonymized version of this PII. Awesome. So we can see here that our updated message looks like can you tell me the orders I've placed in date time? My name is Person. And my phone number is Phone number. And we also got these outputs that we've seen before. So this. Anonymized text make sure that we've filtered out the PII that's sensitive for us. But at the same time we can still use the rest of the text so that we can still respond to, you know, Hank's question with the same manner that sorry, Hank to all that information for you, but now we're not mishandling this private information. All right. Now that we know how Presidio works, let's actually look at how we can use it to build our PII Validator. We are actually going to start out with writing a function that just does PII detection for some entities that we define. So this function is going to use Microsoft Presidio Analyzer to filter out just the person and phone number entities. And then we are going to return the entities that we've identified in our text string. If you want to look at the full list of entities that Microsoft Presidio supports, we've actually provided a link to that resource in the learner notebook. All right. So with our function defined, let's now go on to step two which is basically creating a guardrail that filters out this PII. And you've seen me do this a few times before. But all I'm really doing here is creating a validator class that I register And then the validate function of the validator class is what really contains a lot of the core logic for validation. So I'm creating that over here. Our logic in the validate method basically uses this function that we created up here, and then sends in the text to detect if there's either a name or phone number present. If we detect any PII then we raise a fail result with this error message PII detected. And you know, what are the types of PII as well as the metadata. And then finally if there's no PII detected then we pass pass result with this message. Awesome. Now that we have our guardrail created, let's actually try using. It in a guard and then see how we do. So, This is us initializing the PII guardrails so that it raises an exception anytime a name or phone number is detected. And then let's actually try it on the same sentence as before. So once again this is the same sentence with Hank Tate and the phone number and let's see what we end up getting. So we get the same error that we created before validation failed for failed with errors. PII, both person and phone number detected. Now what happens if we remove the phone number? Just person. All right. Now actually let's look at how we can set up guardrails server with PII so that we can actually use it in our production system. All right. So for this example we're actually not using the PII guardrail that you've created. We're going to use the guardrail that we pulled up from the hub. Because it supports many many different types of entities and then also supports real-time streaming. Then we're going to take a look at in just a second. So same as before. All we're doing is now using the same OpenAI API. But with a PII guard that runs on the input side. And with that, let's also create a guarded version of our RAG chatbot. And then I am going to use this chatbot that I've just created and then see what happens with my example. All right. Message history validation failed. And that was also really quick because what we're doing here is running this guardrail on the input side. So what really happens as soon as you send that message is before we could leak that PII out into an AI system, we actually raise an exception even before we got onto that stage. So if you look at the logs of what's being stored on the back end of our chatbot, you'll see that the only messages restored are the system message that was initially used. And we did not store the message that Hank said that contains sensitive information. The important thing, when you're building your application, and you want to think about how to handle sensitive data well, is making sure that you're detecting it. And anonymizing it. And then, according to your organization's policies, you can figure out how you want to handle any sensitive information. So you might remember that at the beginning of this lesson, we talked about retrieving sensitive information. It's always best practice to sanitize your data before you add it to your vector database. But at the same time, you know, accidents happen and sometimes some data might creep in that maybe contain sensitive information, or maybe it's even okay for that information to be in there. But depending on the level of authorization that a user of your application has, not everybody should have access to that information. Right. And so what you want to do is make sure that you are sanitizing any output that the LLM generates so that even if the LLM sees private information as part of the retrieved context. That private information is not part of the answer that you're showing to your end user. And that is a really cool thing about the guardrail that we just pulled from the hub. What you can really do is, in real time, validate that there's no PII that is being leaked. So we're actually going to see a cool little demo of this happening. Where we're now running our PII filter on the output side instead of the input side. So same as before. We create a guard here. You might see some warnings here, but you know everything's still working as expected. And I wouldn't recommend worrying about it too much. All right. Now that we have our guard initialized, we actually have some boilerplate that we are going to copy and over here. And all this is really doing is calling a guard outside of the server. So calling a guard in code where we call gpt-3.5-turbo with these messages, where we're asking GPT to generate a two-sentence short story. About an unnamed protagonist while we're making up some ten-digit phone numbers, you know, for the protagonist. So this is just some dummy data that gives you an example of maybe some sensitive information. Then your LLM might generate. Here we're doing it without any retrieved data. But in your real system this the data that the output that the LLM generates might actually be grounded in, you know, real sensitive information that you accidentally leaked to the LLM. So this is the request that we're going to be making. And if you use OpenAI streaming ever before, what's going to happen here, is we are going to stream the output that is generated by the LLM. So that you can basically see validation happening in real time. All right. So all that's really happening here is we're iterating over the output of our LLM chunk by chunk. We have streaming set to true. And then we are going to validate the request that we end up getting from the output. So you can see how fast and instantaneous this feels in in real time. All right. So you can actually see there that even though. The LLM had generated like a couple of phone numbers, we detected it and then returned the validated result to you. And you can really use this to make sure that you're not incurring any substantial latency hits, while making sure that any output that your LLM generates is still sanitized and free of any personal or private information. Now, with that, let's go out into our next lesson that's going to take a look at how to make sure your LLM doesn't generate any output that might harm your reputation. Specifically, the example that we're going to take a look at is making sure that your RAG chat application, doesn't mention any competitors.

Learn Code

Next Lesson

Safe and reliable AI via guardrails

Introduction

Failure modes in RAG applications

What are guardrails

Building your first guardrail

Checking for hallucinations with Natural Language Inference

Using hallucination guardrail in a chatbot

Keeping a chatbot on topic

Ensuring no personal identifiable information (PII) is leaked

Preventing competitor mentions

Conclusion

Course Feedback

Community