Introducing Prompt Shield in Content Safety

An implementation with Azure AI Studio and Python

Valentina Alto

--

As AI-powered applications become more and more widespread, malicious actors are finding new ways of attacking. The thing is that today, in addition to the traditional applications’ vulnerabilities, with LLM-powered systems we are adding a new set of components that could potentially be the entry point for malicious attackers.

One of these new component is the meta-prompt, that is the set of instructions (including the context coming from external knowledge base) that we provide the LLM with and that allows us to:

  • Instruct the LLM to answer with a defined style
  • Limit the LLM’s responses within a specified perimeter (this practice is called grounding)
  • Incorporating responsible AI practices to avoid potentially harmful responses

Meta-prompts are the key component in an AI powered application: in fact, since the LLM act as the “brain” or reasoning engine of the app and it orchestrates all the other components (including the VectorDB where we store our knowledge base and the tools we provide the model with), anyone accessing the system message has the power to modify the application’s behavior.

--

--

Valentina Alto
Valentina Alto

Written by Valentina Alto

Data&AI Specialist at @Microsoft | MSc in Data Science | AI, Machine Learning and Running enthusiast

No responses yet