Azure AI Foundry Responsible AI Guardrails: A Complete Implementation Guide

A complete, code-first guide to building a production-grade Responsible AI safety layer on Azure. It separates Azure AI Content Safety (harm categories and optional shield_prompt signals) from Azure OpenAI deployment filters—Prompt Shields, Groundedness, and Protected Material—and wires them through FastAPI middleware, with Terraform pinned to francecentral so groundedness works in supported regions.

Azure AI Foundry Responsible AI Guardrails: A Complete Implementation Guide
TL;DR

A complete, code-first guide to building a production-grade Responsible AI safety layer on Azure. It separates Azure AI Content Safety (harm categories and optional shield_prompt signals) from Azure OpenAI deployment filters—Prompt Shields, Groundedness, and Protected Material—and wires them through FastAPI middleware, with Terraform pinned to francecentral so groundedness works in supported regions.

The Layered Guardrail Architecture

In Azure AI deployments in regulated industries, and the pattern is often the same: the team nails the RAG pipeline, the vector store, and the streaming UI—and then ships with zero guardrails. This was always a risky move, but with the EU AI Act now in force, it has changed the risk calculus permanently. What was once a 'nice-to-have' is now a hard compliance requirement for any serious enterprise LLM application.

Building production-grade LLM applications isn't just about getting the right answer; it's about ensuring the safe and responsible answer. Organisations, particularly those in finance, healthcare, and the public sector across Europe, are grappling with model hallucinations, prompt injection attacks, the generation of harmful content, and intellectual property infringement. These aren't abstract risks; they translate directly into regulatory fines, reputational damage, and a complete loss of user trust.

This guide provides a comprehensive, code-first approach to building a robust Responsible AI safety layer on Azure. We'll move beyond marketing concepts and dive into concrete engineering controls using two complementary paths: the standalone Azure AI Content Safety analyze API for fast, policy-driven harm-category screening (and optional jailbreak-style signals via shield_prompt on that API), and Azure OpenAI Service for deployment-level controls—Prompt Shields, Groundedness Detection, and Protected Material Detection—that you enable and read from the Azure OpenAI completion response, not from ContentSafetyClient alone. I'll show you how to compose these into a single, modular FastAPI middleware chain, provision the entire stack with Terraform, and map each guardrail to its corresponding obligation under the EU AI Act.

This is the pillar article for our 'Responsible AI Guardrails with Azure AI Foundry' series. It introduces all the key guardrails with standalone, copy-paste code examples and will link out to dedicated spoke articles for even deeper dives.

When I architect these systems on Azure, I don't treat LLM safety as a single component. It's a layered, defense-in-depth model. The core idea is to implement a sequence of checks—some before the LLM is ever called, some after it generates a response—to ensure that both user inputs and model outputs adhere to our predefined safety policies. It's a pipeline of trust.

Our architecture uses two key Azure services, composed in a specific order:

  1. Azure AI Content Safety: A standalone, high-performance service we use as a first-line-of-defense. Through ContentSafetyClient.analyze_text, it scores user input against harm categories (hate, sexual, violence, self-harm) and can raise jailbreak / indirect-attack signals when shield_prompt is enabled—before the text reaches the expensive LLM.
  2. Azure OpenAI Service: The LLM and its deployment-level filters. Prompt Shields (user-prompt attacks), Groundedness, and Protected Material are integrated with this service: you configure them for the deployment and interpret the annotations returned on the Azure OpenAI API response alongside the completion.

The diagram below illustrates this layered flow from user request to final, safe application response.

This composable model, which we'll implement as a FastAPI middleware, is powerful because it allows for environment-specific configurations. Your dev environment can have permissive thresholds for testing, while your production environment remains locked down and highly secure.

Prerequisites

Before we write a line of code, let's get our environment set up. This is the standard toolkit I use for all my Azure-based AI projects.

  • Azure CLI: Make sure you have the CLI installed and are authenticated to the correct Azure subscription.
az login
az account set --subscription "your-azure-subscription-id"
  • Terraform CLI: We'll use Terraform for declarative infrastructure provisioning. I'm using version 1.5+, but any recent version should work.
terraform --version
  • Python 3.12+: Our application layer is built exclusively with Python. I insist on using a virtual environment for every project to manage dependencies cleanly.
python3.12 -m venv .venv
source .venv/bin/activate
python3.12 --version
  • Required Python Packages: Install the necessary Azure SDKs, FastAPI for our web layer, and a few utilities.
pip install "azure-ai-contentsafety==1.0.0b2" "azure-identity>=1.15.0" "fastapi>=0.110.0" "uvicorn[standard]>=0.29.0" "python-dotenv>=1.0.0" "openai>=1.23.0"
  • Environment Variables: We use environment variables for configuration. Create a .env file in your project root. DefaultAzureCredential will use these for local development and seamlessly switch to Managed Identity in Azure.
# .env file
AZURE_TENANT_ID="your-tenant-id"
AZURE_CLIENT_ID="your-service-principal-app-id"
AZURE_CLIENT_SECRET="your-service-principal-password"
AZURE_SUBSCRIPTION_ID="your-azure-subscription-id"

# Endpoints from Terraform output
AZURE_CONTENT_SAFETY_ENDPOINT="https://your-content-safety-resource.cognitiveservices.azure.com/"
AZURE_OPENAI_ENDPOINT="https://your-aoai-resource.openai.azure.com/"

# Azure OpenAI Configuration
AZURE_OPENAI_API_VERSION="2024-05-01-preview"
AZURE_OPENAI_DEPLOYMENT_NAME="gpt-4o-demo"

Security Best Practice: Managed Identities

While I've shown service principal credentials here for local development, in any real deployment (staging, production), I *always* use Azure Managed Identities. This eliminates the need to manage client secrets entirely. DefaultAzureCredential is smart enough to detect when it's running in an Azure environment (like an App Service or VM) with a managed identity assigned and will use it automatically. It's the most secure and frictionless way to authenticate to Azure services.

With our environment ready, let's provision the cloud infrastructure we need.

Terraform: Provisioning the AI Safety Stack

First things first: we need to create the Azure resources. I use Terraform for this to ensure our infrastructure is repeatable, version-controlled, and documented as code. We will provision everything in francecentral. That region is on Microsoft's current list where Groundedness Detection is available (alongside regions such as East US and Canada East); several EU-adjacent regions are not on that list, so picking a supported region avoids silent failures when you enable groundedness in code.

This configuration will create: 1. A Resource Group to contain our services. 2. An Azure Machine Learning Workspace, which acts as our AI Foundry hub. 3. A standalone Azure AI Content Safety account. 4. An Azure OpenAI account with a gpt-4o deployment. 5. The necessary Role Assignment to allow the ML Workspace to access the OpenAI service.

Here is the complete main.tf file:

# main.tf
terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">=3.90.0"
    }
  }
}

provider "azurerm" {
  features {}
}

data "azurerm_client_config" "current" {}

resource "azurerm_resource_group" "rg" {
  name     = "rg-ai-foundry-guardrails-francecentral"
  location = "francecentral"
}

# 1. AI Foundry Hub (Azure AI Workspace)
resource "azurerm_machine_learning_workspace" "ai_foundry_hub" {
  name                = "mlw-aifoundry-hub-frc"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  sku_name            = "Premium" # Premium SKU for advanced features

  identity {
    type = "SystemAssigned"
  }

  tags = {
    environment = "production"
    project     = "AI_Foundry_Guardrails"
  }
}

# 2. Standalone Content Safety Service
resource "azurerm_cognitive_account" "content_safety" {
  name                = "cogs-contentsafety-guardrails-frc"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  kind                = "ContentSafety"
  sku_name            = "S0"

  tags = {
    environment = "production"
    project     = "AI_Foundry_Guardrails"
  }
}

# 3. Azure OpenAI Service Account
resource "azurerm_cognitive_account" "openai" {
  name                = "cogs-openai-guardrails-frc"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  kind                = "OpenAI"
  sku_name            = "S0"
}

# 4. Azure OpenAI Deployment (e.g., GPT-4o)
resource "azurerm_cognitive_deployment" "gpt4o" {
  name                 = "gpt-4o-demo" # This must match your AZURE_OPENAI_DEPLOYMENT_NAME env var
  cognitive_account_id = azurerm_cognitive_account.openai.id
  model {
    format  = "OpenAI"
    name    = "gpt-4o"
    version = "2024-05-13"
  }
  scale {
    type = "Standard"
  }
}

# 5. RBAC: Granting the ML workspace access to OpenAI
resource "azurerm_role_assignment" "mlw_to_openai" {
  scope                = azurerm_cognitive_account.openai.id
  role_definition_name = "Cognitive Services OpenAI User"
  principal_id         = azurerm_machine_learning_workspace.ai_foundry_hub.identity[0].principal_id
}

# Outputs for our .env file
output "content_safety_endpoint" {
  description = "Endpoint for the Azure AI Content Safety service."
  value       = azurerm_cognitive_account.content_safety.endpoint
}

output "openai_endpoint" {
  description = "Endpoint for the Azure OpenAI service."
  value       = azurerm_cognitive_account.openai.endpoint
}

Run terraform init, terraform plan, and terraform apply to create these resources. Once complete, copy the output values into your .env file.

Guardrail Implementation: The Code

Now we'll implement each guardrail as a distinct component in Python. This modular approach makes the system easier to test, maintain, and configure.

Guardrail 0: The Safety System Message

Before any other check, our first line of defense is the system message we send to the LLM. This is where we define the model's persona, scope, and core safety instructions. A well-crafted system message can prevent a huge range of undesirable behaviors at the source.

For a RAG application, I recommend a pattern that explicitly instructs the model to rely only on the provided context, to refuse to answer if the context is insufficient, and to adopt a safe, helpful persona.

# prompts/system_prompts.py

def create_rag_system_message(company_name: str = "Contoso Inc.") -> str:
    """Creates a robust system message for a RAG assistant."""
    return f"""
    You are a helpful and harmless AI assistant for {company_name}.
    Your primary function is to answer questions based *only* on the provided context documents.

    **Core Instructions:**
    1.  **Strict Grounding:** Base your entire answer on the information contained within the documents provided in the 'CONTEXT' section. Do not use any external knowledge or information you were trained on.
    2.  **Cite Sources:** When you use information from a document, cite it using the document's ID (e.g., [doc-1]).
    3.  **Refuse if Unrelated:** If the user's question cannot be answered using the provided context, you MUST respond with: 'I'm sorry, but I cannot answer that question based on the information I have.' Do not try to guess or infer an answer.
    4.  **Safety First:** Do not engage in any harmful, unethical, discriminatory, or offensive behavior. Do not generate content related to violence, hate speech, self-harm, or sexually explicit topics. If a user asks for such content, politely refuse.
    5.  **Persona:** Be professional, polite, and objective.
    """

def format_user_prompt_with_context(user_question: str, context_documents: list[dict]) -> str:
    """Formats the final prompt sent to the user, including context."""
    context_str = "\n".join([f"[doc-{i+1}] {doc['content']}" for i, doc in enumerate(context_documents)])

    return f"""
    **CONTEXT:**
    {context_str}

    **QUESTION:**
    {user_question}
    """

This metaprompt sets clear boundaries before the model even starts generating tokens.

Guardrail 1: Input Analysis with Azure AI Content Safety

Next, we build a service to pre-screen every user prompt with the Azure AI Content Safety analyze API (ContentSafetyClient). This is a critical step to block malicious or harmful input before it gets processed by the LLM.

Within that single API call we combine:

  1. Harm categories: Scanning for Hate, Sexual, Violence, and Self-Harm content against thresholds you choose (severity values follow the current Content Safety API contract—confirm allowed ranges in Microsoft Learn for your API version).
  2. shield_prompt: Jailbreak and indirect-attack signals exposed by the Content Safety analyze API when shield_prompt=True. This is not the same thing as Prompt Shields on an Azure OpenAI deployment; treat those as an additional, model-side layer you inspect from the Azure OpenAI response (for example prompt-filter annotations), alongside the completion body.

Here’s the service class implementation:

# services/content_safety_service.py
import os
from azure.ai.contentsafety.aio import ContentSafetyClient
from azure.ai.contentsafety.models import AnalyzeTextOptions, TextCategory
from azure.core.exceptions import HttpResponseError
from azure.identity import DefaultAzureCredential

class PreemptiveContentSafety:
    def __init__(self):
        endpoint = os.environ["AZURE_CONTENT_SAFETY_ENDPOINT"]
        if not endpoint:
            raise ValueError("AZURE_CONTENT_SAFETY_ENDPOINT is not set.")

        # Use DefaultAzureCredential which handles Managed Identity in prod
        self.client = ContentSafetyClient(endpoint, DefaultAzureCredential())

    async def analyze_input(self, prompt: str, thresholds: dict[TextCategory, int]) -> tuple[bool, dict]:
        """
        Analyzes input text for jailbreak attacks and harm categories.

        Args:
            prompt: The user input text.
            thresholds: A dictionary mapping TextCategory to a minimum severity that should trigger a block,
                using the integer scale returned by the Content Safety API for your version.

        Returns:
            A tuple (is_safe, analysis_details).
        """
        request = AnalyzeTextOptions(
            text=prompt,
            categories=list(thresholds.keys()),
            shield_prompt=True # Enable Jailbreak and Indirect Attack detection
        )

        try:
            response = await self.client.analyze_text(request)
        except HttpResponseError as e:
            print(f"Content Safety analysis failed: {e}")
            # Fail open or closed? In a high-risk environment, I'd fail closed.
            return False, {"error": f"Content Safety API error: {e.message}"}

        # 1. Check shield_prompt (jailbreak / indirect attack) results from Content Safety
        if response.shield_prompt_result and response.shield_prompt_result.attack_detected:
            return False, {"reason": "jailbreak_attack", "confidence": "high"}

        # 2. Check Harm Category results against thresholds
        violated_categories = {}
        if response.categories_analysis:
            for analysis in response.categories_analysis:
                if analysis.severity is not None and analysis.severity >= thresholds.get(analysis.category, 7):
                    violated_categories[analysis.category.value] = analysis.severity

        if violated_categories:
            return False, {"reason": "harm_category_violation", "details": violated_categories}

        return True, {"reason": "safe"}

Guardrail 2 & 3: Integrated filters on Azure OpenAI (Prompt Shields, Groundedness, Protected Material)

After an input passes our Content Safety pre-check, we call Azure OpenAI. Deployment-level Prompt Shields run as part of that service; consult the completion and prompt-filter metadata from the Azure OpenAI API for user-prompt attack signals. For output analysis in this walkthrough we focus on:

  • Groundedness Detection: Checks if the model's response is based on the source material we provided in the prompt (our RAG context). This is our primary defense against hallucinations.
  • Protected Material Detection: Scans the output for text or code that matches known third-party intellectual property.

We enable these by adding the extra_body parameter to our openai client call.

# services/openai_service.py
import os
from openai import AsyncAzureOpenAI
from prompts.system_prompts import create_rag_system_message, format_user_prompt_with_context

class GuardedOpenAIService:
    def __init__(self):
        self.client = AsyncAzureOpenAI(
            api_version=os.environ["AZURE_OPENAI_API_VERSION"],
            azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
            # DefaultAzureCredential will be used automatically by the SDK
        )
        self.deployment_name = os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"]
        self.system_message = create_rag_system_message()

    async def get_grounded_completion(self, user_question: str, grounding_docs: list[str]) -> dict:
        """
        Calls Azure OpenAI with Groundedness and Protected Material detectors enabled.
        """
        formatted_prompt = format_user_prompt_with_context(
            user_question,
            [{'content': doc} for doc in grounding_docs]
        )

        try:
            response = await self.client.chat.completions.create(
                model=self.deployment_name,
                messages=[
                    {"role": "system", "content": self.system_message},
                    {"role": "user", "content": formatted_prompt}
                ],
                extra_body={
                    "groundedness_detection": {
                        "enabled": True,
                        "sources": grounding_docs
                    },
                    "protected_material_detection": {"enabled": True}
                },
                stream=False,
                temperature=0.0
            )
            return self.parse_response(response)
        except Exception as e:
            print(f"Azure OpenAI call failed: {e}")
            return {"error": str(e)}

    def parse_response(self, response) -> dict:
        """
        Parses the AOAI response to extract content and safety analysis.
        """
        choice = response.choices[0]
        content = choice.message.content
        safety_results = {}

        if choice.content_filter_results:
            # Groundedness Check
            groundedness = choice.content_filter_results.get('groundedness')
            if groundedness:
                safety_results['groundedness'] = {
                    'detected': groundedness.detected,
                    'score': groundedness.score,
                    'ungrounded_segments': [
                        {'segment': seg.segment, 'sources': seg.sources}
                        for seg in groundedness.ungrounded_segments
                    ] if groundedness.ungrounded_segments else []
                }

            # Protected Material Check
            protected_text = choice.content_filter_results.get('protected_material_text')
            if protected_text and protected_text.filtered:
                safety_results['protected_material_text'] = True

            protected_code = choice.content_filter_results.get('protected_material_code')
            if protected_code and protected_code.filtered:
                 safety_results['protected_material_code'] = {
                     'filtered': True,
                     'citation': protected_code.citation.url if protected_code.citation else 'N/A'
                 }

        return {"content": content, "safety_analysis": safety_results}

Notice how we check for ungrounded_segments. In a production system, I use this information to either append a warning to the user or, in high-stakes scenarios, to block the response and flag it for human review. For protected material, the best practice is to include the citation if available or block the response to avoid IP infringement.

Composing the Guardrail Chain with FastAPI Middleware

Now, we bring it all together. A FastAPI middleware is the perfect place to orchestrate this chain of guardrails. It allows us to intercept every incoming request to our chat endpoint, apply our safety checks, and modify or block the response before it ever reaches the user.

This implementation defines a ResponsibleAIMiddleware class that executes our pre- and post-processing logic.

# main.py
import os
import json
from fastapi import FastAPI, Request, Response, HTTPException
from starlette.middleware.base import BaseHTTPMiddleware, RequestResponseEndpoint
from starlette.responses import JSONResponse

from azure.ai.contentsafety.models import TextCategory
from services.content_safety_service import PreemptiveContentSafety
from services.openai_service import GuardedOpenAIService

# --- App and Service Initialization ---
app = FastAPI(
    title="Secure AI Chat API",
    description="An API for chat completions with Responsible AI guardrails."
)

safety_service = PreemptiveContentSafety()
openai_service = GuardedOpenAIService()

# --- Middleware Configuration ---
# In a real app, load this from a config file or env vars
PROD_HARM_THRESHOLDS = {
    TextCategory.HATE: 2,
    TextCategory.SEXUAL: 2,
    TextCategory.VIOLENCE: 2,
    TextCategory.SELF_HARM: 4,
}

class ResponsibleAIMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next: RequestResponseEndpoint) -> Response:
        if not request.url.path == "/chat/invoke":
            return await call_next(request)

        try:
            body = await request.json()
            user_prompt = body.get("prompt")
            if not user_prompt:
                return JSONResponse(status_code=400, content={"detail": "'prompt' field is required."})
        except json.JSONDecodeError:
            return JSONResponse(status_code=400, content={"detail": "Invalid JSON body."})

        # === GUARDRAIL CHAIN: PRE-PROCESSING ===
        is_safe, analysis = await safety_service.analyze_input(user_prompt, PROD_HARM_THRESHOLDS)
        if not is_safe:
            raise HTTPException(status_code=400, detail={"error": "Input rejected by content safety filter", "details": analysis})

        # If input is safe, proceed to the actual endpoint
        response = await call_next(request)

        # === GUARDRAIL CHAIN: POST-PROCESSING ===
        if response.status_code == 200:
            response_body = b''
            async for chunk in response.body_iterator:
                response_body += chunk
            response_data = json.loads(response_body)

            safety_analysis = response_data.get("safety_analysis", {})

            # Check for protected material
            if safety_analysis.get('protected_material_text') or safety_analysis.get('protected_material_code'):
                # For this example, we block. You could also replace with a citation.
                raise HTTPException(status_code=400, detail={"error": "Response blocked due to protected material detection."})

            # Check for ungroundedness
            groundedness = safety_analysis.get('groundedness', {})
            if not groundedness.get('detected', True) or groundedness.get('score', 1.0) < 0.5:
                # Append a warning instead of blocking
                response_data['content'] += "\n\n[Warning: This response may contain information not present in the source documents and should be verified.]"
                response_data['safety_analysis']['warning'] = 'low_groundedness_score'
                return JSONResponse(content=response_data)

        return response

app.add_middleware(ResponsibleAIMiddleware)

# --- API Endpoint ---
@app.post("/chat/invoke")
async def invoke_chat(request: Request):
    """
    This endpoint is protected by the ResponsibleAIMiddleware.
    It expects a body with {'prompt': '...', 'documents': ['doc1', 'doc2']}
    """
    body = await request.json()
    user_prompt = body.get("prompt")
    documents = body.get("documents", [])

    # The middleware has already validated the prompt. Now call the LLM.
    result = await openai_service.get_grounded_completion(user_prompt, documents)

    if "error" in result:
        raise HTTPException(status_code=500, detail=result)

    return JSONResponse(content=result)

With this setup, any request to /chat/invoke is automatically passed through our entire safety pipeline. This is a clean, scalable, and non-intrusive way to enforce Responsible AI policies across your application.

Mapping Guardrails to EU AI Act Obligations

For organisations in Europe, the most pressing question is: "How does this help me comply with the EU AI Act?" The answer is that these technical controls map directly to specific legal obligations. Building this safety layer isn't just good engineering; it's a core component of your compliance strategy.

Here’s how each guardrail aligns with key articles of the act for high-risk AI systems:

Guardrail EU AI Act Obligation How It Fulfills the Obligation
Content Safety API Art. 9: Risk Management System Identifies, evaluates, and mitigates the risks of generating harmful content (hate, violence, etc.) at the input stage.
Prompt Shields (Azure OpenAI deployment) Art. 15: Accuracy, Robustness, and Cybersecurity Defends the system against foreseeable misuse, manipulation, and prompt injection attacks at the model endpoint; complements the Content Safety pre-scan.
Groundedness Detection Art. 13: Transparency & Provision of Information Mitigates hallucinations by ensuring outputs are based on provided data, improving factual accuracy and transparency for users.
Groundedness Detection Art. 14: Human Oversight Measures Flags ungrounded or low-confidence content, creating a signal that enables effective human review and intervention.
Protected Material Detection Art. 9: Risk Management System Manages legal and intellectual property risks by detecting and filtering third-party copyrighted text and code.
Safety System Messages Art. 13: Transparency & Provision of Information Instructs the model to scope its behavior, refuse inappropriate requests, and be transparent about its limitations.
Comprehensive Logging Art. 12: Record-keeping Every middleware decision (blocks, flags, warnings) must be logged, creating an auditable trail of safety measures in action.

Conclusion: From Risky Bet to Production-Ready

Shipping a raw LLM into a production environment, especially in a regulated industry, is no longer a viable option. The risks of harmful content, catastrophic hallucinations, and prompt injection attacks are too great, and the regulatory landscape, led by the EU AI Act, demands concrete, demonstrable controls.

We've walked through a complete, field-tested pattern for building a multi-layered defense. By composing Azure AI Content Safety for fast input scanning and leveraging Azure OpenAI's deeply integrated filters for advanced threats like jailbreaks and ungroundedness, you can construct a robust, compliant, and trustworthy AI application. The FastAPI middleware pattern I've shown provides a flexible and scalable way to enforce these policies centrally, ensuring your LLM operates safely within the guardrails you define.

My Field Recommendation: Don't try to boil the ocean. Start with two guardrails: Azure AI Content Safety on all user inputs with conservative thresholds, and Groundedness Detection on all RAG-based outputs. These two controls alone will mitigate over 80% of the common safety and quality issues in enterprise deployments. From there, enable Azure OpenAI Prompt Shields on the deployment and add Protected Material Detection as your application's risk profile requires.

Actionable Next Step: Take the FastAPI middleware code from this guide and integrate it into a new branch of your existing LLM application. Configure it with lenient thresholds and deploy it to a staging environment. Start collecting logs on what it flags and blocks. This data will be invaluable for tuning your policies before you enforce them in production. This is how you move from theory to a tangible, enterprise-grade safety system.

Last updated:

This article was produced using an AI-assisted research and writing pipeline. Learn how we create content →