Azure-Samples

AI Gateway

Built by Azure-Samples β€’ 895 stars

What is AI Gateway?

Labs to explore AI Models, MCP servers, and Agents with the AI Gateway powered by Azure API Management and Microsoft Foundry πŸš€

How to use AI Gateway?

1. Install a compatible MCP client (like Claude Desktop). 2. Open your configuration settings. 3. Add AI Gateway using the following command: npx @modelcontextprotocol/ai-gateway 4. Restart the client and verify the new tools are active.
πŸ›‘οΈ Scoped (Restricted)
npx @modelcontextprotocol/ai-gateway --scope restricted
πŸ”“ Unrestricted Access
npx @modelcontextprotocol/ai-gateway

Key Features

Native MCP Protocol Support
Real-time Tool Activation & Execution
Verified High-performance Implementation
Secure Resource & Context Handling

Optimized Use Cases

Extending AI models with custom local capabilities
Automating system workflows via natural language
Connecting external data sources to LLM context windows

AI Gateway FAQ

Q

Is AI Gateway safe?

Yes, AI Gateway follows the standardized Model Context Protocol security patterns and only executes tools with explicit user-granted permissions.

Q

Is AI Gateway up to date?

AI Gateway is currently active in the registry with 895 stars on GitHub, indicating its reliability and community support.

Q

Are there any limits for AI Gateway?

Usage limits depend on the specific implementation of the MCP server and your system resources. Refer to the official documentation below for technical details.

Official Documentation

View on GitHub
<!-- markdownlint-disable MD033 --> <div align="center">

✨ AI Gateway Labs

Open Source GitHub Stars Open in GitHub Codespaces

AI-Gateway Labs

Explore the enterprise-grade gateway for managing AI Models, Tools, and Agents

<br/>

AI-Gateway flow

Azure

</div>

Why AI Gateway?

Building production-ready AI applications requires more than just calling model APIs. You need security, reliability, observability, and cost controlβ€”without slowing down innovation.

AI Gateway powered by Azure API Management provides:

  • πŸ” Security β€” OAuth 2.0, managed identities, content safety filtering
  • ⚑ Performance β€” Load balancing, semantic caching, request routing
  • πŸ“Š Observability β€” Token metrics, built-in logging, tracing
  • πŸ’° Cost Control β€” Rate limiting, quota management, FinOps framework
  • πŸ”Œ Extensibility β€” MCP protocol support, function calling, multi-model routing

πŸ“š Explore the Labs

πŸ”— Browse all 30+ labs at aka.ms/ai-gateway/labs

Each lab is a hands-on Jupyter notebook with step-by-step instructions, Bicep infrastructure templates, and APIM policies you can deploy to your Azure subscription.

🧠 AI Gateway for Models

Manage and control access to Large Language Models with enterprise-grade policies.

LabDescription
Backend Pool Load BalancingDistribute requests across multiple model endpoints
Token Rate LimitingControl token consumption with rate limiting policies
Semantic CachingCache responses using vector similarity for faster, cheaper completions
Model RoutingRoute requests to different backends based on model and version
FinOps FrameworkManage AI budgets with automated quota controls

πŸ”§ AI Gateway for Tools

Enable secure tool access with MCP protocol and function calling capabilities.

LabDescription
Model Context Protocol (MCP)Plug & play tools with OAuth credential management
MCP Client AuthorizationImplement MCP with the client authorization flow
Function CallingUse OpenAI function calling with Azure Functions backend
Realtime Audio + MCPCombine realtime voice API with MCP tools

πŸ€– AI Gateway for Agents

Build and control agentic applications with orchestration frameworks.

LabDescription
AI Agent ServiceExplore Foundry Agent Service with multi-service control
OpenAI Agents SDKUse OpenAI Agents with Azure OpenAI and APIM-managed tools
Gemini MCP AgentsIntegrate Google Gemini models with MCP tools
A2A Enabled AgentsA2A-enabled Agents with models and MCP plug & play tools

πŸš€ Quick Start

Prerequisites

Get Started

# Clone the repository
git clone https://github.com/Azure-Samples/AI-Gateway.git
cd AI-Gateway

# Open VS Code and start with a lab
code .

Or launch instantly with GitHub Codespaces ☁️

πŸ”¨ Developer Tools

The tools/ folder provides utilities for testing and development:

ToolDescription
TracingInvoke AI Foundry APIs with tracing enabled
StreamingTest streaming responses from AI models
Rate Limit TesterValidate rate limiting configurations
Mock ServerOpenAI API mock for local development and testing
OAuth ClientTest OAuth authentication flows

πŸ‘©β€πŸ’» Build Your Own Labs with AI

This repository includes Copilot Agent Skills that help you create new labs using AI-assisted development in VS Code.

Available Skills

SkillDescription
lab-creatorScaffolds new labs with notebooks, Bicep, and policies
apim-bicepGenerates Azure Bicep templates for APIM resources
apim-terraformGenerates Terraform configurations for APIM
apim-policiesCreates APIM XML policies for AI gateway scenarios
apim-kqlGenerates queries in KQL to control models, tools and agents
mcp-builderBuilds MCP servers for tool integration

Example: Create a New Lab

Open this repo in VS Code with GitHub Copilot and use this prompt:

Create a new lab called "multi-model-failover" that demonstrates how to 
implement automatic failover between different AI models when the primary 
model is unavailable or throttled. Include:
- A backend pool with priority-based routing
- Retry policy with exponential backoff
- Circuit breaker pattern for unhealthy backends
- Built-in LLM logging to track usage across all backends
- Test the model with a LangChain agent: https://docs.langchain.com/oss/python/langchain/agents
Use gpt-4.1-mini as primary and gpt-4.1-nano as fallback, deploy to Sweden Central.

Copilot will generate the complete lab structure including:

  • πŸ““ Jupyter notebook with step-by-step instructions
  • 🦾 Bicep infrastructure template
  • βš™οΈ APIM policy XML
  • πŸ“– README documentation
  • 🧹 Cleanup notebook

πŸ›οΈ Well-Architected Framework

Labs are designed following Azure Well-Architected Framework principles:

PillarLabs
SecurityAccess controlling, Content safety, Private connectivity
ReliabilityBackend pool load balancing, Token rate limiting
PerformanceSemantic caching, Model routing
OperationsBuilt-in logging, Token metrics emitting
CostFinOps framework, Semantic caching

πŸ“• Enterprise AI Gateway e-Book

<img align="left" width="200" src="images/ebook.png" alt="Enterprise AI Gateway eBook">

Download the <a href="docs/media/Enterprise%20AI%20Gateway%20eBook%20-%20Feb%202026.pdf">Enterprise AI Gateway e-Book</a> for comprehensive end-to-end view of the Enterprise AI Gateway pattern, explaining why a centralized governance layer is essential for organizations adopting AI at scale and how it can be practically implemented using Azure API Management and Microsoft Foundry.
It describes the AI Gateway as a control plane that mediates all interactions between AI apps and agents and the underlying models, data, and tools, enabling consistent enforcement of security, safety, cost controls, resiliency, scalability, observability, and governance. Overall, the e-Book positions the Enterprise AI Gateway as a foundational architectural component that allows enterprises to innovate rapidly with AI while maintaining trust, compliance, visibility, and control. <br clear="left"/>

🎬 Conferences & Webcasts

Learn from experts through these videos covering AI Gateway concepts and implementations.

Build 2025Reactor Jan 2025MCP WorkflowsSupercharge your API's
Ignite 2024Reactor Nov 2024Content SafetySemantic Caching
Token Emit MetricGenAI GatewayControl AI ServicesJohn Savill
Houssem DellaiTurbo360A2A MCP Multiagents

πŸ“– Resources

🀝 Contributing

Contributions are welcome! Please read CONTRIBUTING.md for guidelines.

⚠️ Disclaimer

This software is provided for demonstration purposes only. It is not intended to be relied upon for any purpose. The creators make no representations or warranties about the completeness, accuracy, reliability, or suitability of this software.

<div align="center">

APIM Love

</div>

Global Ranking

-
Trust ScoreMCPHub Index

Based on codebase health & activity.

Manual Config

{ "mcpServers": { "ai-gateway": { "command": "npx", "args": ["ai-gateway"] } } }