MCPHub Lab - The Security & Discovery Layer for AI Agents

Quick Start | Documentation | OpenAI API Compatibility | Discord

[!IMPORTANT] Llama Stack is now OGX. The name changed, and so did the mission — model-agnostic, multi-SDK, production-grade. Read the full announcement →

Open-source agentic API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.

OGX is a drop-in replacement for the OpenAI API that you can run anywhere — your laptop, your datacenter, or the cloud. Use any OpenAI-compatible client or agentic framework. Swap between Llama, GPT, Gemini, Mistral, or any model without changing your application code.

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8321/v1", api_key="fake")
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello"}],
)

What you get

Chat Completions & Embeddings — standard /v1/chat/completions, /v1/completions, and /v1/embeddings endpoints, compatible with any OpenAI client
Responses API — server-side agentic orchestration with tool calling, MCP server integration, and built-in file search (RAG) in a single API call (learn more)
Vector Stores & Files — /v1/vector_stores and /v1/files for managed document storage and search
Batches — /v1/batches for offline batch processing
Skills — /v1alpha/skills for managing versioned skill bundles (zip archives with SKILL.md manifests) that agents can invoke
Open Responses conformant — the Responses API implementation passes the Open Responses conformance test suite
Multi-SDK support — use the Anthropic SDK (/v1/messages) or Google GenAI SDK (/v1alpha/interactions) natively alongside the OpenAI API

Use any model, use any infrastructure

OGX has a pluggable provider architecture. Develop locally with Ollama, deploy to production with vLLM, or connect to a managed service — the API stays the same.

See the provider documentation for the full list.

Get started

Install and run a OGX server:

# One-line install
curl -LsSf https://github.com/ogx-ai/ogx/raw/main/scripts/install.sh | bash

# Or install via uv
uv pip install ogx[starter]

# Start the server (uses the starter distribution with Ollama)
uv run ogx run starter

Then connect with any OpenAI, Anthropic, or Google GenAI client — Python, TypeScript, curl, or any framework that speaks these APIs.

See the Quick Start guide for detailed setup.

Resources

Documentation — full reference
OpenAI API Compatibility — endpoint coverage and provider matrix
Getting Started Notebook — text and vision inference walkthrough
Contributing — how to contribute

Client SDKs:

OGX provides official client SDKs for Python and TypeScript:

Language	SDK	Package
Python	ogx-client-python
TypeScript	ogx-client-typescript

Alternative Python SDK:

For users who prefer an OpenAPI Generator-based SDK, an alternative Python client is available:

ogx-open-client — Auto-generated from OpenAPI spec, provides similar functionality with a different generation approach
Usage Examples — End-to-end code examples for all major features
Strategy & Rationale — Why two SDKs, when to use which, and long-term plans

The official ogx_client SDK is recommended for most use cases. The ogx_open_client package offers an alternative for teams with specific OpenAPI tooling requirements.

Community

We hold regular community calls every Thursday at 09:00 AM PST — see the Community Event on Discord for details.

Thanks to all our amazing contributors!

llamastack/llama stack

What is llamastack/llama stack?

How to use llamastack/llama stack?

Key Features

Optimized Use Cases

llamastack/llama stack FAQ

Is llamastack/llama stack safe?

Is llamastack/llama stack up to date?

Are there any limits for llamastack/llama stack?

Official Documentation

What you get

Use any model, use any infrastructure

Get started

Resources

Community

Global Ranking

Manual Config