MCPHub LabRegistryllamastack/llama-stack
llamastack

llamastack/llama stack

Built by llamastack 8,302 stars

What is llamastack/llama stack?

Composable building blocks to build LLM Apps

How to use llamastack/llama stack?

1. Install a compatible MCP client (like Claude Desktop). 2. Open your configuration settings. 3. Add llamastack/llama stack using the following command: npx @modelcontextprotocol/llamastack-llama-stack 4. Restart the client and verify the new tools are active.
🛡️ Scoped (Restricted)
npx @modelcontextprotocol/llamastack-llama-stack --scope restricted
🔓 Unrestricted Access
npx @modelcontextprotocol/llamastack-llama-stack

Key Features

Native MCP Protocol Support
Real-time Tool Activation & Execution
Verified High-performance Implementation
Secure Resource & Context Handling

Optimized Use Cases

Extending AI models with custom local capabilities
Automating system workflows via natural language
Connecting external data sources to LLM context windows

llamastack/llama stack FAQ

Q

Is llamastack/llama stack safe?

Yes, llamastack/llama stack follows the standardized Model Context Protocol security patterns and only executes tools with explicit user-granted permissions.

Q

Is llamastack/llama stack up to date?

llamastack/llama stack is currently active in the registry with 8,302 stars on GitHub, indicating its reliability and community support.

Q

Are there any limits for llamastack/llama stack?

Usage limits depend on the specific implementation of the MCP server and your system resources. Refer to the official documentation below for technical details.

Official Documentation

View on GitHub
<h1 align="center">Llama Stack</h1> <p align="center"> <a href="https://pypi.org/project/llama_stack/"><img src="https://img.shields.io/pypi/v/llama_stack?logo=pypi" alt="PyPI Version"></a> <a href="https://pypi.org/project/llama-stack/"><img src="https://img.shields.io/pypi/dm/llama-stack" alt="PyPI Downloads"></a> <a href="https://hub.docker.com/u/llamastack"><img src="https://img.shields.io/docker/pulls/llamastack/distribution-starter?logo=docker" alt="Docker Hub Pulls"></a> <a href="https://github.com/meta-llama/llama-stack/blob/main/LICENSE"><img src="https://img.shields.io/pypi/l/llama_stack.svg" alt="License"></a> <a href="https://discord.gg/llama-stack"><img src="https://img.shields.io/discord/1257833999603335178?color=6A7EC2&logo=discord&logoColor=ffffff" alt="Discord"></a> <a href="https://github.com/meta-llama/llama-stack/actions/workflows/unit-tests.yml?query=branch%3Amain"><img src="https://github.com/meta-llama/llama-stack/actions/workflows/unit-tests.yml/badge.svg?branch=main" alt="Unit Tests"></a> <a href="https://github.com/meta-llama/llama-stack/actions/workflows/integration-tests.yml?query=branch%3Amain"><img src="https://github.com/meta-llama/llama-stack/actions/workflows/integration-tests.yml/badge.svg?branch=main" alt="Integration Tests"></a> <a href="https://llamastack.github.io/docs/api-openai/conformance"><img src="https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fraw.githubusercontent.com%2Fmeta-llama%2Fllama-stack%2Fmain%2Fdocs%2Fstatic%2Fopenai-coverage.json&query=%24.summary.conformance.score&suffix=%25&label=OpenResponses%20Conformance&color=brightgreen" alt="OpenResponses Conformance"></a> <a href="https://deepwiki.com/llamastack/llama-stack"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a> </p>

Quick Start | Documentation | OpenAI API Compatibility | Discord

Open-source agentic API server for building AI applications. OpenAI-compatible. Any model, any infrastructure.

<p align="center"> <img src="docs/static/img/architecture-animated.svg" alt="Llama Stack Architecture" width="100%"> </p>

Llama Stack is a drop-in replacement for the OpenAI API that you can run anywhere — your laptop, your datacenter, or the cloud. Use any OpenAI-compatible client or agentic framework. Swap between Llama, GPT, Gemini, Mistral, or any model without changing your application code.

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8321/v1", api_key="fake")
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello"}],
)

What you get

  • Chat Completions & Embeddings — standard /v1/chat/completions, /v1/completions, and /v1/embeddings endpoints, compatible with any OpenAI client
  • Responses API — server-side agentic orchestration with tool calling, MCP server integration, and built-in file search (RAG) in a single API call (learn more)
  • Vector Stores & Files/v1/vector_stores and /v1/files for managed document storage and search
  • Batches/v1/batches for offline batch processing
  • Open Responses conformant — the Responses API implementation passes the Open Responses conformance test suite

Use any model, use any infrastructure

Llama Stack has a pluggable provider architecture. Develop locally with Ollama, deploy to production with vLLM, or connect to a managed service — the API stays the same.

See the provider documentation for the full list.

Get started

Install and run a Llama Stack server:

# One-line install
curl -LsSf https://github.com/llamastack/llama-stack/raw/main/scripts/install.sh | bash

# Or install via uv
uv pip install llama-stack

# Start the server (uses the starter distribution with Ollama)
llama stack run

Then connect with any OpenAI client — Python, TypeScript, curl, or any framework that speaks the OpenAI API.

See the Quick Start guide for detailed setup.

Resources

Client SDKs:

LanguageSDKPackage
Pythonllama-stack-client-pythonPyPI version
TypeScriptllama-stack-client-typescriptNPM version

Community

We hold regular community calls every Thursday at 09:00 AM PST — see the Community Event on Discord for details.

Star History Chart

Thanks to all our amazing contributors!

<a href="https://github.com/meta-llama/llama-stack/graphs/contributors"> <img src="https://contrib.rocks/image?repo=meta-llama/llama-stack" alt="Llama Stack contributors" /> </a>

Global Ranking

8.5
Trust ScoreMCPHub Index

Based on codebase health & activity.

Manual Config

{ "mcpServers": { "llamastack-llama-stack": { "command": "npx", "args": ["llamastack-llama-stack"] } } }