Overview

Open WebUI is a user-friendly, browser-based interface for interacting with large language models (LLMs), often used for development, testing, and end-user-facing chat experiences. When used in conjunction with LiteLLM, which acts as a proxy layer for LLM endpoints, Open WebUI becomes a powerful front end for accessing and managing model-backed conversational workflows.

By connecting Open WebUI to a LiteLLM-proxied model, users benefit from a flexible and consistent API layer that can abstract away the complexity of managing multiple backends (OpenAI, Anthropic, local vLLMs, etc.), and allow seamless switching, monitoring, and control.

What This Connection Enables

When Open WebUI connects to a LiteLLM proxy, it enables the following:

Chat-based interaction with deployed models, including fine-tuned or experimental variants.
Live testing of prompts, personas, or workflows before integrating with production systems.
Dynamic model switching without modifying frontend logic, thanks to LiteLLM’s uniform interface.
Access control, rate limiting, and logging, all centralized at the proxy layer, ensuring more secure and observable operations.
Preconfigured system prompts and parameter tuning managed through LiteLLM’s configuration files or API.

Why Use Open WebUI with LiteLLM?

Fast Prototyping and Feedback Loops
Developers and non-technical stakeholders can quickly iterate on prompt designs or assess model behaviors in a familiar chat interface. This helps close the feedback loop between model training, deployment, and UX testing.
Unified Access Point Across Models
LiteLLM supports multiple backends, and by centralizing access through a proxy, Open WebUI users don’t need to worry about backend-specific quirks. Whether you're testing a model hosted on vLLM, HuggingFace, or OpenAI, the experience remains consistent.
Safe Public or Internal Exposure
With LiteLLM in front, security features like token authentication, logging, or access gating can be applied before exposing the chat UI to end users or internal teams. This is especially useful in staging environments or internal tooling.
Rich Observability and Control
When combined with Langfuse and Prometheus/Grafana, this setup allows every Open WebUI session to be logged, traced, and visualized. Engineers can see how the model behaves in live settings and monitor usage patterns or failures.
Customization and Extensions
Open WebUI can be configured to add memory, tools, and plugins—extending beyond static chat into interactive agent-based workflows. When used with LiteLLM and orchestration layers (e.g., LangChain, LangGraph), it becomes a gateway to advanced applications like RAG, planning agents, or multi-step pipelines.

Connecting Open WebUI to LiteLLM enables a modular, secure, and observable way to interact with powerful LLMs, streamlining development, evaluation, and deployment workflows across the Cake AI platform.

Instructions

Create a key in LiteLLM for Open WebUI to access use.

We recommend you create a new key for every application that uses LiteLLM to track stats. It can also make sense to generate separate keys for each LiteLLM model proxied.

Navigate to LiteLLM

Create a new key

Copy the secret key for the next step.

Add LiteLLM key to Open WebUI as an Administrator

This means all users launching LiteLLM can access models creates

You may also add keys to Open WebUI as an individual user

While individual user configuration of LiteLLM is slower than having an admin set it for everyone, there are a few advantages to this approach:

Users will be using their own keys and their model usage can be allocated and budgeted separately
Logging of LLM requests can be more individualized
User requests are made thru the browser rather than the backend, this allows user to call models deployed on their own machines.