How to use Cake - Overview

Prev Next

How to use Cake - Overview

Cake Overview

Cake is an end-to-end environment for managing the entire AI lifecycle, from data engineering and model training, all the way to inference and monitoring. This article will guide you through the high-level Cake platform architecture, providing an overview of how its design choices streamline AI operations while maintaining flexibility, security, and control.

Holistic Lifecycle Management

The Cake platform integrates the entire range of capabilities needed for managing the AI lifecycle, including:

  • Advanced large language and embedding models

  • Multi-agent systems

  • 3D parallel training and fine-tuning capabilities

  • Model monitoring and observability tools

  • GPU auto-scaling (from zero to thousands of nodes) for both training and inference

  • Exploratory data analysis (EDA) and AutoML frameworks

  • Cloud cost monitoring and optimization

  • PII/PHI anonymization utilities

Built to handle both traditional ML and generative AI workloads, Cake provides centralized management—a “single pane of glass”—to oversee every AI project.

Deployment Flexibility

Cake deploys directly into your own virtual private cloud (VPC) or on-premises infrastructure. This ensures no sensitive data ever leaves your environment. With encryption both in transit and at rest, along with robust Kubernetes role-based access controls (RBAC), Cake prioritizes security at every layer.

Every component is authenticated, and platform access is scoped based on user roles, ensuring a least-privilege model. Even the deployment itself adheres to infrastructure-as-code (IaC) principles, where all changes are version-controlled through Git repositories. This gives teams full transparency and control over their infrastructure.

Cake and Open Source

Understanding how to use Cake really means understanding how to effectively leverage a carefully curated stack of open source tools and frameworks. The Cake platform is built on a modular architecture that stitches together best-in-class open technologies—augmented with Cake’s custom tooling—to support every phase of the generative AI lifecycle. To make the most of the platform, users should become familiar with the foundational components across four key domains: Core, ML Ops, AI Ops, and Data Engineering.

  • Core encompasses the base platform components like Kubernetes, Istio, and Prometheus

  • ML Ops focuses on model development workflows, including training, fine-tuning, experiment tracking (e.g., with MLFlow), and inference (via Ray and KubeRay).

  • AI Ops deals with monitoring, tracing, alerting, and serving infrastructure for generative and agentic components—leveraging tools like vLLM, LangFuse, and LiteLLM to ensure operational reliability and observability of AI Ops

Key Core Platform Documentation

Key Components

Key Security and Access Control Documentation

Key ML Ops Documentation

Key Components

Key AI Ops Documentation

Overview Flow

AI Ops for Local and Fine-Tuned Models - Cake AI Ops for Fine-Tuned Models

Key Components

Key Data Engineering Documentation