Getting Started with Anomaly Detection Toolkit (ADTK)

Introduction

The Anomaly Detection Toolkit (ADTK) is an open-source Python library designed for building, testing, and deploying interpretable anomaly detection pipelines—particularly for univariate and multivariate time series. ADTK empowers teams to monitor key metrics such as model latency, traffic volumes, engagement rates, and embedding drift in a lightweight, customizable, and explainable way. It is ideal for catching issues early in both ML-driven systems and operational workflows where thresholds, trends, and statistical expectations are critical.

Key Benefits of Using ADTK include:

Modular and Declarative: Provides a simple, scikit-learn–like API for chaining together preprocessing, smoothing, thresholding, and anomaly detection models.
Flexible Detection Methods: Supports statistical approaches (e.g., Z-score, quantiles), rule-based methods (e.g., level shift, volatility shift), and customizable detectors.
Time Series Native: Designed specifically for time-indexed data, with native handling of missing values, resampling, and rolling windows.
Explainable and Lightweight: Each detection method is interpretable and transparent—ideal for operational dashboards and root-cause analysis.
Python and Pandas Compatible: Integrates seamlessly with Cake’s Python-based data stack for quick adoption in Airflow, Prefect, or Dagster pipelines.

Use Cases

ADTK is used to detect:

Model inference anomalies such as latency spikes, throughput drops, or unexpected error rates in serving platforms like vLLM, KServe, or Triton.
Drift in user interaction data, including sudden changes in input length, query volume, or downstream engagement with LLM applications.
Operational metric deviations like DAG execution failures, scheduled task delays, or unexpected load on Kubernetes resources.
Data quality issues in structured pipelines—such as missing features, abnormal feature distributions, or upstream data outages.

It integrates well with observability tools like Grafana, Prometheus, and Arize Phoenix, and complements more advanced ML monitoring frameworks like NannyML, DeepChecks, or Evidently for multi-layer visibility. By adopting ADTK, Cake adds a lightweight, interpretable, and production-ready layer of anomaly detection—helping teams catch issues faster, reduce downtime, and build trust in automated and ML-powered systems.

Important Links

Main Site

Documentation