Wandb | Awesome Repository

Wandb is a centralized platform for machine learning experiment tracking, model registry management, and workflow orchestration. It provides a comprehensive suite of tools for logging, visualizing, and versioning training metrics, model artifacts, and hyperparameter sweeps to ensure reproducibility across development cycles. The platform also functions as an observability tool for large language model applications, enabling the tracing of execution steps, token usage, and reasoning processes.

The project distinguishes itself through its event-driven automation capabilities, which allow users to trigger workflows, manage training job lifecycles, and execute serverless fine-tuning tasks based on experiment results or metric thresholds. It supports complex model development by providing standardized interfaces for connecting to foundation models, deploying lightweight model adapters, and enforcing output constraints. Additionally, the platform offers deep observability into model behavior, including the ability to capture intermediate reasoning, validate long-context processing, and assess model safety.

Beyond core tracking, the platform includes extensive support for monitoring system resources and hardware accelerator performance, alongside rich media logging for audio, video, and molecular structures. It facilitates team collaboration through interactive reporting and provides robust data management features, such as versioned artifact lineage, automated retention policies, and secure storage.

The system is designed for integration into existing development environments through a command-line utility and a programmatic software development kit that handles authentication, local service management, and asynchronous data synchronization.

Features

Machine Learning Experiment Trackers - Provides a centralized dashboard for logging, visualizing, and versioning machine learning training metrics and artifacts.
Model Lineage Trackers - Tracks dependencies between datasets, model weights, and training runs using immutable snapshots for reproducibility.
Experiment Tracking - Logs, versions, and visualizes machine learning training metrics and model artifacts to ensure reproducibility.
LLM Observability - Captures and monitors nested execution steps, token usage, and costs for complex language model applications.

Features

Machine Learning Experiment Trackers - Provides a centralized dashboard for logging, visualizing, and versioning machine learning training metrics and artifacts.
Model Lineage Trackers - Tracks dependencies between datasets, model weights, and training runs using immutable snapshots for reproducibility.
Experiment Tracking - Logs, versions, and visualizes machine learning training metrics and model artifacts to ensure reproducibility.
LLM Observability - Captures and monitors nested execution steps, token usage, and costs for complex language model applications.