Vowpal Wabbit is an open-source machine learning system designed for online learning, where models update incrementally from streaming data without requiring full retraining. It provides a reduction-based learning framework that composes complex tasks from simpler algorithms, and includes a feature hashing trick that maps unbounded feature names into a fixed-size vector space to keep memory usage constant regardless of dataset size. The system supports distributed training across a cluster using an allreduce protocol for synchronized updates, and offers an active learning query strategy that s
This repository contains the code to replicate the synthetic experiment conducted in the paper "Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model" by Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, and Yasuo Yamamoto, which…
This repository contains the code for replicating the experiments of the paper "Optimal Off-Policy Evaluation from Multiple Logging Policies" (ICML2021, proceedings.mlr.press/v139/kallus21a.html)