Vowpal Wabbit is an open-source machine learning system designed for online learning, where models update incrementally from streaming data without requiring full retraining. It provides a reduction-based learning framework that composes complex tasks from simpler algorithms, and includes a feature hashing trick that maps unbounded feature names into a fixed-size vector space to keep memory usage constant regardless of dataset size. The system supports distributed training across a cluster using an allreduce protocol for synchronized updates, and offers an active learning query strategy that s
This repository contains the code to replicate the synthetic experiment conducted in the paper "Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model" by Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, and Yasuo Yamamoto, which…
This repository contains the code used for the experiments in "Off-Policy Evaluation for Large Action Spaces via Embeddings (ICML2022)" by Yuta Saito and Thorsten Joachims.