←BackLliumy2010/UFT0Copy as MarkdownView on GitHub↗0 stars·0 forks·0 viewsUFTMingyang Liu, Gabriele Farina, Asuman Ozdaglar FeaturesOff-Policy Optimization - Unifying supervised and reinforcement fine-tuning processes.