awesome-repositories.comBlog

© 2026 Bringes Technology SRL·VAT RO45896025·hello@awesome-repositories.com

MCP Blog Curated searches Sitemap Privacy Terms

Safe Rlhf | Awesome Repository

PKU-Alignmentsafe-rlhf

0

View on GitHub↗

1,605 stars·133 forks·Python·Apache-2.0·0 viewspku-beaver.github.io↗

Safe Rlhf

Features

Jailbreak Defenses - Implements safe reinforcement learning from human feedback.

AI search

Explore more awesome repositories

Describe what you need in plain English — the AI ranks thousands of curated open-source projects by relevance.

Start searching with AI

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback