←BackPKU-Alignment/safe-rlhf0Copy as MarkdownView on GitHub↗1,605 stars·133 forks·Python·Apache-2.0·0 viewspku-beaver.github.io↗Safe RlhfFeaturesJailbreak Defenses - Implements safe reinforcement learning from human feedback.