[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
-
Updated
Nov 4, 2024 - Python
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
[Paper][ACL 2024 Findings] Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Survey of preference alignment algorithms
Add a description, image, and links to the preference-alignment topic page so that developers can more easily learn about it.
To associate your repository with the preference-alignment topic, visit your repo's landing page and select "manage topics."