Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KTO training with datasets in alpaca format #3803

Closed
Cheungki opened this issue May 18, 2024 · 5 comments
Closed

KTO training with datasets in alpaca format #3803

Cheungki opened this issue May 18, 2024 · 5 comments
Labels
solved This problem has been already solved

Comments

@Cheungki
Copy link

Nice work!

I'm glad to find that LLaMA-Factory supports KTO training. But training with datasets in alpaca format will lead to an error that all datapoints will be described as desired examples. A possible reason might be that examples["response"][i][0]["content"] here will always be true.

@hiyouga
Copy link
Owner

hiyouga commented May 18, 2024

ok I'll fix it, thanks for pointing it out

@hiyouga
Copy link
Owner

hiyouga commented May 18, 2024

fixed

@hiyouga hiyouga added the solved This problem has been already solved label May 18, 2024
@svjack
Copy link

svjack commented May 18, 2024

fixed

Where is kto_chosen_weight and kto_rejected_weight in ui ?
And if will add a auto calculate logic of this two value based on ratio between chosen and rejected sample ?

@hiyouga
Copy link
Owner

hiyouga commented May 18, 2024

@svjack the webui will be updated later

@svjack
Copy link

svjack commented May 19, 2024

@svjack the webui will be updated later

是否存在根据正副样本比例计算一个相对稳健的两个权重的方法呢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

3 participants