KTO training with datasets in alpaca format #3803

Cheungki · 2024-05-18T07:50:04Z

Nice work!

I'm glad to find that LLaMA-Factory supports KTO training. But training with datasets in alpaca format will lead to an error that all datapoints will be described as desired examples. A possible reason might be that examples["response"][i][0]["content"] here will always be true.

The text was updated successfully, but these errors were encountered:

hiyouga · 2024-05-18T07:58:59Z

ok I'll fix it, thanks for pointing it out

hiyouga · 2024-05-18T08:13:20Z

fixed

svjack · 2024-05-18T15:36:39Z

fixed

Where is kto_chosen_weight and kto_rejected_weight in ui ?
And if will add a auto calculate logic of this two value based on ratio between chosen and rejected sample ?

hiyouga · 2024-05-18T16:05:16Z

@svjack the webui will be updated later

svjack · 2024-05-19T02:49:21Z

@svjack the webui will be updated later

是否存在根据正副样本比例计算一个相对稳健的两个权重的方法呢

hiyouga closed this as completed in 0edc167 May 18, 2024

hiyouga added the solved This problem has been already solved label May 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KTO training with datasets in alpaca format #3803

KTO training with datasets in alpaca format #3803

Cheungki commented May 18, 2024

hiyouga commented May 18, 2024

hiyouga commented May 18, 2024

svjack commented May 18, 2024

hiyouga commented May 18, 2024

svjack commented May 19, 2024

KTO training with datasets in alpaca format #3803

KTO training with datasets in alpaca format #3803

Comments

Cheungki commented May 18, 2024

hiyouga commented May 18, 2024

hiyouga commented May 18, 2024

svjack commented May 18, 2024

hiyouga commented May 18, 2024

svjack commented May 19, 2024