New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Unable to interrupt when using multi-GPU training in webui. 在webui使用多GPU训练时无法中断 #3978

Closed

1 task done

injet-zhou opened this issue May 30, 2024 · 0 comments · Fixed by #3987

Labels

solved

Contributor

injet-zhou commented May 30, 2024

Reminder

I have read the README and searched the existing issues.

Reproduction

luanch webui using: CUDA_VISIBLE_DEVICES='0,1' llamafactory-cli webui
fill the model name or model path
click Start button

Chinese

使用CUDA_VISIBLE_DEVICES='0,1' llamafactory-cli webui启动webui
填写模型名称或者模型路径
点击启动按钮

Expected behavior

Training terminated successfully. 正常终止训练

System Info

current llama-factory reversion's commit id: 97346c1
OS: Ubuntu 22.04
transformers version: 4.41.1
Platform: Linux-5.15.0-43-generic-x86_64-with-glibc2.35
Python version: 3.10.14
Huggingface_hub version: 0.23.2
Safetensors version: 0.4.3
Accelerate version: 0.30.1
Accelerate config: not found
PyTorch version (GPU?): 2.3.0+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: Yes

Others

No response

The text was updated successfully, but these errors were encountered:

injet-zhou mentioned this issue

Fix cann't interrupt training when using multi GPUs in webui #3987

Merged

1 task

hiyouga added the pending label

hiyouga closed this as completed in #3987

hiyouga added solved and removed pending labels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment