-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
webui训练GLM4-9B-chat模型时候报错,数据集少的时候没问题,一多就报错 #4928
Comments
我用的当前能下载的最新的llamafactory版本,遇到了同样的问题 |
Reminder
System Info
llamafactory
version: 0.8.3.dev0Reproduction
Traceback (most recent call last):
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\multiprocess\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\datasets\utils\py_utils.py", line 678, in _write_generator_to_queue
for i, result in enumerate(func(**kwargs)):
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\datasets\arrow_dataset.py", line 3552, in _map_single
batch = apply_function_on_filtered_inputs(
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\datasets\arrow_dataset.py", line 3421, in apply_function_on_filtered_inputs
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\src\llamafactory\data\aligner.py", line 123, in convert_sharegpt
if dataset_attr.system_tag and messages[0][dataset_attr.role_tag] == dataset_attr.system_tag:
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "D:\AIchatgpt\venv\Scripts\llamafactory-cli.exe_main.py", line 7, in
sys.exit(main())
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\src\llamafactory\cli.py", line 111, in main
run_exp()
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\src\llamafactory\train\tuner.py", line 50, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\src\llamafactory\train\sft\workflow.py", line 46, in run_sft
dataset = get_dataset(model_args, data_args, training_args, stage="sft", **tokenizer_module)
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\src\llamafactory\data\loader.py", line 174, in get_dataset
all_datasets.append(load_single_dataset(dataset_attr, model_args, data_args, training_args))
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\src\llamafactory\data\loader.py", line 140, in load_single_dataset
return align_dataset(dataset, dataset_attr, data_args, training_args)
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\src\llamafactory\data\aligner.py", line 233, in align_dataset
return dataset.map(
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\datasets\arrow_dataset.py", line 602, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\datasets\arrow_dataset.py", line 567, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\datasets\arrow_dataset.py", line 3253, in map
for rank, done, content in iflatmap_unordered(
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\datasets\utils\py_utils.py", line 718, in iflatmap_unordered
[async_result.get(timeout=0.05) for async_result in async_results]
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\datasets\utils\py_utils.py", line 718, in
[async_result.get(timeout=0.05) for async_result in async_results]
File "C:\Users\Administrator\PycharmProjects\LLaMA-Factory\venv\lib\site-packages\multiprocess\pool.py", line 774, in get
raise self._value
IndexError: list index out of range
Expected behavior
No response
Others
No response
The text was updated successfully, but these errors were encountered: