Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot access local variable 'kwargs' where it is not associated with a value #763

Closed
crankyadmin opened this issue Sep 1, 2023 · 2 comments
Labels
solved This problem has been already solved

Comments

@crankyadmin
Copy link

crankyadmin commented Sep 1, 2023

Hi,

Firstly can I just say, awesome project! Thank you for your efforts.

I'm currently facing an issue when running the training from the web ui.

Loading cached processed dataset at /home/david/.cache/huggingface/datasets/text/default-c1c19be682713dfa/0.0.0/c4a140d10f020282918b5dd1b8a49f0104729c6177f60a6b49ec2a365ec69f34/cache-0aebf50c61b7948a.arrow Running tokenizer on dataset: 0%| | 0/50 [00:00<?, ? examples/s] Exception in thread Thread-10 (run_exp): Traceback (most recent call last): File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/usr/lib/python3.11/threading.py", line 975, in run self._target(*self._args, **self._kwargs) File "/home/david/Code/github/LLaMA-Efficient-Tuning/src/llmtuner/tuner/tune.py", line 24, in run_exp run_pt(model_args, data_args, training_args, finetuning_args, callbacks) File "/home/david/Code/github/LLaMA-Efficient-Tuning/src/llmtuner/tuner/pt/workflow.py", line 26, in run_pt dataset = preprocess_dataset(dataset, tokenizer, data_args, training_args, stage="pt") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/david/Code/github/LLaMA-Efficient-Tuning/src/llmtuner/dsets/preprocess.py", line 165, in preprocess_dataset dataset = dataset.map( ^^^^^^^^^^^^ File "/home/david/Code/github/LLaMA-Efficient-Tuning/venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 592, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/david/Code/github/LLaMA-Efficient-Tuning/venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 557, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/david/Code/github/LLaMA-Efficient-Tuning/venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3097, in map for rank, done, content in Dataset._map_single(**dataset_kwargs): File "/home/david/Code/github/LLaMA-Efficient-Tuning/venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3474, in _map_single batch = apply_function_on_filtered_inputs( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/david/Code/github/LLaMA-Efficient-Tuning/venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 3353, in apply_function_on_filtered_inputs processed_inputs = function(*fn_args, *additional_args, **fn_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/david/Code/github/LLaMA-Efficient-Tuning/src/llmtuner/dsets/preprocess.py", line 42, in preprocess_pretrain_dataset tokenized_examples = tokenizer(examples["prompt"], **kwargs) ^^^^^^ UnboundLocalError: cannot access local variable 'kwargs' where it is not associated with a value

And the model config:

[INFO|configuration_utils.py:775] 2023-09-01 15:25:44,828 >> Model config LlamaConfig { "_name_or_path": "meta-llama/Llama-2-7b-hf", "architectures": [ "LlamaForCausalLM" ], "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 11008, "max_position_embeddings": 4096, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 32, "num_key_value_heads": 32, "pretraining_tp": 1, "rms_norm_eps": 1e-05, "rope_scaling": null, "tie_word_embeddings": false, "torch_dtype": "float16", "transformers_version": "4.32.1", "use_cache": true, "vocab_size": 32000 }

Any hints on what may be causing it?

Thanks again.

Edit: CUDA_VISIBLE_DEVICES=0 python src/train_bash.py --stage pt --model_name_or_path meta-llama/Llama-2-7b-hf --do_train --dataset wiki_demo --template default --finetuning_type lora --lora_target q_proj,v_proj --output_dir path_to_pt_checkpoint --overwrite_cache --per_device_train_batch_size 4 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --logging_steps 10 --save_steps 1000 --learning_rate 5e-5 --num_train_epochs 3.0 --plot_loss --fp16 Produces the same error.

@hiyouga hiyouga closed this as completed in 370bdb6 Sep 1, 2023
@hiyouga
Copy link
Owner

hiyouga commented Sep 1, 2023

Fixed, please update the code

@hiyouga hiyouga added the solved This problem has been already solved label Sep 1, 2023
@crankyadmin
Copy link
Author

That solved it. Thank!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants