-
Setup up a new conda env and install required packages
# create conda env conda create -n minigpt python=3.10 -y conda activate minigpt # install packages pip install -r requirements.txt
-
This project relies on apex, which, unfortunately, you need to compile from source. Please follow the official instructions to compile.
- Some experience to compile successfully:
git clone https://github.com/NVIDIA/apex
- make sure the version of CUDA on your machine is eqaul to the version with which your installed pytorch is built.
- make sure
pip >= 23.1
, otherwise runpip install --upgrade pip
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
- Some experience to compile successfully:
-
LLaMA ckpt preparation. Please request access to the pre-trained LLaMA from this form, and orgnize the diractory like:
{path/to/llama}/ |- consolidated.00.pth |- params.json |- tokenizer.model
-
The
torchhub_train.json
from Gorilla official repository has different format compared totensorflow_train.json
andhuggingface_train.json
, so currently we didn't conduct experiments on it.
-
First specify
llama_path
infinetune/scripts/finetune/finetune_7B_gorilla_{tf,hf,th}.sh
-
Then run script:
cd finetune bash scripts/finetune/finetune_7B_gorilla_{tf,hf,th}.sh sdp 1
Last parameter is model parallel size, and you can increase it if GPU memory is low. For A100/A6000, you can leave it 1.
-
To evaluate model performance, first we need generate responses with finetuned model.
-
First copy
params.json
to model folder:cp {path/to/llama}/params.json finetune/output/{exp_name}/{epoch*}/
-
Then run command:
cd inference torchrun --nproc_per_node 1 gorilla_inference_full_finetune.py --dataset_path ../gorilla-main/eval/eval-data/questions/{tensorflowhub, huggingface, torchhub}/questions_{tensorflowhub, huggingface, torchhub}_0_shot.jsonl --ckpt_dir ../finetune/output/{exp_name}/{epoch*}/ --tokenizer_path {path/to/llama}/tokenizer.model
Note:
ckpt_dir
should be a FOLDER, not a.pth
file. Only inference on one GPU is supported.
-
First specify
llama_path
inalpaca_finetuning_v1/finetune_{tf,hf,th}.sh
-
Then run script:
cd alpaca_finetuning_v1 bash finetune_{tf,hf,th}.sh
Note:
--blr
is the base learning rate, we havelr = blr * eff_batch_size / 256
inalpaca_finetuning_v1/finetuning.py
line 237. Adjust it when you change the GPU number. -
After adapter finetune, we can extract adapter parameters from checkpoint. Run:
python extract_adapter_from_checkpoint.py --model_path ./checkpoint/{exp_name}/{pth_file}
- Run command:
Note:
cd inference torchrun --nproc_per_node 1 gorilla_inference_llama_adapter_v1.py --ckpt_dir {path/to/llama} --tokenizer_path {path/to/llama}/tokenizer.model --adapter_path ../alpaca_finetuning_v1/checkpoint/{exp_name}/{adapter_pth_file} --dataset_path ../gorilla-main/eval/eval-data/questions/{tensorflowhub, huggingface, torchhub}/questions_{tensorflowhub, huggingface, torchhub}_0_shot.jsonl
ckpt_dir
should be a FOLDER, not a.pth
file. Only inference on one GPU is supported.
- Run Gorilla official evaluation code by:
cd gorilla-main/eval/eval-scripts/ # For full finetune python ast_eval_{tf,hf,th}.py --api_dataset ../../data/api/{tensorflowhub_api, huggingface_api, torchhub_api}.jsonl --apibench ../../data/apibench/{tensorflow,huggingface,torchhub}_eval.json --llm_responses ../../../finetune/output/{exp_name}/{epoch*}/model_prediction_results.jsonl # For llama-adapter python ast_eval_{tf,hf,th}.py --api_dataset ../../data/api/{tensorflowhub_api, huggingface_api, torchhub_api}.jsonl --apibench ../../data/apibench/{tensorflow,huggingface,torchhub}_eval.json --llm_responses ../../../alpaca_finetuning_v1/checkpoint/{exp_name}/model_prediction_results.jsonl
Our finetuned LLaMA-adapter models and their predictions can be found in this link.
Methods | TensorFlow Hub | TensorFlow Hub | HuggingFace | HuggingFace |
---|---|---|---|---|
overall |
hallu |
overall |
hallu |
|
Official | 83.79 | 5.40 | 71.68 | 10.95 |
Full finetune | 88.02 | 1.02 | 69.69 | 10.29 |
LLaMA-adapter | 86.90 | 0.74 | 63.62 | 11.83 |