This is the official repository for scPEFT: Parameter-Efficient Fine-Tuning Enhances Adaptation of Single Cell Large Language Model for Cell Type Identification
Download model checkpoint: scGPT_human and put it at ./scGPT_human
-
Clone the repository:
git clone https://github.com/laolintou/scPEFT.git
-
Navigate to the project directory and create a conda environment:
cd scPEFT conda env create -f environment.yaml
-
Activate the conda environment:
conda activate scGPT
All data used in this study are publicly available.
Dataset | Link |
---|---|
M.S. | M.S. |
Zheng68k | Zheng68k |
NSCLC | NSCLC |
COVID-19 | COVID-19 |
Firstly,enter folder tutorials cd scPEFT-main/tutorials
python Tutorial_Reference_Mapping.py --data_name "ms"
python Tutorial_Annotation_Finetune.py --data_name "ms" --finetune_type "Full_finetune" --use_prompt False
python Tutorial_Annotation_Finetune.py --data_name "ms" --finetune_type "Cls_finetune" --use_prompt False
python Tutorial_Annotation_Prompt.py --data_name "ms" --prompt_type "Gene_token_prompt" --use_prompt True
python Tutorial_Annotation_Prompt.py --data_name "ms" --prompt_type "Gene_encoder_prompt" --use_prompt True
python Tutorial_Annotation_Prompt.py --data_name "ms" --prompt_type "prefix_prompt" --use_prompt True
python Tutorial_Annotation_Prompt.py --data_name "ms" --prompt_type "LoRA" --use_prompt True
data_name :dataset name
prompt_type:the type that you add into model
use_prompt:whether use prompt or not
Weighted Accuracy: XXX, Weighted Precision: XXX, Weighted Recall: XXX, Weighted F1: XXX
-------------------------------------------------------------
accuracy precision recall f1-score support
XX cell type - - - - -
...
...
accuracy - -
macro avg - - - -
weighted avg - - - -
Weighted Accuracy: The balanced accuracy in binary and multiclass classification problems to deal with imbalanced datasets.
Weighted Precision: weighted Precision based on number of each cell type
Weighted Recall: weighted Recall based on number of each cell type
Weighted F1: weighted F1 score based on number of each cell type
@article {He2024.01.27.577455,
author = {Fei He and Ruixin Fei and Mingyue Gao and Li Su and Xinyu Zhang and Dong Xu},
title = {Parameter-Efficient Fine-Tuning Enhances Adaptation of Single Cell Large Language Model for Cell Type Identification},
year = {2024},
doi = {10.1101/2024.01.27.577455},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2024/01/30/2024.01.27.577455},
journal = {bioRxiv}
}