Skip to content

laolintou/scPEFT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scPEFT

This is the official repository for scPEFT: Parameter-Efficient Fine-Tuning Enhances Adaptation of Single Cell Large Language Model for Cell Type Identification

A Quick Overview

overview

Requirements

Download model checkpoint: scGPT_human and put it at ./scGPT_human

  1. Clone the repository:

    git clone https://github.com/laolintou/scPEFT.git
  2. Navigate to the project directory and create a conda environment:

    cd scPEFT
    conda env create -f environment.yaml
  3. Activate the conda environment:

    conda activate scGPT

Data preparation

All data used in this study are publicly available.

Dataset Link
M.S. M.S.
Zheng68k Zheng68k
NSCLC NSCLC
COVID-19 COVID-19

Get Started

Firstly,enter folder tutorials cd scPEFT-main/tutorials

native

python Tutorial_Reference_Mapping.py --data_name "ms"

full finetune

train & test

python Tutorial_Annotation_Finetune.py --data_name "ms" --finetune_type "Full_finetune" --use_prompt False

finetune classifier

train & test

python Tutorial_Annotation_Finetune.py --data_name "ms" --finetune_type "Cls_finetune" --use_prompt False

Gene token prompt

train & test

python Tutorial_Annotation_Prompt.py --data_name "ms" --prompt_type "Gene_token_prompt" --use_prompt True

Gene encoder prompt

train & test

python Tutorial_Annotation_Prompt.py --data_name "ms" --prompt_type "Gene_encoder_prompt" --use_prompt True

prefix prompt

train & test

python Tutorial_Annotation_Prompt.py --data_name "ms" --prompt_type "prefix_prompt" --use_prompt True

LoRA prompt

train & test

python Tutorial_Annotation_Prompt.py --data_name "ms" --prompt_type "LoRA" --use_prompt True

Command Line Arguments

data_name :dataset name

prompt_type:the type that you add into model

use_prompt:whether use prompt or not

Result Output Format

Weighted Accuracy: XXX, Weighted Precision: XXX, Weighted Recall: XXX, Weighted F1: XXX
-------------------------------------------------------------
                accuracy   precision    recall    f1-score    support
XX cell type         -         -          -          -           -
...
...

              accuracy                               -           -
             macro avg         -          -          -           -
          weighted avg         -          -          -           -

Weighted Accuracy: The balanced accuracy in binary and multiclass classification problems to deal with imbalanced datasets.

Weighted Precision: weighted Precision based on number of each cell type

Weighted Recall: weighted Recall based on number of each cell type

Weighted F1: weighted F1 score based on number of each cell type

Built With

pytorch

scGPT

Citation

@article {He2024.01.27.577455,
	author = {Fei He and Ruixin Fei and Mingyue Gao and Li Su and Xinyu Zhang and Dong Xu},
	title = {Parameter-Efficient Fine-Tuning Enhances Adaptation of Single Cell Large Language Model for Cell Type Identification},
	year = {2024},
	doi = {10.1101/2024.01.27.577455},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2024/01/30/2024.01.27.577455},
	journal = {bioRxiv}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages