Maybe a conflict between accelerate and transformers CLIPVisionModel #3339

striveAgain · 2025-01-13T06:52:59Z

System Info

- `Accelerate` version: 1.0.1
- Platform: Linux-5.15.0-124-generic-x86_64-with-glibc2.31
- `accelerate` bash location: ~
- Python version: 3.9.19
- Numpy version: 1.24.4
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- PyTorch XPU available: False
- PyTorch NPU available: False
- PyTorch MLU available: False
- PyTorch MUSA available: False
- System RAM: 503.53 GB
- GPU type: NVIDIA A100-SXM4-40GB
- `Accelerate` default config:
        Not found

Information

The official example scripts
My own modified scripts

Tasks

One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
My own task or dataset (give details below)

Reproduction

from accelerate import Accelerator
from transformers import CLIPVisionModelWithProjection

accelerator = Accelerator(
gradient_accumulation_steps=opt["training_opt"]["gradient_accumulation_steps"],
mixed_precision=opt["training_opt"]["mixed_precision"],
log_with=opt["tracker_opt"]["report_to"],
project_config=accelerator_project_config,
kwargs_handlers=[ddp_kwargs]
)

image_encoder = CLIPVisionModelWithProjection.from_pretrained(opt["module_opt"]["image_encoder"]["pretrained_model_path"])
image_encoder .to(accelerator.device, dtype=weight_dtype)

Expected behavior

I am using the accelerate and transformers libraries to train the my model in a single-node multi-GPU environment, with the parameters of CLIPVisionModelWithProjection frozen. When I start the experiment, for example, using GPUs 4, 5, 6, and 7 (a total of 4 GPUs), I notice that processes for GPUs 5, 6, and 7 also consume memory on GPU 4, approximately 510MB. Once I remove the CLIPVisionModelWithProjection, everything returns to normal. What could be the reason for this? Is there a solution?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maybe a conflict between accelerate and transformers CLIPVisionModel #3339

Maybe a conflict between accelerate and transformers CLIPVisionModel #3339

striveAgain commented Jan 13, 2025 •

edited

Loading

Maybe a conflict between accelerate and transformers CLIPVisionModel #3339

Maybe a conflict between accelerate and transformers CLIPVisionModel #3339

Comments

striveAgain commented Jan 13, 2025 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

striveAgain commented Jan 13, 2025 •

edited

Loading