Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix torch-npu dependency #4561

Merged
merged 8 commits into from
Jun 27, 2024
Merged

fix torch-npu dependency #4561

merged 8 commits into from
Jun 27, 2024

Conversation

hashstone
Copy link

What does this PR do?

Fix torch-npu dependency problem like this

#0 11.87 Collecting triton==2.1.0 (from torch==2.1.0->llamafactory==0.8.3.dev0)
#0 11.90   Downloading triton-2.1.0-0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.3 kB)
#0 11.90 INFO: pip is looking at multiple versions of torch-npu to determine which version is compatible with other requirements. This could take a while.
#0 11.90 ERROR: Cannot install llamafactory and llamafactory[metrics,torch-npu]==0.8.3.dev0 because these package versions have conflicting dependencies.
#0 11.90 
#0 11.90 The conflict is caused by:
#0 11.90     llamafactory[metrics,torch-npu] 0.8.3.dev0 depends on torch==2.1.0; extra == "torch-npu"
#0 11.90     torch-npu 2.1.0.post3 depends on torch==2.1.0+cpu
#0 11.90 
#0 11.90 To fix this you could try to:
#0 11.90 1. loosen the range of package versions you've specified
#0 11.90 2. remove package versions to allow pip to attempt to solve the dependency conflict
#0 11.90 
#0 11.90 ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Before submitting

@hiyouga hiyouga added the pending This problem is yet to be addressed label Jun 26, 2024
@hiyouga
Copy link
Owner

hiyouga commented Jun 26, 2024

cc @MengqingCao

@MengqingCao
Copy link
Contributor

@hashstone Thanks for the fix! But the installation of torch-npu differs by architecture. Maybe adding a dependency "torch-npu-x86" in setup.py and choosing to install "torch-npu" or "torch-npu-x86" depending on the architecture in the Dockerfile is a solution. @hiyouga What do you think?
image
https://pypi.org/project/torch-npu/2.1.0.post3/

@hashstone
Copy link
Author

@MengqingCao have modified as your proposal, please review.

@@ -1,28 +1,35 @@
# Use the Ubuntu 22.04 image with CANN 8.0.rc1
# More versions can be found at https://hub.docker.com/r/cosdt/cann/tags
FROM cosdt/cann:8.0.rc1-910b-ubuntu22.04
FROM --platform=$TARGETPLATFORM cosdt/cann:8.0.rc1-910b-ubuntu22.04
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done automatically by docker. let's use uname -i to get the architecture info.


# Copy the rest of the application into the image
COPY . /app

# Install the LLaMA Factory
RUN EXTRA_PACKAGES="torch-npu,metrics"; \
RUN EXTRA_PACKAGES="metrics"; \
if [ "$TARGETPLATFORM" == "linux/arm64" ]; then \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if [ "$(uname -i)" == "aarch64" ]; then \

Copy link
Owner

@hiyouga hiyouga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did some modifications and this pr can be merged

@hiyouga hiyouga merged commit a6bf74c into hiyouga:main Jun 27, 2024
@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 27, 2024
@hiyouga
Copy link
Owner

hiyouga commented Jun 27, 2024

Hi @hashstone , if possible, could you please help us verify whether the problem has been fixed or not, using the current version: https://github.com/hiyouga/LLaMA-Factory/blob/main/docker/docker-npu/Dockerfile ? thanks!

@hashstone
Copy link
Author

Hi @hashstone , if possible, could you please help us verify whether the problem has been fixed or not, using the current version: https://github.com/hiyouga/LLaMA-Factory/blob/main/docker/docker-npu/Dockerfile ? thanks!

In my local build env, cross building through docker build --platform linux/arm64 or docker build --platform linux/amd64 is pass.

@hiyouga
Copy link
Owner

hiyouga commented Jun 28, 2024

Hi @hashstone , if possible, could you please help us verify whether the problem has been fixed or not, using the current version: https://github.com/hiyouga/LLaMA-Factory/blob/main/docker/docker-npu/Dockerfile ? thanks!

In my local build env, cross building through docker build --platform linux/arm64 or docker build --platform linux/amd64 is pass.

Great news, thanks for the verification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants