Bump to PyTorch 2.0.0 #165

Tobias-Fischer · 2023-04-02T21:09:46Z

Checklist

Used a personal fork of the feedstock to propose changes
Bumped the build number (if the version is unchanged)
Reset the build number to 0 (if the version changed)
Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
Ensured the license file is being packaged.

…nda-forge-pinning 2023.03.29.16.02.14

conda-forge-webservices · 2023-04-02T21:09:51Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

Tobias-Fischer · 2023-04-02T21:10:03Z

Fixes #151

Tobias-Fischer · 2023-04-02T21:12:38Z

Still struggling with some issues.

For the non-cuda build on linux-64, at least locally I was running into issues where the run dependencies of the pytorch package are not pulled in by the pytorch-cpu package (which is really weird) and leads for the build to fail as sympy cannot be found when import torch.
Locally the cuda build fails with

conda.CondaMultiError: The package for cudnn located at /home/conda/feedstock_root/build_artifacts/pkg_cache/cudnn-8.4.1.50-hed8a83a_0
appears to be corrupted. The path 'bin/.cudnn-post-link.sh'
specified in the package manifest cannot be found.

The package for cudnn located at /home/conda/feedstock_root/build_artifacts/pkg_cache/cudnn-8.4.1.50-hed8a83a_0
appears to be corrupted. The path 'include/cudnn.h'
specified in the package manifest cannot be found.

Have not tested the osx builds.

Let's see what happens in CI.

Tobias-Fischer · 2023-04-03T02:11:48Z

It seems like this is working quite well :)

Unfortunately linux-aarch64 errors with:

ImportError: /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1680469918528/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.11/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: _ZNK6google8protobuf7Message11GetTypeNameEv

Any ideas where this issue might be from @conda-forge/pytorch-cpu @conda-forge/libprotobuf @conda-forge/protobuf? It seems like it could be related to the gcc version (NVIDIA-AI-IOT/torch2trt#53)? I can only find this issue on conda-forge but it does not seem to apply here (mixing defaults and conda-forge channels): conda-forge/paraview-feedstock#23

Several issues hint to towards issues with onnx/caffe2.

hmaarrfk · 2023-04-03T02:18:46Z

my guess is that somebody is re-exporting the symbols publicly. It may be the vendored onnx?

hmaarrfk · 2023-04-03T02:25:50Z

Maybe related to: https://github.com/pytorch/pytorch/blob/5d62d1255778b53ece16c79fd842cd42eca31f93/CMakeLists.txt#L175

Tobias-Fischer · 2023-04-03T09:34:32Z

Looks like this is now ready for review. Could someone else please test a cuda build locally?

hmaarrfk · 2023-04-03T21:30:51Z

I'm building now. About 6 hours per build. about 12 hours to go.

Do we have a "test" script for GPU usage?

I'm kinda out of creative stamina for the day so if you have an idea I would be all ears.

Tobias-Fischer · 2023-04-03T21:32:03Z

Thanks @hmaarrfk!

We can test the gpu builds with

>>> import torch

>>> torch.cuda.is_available()
True

>>> torch.cuda.device_count()
1

hmaarrfk · 2023-04-03T21:40:15Z

>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
1
>>> print(torch.__version__)
2.0.0.post200

is the post expected?

hmaarrfk · 2023-04-03T21:41:04Z

Builds starting to upload https://anaconda.org/mark.harfouche/pytorch/files if people want a more thorough test.

Tobias-Fischer · 2023-04-03T21:45:28Z

There are a few instances of post200 when searching on the PyTorch repo: https://github.com/search?q=repo%3Apytorch%2Fpytorch+post200&type=issues

Not sure what it refers to

hmaarrfk · 2023-04-03T22:30:16Z

Of course, pytorch needs to have a custom version numbering system ^_^

https://github.com/pytorch/pytorch/blob/73b06a0268bb89c09a86f16fa0f72818baa4b250/tools/generate_torch_version.py#L51

The use the CONDA_BUILD_NUMBER and add it to the version number.

I can't fully trace it, but it seems to be related to the build number.

Lets maybe flag this as an issue, but I don't want to patch too much at this stage.

Tobias-Fischer · 2023-04-04T00:25:47Z

I don't think it's a big deal, is it? If you really wanted to, we could change https://github.com/Tobias-Fischer/pytorch-cpu-feedstock/blob/5f00d8033d7cc23eb9831c74dc038fe3e4047562/recipe/build_pytorch.sh#LL56 to 0 instead.

hmaarrfk · 2023-04-04T00:29:20Z

I think it is ok as is.
Ah thnk you. i searched pytorch's repo for that statement but forgot that maybe we just set it...

Eventually, i think somebody's version check will be broken but we can deal with it later.

Tobias-Fischer · 2023-04-04T00:45:16Z

So far no one has complained, and we've had the same situation in past conda-forge builds:

>>> torch.__version__
'1.13.0.post200'

hmaarrfk · 2023-04-04T01:36:54Z

Yeah. I agree. I think it's fine.

Just waiting for the builds at this stage.

hmaarrfk · 2023-04-04T11:19:35Z

LInux cuda log files

log_files.zip

h-vetinari · 2023-04-04T16:53:44Z

I thought that 2.0 needed Triton as a backend, or is that optional...?

ngam · 2023-04-04T17:07:24Z

The post.xxx has always been there. I believed it was due to building from cloning a git repo...

Any comment on triton btw? Are we building with all new 2.x capabilities?

hmaarrfk · 2023-04-04T17:13:00Z

hm, sorry i forgot to test for triton, is there a "test" you would like be to try on a GPU?

ngam · 2023-04-04T17:19:53Z

Not sure, I haven't been using GPUs for a few months now, but I can test when I get a chance in a few weeks. Let's keep this on our radar for 2.0.1 (unless someone complains before then).

Note "torchtriton" in the uploads from the PyTorch channel:

linux-64/pytorch-2.0.0-py3.8_cuda11.8_cudnn8.7.0_0.tar.bz2

No Description

Uploaded	Fri Mar 10 00:57:19 2023
md5 checksum	fc92239ea8aa4ba12cd8305a1505f78a
arch	x86_64
build	py3.8_cuda11.8_cudnn8.7.0_0
constrains	cpuonly <0
depends	blas * mkl, filelock, jinja2, mkl >=2018, networkx, python >=3.8,<3.9.0a0, pytorch-cuda >=11.8,<11.9, pytorch-mutex 1.0 cuda, sympy, torchtriton 2.0.0, typing_extensions
has_prefix	True
license	BSD 3-Clause
license_family	BSD
machine	x86_64
operatingsystem	linux
platform	linux
subdir	linux-64
target-triplet	x86_64-any-linux
timestamp	1678406617283

hmaarrfk · 2023-04-04T17:36:53Z

linux compiling just takes "time" but it is easy to start, so if somebody wants to add triton support, i can rebuild linux.

Tobias-Fischer · 2023-04-04T21:05:59Z

I am a bit confused what is depending on what. As far as I can see, pytorch depends on torchtriton, which in turn seems to depend on pytorch.
Is there any difference between torchtriton and triton?
We already package an old version of triton in conda-forge and have a PR open for version 2.0.0 (triton v2.0.0 triton-feedstock#2). Would this 2.0.0 version be suitable as dependency for here?

Tobias-Fischer · 2023-04-04T21:15:32Z

Another issue seems to be that I can't install a recent version of torchvision and pytorch=2 side-by-side; I manage to kill mamba ;)

> mamba create -n pytorch21 pytorch=2 torchvision=0.14
python: /home/conda/feedstock_root/build_artifacts/mamba-split_1680002410624/work/libmamba/src/core/satisfiability_error.cpp:1767: mamba::{anonymous}::TreeExplainer::node_t mamba::{anonymous}::TreeExplainer::concat_nodes(const std::vector<long unsigned int>&): Assertion `std::all_of( ids.begin(), ids.end(), [&](auto id) { return m_pbs.graph().node(ids.front()).index() == m_pbs.graph().node(id).index(); } )' failed.

If I don't specify the torchvision version, it pulls in a very old torchvision which is definitely not compatible ..

hmaarrfk · 2023-04-04T21:18:08Z

pytorch run exports itself, so torchvision has to be rebuilt.

ngam · 2023-04-05T00:36:17Z

Let's move the discussion to #166 to better keep track. The issue with torchvision should be resolved with the migrator or with a manual rebuild

Tobias-Fischer added 2 commits March 30, 2023 10:52

MNT: Re-rendered with conda-build 3.23.3, conda-smithy 3.23.1, and co…

5a62942

…nda-forge-pinning 2023.03.29.16.02.14

Fixes for PyTorch 2.0.0

9e102a2

Tobias-Fischer requested review from benjaminrwilson, hmaarrfk and sodre as code owners April 2, 2023 21:09

Tobias-Fischer marked this pull request as draft April 2, 2023 21:10

Try fix aarch64 issues

5f00d80

Tobias-Fischer marked this pull request as ready for review April 3, 2023 09:33

hmaarrfk merged commit 1a3257e into conda-forge:main Apr 4, 2023

ngam mentioned this pull request Apr 5, 2023

building with triton support? #166

Open

rbavery mentioned this pull request Apr 6, 2023

old version .10.0 is installed with pytorch 2.0 conda-forge/torchvision-feedstock#71

Open

1 task

h-vetinari mentioned this pull request May 6, 2023

Pytorch 2.0 #151

Closed

ngam mentioned this pull request Jul 31, 2023

ENH use openblas too #175

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump to PyTorch 2.0.0 #165

Bump to PyTorch 2.0.0 #165

Tobias-Fischer commented Apr 2, 2023 •

edited

Loading

conda-forge-webservices bot commented Apr 2, 2023

Tobias-Fischer commented Apr 2, 2023

Tobias-Fischer commented Apr 2, 2023

Tobias-Fischer commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

Tobias-Fischer commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

Tobias-Fischer commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

Tobias-Fischer commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

Tobias-Fischer commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

Tobias-Fischer commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

h-vetinari commented Apr 4, 2023

ngam commented Apr 4, 2023 •

edited

Loading

hmaarrfk commented Apr 4, 2023

ngam commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

Tobias-Fischer commented Apr 4, 2023

Tobias-Fischer commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

ngam commented Apr 5, 2023 •

edited

Loading

Bump to PyTorch 2.0.0 #165

Bump to PyTorch 2.0.0 #165

Conversation

Tobias-Fischer commented Apr 2, 2023 • edited Loading

conda-forge-webservices bot commented Apr 2, 2023

Tobias-Fischer commented Apr 2, 2023

Tobias-Fischer commented Apr 2, 2023

Tobias-Fischer commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

Tobias-Fischer commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

Tobias-Fischer commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

Tobias-Fischer commented Apr 3, 2023

hmaarrfk commented Apr 3, 2023

Tobias-Fischer commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

Tobias-Fischer commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

h-vetinari commented Apr 4, 2023

ngam commented Apr 4, 2023 • edited Loading

hmaarrfk commented Apr 4, 2023

ngam commented Apr 4, 2023

linux-64/pytorch-2.0.0-py3.8_cuda11.8_cudnn8.7.0_0.tar.bz2

hmaarrfk commented Apr 4, 2023

Tobias-Fischer commented Apr 4, 2023

Tobias-Fischer commented Apr 4, 2023

hmaarrfk commented Apr 4, 2023

ngam commented Apr 5, 2023 • edited Loading

Tobias-Fischer commented Apr 2, 2023 •

edited

Loading

ngam commented Apr 4, 2023 •

edited

Loading

ngam commented Apr 5, 2023 •

edited

Loading