Skip to content

Commit

Permalink
[ci][up] python 3.11 for ray-ml (ray-project#44075)
Browse files Browse the repository at this point in the history
Upgrade dependencies so that we can build ray-ml image for python 3.11. Fix all related test issues

Signed-off-by: can <[email protected]>
  • Loading branch information
can-anyscale authored Apr 5, 2024
1 parent 835e126 commit ddc0064
Show file tree
Hide file tree
Showing 20 changed files with 368 additions and 364 deletions.
2 changes: 2 additions & 0 deletions .buildkite/_forge.rayci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ steps:
python:
- "3.9"
- "3.10"
- "3.11"
cuda:
- "11.8.0"
env:
Expand All @@ -72,5 +73,6 @@ steps:
matrix:
- "3.9"
- "3.10"
- "3.11"
env:
PYTHON_VERSION: "{{matrix}}"
1 change: 1 addition & 0 deletions .buildkite/base.rayci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ steps:
wanda: ci/docker/base.gpu.wanda.yaml
matrix:
- "3.10"
- "3.11"
env:
PYTHON: "{{matrix}}"

Expand Down
1 change: 1 addition & 0 deletions .buildkite/build.rayci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,3 +85,4 @@ steps:
matrix:
- "3.9"
- "3.10"
- "3.11"
4 changes: 2 additions & 2 deletions .buildkite/others.rayci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ steps:
- ./ci/ci.sh compile_pip_dependencies
- cp -f ./python/requirements_compiled.txt /artifact-mount/
soft_fail: true
job_env: oss-ci-base_build
job_env: oss-ci-base_build-py3.11
depends_on:
- forge
- oss-ci-base_build
- oss-ci-base_build-multipy

# test
- label: doc tests
Expand Down
3 changes: 1 addition & 2 deletions .buildkite/serve.rayci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ steps:
- name: servebuild-multipy
label: "wanda: servebuild-py{{matrix}}"
wanda: ci/docker/serve.build.wanda.yaml
matrix:
- "3.11"
matrix: ["3.11"]
env:
PYTHON: "{{matrix}}"
depends_on: oss-ci-base_build-multipy
Expand Down
4 changes: 0 additions & 4 deletions ci/docker/docgpu.build.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,4 @@ set -euo pipefail
DOC_TESTING=1 TRAIN_TESTING=1 TUNE_TESTING=1 ci/env/install-dependencies.sh
pip install -Ur ./python/requirements/ml/dl-gpu-requirements.txt

# TODO(amogkam): Remove when https://github.com/ray-project/ray/issues/36011
# is resolved.
pip install -U transformers==4.34.1

EOF
2 changes: 0 additions & 2 deletions ci/docker/ml.build.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,6 @@ RUN <<EOF

set -euo pipefail

# TODO (can): Move mosaicml to train-test-requirements.txt
pip install "mosaicml==0.12.1"
DOC_TESTING=1 TRAIN_TESTING=1 TUNE_TESTING=1 DATA_PROCESSING_TESTING=1 \
INSTALL_HOROVOD=1 INSTALL_HDFS=1 \
./ci/env/install-dependencies.sh
Expand Down
12 changes: 3 additions & 9 deletions ci/docker/serve.build.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -20,19 +20,13 @@ RUN <<EOF

set -euo pipefail

pip install -U torch==2.0.1 torchvision==0.15.2
pip install -U tensorflow==2.13.1 tensorflow-probability==0.21.0
pip install -U --ignore-installed \
-c python/requirements_compiled.txt \
-r python/requirements.txt \
-r python/requirements/test-requirements.txt

# doc requirements
#
# TODO (shrekris-anyscale): Remove transformers after core transformer requirement
# is upgraded
pip install transformers==4.30.2
pip install -c python/requirements_compiled.txt aioboto3
pip install -U -c python/requirements_compiled.txt \
tensorflow tensorflow-probability torch torchvision \
transformers aioboto3

git clone https://github.com/wg/wrk.git /tmp/wrk && pushd /tmp/wrk && make -j && sudo cp wrk /usr/local/bin && popd

Expand Down
3 changes: 1 addition & 2 deletions ci/env/install-horovod.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,4 @@

# This script installs horovod.

# TODO: eventually pin this to master.
HOROVOD_WITH_GLOO=1 HOROVOD_WITHOUT_MPI=1 HOROVOD_WITHOUT_MXNET=1 pip install --no-cache-dir -U git+https://github.com/horovod/horovod.git@0b19c5ce6c5c93e7ed3bbf680290f918b2a0bdbb
HOROVOD_WITH_GLOO=1 HOROVOD_WITHOUT_MPI=1 HOROVOD_WITHOUT_MXNET=1 pip install --no-cache-dir -U horovod==0.28.1
2 changes: 1 addition & 1 deletion python/ray/train/tests/test_torch_transformers_train.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ def ray_start_8_cpus():


# We are only testing Causal Language Modeling here
MODEL_NAME = "hf-internal-testing/tiny-random-gpt2"
MODEL_NAME = "hf-internal-testing/tiny-random-BloomForCausalLM"

# Training Loop Configurations
NUM_WORKERS = 2
Expand Down
4 changes: 2 additions & 2 deletions python/requirements/ml/core-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ xgboost==1.7.6
lightgbm==3.3.5

# Huggingface
transformers==4.19.1 # TODO(ml-team): This should be upgraded.
accelerate==0.20.3
transformers==4.36.2
accelerate==0.28.0

# DL libraries
-r dl-cpu-requirements.txt
Expand Down
6 changes: 3 additions & 3 deletions python/requirements/ml/dl-cpu-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# These requirements are used for the CI and CPU-only Docker images so we install CPU only versions of torch.
# For GPU Docker images, you should install dl-gpu-requirements.txt afterwards.

tensorflow==2.11.0; sys_platform != 'darwin' or platform_machine != 'arm64'
tensorflow-macos==2.11.0; sys_platform == 'darwin' and platform_machine == 'arm64'
tensorflow-probability==0.19.0
tensorflow==2.15.1; sys_platform != 'darwin' or platform_machine != 'arm64'
tensorflow-macos==2.15.1; sys_platform == 'darwin' and platform_machine == 'arm64'
tensorflow-probability==0.23.0
tensorflow-io-gcs-filesystem==0.31.0
tensorflow-datasets
array-record==0.5.0; sys_platform != 'darwin' and platform_system != "Windows"
Expand Down
6 changes: 3 additions & 3 deletions python/requirements/ml/dl-gpu-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# If you make changes below this line, please also make the corresponding changes to `dl-cpu-requirements.txt`!

tensorflow==2.11.0; sys_platform != 'darwin' or platform_machine != 'arm64'
tensorflow-macos==2.11.0; sys_platform == 'darwin' and platform_machine == 'arm64'
tensorflow-probability==0.19.0
tensorflow==2.15.1; sys_platform != 'darwin' or platform_machine != 'arm64'
tensorflow-macos==2.15.1; sys_platform == 'darwin' and platform_machine == 'arm64'
tensorflow-probability==0.23.0
tensorflow-datasets

--extra-index-url https://download.pytorch.org/whl/cu118 # for GPU versions of torch, torchvision
Expand Down
9 changes: 3 additions & 6 deletions python/requirements/ml/rllib-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,9 @@ higher==0.2.1
# For auto-generating an env-rendering Window.
pyglet==1.5.15
imageio-ffmpeg==0.4.5
# ONNX
# ONNX 1.13.0 depends on protobuf > 3.20, conflicting with tensorflow.
# ONNX 1.12.0 is not published for mac arm64, so we exclude it for now.
onnx==1.12.0; sys_platform != 'darwin' or platform_machine != 'arm64'
onnxruntime==1.14.1; sys_platform != 'darwin' or platform_machine != 'arm64'
tf2onnx==1.13.0; sys_platform != 'darwin' or platform_machine != 'arm64'
onnx==1.15.0; sys_platform != 'darwin' or platform_machine != 'arm64'
onnxruntime==1.16.3; sys_platform != 'darwin' or platform_machine != 'arm64'
tf2onnx==1.15.1; sys_platform != 'darwin' or platform_machine != 'arm64'
rich==12.6.0
# Msgpack checkpoint stuff.
msgpack
Expand Down
6 changes: 3 additions & 3 deletions python/requirements/ml/rllib-test-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,16 @@ kaggle_environments==1.7.11
mlagents_envs==0.28.0

# For tests on minigrid.
minigrid==2.1.1
minigrid
# For tests on RecSim and Kaggle envs.
# Explicitly depends on `tensorflow` and doesn't accept `tensorflow-macos`
recsim==0.2.4; (sys_platform != 'darwin' or platform_machine != 'arm64')
# recsim depends on dopamine-rl, but dopamine-rl pins gym <= 0.25.2, which break some envs
dopamine-rl==4.0.5; (sys_platform != 'darwin' or platform_machine != 'arm64')
tensorflow_estimator
# DeepMind's OpenSpiel
open-spiel==1.2
open-spiel==1.4

# Requires libtorrent which is unavailable for arm64
autorom[accept-rom-license]; platform_machine != "arm64"
h5py==3.7.0
h5py==3.10.0
1 change: 1 addition & 0 deletions python/requirements/ml/train-test-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
evaluate==0.4.0
mosaicml
sentencepiece==0.1.96
4 changes: 2 additions & 2 deletions python/requirements/ml/tune-test-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

aim==3.17.5

gpy==1.10.0; python_version <= '3.9'
gpy==1.13.1

jupyterlab==3.6.1
matplotlib!=3.4.3

pytest-remotedata==0.3.2
pytorch-lightning
pytorch-lightning==1.8.6
fairscale==0.4.6
shortuuid==1.0.1
timm==0.9.2
Expand Down
4 changes: 3 additions & 1 deletion python/requirements/test-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ pytest-timeout==2.1.0
pytest-virtualenv==1.7.0
pytest-sphinx @ git+https://github.com/ray-project/pytest-sphinx
redis==4.4.2
scikit-learn==1.0.2; python_version < '3.11'
scikit-learn==1.3.2
smart_open[s3]==6.2.0
tqdm==4.64.1
trustme==0.9.0
Expand Down Expand Up @@ -91,6 +91,8 @@ importlib-metadata==6.11.0
# Some packages have downstream dependencies that we have to specify here to resolve conflicts.
# Feel free to add (or remove!) packages here liberally.
tensorboardX
tensorboard
tensorboard-data-server==0.7.2
h11==0.12.0
markdown-it-py==1.1.0
attrs==21.4.0
Expand Down
Loading

0 comments on commit ddc0064

Please sign in to comment.