Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Housekeeping] Make flytekit lighter weight/less opinionated about dependencies #4418

Open
2 tasks done
cosmicBboy opened this issue Nov 13, 2023 · 3 comments
Open
2 tasks done
Labels
backlogged For internal use. Reserved for contributor team workflow. housekeeping Issues that help maintain flyte and keep it tech-debt free

Comments

@cosmicBboy
Copy link
Contributor

cosmicBboy commented Nov 13, 2023

Describe the issue

Currently, flytekit has a bunch of dependencies, many of which are pinned to specific versions or have restrictive constraints: https://github.com/flyteorg/flytekit/blob/38c76876dfe7fc2c62536ca6a195bce8a56c6270/setup.py#L30-L80

This makes it painful for folks to install flytekit especially with existing projects that may have conflicting constraints on shared dependencies.

What if we do not do this?

Users will continue to experience issues with conflicting version pins.

Related component(s)

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@cosmicBboy cosmicBboy added housekeeping Issues that help maintain flyte and keep it tech-debt free backlogged For internal use. Reserved for contributor team workflow. labels Nov 13, 2023
@thomasjpfan
Copy link
Member

thomasjpfan commented Nov 15, 2023

I went through all the dependencies and categorized:

Required Dependencies (for now)

  • click
  • cloudpickle
  • croniter
  • dataclasses-json
  • docker
  • flyteidl
  • googleapis-common-protos
  • grpc
  • grpcio-status
  • importlib-metadata (Can get rid of after Python >= 3.10)
  • jsonpickle
  • keyring
  • kubernetes
  • marshmallow-enum
  • marshmallow-jsonschema
  • mashumaro
  • protobuf
  • pyarrow
  • pytz
  • pyyaml
  • requests
  • rich
  • rich_click
  • statsd
  • typing_extensions

Other dependencies

@pingsutw
Copy link
Member

pingsutw commented Nov 16, 2023

gcsfs and s3fs could be added to extra. like flytekit[s3] or flytekit[gcs]

@thomasjpfan
Copy link
Member

thomasjpfan commented Feb 9, 2024

On the size of all the dependencies (included indirect ones), here are the wheel sizes for dependencies greater than 1M:

 23M	pyarrow-15.0.0-cp311-cp311-macosx_11_0_arm64.whl
 13M	numpy-1.26.4-cp311-cp311-macosx_11_0_arm64.whl
 11M	botocore-1.31.17-py3-none-any.whl
9.2M	grpcio-1.60.1-cp311-cp311-macosx_10_10_universal2.whl
5.6M	cryptography-42.0.2-cp39-abi3-macosx_10_12_universal2.whl
1.5M	kubernetes-29.0.0-py2.py3-none-any.whl
1.1M	pygments-2.17.2-py3-none-any.whl

With flyteorg/flytekit#1818, I suspect we can make pyarrow an optional dependency. Making numpy optional should be doable as well.

After than there is botocore which is required by AWS, cryptography for Azure. To tackle that we'll need to go with #4418 (comment) to make progress. For me, I do not think AWS users should be required to install dependencies required by Azure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlogged For internal use. Reserved for contributor team workflow. housekeeping Issues that help maintain flyte and keep it tech-debt free
Projects
None yet
Development

No branches or pull requests

3 participants