Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python version incorrect by default. Add checks and add docs. #1312

Closed
robertlugg opened this issue Feb 21, 2020 · 6 comments
Closed

Python version incorrect by default. Add checks and add docs. #1312

robertlugg opened this issue Feb 21, 2020 · 6 comments

Comments

@robertlugg
Copy link
Contributor

This is a real ease-of-use killer, especially for new users (and those of us who don't install new versions often). Consider this page:
https://www.tensorflow.org/tfx/tutorials/tfx/components#export_to_pipeline
It makes no reference that you need a particular version of python. If I use anaconda and try to install using this yaml file:

name: tfx-0.21
dependencies:
    - pip
    - pip:
        - tensorflow==2.1
        - tfx==0.21.0
        - tensorboard==2.1

This will fail because it can't find a 2.1 version that works with Python 3.8 (Today's default).

My requests:

  1. If at all possible, build the dependency requirements into the pip and conda packages. They should know that if you use TensorFlow/TFX you must use python N.N -> M.M.

Audit and improve your documents.

  1. In the link above, there is this line:
    pip install tfx==0.21.0 tensorflow==2.1 tensorboard==2.1
    , but there is no reference to the version of Python to use. I think that should be fixed.

  2. You have a nice table of dependencies at:
    https://github.com/tensorflow/tfx#compatible-versions
    , but it doesn't include python...the most important one. Add a column for Python.

@gowthamkpr gowthamkpr self-assigned this Feb 21, 2020
@gowthamkpr
Copy link

@robertlugg Thank you for pointing out this issue.

@chongkong
Copy link
Contributor

Thanks for your help on perfecting our user experience. We are definitely missing the mention of python version in our docs. We're currently supporting up to python 3.7 as our dependencies don't support python 3.8+ yet. (This includes tensorflow/tensorflow#33374 and tfx-bsl package.)

That being said, I wouldn't refer python 3.8 is the today's default, one because python doesn't deprecate the previous minor version (unless significant reason), but also PyPI stats indicates python 3.6 and python 3.7 is the dominant minor versions (10x more than 3.8 as of Feb 2020). It does NOT mean TFX will not support python 3.8 any soon. We're internally tracking it and trying to make it available, and one prerequisite is to have tensorflow support it.

Moving on to the notebook check, I hope we can have a better mechanism for specifying which python version to use as well as what packages to install in the notebook environment, but our best effort so far was to install packages using !pip install magic commands. If we're to check the python version as well, I can think of two options: one to raise error if sys.version_info >= (3, 8), or to have a complex version specifier such as tfx==0.21, python<3.8. However I don't see a clear benefit of it, as we're still raising Error if python version is not compatible, and this is the same even if we don't specify the python version, and let pip installation phase fail. Also this temporary workaround would be a short-lived code that requires frequent update, but we don't have a good mechanism to update them automatically, resulting in an increase of complexity only.

For the version compatibility table, I think the purpose of the table is to illustrate the version compatibility between tfx family libraries (that are not visible through pip dependency) therefore adding python version would not be very appropriate. Still, I found no other place to mention the python compatibility, I'll add a column. FYI, entire (and the single source of) truth of version compatibility would be in the pypi package metadata, including python version compatibility. Best practice would be installing tfx through proper version resolver (unfortunately pip isn't yet, and due to the past bug in tensorflow, you can't even use other well-defined package resolver such as poetry), and you don't have to worry about version compatibility. Or at least, you'll be noticed for failure for your installation.

TL;DR, I'll update the README.md.

@chongkong
Copy link
Contributor

It seems like we already have PyPI icon for the python version support in README. I'm closing the PR. Sorry for the confusion.
image

@robertlugg
Copy link
Contributor Author

@chongkong:

I don't mean to be too critical. I know you are at 0.21. But as TFX matures I think more and more time needs to be spent on your user documentation.

Python default version

By default, I mean the one I get when I don't specify a version. The default is 3.8.1

Please consider this on the python website:
Image 2733

Also, when I install python using anaconda, I get 3.8.1 by default.

Mention the requirement.

Please consider my request regarding the tutorial in this issue description.

I agree that on your docs somewhere you do mention 3.7. In fact, there are a couple of places. The problem is that a user doesn't go through the entire site github. They might, for instance, jump to the getting started, which makes no mention of the python version. conda error messages are really poor so there is little hope of determining the problem.

Run the tutorial

Would someone run your tutorial:
https://www.tensorflow.org/tfx/tutorials/tfx/components#export_to_pipeline

Be a new user that doesn't know where to look for everything. Start from that tutorial and work your way through it. In fact, it doesn't run! The problem again is with versioning. This isn't python versioning but with TensorFlow. The only way I could get it to run through is with the following

name: tfx-0.21
dependencies:
    - python==3.6.8
    - pip
    - pip:
        - google-cloud-bigquery==1.24.0
        - google-cloud-storage==1.26.0
        - tensorflow-data-validation==0.21.0
        - tensorflow-model-analysis==0.21.1
        - tensorflow-metadata==0.21.0
        - tensorflow-transform==0.21.0
        - ml-metadata==0.21.0
        - apache-beam==2.17.0
        - pyarrow==0.15.0
        - tfx-bsl==0.21.0
        - tfx==0.21.0
        - tensorflow==2.1
        - tensorboard==2.1
        - matplotlib==3.1.3
        - ipympl==0.4.1
        - pillow
        - pylint
    - requests

(Please disregard some of the dependencies towards the bottom). For some reason I needed to download google-cloud packages even though I'm running locally!

For the reader of that tutorial, they would have no idea what is going on. I generated the above by looking at the table in the readme.md and through trial and error.

My suggestion is that you start from a clean virtualenv or conda environment and follow that tutorial explicitly. I think you will find it impossible to run through the tutorial (PS: confirm that all the jupyter tables and charts display correctly)

I'm not looking for a work-around

I already am able to get it running but I maybe spend 1/2 a day doing so. I'm just writing this issue because I think its important to improve the first-time user experience

No github

Your documentation is being published at: https://www.tensorflow.org/tfx As someone who only wishes to run the tutorial, I should not need to go to github at all.

@gowthamkpr gowthamkpr self-assigned this Aug 9, 2022
@gowthamkpr gowthamkpr assigned gowthamkpr and 1025KB and unassigned gowthamkpr and chongkong Aug 19, 2022
@singhniraj08
Copy link
Contributor

@robertlugg,

TFX official documentation states that TFX requires Python 3. Also, for more info we can refer to PyPI website stating Python version requirements for TFX Requires: Python >=3.7, <3.10.

Hope this helps. Thank you!

@singhniraj08
Copy link
Contributor

Closing this due to inactivity. Please take a look into the answers provided above, feel free to reopen and post your comments(if you still have queries on this). Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants