Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use multiple outputs for dask & distributed #172

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

jakirkham
Copy link
Member

Combines dask-core & distributed into a single recipe with dask by using multiple outputs to produce the different packages.


Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-linter

This comment was marked as outdated.

@conda-forge-linter

This comment was marked as outdated.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@jakirkham jakirkham force-pushed the use_split_pkgs branch 6 times, most recently from c208d2b to 2148e84 Compare February 26, 2022 00:23
@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe) and found some lint.

Here's what I've got...

For recipe:

  • The outputs section contained an unexpected subsection name. entry_points is not a valid subsection name.

@jakirkham jakirkham force-pushed the use_split_pkgs branch 2 times, most recently from 5558bf6 to 38391f6 Compare February 26, 2022 00:32
@jakirkham
Copy link
Member Author

jakirkham commented Feb 26, 2022

  • The outputs section contained an unexpected subsection name. entry_points is not a valid subsection name.

Seems to be covered by conda-build's tests. So this lint doesn't seem right

Raised as issue ( conda-forge/conda-smithy#1598 )

Copy link
Member Author

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have done some work to combine dask-core & distributed into this recipe so we can update just one feedstock based on our recent conversation here ( conda-forge/dask-core-feedstock#59 (comment) ). Highlighted some differences that come out of this approach below

Comment on lines +38 to +44
run_constrained:
- cytoolz >=0.8.2
- numpy >=1.18
- pandas >=1.0
- distributed {{ version }}
- bokeh >=2.1.1
- jinja2
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These come from the extra requirements in setup.py for dask (with exception for cytoolz, which is added to line up with toolz)

build:
noarch: python
number: {{ number }}
script: build-dask-core.sh
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are no longer able to list commands in the recipe directly with outputs. So move the install steps to this script.

Comment on lines +67 to +70
entry_points:
- dask-scheduler = distributed.cli.dask_scheduler:go
- dask-ssh = distributed.cli.dask_ssh:go
- dask-worker = distributed.cli.dask_worker:go
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted above the linter does not like this, but as seen from CI this builds without issue. See upstream issue for resolution of lint

recipe/meta.yaml Outdated Show resolved Hide resolved
#!/bin/bash

pushd "${SRC_DIR}/dask"
$PYTHON -m pip install . --no-deps --ignore-installed -vv
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip was trying to install things when it shouldn't have. So needed to add --no-deps to bypass this. pip was still trying to do some things so used --ignore-installed to clamp down on the remaining issues.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really thought we had centrallized these issues. However, I alwyas see pip changing how they work with their flags.

I had to recently add --no-build-isolation as well. I think :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be an issue related to conda/conda-build#4292

Copy link
Member Author

@jakirkham jakirkham Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that does look like it, Jaime. Have seen this in other split recipes as well.

recipe/meta.yaml Outdated Show resolved Hide resolved
recipe/meta.yaml Outdated Show resolved Hide resolved
Comment on lines 164 to 171
about:
home: https://dask.org/
summary: Parallel PyData with Task Scheduling
license: BSD-3-Clause
license_file: LICENSE.txt
license_file: dask/LICENSE.txt
description: |
Dask is a flexible parallel computing library for analytics.
doc_url: https://dask.org/
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need the metadata in the global scope. So we leave that here even though it is a bit duplicative.

Comment on lines 176 to 188
- alimanfoo
- jakirkham
- jcrist
- jrbourbeau
- koverholt
- marcelotrevisani
- martindurant
- mrocklin
- ogrisel
- pitrou
- shoyer
- sinhrks
- tomaugspurger
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged all the packages' maintainers lists together into one sorted list

@@ -0,0 +1,5 @@
#!/bin/bash

pushd "${SRC_DIR}/dask"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cd into the particular source directory for the install

@jakirkham
Copy link
Member Author

@jrbourbeau @jsignell would be good to get your thoughts on this 🙂

Copy link
Member

@jsignell jsignell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @jakirkham! It looks pretty good to me. I just have some questions.

recipe/meta.yaml Outdated Show resolved Hide resolved
recipe/meta.yaml Outdated


package:
name: dask
name: dask-split
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the package need a name and if it does can the name just be dask?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah unfortunately. It also can't duplicate one of the existing subpackages (have struggled with trying to do that before). Typically we just add -split here, but we could add anything as long as it doesn't duplicate other packages

That said, agree this is not the most ergonomic thing

README.md Outdated Show resolved Hide resolved
README.md Outdated

Installing `dask` from the `conda-forge` channel can be achieved by adding `conda-forge` to your channels with:
Installing `dask-split` from the `conda-forge` channel can be achieved by adding `conda-forge` to your channels with:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we need a name for this recipe of outputs, I don't think dask-split should be referenced anywhere.

Copy link
Member Author

@jakirkham jakirkham Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this text is autogenerated. Not too much we can do here.

Edit: Raised issue ( conda-forge/conda-smithy#1599 )


```
conda install dask
conda install dask dask-core distributed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should all just say dask since dask installs them all.

Copy link
Member Author

@jakirkham jakirkham Feb 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also autogenerated. So not much we can do here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah ok I was kind of worried that was the case. I guess it's not too bad becuase it would be a pretty strange user who encountered this as readme as their install instructions. It a little bit seems like this is not the place for install instructions at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it seems reasonable to provide some kind of install instructions on the feedstock given this is where the recipe is maintained and packages published.

While this particular piece is duplicative in our case, it is not always so (there is not always a kitchen sink package to install) and it should still work (despite being a bit verbose).

The portion above with installing dask-split is a bit worse IMHO as that is actually providing inaccurate information that seems unlike to ever be correct.

That all being said, if you would like to raise a conda-smithy issue on how to improve this with some suggestions, would be supportive of that. Whenever conda-smithy has the intended improvement, it should be a simple re-render to replace what is currently here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm commenting on the issue, but it looks like there might already be a way to specify a different package_name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I tried to use package_name at the time, but it didn't seem to solve the issue.

…nda-forge-pinning 2022.02.28.13.05.22

Now that `dask-core`'s `about/home` points to the webpage, re-render to
fix the README.
Should fix some README templating.
@jakirkham
Copy link
Member Author

Thanks Julia 🙂 Pushed some fixes based on feedback above

@jsignell
Copy link
Member

It looks like there are some unmerged suggestions. Should we merge them in and give this a shot on friday?

@jakirkham
Copy link
Member Author

Sorry not following. What unmerged suggestions?

@jsignell
Copy link
Member

Sorry not following. What unmerged suggestions?

I commented on them, so hopefully that will help make them visible. I realized though that we probably can't merge this until the linter failure is fixed upstream.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I was trying to look for recipes to lint for you, but it appears we have a merge conflict.
Please try to merge or rebase with the base branch to resolve this conflict.

Please ping the 'conda-forge/core' team (using the @ notation in a comment) if you believe this is a bug.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe) and found some lint.

Here's what I've got...

For recipe:

  • The outputs section contained an unexpected subsection name. entry_points is not a valid subsection name.

@jakirkham
Copy link
Member Author

I commented on them, so hopefully that will help make them visible.

Thanks! Included those. Also fixed merge conflicts.

I realized though that we probably can't merge this until the linter failure is fixed upstream.

We can. We just have to accept a linter failure in the near term.

Am able to build locally with this change, but maybe there are other
tooling issues we are missing by doing this.
name: dask-split
name: dask
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any issues having a split package with the same name used here and in one of the outputs? In the past had issues with building, being converted by staged-recipes, etc. ( for example conda-forge/staged-recipes#7386 (comment) ), but am wondering how much of an issue this still is. For example this builds fine locally and was able to re-render. Thoughts? 🙂

cc @conda-forge/core

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just find that it is an edge case. So eventually, are you going to be waiting for it to fail because somebody has a bug in their code somewhere. Do you really want to play with that risk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming the top-level package something like *-split has some effects that are undesirable ( #172 (review) ).

What else is the top-level package used for?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a pretty weak opinion on this front.

I do think that the autogenerated README should not be used as the end all documentation.

I do also think it is easier to wolve nits like this in conda smithy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same was mainly hoping to address the remaining reviewer comments. If we are wary of this, happy to go back to a different name here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see this comment conda-forge/conda-smithy#1599 (comment). I'd appreciate it if you could try it before asking for more help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants