Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make license file comment a hint #1102

Closed
wants to merge 4 commits into from

Conversation

jakirkham
Copy link
Member

This relaxes the license file check to a hint.

xref: #1088

Checklist

  • Added a news entry

This relaxes the license file check to a hint.
@scopatz
Copy link
Member

scopatz commented Jul 2, 2019

I actually think it should be a lint and not a hint. If there is a license, then that license requires it to be present.

The cases where the project claims it is licensed in a certain way, but doesn't provide a license file anywhere, really just means that the project is unlicensed. I think whether are comfortable packaging unlicensed packages is a separate issue from lint vs hint.

@jakirkham
Copy link
Member Author

I think whether ere comfortable packaging unlicensed packages is a separate issue from lint vs hint.

I would disagree. This is the essence of lint vs. hint in my mind. Adding a lint means the recipe is broken. Adding a hint means we have added a warning. In the latter case, there is still a problem, but not one that blocks things from moving forward.

We could have a more nuanced discussion about the degrees to which a recipe is wrong and what options we provide to bypass different types of complaints. Though this is a separate discussion.

@scopatz
Copy link
Member

scopatz commented Jul 2, 2019

Right, I guess I consider a package that claims one license but provides another license or no license a (legally) broken package, and thus a broken recipe.

@jakirkham
Copy link
Member Author

@conda-forge/core, other thoughts?

@isuruf
Copy link
Member

isuruf commented Jul 2, 2019

-1 on this. If you'd like to, we can ask NumFOCUS legal.

@jakirkham
Copy link
Member Author

I'm only softening the change that was here. It still encourages the license file be added (much as before). It just is a bit less forceful when requesting.

@dhirschfeld
Copy link
Member

If a package is effectively unlicensed can you not just add a LICENSE file saying Unlicensed.

That way a license file is always required but you're also telling the user of the package that it is unlicensed and it is then up to them to decide what to do about that.

@jakirkham
Copy link
Member Author

FWIW I've generated a list of all feedstocks that lack license_file currently. Admittedly I have not done any sophisticated metadata parsing to determine whether license_file could be skipped. In all I found 1321 feedstocks lacked license_file. So ~20% of the feedstocks in conda-forge.

It's also worth noting that we likely use packages from defaults (certainly true of CDTs) that lack license files.

I think it is going to be very difficult for us to enforce every feedstock (or package used) has a license file. That said, having a warning (not an error) would be a good first step in that direction. To be clear I don't disagree with the direction of the original PR, only the rate at which it moves in that direction.

@scopatz
Copy link
Member

scopatz commented Jul 3, 2019

I think it is going to be very difficult for us to enforce every feedstock (or package used) has a license file. That said, having a warning (not an error) would be a good first step in that direction. To be clear I don't disagree with the direction of the original PR, only the rate at which it moves in that direction.

Yeah, I think that has convinced me. Basically, I would want an error on stage-recipes and a warning to fix for the existing feedstocks. But that is too annoying to implement, practically.

@isuruf
Copy link
Member

isuruf commented Jul 3, 2019

On the other hand, the large number of feedstocks missing license_file is motivation for to make this a lint. I (and I assume other maintainers are too) mostly ignore comments by the linter when the PR is green and just merge. If the status is red, then the maintainers will notice and do something about it.

@scopatz
Copy link
Member

scopatz commented Jul 3, 2019

right but maybe we should go through a "deprecation" cycle of sorts where we hint for a while (saying that we will lint in a fee months) and then we lint.

@isuruf
Copy link
Member

isuruf commented Jul 3, 2019

We've had the license_file is packaged checkbox in PR template for months now.

@jakirkham
Copy link
Member Author

I (and I assume other maintainers are too) mostly ignore comments by the linter when the PR is green and just merge. If the status is red, then the maintainers will notice and do something about it.

I'm not sure this is true. We frequently hear back from people when re-rendering fails with a "nothing to do" message. Also have seen people actively mitigate warnings (like switching to pip). Though ultimately we need metrics to really understand how impactful warnings are. In any event, a warning is better than no warning.

@isuruf
Copy link
Member

isuruf commented Jul 3, 2019

As I said earlier, there is a checkbox for "license_file is packaged" in PR template for more than a year (since 3.0.0) and still there are more than a thousand packages without a license_file. To fix this problem in conda-forge, I think we should make the linter fail which captures the attention of maintainers.

@jakirkham
Copy link
Member Author

Again, as noted here, I don't think the package maintainers are the issue. Instead there are many libraries that simply don't ship license files in the first place. Some of these are deep in our stack. We've had discussions with prominent members of the community ( please see jaraco/tempora#1 and jaraco/skeleton#1 ) to impress the value of shipping licenses. We've slowly changed peoples' minds on this issue, but still there is much work to be done here. To be clear the work to be done is in the libraries themselves and only then the packages. Making the linter a bit more agro doesn't really facilitate this. Educating people at PyCON and SciPy on the value of shipping license files might be.

@isuruf
Copy link
Member

isuruf commented Jul 3, 2019

There are several cases here

  1. Packages with a license like BSD, MIT, GPL, but the recipe doesn't have a license_file.
  2. Packages that are public domain, but the recipe doesn't have a license_file.
  3. Packages without a license file, but mentions BSD, MIT somewhere

We should fix 1, and the linter complains.
There's nothing to do for 2, and the linter doesn't complain.
We shouldn't package 3, because BSD, MIT requires the copyright notice and the license text and if there's no such license file, we simply can't package it.

From your comment, you are concerned about 3 and wants to simply create a conda package for that package without a license file. My point is that we don't have the right to ship packages in case 3.

Can you please explain why you think we have the right to ship packages in case 3?

@CJ-Wright
Copy link
Member

I see two problems, the first is that we have existing packages which might not meet out standards, and we may accept future packages which don't meet our standards.

I think the fix for the future problem is to lint. I think we do have ways to lint on staged-recipes which don't run on the feedstocks themselves (maybe deleting the examples dir as one instance of this?)

To what extent can we inspect/migrate our way out of the existing package problem?
For instance, xonsh ships a license file in its tarball. We could inspect that a file which is named license exists and try to link to that.

If it isn't in the tarball but a repo is listed on the source/dev/doc urls maybe we can go there and look for a license to copy over.

If all that fails then try to make the maintainers aware? (or maybe inspect the license metadata (for those which ship metadata but not files) and have the bot open a PR into the project itself with the associated license file if we want to be super forward)

I think we'll get more conversions to shipping a license for those things which provide a license, in some form, if the bot does it than if we force the maintainers to do it themselves. At that point we will be left with, hopefully a handful of, packages which don't have any license info provided and we can take corrective action as needed.

@isuruf
Copy link
Member

isuruf commented Jul 9, 2019

@CJ-Wright, from what I gather from your comment, your issue is with packages in category 1 as in #1102 (comment) and not with packages in category 3 as @jakirkham is concerned about, right?

@CJ-Wright
Copy link
Member

I'm concerned with all the cases.

I think

(or maybe inspect the license metadata (for those which ship metadata but not files) and have the bot open a PR into the project itself with the associated license file if we want to be super forward)

helps to address category 3 by moving it to category 1. As does the option to

take corrective action as needed

I think it would be good to assign numbers to these categories. 20% of conda-forge is a large number. It may be that very few fall into category 3. This doesn't diminish that category's importance but does help to define the scope of the problem and may help us understand our path forward.

@isuruf
Copy link
Member

isuruf commented Jul 9, 2019

From my point of view, making the lint a hint especially for category 1 is equivalent to,

We did something illegal sometime ago and let's just keep doing it if the maintainer finds it convenient.

@mariusvniekerk
Copy link
Member

I think making tooling to go fix the existing feedstocks automatically is a good idea, but WILDLY our of the scope of this pull request.
We can make the linter a little smarter about providing a suggestion potentially for the license file if we can detect one in the sdist

@CJ-Wright
Copy link
Member

I agree with the scoping @mariusvniekerk most likely this PR should generate an issue on https://github.com/conda-forge/conda-forge.github.io discussing what mitigation steps we can take and how best to engage with the community on this issue. (And maybe host some more detailed numbers)

@dhirschfeld
Copy link
Member

dhirschfeld commented Jul 9, 2019

Is it really that hard to enforce that a feedstock maintainer has to provide a license file?

I maintain packages where the license file wasn't included in the sdist (because it wasn't listed in MANIFEST.in) and in those cases I've had to manually copy the contents of the LICENSE file into the feedstock repo. It isn't a great burden, and it has prompted me to open PRs on the upstream repos to fix the problem. In that way, by requiring a license for conda-forge it can improve the practices of the whole ecosystem.

  1. Packages with a license like BSD, MIT, GPL, but the recipe doesn't have a license_file.

Require the license be copied in the feedstock if it is not in the source. Submit PRs (or even just issues) upstream to highlight the problem

  1. Packages that are public domain, but the recipe doesn't have a license_file.

Require the feedstock maintainer to package a LICENSE file stating the code is public domain - assuming you somehow know for a fact that it is (if so, you can add references in the commit message)

  1. Packages without a license file, but mentions BSD, MIT somewhere

Require the feedstock maintainer to package a LICENSE file stating the code is unlicensed as it doesn't meet the conditions of the stated license. Encourage the maintainer to submit an upstream issue to highlight the problem

@isuruf
Copy link
Member

isuruf commented Jul 16, 2019

@conda-forge/core, can we please get a decision so that we can release a new version of conda-smithy for the R migration

@bgruening
Copy link

I'm with @isuruf here. We need to be more strict here not less.

@jakirkham
Copy link
Member Author

Both proposals are more strict. The linter will no longer be silent about missing license files. It’s just a question of how much volume we want 😉

@isuruf
Copy link
Member

isuruf commented Jul 18, 2019

@jakirkham, can you regenerate the list of feedstocks that have missing license_file entries. I want to compare the two lists to see how much progress we've made with the R migration.

@isuruf
Copy link
Member

isuruf commented Nov 10, 2019

@conda-forge/core any word from NumFOCUS legal?

@scopatz
Copy link
Member

scopatz commented Nov 10, 2019

I don't think anyone has asked formally yet, though there was some discussion at PyData NYC

@ocefpaf
Copy link
Member

ocefpaf commented Nov 13, 2019

We discussed this on today's meeting and asking NumFOCUS or not we should ask what we want to do. Do we want to enforce licenses as much as we can? If the answer is yes we should make that a linter error, not a hint, and merge the PR with the linter error on special cases only.

@ocefpaf
Copy link
Member

ocefpaf commented Nov 18, 2019

Closing this to see of people interested in having this will comment on #1102 (comment), otherwise it is solved and merging with a failed linter on special cases only will be the way to go.

@isuruf
Copy link
Member

isuruf commented Nov 20, 2019

Do we want to enforce licenses as much as we can?

Yes. It is required for the license types given and not something we can choose to do or not.

@isuruf
Copy link
Member

isuruf commented Nov 20, 2019

PR reverting this is at conda-forge/conda-smithy-feedstock#171

@ocefpaf
Copy link
Member

ocefpaf commented Nov 20, 2019

Let's go with #171.

@ocefpaf ocefpaf closed this Nov 20, 2019
@jakirkham jakirkham deleted the lic_hint branch November 20, 2019 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants